808 stories
·
25 followers

Doutores

2 Shares

Read the whole story
vitormazzi
21 hours ago
reply
Brasil
luizirber
1 day ago
reply
Davis, CA
Share this story
Delete

30-10-2018

2 Shares

Read the whole story
vitormazzi
3 days ago
reply
Brasil
luizirber
5 days ago
reply
Davis, CA
Share this story
Delete

Tem certeza, Alvim?

1 Comment and 2 Shares



Read the whole story
luizirber
5 days ago
reply
DES PA CHI TO
Davis, CA
vitormazzi
3 days ago
reply
Brasil
Share this story
Delete

Building a conda forge package from an r cran package

1 Share
2020-01-16-Building-a-conda-forge-package-from-an-R-CRAN-package.utf8 For almost every workflow I run, I use snakemake. Snakemake is a workflow manager written by bioinformaticians for bioinformaticians. It has a lot of wonderful features (file tracking, cluster integration, reports, and integration with multiple languages), including support for conda environments. I use conda to specify all of the software I need for a rule, and conda takes care of building the environments for me. This makes my workflows more repeatable, and allows me to quickly switch between different systems (e.g. my campus compute cluster and NSF XSEDE’s Jetstream). Here is an example of a simple Snakefile, as well as the accompanying conda environment. Snakefile: rule tally_iris: output: "iris_tally.csv" conda: 'dplyr.yml' shell: ''' Rscript -e "library(dplyr); library(readr); iris %>% group_by(Species) %>% tally() %>% write_csv('iris_tally.csv')" ''' dplyr.yml: channels: - conda-forge - bioconda - defaults dependencies: - r-dplyr=0.8.3 - r-readr=1.3.1 To execute this snakefile, I would run: snakemake --use-conda However, sometimes I run into a situation where the package or library I use in my workflow does not have a conda package associated with it. In this case, I used to install all of the dependencies (assuming they had conda packages) in a conda environment, and then install the package I was interested in a rule in snakemake. I would usually also make this installation script write out a text file so that I knew the rule had finished running. No more! Luiz Irber recently showed me how to make a conda-forge recipe from a CRAN pacakge. I did this successfully with the optimr package, and couldn’t believe how streamlined the process is. I walk through the process below. 1. Build the recipe using conda_r_skeleton_helper conda_r_skeleton_helper is a github repository that builds a properly formatted conda-forge recipe from an R CRAN package. With a user-created list of packages, it builds a recipe for each package in the list. It uses the documentation on CRAN to auto-populate recipe fields like dependencies and description, as well as others. 1a. Installing conda-build conda_r_skeleton_helper requires conda-build, so I first installed it into its own conda environment. conda create -n conda_build conda-build conda activate conda_build 1b. Cloning conda_r_skeleton_helper Next, I cloned the repository to my laptop. git clone https://github.com/bgruening/conda_r_skeleton_helper.git 1c. Building the recipe With conda_r_skeleton_helper on my laptop, I then followed the instructions on the README file and added the R CRAN packages I wanted to build a recipe for to the packages.txt file. I removed the packages that were already there. In this case, I was building a recipe for optimr. My packages.txt file ended up looking like this: r-optimr With my packages of interest in the packages.txt file, I then ran the run script. I chose to run it in R, but it can be run in bash and python as well. Rscript run.R When this was finished running, I had a newly created folder called r-optimr. Inside it was three files, bld.bat, build.sh, meta.yaml. I had successfully built a conda-forge recipe! As a last step, I added my github username to the maintainers section so that I can approve version bumps on the recipe. conda-forge has set up a bot to orchestrate these changes, but a maintainer still needs to click the merge button to propagate those changes. Because I made the recipe, I added myself so I can click the button. 2. Submit the recipe to conda-forge With the recipe built, I now needed to submit it to conda-forge. I decided to orchestrate this process within GitHub instead of using git for no particular reason. To start the process, I first forked the conda-forge staged-recipes repository into my own github. Once there, I created a branch that I named r-optimr. I switched to that branch and changed into the recipes directory. I clicked upload, and uploaded my local r-optimr folder. Lastly, I started a pull request to merge my changes on my r-optimr branch to the conda-forge master branch. I followed the checklist that is in the PR template and clicked submit! My recipe passed all checks, so it was merged within a few hours of my posting it. You can see my merged PR here, and the r-optimr conda-forge package here. Thank you to Luiz Irber for teaching me this process, and for feedback on this post!
Read the whole story
luizirber
16 days ago
reply
Davis, CA
Share this story
Delete

Thorndike won. Dewey Lost: The Most Important 4 Words about the US Education System

1 Share

One cannot understand the history of education in the United States during the twentieth century unless one realizes that Edward L. Thorndike won and John Dewey lost.
— Ellen Condliffe Lagemann

I mentioned that “Thorndike won and Dewey lost” on Twitter a couple months ago. I realized that some education researchers didn’t know this story. I first learned about it in Lagemann’s intriguing book, An Elusive Science.

Lagemann explains that Dewey was the pioneer at Chicago and Columbia, and recruited faculty and administrators that supported his perspective. But Thorndike came later and replaced those faculty.

Unlike Dewey, Thorndike favored the separation of philosophy and psychology. Despite considerable disdain for educators and an extremely imperialistic view of psychology, which he thought supreme for studying and controlling human affairs, Thorndike formulated ideas that were more suited to translation into formulas for educational practice. A conservative person whose prose was clear, to the point, humorless, and colorless, Thorndike was about as different from Dewey as two men could be. (p. 56-57)

Five years after Dewey left Chicago, Charles Hubbard Judd took his place. While Judd and Thorndike were rivals, they had similar views about the role and definition of school.

Over the years, Judd also recruited a faculty that was as supportive of his views as the Dewey group had been supportive of the views of their chief. (p. 68)

Although both thought experimentation was necessary in education, Dewey saw the school as the laboratory of education, whereas Judd saw the school as primarily the place for the implementation of real laboratory findings…Whereas Dewey saw teachers and researchers as more alike than different, wanting both to be skilled students of education, Judd believed that the improvement of education required the professionalization of education, which, in turn, necessitated tha teachers and researchers fulfill distinct roles. (69-70)

Audrey Watters has written a great blog post about the tension:

Ed-tech has always been more Thorndike than Dewey because education has been more Thorndike than Dewey. That means more instructivism than constructionism. That means more multiple choice tests than projects. That means more surveillance than justice.

If you do Web searches on “Thorndike won. Dewey lost,” you’ll find many relevant essays and papers. Dewey (wikipedia page) believed in educating the student, meeting them where they were, and helping them to develop in their community through teacher-driven innovations in the classroom. Thorndike (wikipedia page) was about administrative systems: grades, teacher requirements and credentialing, preparing students for vocations, testing (Thorndike is best known in psychology for his work on measurement), and teachers implementing what researchers invent. The US education system favors the latter.

I like David Labaree’s paper “How Dewey Lost: The Victory of David Snedden and Social Efficiency in the Reform of American Education” which summarizes why Thorndike won.

The pedagogically progressive vision of education — child-centered, inquiry based, and personally engaging — is a hothouse flower trying to survive in the stony environment of public education. It won’t thrive unless conditions are ideal, since, among other things, it requires committed, creative, energetic, and highly educated teachers, who are willing and able to construct education to order for students in the classroom; and it requires broad public and fiscal support for education as an investment in students rather than an investment in economic productivity.

But the administrative progressive vision of education — as a prudent investment in a socially efficient future — is a weed. It will grow almost anywhere.

When I look at computing education interventions, I see a lot informed by Dewey. Caring teachers, researchers working in partnership with practitioners (RPPs), and developers want students to engage and learn. That’s great, and as Audrey Watters has suggested, technology may be a way of making Dewey’s vision work in US classrooms today. But there’s likely a reason why Thorndike won.

It wasn’t luck. The US school system is built following Thorndike’s vision because his vision was more in concert with US values. I’m not an expert on how US values have driven the US education system, but I can guess at some of the factors. The US system is driven by the promise of compulsory education for all, a belief in rugged individualism, and the value for a capitalistic society.

  • We have a mission to educate everyone. When there’s a trade-off between increasing quality somewhere versus making sure that we can provide something for everybody, the most common choice is for the something for everybody.
  • We like our image of Americans as settlers/pioneers. No “hothouse flowers.” We’re “weeds” that can rise up to handle adversity. We want our education system to be small, minimalist, and local.
  • Education is expensive. States increase their investments only if (on paper at least) they can offer the same thing to everyone. The top goal in US education is to prepare workers, over a goal to prepare citizens. Our education decisions are dominated by economics.

Few students get access to computing education today (as I described in this blog post). The biggest barrier is that we’re too busy and resource-limited providing all the students the classes they need to meet current school requirements. See the principal in Miranda Parker’s dissertation who chooses to keep choir (which helps many students to get the credits they need to graduate) over CS (which only a few students might take). See the education faculty I talked about in my recent CACM blog, who are far too busy meeting state requirements for mathematics and science teachers to fit in CS which isn’t required to be taught pre-service. CS is something new that only a few students get excited about— that might be something Dewey would like since he values individuals finding their interests, but not Thorndike who values the education as a system for everybody.

The lesson is that if we want to get computing education in front of US students, we need to figure out how to make it work within Thorndike’s system. We have to be efficient. We have to do it with few resources. We have to fit into existing models. Alternatively, we can try to move the US education system into a more Dewey-like model — but we have to realize how big a shift that is. Thorndike won almost 100 years ago. The US education system has a century of ingrained views that align with Thorndike.

I wish I could argue for a more progressive view, but in the end: Thorndike won. Dewey lost.



Read the whole story
luizirber
22 days ago
reply
Davis, CA
Share this story
Delete

Scaling up and down

1 Share

There’s a worn-out analogy in software development that you cannot build a skyscraper the same way you build a dog house. The idea is that techniques that will work on a small scale will not work on a larger scale. You need more formality to build large software systems.

The analogy is always applied in one direction: up. It’s always an exhortation to use techniques appropriate for larger projects.

But the analogy works in the other direction as well: it’s inappropriate to build a dog house the same way you’d build a skyscraper. It would be possible to build a dog house the way you’d build a skyscraper, but it would be very expensive. Amateur carpentry methods don’t scale up, but professional construction methods don’t scale down economically.

Bias for over-engineering

There’s a bias toward over-engineering because it works, albeit inefficiently, whereas under-engineering does not. You can use a sledgehammer to do a hammer’s job. It’ll be clumsy, and you might hurt yourself, but it can work. And there are tasks where a hammer just won’t get the job done.

Another reason for the bias toward over-engineering is asymmetric risk. If an over-engineered approach fails, you’ll face less criticism than if a simpler approach fails. As the old saying goes, nobody got fired for choosing IBM.

Context required

Simple solutions require context to appreciate. If you do something simple, you’re open to the criticism “But that won’t scale!” You have to defend your solution by explaining that it will scale far enough, and that it avoids costs associated with scaling further than necessary.

Suppose a group is debating whether to walk or drive to lunch. Someone advocating driving requires less context to make his point. He can simply say “Driving is faster than walking,” which is generally true. The burden is on the person advocating walking to explain why walking would actually be faster under the circumstances.

Writing prompt

I was using some database-like features in Emacs org-mode this morning and that’s what prompted me to write this post. I can just hear someone say “That won’t scale!” I often get this reaction from someone when I write about a simple, low-tech way to do something on a small scale.

Using a text file as a database doesn’t scale. But I have 88 rows, so I think I’ll be OK. A relational database would be better for storing million of records, but that’s not what I’m working on at the moment.

More posts on scale

Read the whole story
luizirber
27 days ago
reply
Davis, CA
Share this story
Delete
Next Page of Stories