705 stories
·
25 followers

Scenes from the ant colony's growing magician problem

jwz
1 Comment and 4 Shares
probablybadrpgideas:
If Cthulhu can be summoned by humans who are so far beneath it, why can't humans be summoned by ants?

The answer is they should be.

20thcenturyvole:

Well if a bunch of ants formed a circle in my house I'd certainly notice, try to figure out where they'd all come from, and possibly wreak destruction there.

weasowl:

That's why knowing and correctly pronouncing the true name is so important to the ritual. Imagine how impossible it would be to not go take a look if the circle of ants started chanting your name.

And they're like, you can't leave because we drew a line made of tiny crystals - now you have to do us a favor.

And you're like, let's just see where this goes "yup, you got me... what's the favor?"

and usually the favor is like, "kill this one ant for us" or "give me a pile of sugar" and you're like... okay? and you do, because why not, it isn't hard for you and boy is this going to be a fucking story to tell, these fucking ants chanting your name and wanting a spoonful of sugar or whatever.

And SOMEtimes you get asked for things you can't really do, one of them, she's like, "I love this ant but she won't pay any attention to me, make me important to her" and you're like... um? how? So you just kill every ant in the colony except the two of them, ta-da! problem solved! and the first ant is like horrified whisper "what have I done"

Previously, previously, previously, previously, previously, previously, previously, previously.

Read the whole story
vitormazzi
2 days ago
reply
Brasil
luizirber
3 days ago
reply
Davis, CA
Share this story
Delete
1 public comment
fxer
1 day ago
reply
"and you're like... um? how? So you just kill every ant in the colony except the two of them, ta-da! problem solved!"
Bend, Oregon

07-12-2017

1 Share

Read the whole story
luizirber
14 days ago
reply
Davis, CA
Share this story
Delete

Matthew Rocklin: Pickle isn't slow, it's a protocol

1 Share

This work is supported by Anaconda Inc

tl;dr: Pickle isn’t slow, it’s a protocol. Protocols are important for ecosystems.

A recent Dask issue showed that using Dask with PyTorch was slow because sending PyTorch models between Dask workers took a long time (Dask GitHub issue).

This turned out to be because serializing PyTorch models with pickle was very slow (1 MB/s for GPU based models, 50 MB/s for CPU based models). There is no architectural reason why this needs to be this slow. Every part of the hardware pipeline is much faster than this.

We could have fixed this in Dask by special-casing PyTorch models (Dask has it’s own optional serialization system for performance), but being good ecosystem citizens, we decided to raise the performance problem in an issue upstream (PyTorch Github issue). This resulted in a five-line-fix to PyTorch that turned a 1-50 MB/s serialization bandwidth into a 1 GB/s bandwidth, which is more than fast enough for many use cases (PR to PyTorch).

     def __reduce__(self):
-        return type(self), (self.tolist(),)
+        b = io.BytesIO()
+        torch.save(self, b)
+        return (_load_from_bytes, (b.getvalue(),))


+def _load_from_bytes(b):
+    return torch.load(io.BytesIO(b))

Thanks to the PyTorch maintainers this problem was solved pretty easily. PyTorch tensors and models now serialize efficiently in Dask or in any other Python library that might want to use them in distributed systems like PySpark, IPython parallel, Ray, or anything else without having to add special-case code or do anything special. We didn’t solve a Dask problem, we solved an ecosystem problem.

However before we solved this problem we discussed things a bit. This comment stuck with me:

Github Image of maintainer saying that PyTorch's pickle implementation is slow

This comment contains two beliefs that are both very common, and that I find somewhat counter-productive:

  1. Pickle is slow
  2. You should use our specialized methods instead

I’m sort of picking on the PyTorch maintainers here a bit (sorry!) but I’ve found that they’re quite widespread, so I’d like to address them here.

Pickle is slow

Pickle is not slow. Pickle is a protocol. We implement pickle. If it’s slow then it is our fault, not Pickle’s.

To be clear, there are many reasons not to use Pickle.

  • It’s not cross-language
  • It’s not very easy to parse
  • It doesn’t provide random access
  • It’s insecure
  • etc..

So you shouldn’t store your data or create public services using Pickle, but for things like moving data on a wire it’s a great default choice if you’re moving strictly from Python processes to Python processes in a trusted and uniform environment.

It’s great because it’s as fast as you can make it (up a a memory copy) and other libraries in the ecosystem can use it without needing to special case your code into theirs.

This is the change we did for PyTorch.

     def __reduce__(self):
-        return type(self), (self.tolist(),)
+        b = io.BytesIO()
+        torch.save(self, b)
+        return (_load_from_bytes, (b.getvalue(),))


+def _load_from_bytes(b):
+    return torch.load(io.BytesIO(b))

The slow part wasn’t Pickle, it was the .tolist() call within __reduce__ that converted a PyTorch tensor into a list of Python ints and floats. I suspect that the common belief of “Pickle is just slow” stopped anyone else from investigating the poor performance here. I was surprised to learn that a project as active and well maintained as PyTorch hadn’t fixed this already.

As a reminder, you can implement the pickle protocol by providing the __reduce__ method on your class. The __reduce__ function returns a loading function and sufficient arguments to reconstitute your object. Here we used torch’s existing save/load functions to create a bytestring that we could pass around.

Just use our specialized option

Specialized options can be great. They can have nice APIs with many options, they can tune themselves to specialized communication hardware if it exists (like RDMA or NVLink), and so on. But people need to learn about them first, and learning about them can be hard in two ways.

Hard for users

Today we use a large and rapidly changing set of libraries. It’s hard for users to become experts in all of them. Increasingly we rely on new libraries making it easy for us by adhering to standard APIs, providing informative error messages that lead to good behavior, and so on..

Hard for other libraries

Other libraries that need to interact definitely won’t read the documentation, and even if they did it’s not sensible for every library to special case every other library’s favorite method to turn their objects into bytes. Ecosystems of libraries depend strongly on the presence of protocols and a strong consensus around implementing them consistently and efficiently.

Sometimes Specialized Options are Appropriate

There are good reasons to support specialized options. Sometimes you need more than 1GB/s bandwidth. While this is rare in general (very few pipelines process faster than 1GB/s/node), it is true in the particular case of PyTorch when they are doing parallel training on a single machine with multiple processes. Soumith (PyTorch maintainer) writes the following:

When sending Tensors over multiprocessing, our custom serializer actually shortcuts them through shared memory, i.e. it moves the underlying Storages to shared memory and restores the Tensor in the other process to point to the shared memory. We did this for the following reasons:

  • Speed: we save on memory copies, especially if we amortize the cost of moving a Tensor to shared memory before sending it into the multiprocessing Queue. The total cost of actually moving a Tensor from one process to another ends up being O(1), and independent of the Tensor’s size

  • Sharing: If Tensor A and Tensor B are views of each other, once we serialize and send them, we want to preserve this property of them being views. This is critical for neural-nets where it’s common to re-view the weights / biases and use them for another. With the default pickle solution, this property is actually lost.

Read the whole story
luizirber
18 days ago
reply
Davis, CA
Share this story
Delete

Kylian Mbappé impõe a nova ordem a Neymar Júnior

1 Share

Fayza Lamari foi uma boa jogadora de basquete da Primeira Divisão da França. A devoção por um jogo sacrificado fez com que formasse uma ideia severa da prática esportiva. No Paris Saint-Germain dizem que seu caráter áspero moldou seu filho, a nova estrela da seleção francesa, Kylian Mbappé.

Seguir leyendo.







Read the whole story
luizirber
31 days ago
reply
Davis, CA
Share this story
Delete

Computer Science to the Second Degree

1 Share

Some thoughts on studying computer science from Gian-Carlo Rota:

A large fraction of MIT undergraduates major in computer science or at least acquire extensive computer skills that are applicable in other fields. In their second year, they catch on to the fact that their required courses in computer science do not provide the whole story. Not because of deficiencies in the syllabus; quite the opposite. The undergraduate curriculum in computer science at MIT is probably the most progressive and advanced such curriculum anywhere. Rather, the students learn that side by side with required courses there is another, hidden curriculum consisting of new ideas just coming into use, new techniques and that spread like wildfire, opening up unsuspected applications that will eventually be adopted into the official curriculum.

Keeping up with this hidden curriculum is what will enable a computer scientist to stay ahead in the field. Those who do not become computer scientists to the second degree risk turning into programmers who will only implement the ideas of others.

MIT is, of course, an exceptional school, but I think Rota's comments apply to computer science at most schools. So much learning of CS happens in the spaces between courses: in the lab, in the student lounge, at meetings of student clubs, at part-time jobs, .... That can sometimes be a challenge for students who don't have much curiosity, or develop one as they are exposed to new topics.

As profs, we encourage students to be aware of all that is going on in computer science beyond the classroom and to take part in the ambient curriculum to the extent they are able. Students who become computer scientists only to the first degree can certainly find good jobs and professional success, but there are more opportunities open at the second degree. CS can also be a lot more fun there.

Read the whole story
luizirber
49 days ago
reply
Davis, CA
Share this story
Delete

How long does it take to produce scientific software?

1 Share

Over here at UC Davis, the Lab for Data Intensive Biology has been on extended walkabout developing software for, well, doing data intensive biology.

Over the past two to three years or so, various lab members have been working on the following new pieces of software -

I should say that all of these except for kevlar have been explicitly supported by my Moore Foundation funding from the Data Driven Discovery Initiative.

With the possible exception of dammit, every single one of these pieces of software was developed entirely since the move to UC Davis (so, since 2015 or later). And almost all of them are now approaching some reasonable level of maturity, defined as "yeah, not only does this work, but it might be something that other people can use." (Both dammit and sourmash are being used by other people already; kevlar, spacegraphcats, and boink are being written up now.)

All of these coming together at the same time seems like quite a coincidence to me, and I would like to make the following proposition:

It takes a minimum of two to three years for a piece of scientific software to become mature enough to publicize.

This fits with my previous experiences with khmer and the FamilyRelations/Cartwheel set of software as well - each took about two years to get to the point where anyone outside the lab could use them.

I can think of quite a few reasons why some level of aging could be necessary -

  • often in science one has no real idea of what you're doing at the beginning of a project, and that just takes time to figure out;

  • code just takes time to get reasonably robust when interfacing with real world data;

  • there are lots of details that need to be worked out for installation and distribution of code, and that also just takes time;

but I'm somewhat mystified by the 2-3 year arc. It could be tied to the funding timeline (the Moore grant ends in about a year) or career horizons (the grad students want to graduate, the postdocs want to move on).

My best guess, tho, is that there is some complex tradeoff between scope and effort that breaks the overall software development work into multiple stages - something like,

  1. figure out the problem
  2. implement a partial solution
  3. make an actual solution
  4. expand solution cautiously to apply to some other nearby problems.

I'm curious as to whether or not this pattern fits with other people's experiences!

I do expect these projects to continue maturing as time and opportunity permits, much like khmer. boink, spacegraphcats, and sourmash should all result in multiple papers from my lab; kevlar will probably move with Daniel to his next job, but may be something we also extend in our lab; etc.

Another very real question in my mind is: which software do we choose to maintain and extend? It's clearly dependent on funding, but also on the existence of interesting problems that the software can still address, and on who I have in my lab... right now a lot of our planning is pretty helter skelter, but it would be good to articulate a list of guiding considerations for when I do see pots of money on the horizon.

Finally: I think this 2-3 year timeline has some interesting implications for the question of whether or not we should require people to release usable software. I think it's a major drain on people to expect them to not only come up with some cool new idea and implement it in software they can use, but then also make software that is more generally usable. Both sides of this take special skills - some people are good at methods & algorithms development, some people are good at software development, but very few people are good at both. And we should value both, but not require that people be good at both.

--titus

Read the whole story
luizirber
69 days ago
reply
Davis, CA
Share this story
Delete
Next Page of Stories