826 stories
·
25 followers

Five Years of Rust

1 Share

With all that's going on in the world you'd be forgiven for forgetting that as of today, it has been five years since we released 1.0 in 2015! Rust has changed a lot these past five years, so we wanted reflect back on all of our contributors' work since the stabilization of the language.

Rust is a general purpose programming language empowering everyone to build reliable and efficient software. Rust can be built to run anywhere in the stack, whether as the kernel for your operating system or your next web app. It is built entirely by an open and diverse community of individuals, primarily volunteers who generously donate their time and expertise to help make Rust what it is.

Major Changes since 1.0

2015

1.2 — Parallel Codegen: Compile time improvements are a large theme to every release of Rust, and it's hard to imagine that there was a short time where Rust had no parallel code generation at all.

1.3 — The Rustonomicon: Our first release of the fantastic "Rustonomicon", a book that explores Unsafe Rust and its surrounding topics and has become a great resource for anyone looking to learn and understand one of the hardest aspects of the language.

1.4 — Windows MSVC Tier 1 Support: The first tier 1 platform promotion was bringing native support for 64-bit Windows using the Microsoft Visual C++ toolchain (MSVC). Before 1.4 you needed to also have MinGW (a third party GNU environment) installed in order to use and compile your Rust programs. Rust's Windows support is one of the biggest improvements these past five years. Just recently Microsoft announced a public preview of their official Rust support for the WinRT API! Now it's easier than ever build top quality native and cross platform apps.

1.5 — Cargo Install: The addition of being able to build Rust binaries alongside cargo's pre-existing plugin support has given birth to an entire ecosystem of apps, utilities, and developer tools that the community has come to love and depend on. Quite a few of the commands cargo has today were first plugins that the community built and shared on crates.io!

2016

1.6 — Libcore: libcore is a subset of the standard library that only contains APIs that don't require allocation or operating system level features. The stabilization of libcore brought the ability to compile Rust with no allocation or operating system dependency was one of the first major steps towards Rust's support for embedded systems development.

1.10 — C ABI Dynamic Libraries: The cdylib crate type allows Rust to be compiled as a C dynamic library, enabling you to embed your Rust projects in any system that supports the C ABI. Some of Rust's biggest success stories among users is being able to write a small critical part of their system in Rust and seamlessly integrate in the larger codebase, and it's now easier than ever.

1.12 — Cargo Workspaces: Workspaces allow you to organise multiple rust projects and share the same lockfile. Workspaces have been invaluable in building large multi-crate level projects.

1.13 — The Try Operator: The first major syntax addition was the ? or the "Try" operator. The operator allows you to easily propagate your error through your call stack. Previously you had to use the try! macro, which required you to wrap the entire expression each time you encountered a result, now you can easily chain methods with ? instead.

try!(try!(expression).method()); // Old
expression?.method()?;           // New

1.14 — Rustup 1.0: Rustup is Rust's Toolchain manager, it allows you to seamlessly use any version of Rust or any of its tooling. What started as a humble shell script has become what the maintainers affectionately call a "chimera". Being able to provide first class compiler version management across Linux, macOS, Windows, and the dozens of target platforms would have been a myth just five years ago.

2017

1.15 — Derive Procedural Macros: Derive Macros allow you to create powerful and extensive strongly typed APIs without all the boilerplate. This was the first version of Rust you could use libraries like serde or diesel's derive macros on stable.

1.17 — Rustbuild: One of the biggest improvements for our contributors to the language was moving our build system from the initial make based system to using cargo. This has opened up rust-lang/rust to being a lot easier for members and newcomers alike to build and contribute to the project.

1.20 — Associated Constants: Previously constants could only be associated with a module. In 1.20 we stabilised associating constants on struct, enums, and importantly traits. Making it easier to add rich sets of preset values for types in your API, such as common IP addresses or interesting numbers.

2018

1.24 — Incremental Compilation: Before 1.24 when you made a change in your library rustc would have to re-compile all of the code. Now rustc is a lot smarter about caching as much as possible and only needing to re-generate what's needed.

1.26 — impl Trait: The addition of impl Trait gives you expressive dynamic APIs with the benefits and performance of static dispatch.

1.28 — Global Allocators: Previously you were restricted to using the allocator that rust provided. With the global allocator API you can now customise your allocator to one that suits your needs. This was an important step in enabling the creation of the alloc library, another subset of the standard library containing only the parts of std that need an allocator like Vec or String. Now it's easier than ever to use even more parts of the standard library on a variety of systems.

1.31 — 2018 edition: The release of the 2018 edition was easily our biggest release since 1.0, adding a collection of syntax changes and improvements to writing Rust written in a completely backwards compatible fashion, allowing libraries built with different editions to seamlessly work together.

  • Non-Lexical Lifetimes A huge improvement to Rust's borrow checker, allowing it to accept more verifiable safe code.
  • Module System Improvements Large UX improvements to how we define and use modules.
  • Const Functions Const functions allow you to run and evaluate Rust code at compile time.
  • Rustfmt 1.0 A new code formatting tool built specifically for Rust.
  • Clippy 1.0 Rust's linter for catching common mistakes. Clippy makes it a lot easier to make sure that your code is not only safe but correct.
  • Rustfix With all the syntax changes, we knew we wanted to provide the tooling to make the transition as easy as possible. Now when changes are required to Rust's syntax they're just a cargo fix away from being resolved.

2019

1.34 — Alternative Crate Registries: As Rust is used more and more in production, there is a greater need to be able to host and use your projects in non-public spaces, while cargo has always allowed remote git dependencies, with Alternative Registries your organisation can easily build and share your own registry of crates that can be used in your projects like they were on crates.io.

1.39 — Async/Await: The stabilisation of the async/await keywords for handling Futures was one of the major milestones to making async programming in Rust a first class citizen. Even just six months after its release async programming in Rust has blossomed into a diverse and performant ecosystem.

2020

1.42 — Subslice patterns: While not the biggest change, the addition of the .. (rest) pattern has been a long awaited quality of life feature that greatly improves the expressivity of pattern matching with slices.

Error Diagnostics

One thing that we haven't mentioned much is how much Rust's error messages and diagnostics have improved since 1.0. Looking at older error messages now feels like looking at a different language.

We’ve highlighted a couple of examples that best showcase just how much we’ve improved showing users where they made mistakes and importantly help them understand why it doesn’t work and teach them how they can fix it.

First Example (Traits)
use std::io::Write;

fn trait_obj(w: &Write) {
    generic(w);
}

fn generic<W: Write>(_w: &W) {}
1.2.0 Error Message
   Compiling error-messages v0.1.0 (file:///Users/usr/src/rust/error-messages)
src/lib.rs:6:5: 6:12 error: the trait `core::marker::Sized` is not implemented for the type `std::io::Write` [E0277]
src/lib.rs:6     generic(w);
                 ^~~~~~~
src/lib.rs:6:5: 6:12 note: `std::io::Write` does not have a constant size known at compile-time
src/lib.rs:6     generic(w);
                 ^~~~~~~
error: aborting due to previous error
Could not compile `error-messages`.

To learn more, run the command again with --verbose.

A terminal screenshot of the 1.2.0 error message.

1.43.0 Error Message
   Compiling error-messages v0.1.0 (/Users/ep/src/rust/error-messages)
error[E0277]: the size for values of type `dyn std::io::Write` cannot be known at compilation time
 --> src/lib.rs:6:13
  |
6 |     generic(w);
  |             ^ doesn't have a size known at compile-time
...
9 | fn generic<W: Write>(_w: &W) {}
  |    ------- -       - help: consider relaxing the implicit `Sized` restriction: `+  ?Sized`
  |            |
  |            required by this bound in `generic`
  |
  = help: the trait `std::marker::Sized` is not implemented for `dyn std::io::Write`
  = note: to learn more, visit <https://doc.rust-lang.org/book/ch19-04-advanced-types.html#dynamically-sized-types-and-the-sized-trait>

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.
error: could not compile `error-messages`.

To learn more, run the command again with --verbose.

A terminal screenshot of the 1.43.0 error message.

Second Example (help)
fn main() {
    let s = "".to_owned();
    println!("{:?}", s.find("".to_owned()));
}
1.2.0 Error Message
   Compiling error-messages v0.1.0 (file:///Users/ep/src/rust/error-messages)
src/lib.rs:3:24: 3:43 error: the trait `core::ops::FnMut<(char,)>` is not implemented for the type `collections::string::String` [E0277]
src/lib.rs:3     println!("{:?}", s.find("".to_owned()));
                                    ^~~~~~~~~~~~~~~~~~~
note: in expansion of format_args!
<std macros>:2:25: 2:56 note: expansion site
<std macros>:1:1: 2:62 note: in expansion of print!
<std macros>:3:1: 3:54 note: expansion site
<std macros>:1:1: 3:58 note: in expansion of println!
src/lib.rs:3:5: 3:45 note: expansion site
src/lib.rs:3:24: 3:43 error: the trait `core::ops::FnOnce<(char,)>` is not implemented for the type `collections::string::String` [E0277]
src/lib.rs:3     println!("{:?}", s.find("".to_owned()));
                                    ^~~~~~~~~~~~~~~~~~~
note: in expansion of format_args!
<std macros>:2:25: 2:56 note: expansion site
<std macros>:1:1: 2:62 note: in expansion of print!
<std macros>:3:1: 3:54 note: expansion site
<std macros>:1:1: 3:58 note: in expansion of println!
src/lib.rs:3:5: 3:45 note: expansion site
error: aborting due to 2 previous errors
Could not compile `error-messages`.

To learn more, run the command again with --verbose.

A terminal screenshot of the 1.2.0 error message.

1.43.0 Error Message
   Compiling error-messages v0.1.0 (/Users/ep/src/rust/error-messages)
error[E0277]: expected a `std::ops::FnMut<(char,)>` closure, found `std::string::String`
 --> src/lib.rs:3:29
  |
3 |     println!("{:?}", s.find("".to_owned()));
  |                             ^^^^^^^^^^^^^
  |                             |
  |                             expected an implementor of trait `std::str::pattern::Pattern<'_>`
  |                             help: consider borrowing here: `&"".to_owned()`
  |
  = note: the trait bound `std::string::String: std::str::pattern::Pattern<'_>` is not satisfied
  = note: required because of the requirements on the impl of `std::str::pattern::Pattern<'_>` for `std::string::String`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.
error: could not compile `error-messages`.

To learn more, run the command again with --verbose.

A terminal screenshot of the 1.43.0 error message.

Third Example (Borrow checker)
fn main() {
    let mut x = 7;
    let y = &mut x;

    println!("{} {}", x, y);
}
1.2.0 Error Message
   Compiling error-messages v0.1.0 (file:///Users/ep/src/rust/error-messages)
src/lib.rs:5:23: 5:24 error: cannot borrow `x` as immutable because it is also borrowed as mutable
src/lib.rs:5     println!("{} {}", x, y);
                                   ^
note: in expansion of format_args!
<std macros>:2:25: 2:56 note: expansion site
<std macros>:1:1: 2:62 note: in expansion of print!
<std macros>:3:1: 3:54 note: expansion site
<std macros>:1:1: 3:58 note: in expansion of println!
src/lib.rs:5:5: 5:29 note: expansion site
src/lib.rs:3:18: 3:19 note: previous borrow of `x` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `x` until the borrow ends
src/lib.rs:3     let y = &mut x;
                              ^
src/lib.rs:6:2: 6:2 note: previous borrow ends here
src/lib.rs:1 fn main() {
src/lib.rs:2     let mut x = 7;
src/lib.rs:3     let y = &mut x;
src/lib.rs:4
src/lib.rs:5     println!("{} {}", x, y);
src/lib.rs:6 }
             ^
error: aborting due to previous error
Could not compile `error-messages`.

To learn more, run the command again with --verbose.

A terminal screenshot of the 1.2.0 error message.

1.43.0 Error Message
   Compiling error-messages v0.1.0 (/Users/ep/src/rust/error-messages)
error[E0502]: cannot borrow `x` as immutable because it is also borrowed as mutable
 --> src/lib.rs:5:23
  |
3 |     let y = &mut x;
  |             ------ mutable borrow occurs here
4 |
5 |     println!("{} {}", x, y);
  |                       ^  - mutable borrow later used here
  |                       |
  |                       immutable borrow occurs here

error: aborting due to previous error

For more information about this error, try `rustc --explain E0502`.
error: could not compile `error-messages`.

To learn more, run the command again with --verbose.

A terminal screenshot of the 1.43.0 error message.

Quotes from the teams

Of course we can't cover every change that has happened. So we reached out and asked some of our teams what changes they are most proud of:

For rustdoc, the big things were:

  • The automatically generated documentation for blanket implementations
  • The search itself and its optimizations (last one being to convert it into JSON)
  • The possibility to test more accurately doc code blocks "compile_fail, should_panic, allow_fail"
  • Doc tests are now generated as their own seperate binaries.

— Guillaume Gomez (rustdoc)

Rust now has baseline IDE support! Between IntelliJ Rust, RLS and rust-analyzer, I feel that most users should be able to find "not horrible" experience for their editor of choice. Five years ago, "writing Rust" meant using old school Vim/Emacs setup.

— Aleksey Kladov (IDEs and editors)

For me that would be: Adding first class support for popular embedded architectures and achieving a striving ecosystem to make micro controller development with Rust an easy and safe, yet fun experience.

— Daniel Egger (Embedded WG)

The release team has only been around since (roughly) early 2018, but even in that time, we've landed ~40000 commits just in rust-lang/rust without any significant regressions in stable.

Considering how quickly we're improving the compiler and standard libraries, I think that's really impressive (though of course the release team is not the sole contributor here). Overall, I've found that the release team has done an excellent job of managing to scale to the increasing traffic on issue trackers, PRs being filed, etc.

— Mark Rousskov (Release)

Within the last 3 years we managed to turn Miri from an experimental interpreter into a practical tool for exploring language design and finding bugs in real code—a great combination of PL theory and practice. On the theoretical side we have Stacked Borrows, the most concrete proposal for a Rust aliasing model so far. On the practical side, while initially only a few key libraries were checked in Miri by us, recently we saw a great uptake of people using Miri to find and fix bugs in their own crates and dependencies, and a similar uptake in contributors improving Miri e.g. by adding support for file system access, unwinding, and concurrency.

— Ralf Jung (Miri)

If I had to pick one thing I'm most proud of, it was the work on non-lexical lifetimes (NLL). It's not only because I think it made a big difference in the usability of Rust, but also because of the way that we implemented it by forming the NLL working group. This working group brought in a lot of great contributors, many of whom are still working on the compiler today. Open source at its best!

— Niko Matsakis (Language)

The Community

As the language has changed and grown a lot in these past five years so has its community. There's been so many great projects written in Rust, and Rust's presence in production has grown exponentially. We wanted to share some statistics on just how much Rust has grown.

  • Rust has been voted "Most Loved Programming" every year in the past four Stack Overflow developer surveys since it went 1.0.
  • We have served over 2.25 Petabytes (1PB = 1,000 TB) of different versions of the compiler, tooling, and documentation this year alone!
  • In the same time we have served over 170TB of crates to roughly 1.8 billion requests on crates.io, doubling the monthly traffic compared to last year.

When Rust turned 1.0 you could count the number of companies that were using it in production on one hand. Today, it is being used by hundreds of tech companies with some of the largest tech companies such as Apple, Amazon, Dropbox, Facebook, Google, and Microsoft choosing to use Rust for its performance, reliability, and productivity in their projects.

Conclusion

Obviously we couldn't cover every change or improvement to Rust that's happened since 2015. What have been your favourite changes or new favourite Rust projects? Feel free to post your answer and discussion on our Discourse forum.

Lastly, we wanted to thank everyone who has to contributed to the Rust, whether you contributed a new feature or fixed a typo, your work has made Rust the amazing project it is today. We can't wait to see how Rust and its community will continue to grow and change, and see what you all will build with Rust in the coming decade!

Read the whole story
luizirber
10 days ago
reply
Davis, CA
Share this story
Delete

Putting it all together

1 Comment

sourmash 3.3 was released last week, and it is the first version supporting zipped databases. Here is my personal account of how that came to be =]

What is a sourmash database?

A sourmash database contains signatures (typically Scaled MinHash sketches built from genomic datasets) and an index for allowing efficient similarity and containment queries over these signatures. The two types of index are SBT, a hierarchical index that uses less memory by keeping data on disk, and LCA, an inverted index that uses more memory but is potentially faster. Indices are described as JSON files, with LCA storing all the data in one JSON file and SBT opting for saving a description of the index structure in JSON, and all the data into a hidden directory with many files.

We distribute some prepared databases (with SBT indices) for Genbank and RefSeq as compressed TAR files. The compressed file is ~8GB, but after decompressing it turns into almost 200k files in a hidden directory, using about 40 GB of disk space.

Can we avoid generating so many hidden files?

The initial issue in this saga is dib-lab/sourmash#490, and the idea was to take the existing support for multiple data storages (hidden dir, TAR files, IPFS and Redis) and save the index description in the storage, allowing loading everything from the storage. Since we already had the databases as TAR files, the first test tried to use them but it didn't take long to see it was a doomed approach: TAR files are terrible from random access (or at least the tarfile module in Python is).

Zip files showed up as a better alternative, and it helps that Python has the zipfile module already available in the standard library. Initial tests were promising, and led to dib-lab/sourmash#648. The main issue was performance: compressing and decompressing was slow, but there was also another limitation...

Loading Nodegraphs from a memory buffer

Another challenge was efficiently loading the data from a storage. The two core methods in a storage are save(location, content), where content is a bytes buffer, and load(location), which returns a bytes buffer that was previously saved. This didn't interact well with the khmer Nodegraphs (the Bloom Filter we use for SBTs), since khmer only loads data from files, not from memory buffers. We ended up doing a temporary file dance, which made things slower for the default storage (hidden dir), where it could have been optimized to work directly with files, and involved interacting with the filesystem for the other storages (IPFS and Redis could be pulling data directly from the network, for example).

This one could be fixed in khmer by exposing C++ stream methods, and I did a small PoC to test the idea. While doable, this is something that was happening while the sourmash conversion to Rust was underway, and depending on khmer was a problem for my Webassembly aspirations... so, having the Nodegraph implemented in Rust seemed like a better direction, That has actually been quietly living in the sourmash codebase for quite some time, but it was never exposed to the Python (and it was also lacking more extensive tests).

After the release of sourmash 3 and the replacement of the C++ for the Rust implementation, all the pieces for exposing the Nodegraph where in place, so dib-lab/sourmash#799 was the next step. It wasn't a priority at first because other optimizations (that were released in 3.1 and 3.2) were more important, but then it was time to check how this would perform. And...

Your Rust code is not so fast, huh?

Turns out that my Nodegraph loading code was way slower than khmer. The Nodegraph binary format is well documented, and doing an initial implementation wasn't so hard by using the byteorder crate to read binary data with the right endianess, and then setting the appropriate bits in the internal fixedbitset in memory. But the khmer code doesn't parse bit by bit: it reads a long char buffer directly, and that is many orders of magnitude faster than setting bit by bit.

And there was no way to replicate this behavior directly with fixedbitset. At this point I could either bit-indexing into a large buffer and lose all the useful methods that fixedbitset provides, or try to find a way to support loading the data directly into fixedbitset and open a PR.

I chose the PR (and even got #42! =]).

It was more straightforward than I expected, but it did expose the internal representation of fixedbitset, so I was a bit nervous it wasn't going to be merged. But bluss was super nice, and his suggestions made the PR way better! This simplified the final Nodegraph code, and actually was more correct (because I was messing a few corner cases when doing the bit-by-bit parsing before). Win-win!

Nodegraphs are kind of large, can we compress them?

Being able to save and load Nodegraphs in Rust allowed using memory buffers, but also opened the way to support other operations not supported in khmer Nodegraphs. One example is loading/saving compressed files, which is supported for Countgraph (another khmer data structure, based on Count-Min Sketch) but not in Nodegraph.

If only there was an easy way to support working with compressed files...

Oh wait, there is! niffler is a crate that I made with Pierre Marijon based on some functionality I saw in one of his projects, and we iterated a bit on the API and documented everything to make it more useful for a larger audience. niffler tries to be as transparent as possible, with very little boilerplate when using it but with useful features nonetheless (like auto detection of the compression format). If you want more about the motivation and how it happened, check this Twitter thread.

The cool thing is that adding compressed files support in sourmash was mostly one-line changes for loading (and a bit more for saving, but mostly because converting compression levels could use some refactoring).

Putting it all together: zipped SBT indices

With all these other pieces in places, it's time to go back to dib-lab/sourmash#648. Compressing and decompressing with the Python zipfile module is slow, but Zip files can also be used just for storage, handing back the data without extracting it. And since we have compression/decompression implemented in Rust with niffler, that's what the zipped sourmash databases are: data is loaded and saved into the Zip file without using the Python module compression/decompression, and all the work is done before (or after) in the Rust side.

This allows keeping the Zip file with similar sizes to the original TAR files we started with, but with very low overhead for decompression. For compression we opted for using Gzip level 1, which doesn't compress perfectly but also doesn't take much longer to run:

Level Size Time
0 407 MB 16s
1 252 MB 21s
5 250 MB 39s
9 246 MB 1m48s

In this table, 0 is without compression, while 9 is the best compression. The size difference from 1 to 9 is only 6 MB (~2% difference) but runs 5x faster, and it's only 30% slower than saving the uncompressed data.

The last challenge was updating an existing Zip file. It's easy to support appending new data, but if any of the already existing data in the file changes (which happens when internal nodes change in the SBT, after a new dataset is inserted) then there is no easy way to replace the data in the Zip file. Worse, the Python zipfile will add the new data while keeping the old one around, leading to ginormous files over time1 So, what to do?

I ended up opting for dealing with the complexity and complicating the ZipStorage implementation a bit, by keeping a buffer for new data. If it's a new file or it already exists but there are no insertions the buffer is ignored and all works as before.

If the file exists and new data is inserted, then it is first stored in the buffer (where it might also replace a previous entry with the same name). In this case we also need to check the buffer when trying to load some data (because it might exist only in the buffer, and not in the original file).

Finally, when the ZipStorage is closed it needs to verify if there are new items in the buffer. If not, it is safe just to close the original file. If there are new items but they were not present in the original file, then we can append the new data to the original file. The final case is if there are new items that were also in the original file, and in this case a new Zip file is created and all the content from buffer and original file are copied to it, prioritizing items from the buffer. The original file is replaced by the new Zip file.

Turns out this worked quite well! And so the PR was merged =]

The future

Zipped databases open the possibility of distributing extra data that might be useful for some kinds of analysis. One thing we are already considering is adding taxonomy information, let's see what else shows up.

Having Nodegraph in Rust is also pretty exciting, because now we can change the internal representation for something that uses less memory (maybe using RRR encoding?), but more importantly: now they can also be used with Webassembly, which opens many possibilities for running not only signature computation but also search and gather in the browser, since now we have all the pieces to build it.

Comments?


Footnotes

  1. The zipfile module does throw a UserWarning pointing that duplicated files were inserted, which is useful during development but generally doesn't show during regular usage...

Read the whole story
luizirber
13 days ago
reply
New post!
Davis, CA
Share this story
Delete

The Culture War in Open Source is On

1 Share

Perhaps you remember Eric S. Raymond from his memorable essay “The Cathedral and the Bazaar,” or his advice to programmers on “not reacting like a loser,” or his enthusiasm for guns. Regardless, the legendary open-source coder and commentator recently made a brief comeback to the Open Source Initiative, an organization he co-founded in 1998. On February 24, he announced on an OSI email list that “a wild co-founder appears” after being absent for many years. By February 28, after denouncing the “political ratfucking” and “vulgar Marxism” he perceived to be infecting the organization, he was removed from the list for violating the code of conduct—the existence of which he had also railed against. The previous month, also, Raymond’s OSI co-founder Bruce Perens withdrew his membership over an apparently unrelated dispute about a software license.

These happenings are not unrelated, really. They are part of what has been variously called “the culture war at the heart of open source” and “the great open source shake-up.” The disputes concern the decades-old legal technology at the heart of the movement known variously as “free software” or “open source”: software licenses, particularly ones that turn proprietary code into a common good. The conflicts at hand raise questions about ethics and economics that the open source establishment believes have long been answered and can be put away for good. Oh, and as programmer Steve Klabnik notes in passing, without further explanation, “I personally think that gender plays a huge role.”

Richard Stallman, originator of the first free-and-open license, resigned from the Free Software Foundation last year over comments sympathetic to the late billionaire and molester Jeffrey Epstein. The reputation of Creative Commons founder Lawrence Lessig also took a hit for defending one of Epstein’s enablers. Maybe all this set a mood, heightened the intensity. It is probably not an accident that, in a world where ~95% percent of GitHub contributors identify as male, several of the major partisans in this culture war do not.

The bug that has been lurking in the open source codebase all along, for the partisans, might be best summarized as neutrality. The OSI’s Open Source Definition prohibits value judgments about such things as “fields of endeavor,” business models, and technology stacks. The Free Software Foundation puts the idea this way: “The freedom to run the program as you wish, for any purpose.” As long as the code remains free and open, users—whether individual or corporate—should not be constrained by a license in what they do or how they make money. Any such constraint, the argument goes, is a slippery slope. During his brief return to OSI’s email lists, Raymond quoted Thomas Paine: “He that would make his own liberty secure, must guard even his enemy from oppression; for if he violates this duty, he establishes a precedent that will reach to himself.” Restrict others, and it will come back to haunt you.

A gear in the shape of a flower.

Photo CC-BY Mark Fischer.

The challenges to this noble neutrality come in two forms: ethics and economics.

On the first front, there is the Ethical Source Movement, the doing of Ruby developer and Model View Culture contributor Coraline Ada Ehmke, against whom Raymond’s worst recent vitriol has been directed. Ethical Source builds on the waves of tech workers rebelling against their bosses for taking contracts with the likes of the Pentagon and ICE. Ethical Source offers its own counter-definition to OSI’s, outlining a type of license that allows developers to restrict certain uses of their software. The handful of compliant licenses prohibit violations of the Universal Declaration of Human Rights, for instance, or International Labour Organization standards. The promise is some peace of mind for programmers and perhaps a more ethical world; the danger is a mess of unenforceable licenses whose prohibitions become impossible to keep track of.

The second challenge is economic. This has long been a sore spot in open source, which has largely outsourced its economics to companies that might contribute employee time or grants to what projects are building. Those companies also get a lot of the financial rewards, leaving many contributors to offer their unpaid free time—an unequally available resource. New lawyerly efforts, such as Heather Meeker’s PolyForm Project and Kyle E. Mitchell’s License Zero, offer menus of licenses designed to enable certain popular business models. The code is still “source available”—anyone can read it, and use it under certain terms—but there are restrictions to prevent some company from taking it and profiting from others’ work. One PolyForm license permits only small businesses to use the software for free; big corporations have to pay. Again, though, critics fear a free-for-all of licenses that end up piling on restrictions and constraining the commons.

Let me get back to that question of gender. From the perspective of the “meritocracy” ideal familiar in tech culture—and much maligned in these pages—the neutrality principle makes sense. The spree of new license-making is confounding. Forget money, forget ideology, we’ll judge you by your code! But through the lens of feminist thought, the challenges should come as no surprise.

For instance, feminists have been really good at noticing the undervalued, under-noticed labor propping up everything else—because often it is women doing that stuff. Why not make sure that people creating open code are also the ones receiving financial benefits? Especially since women tend to have far less free time on their hands, getting paid matters. Meanwhile, feminists have stressed that there is really no value-neutral, objective, meritocratic vantage point. All knowledge is situated. There is no excuse for dispensing with ethical discussion and ethical standards. Why not find reasonable ways for groups to articulate and enforce the standards they hold?

The more you look at open source through a feminist lens—as I’ve done at greater length elsewhere—the more those two-decade-old certainties start to crack. We need to see those cracks to build better tools. 

People often talk about open source as a commons, but if you go back and read Elinor Ostrom’s studies on pre-digital practices of commoning around the world, you’ll find that there are some important things missing. (She didn’t identify her work with feminism, but she is still the first woman to have won the Nobel in economics.) Ostrom recognized boundary-making as an essential practice for commoners—defining who is a member and what purpose the commons serves, whether economic, social, or otherwise. She also found that a healthy commons depends on participants having a voice in changing its rules and norms, which is a far cry from the dictatorships-for-life that tend to rule open source. If open source listened better to the legacy of commoning before it, projects wouldn’t just choose a license, they would start by adopting a governance structure and finding a place to manage funds. They would plan for economic and ethical self-determination—autonomy was another principle Ostrom emphasized—rather than outsourcing those matters to corporate whims.

Anyway, it is only natural that open source contributors should want to hack the process, not just the software they write. For all that open source has achieved, there is so much more that it hasn’t. “The year of the Linux desktop” has turned from an expectation to a joke, unless you count Google’s Android and Chromebooks. The open Web has closed shut. Surveillance-powered corporate platforms have accumulated power because of, not despite, open source. And the staggering homogeneity that still persists among open source developers testifies to the talent and life experience that their projects are missing.

If the early architects of open source feel threatened by a new generation demanding more, that is probably a sign of progress.

Read the whole story
luizirber
25 days ago
reply
Davis, CA
Share this story
Delete

Software and workflow development practices (April 2020 update)

1 Comment

Over the last 10-15 years, I've blogged periodically about how my lab develops research software and build scientific workflows. The last update talked a bit about how we've transitioned to snakemake and conda for automation, but I was spurred by an e-mail conversation into another update - because, y'all, it's going pretty well and I'm pretty happy!

Below, I talk through our current practice of building workflows and software. These procedures work pretty well for our (fairly small) lab of people who mostly work part-time on workflow and software development. By far the majority of our effort is usually spent trying to understand the results of our workflows; except in rare cases, I try to guide people to spend at most 20% of their time writing new analysis code - preferably less.

Nothing about these processes ensures that the scientific output is correct or useful, of course. While scientific correctness of computational workflows necessarily depends (often critically) on the correctness of the code underlying those workflows, the code could ultimately be doing the wrong thing scientifically. That having been said, I've found that the processes below let us focus much more cleanly on the scientific value of the code because we don't worry as much about whether the code is correct, and moreover our processes support rapid iteration of software and workflows as we iteratively develop our use cases.

As one side note, I should say that the complexity of the scientific process is one thing that distinguishes research computing from other software engineering projects. Often we don't actually have a good idea of what we're trying to achieve, at least not any level of specificity. This is a recipe for disaster in a software engineering project, but it's our day-to-day life in science! What ...fun? (I mean, it kind of is. But it's also hellishly complicated.)

Workflows and scripts

Pretty much every scientific computing project I've worked on in the last (counts on fingers and toes... runs out of toes... 27 years!? eek) has grown into a gigantic mess of scripts and data files. Over the (many) years I've progressively worked on taming these messes using a variety of techniques.

Phillip Brooks, Charles Reid, Tessa Pierce, and Taylor Reiter have been the source of a lot of the workflow approaches I discussed below, although everyone in the lab has been involved in the discussions!

Store code and configuration in version control

Since I "grew up" simultaneously in science and open source, I started using version control early on - first RCS, then CVS, then darcs, then Subversion, and finally git. Version control is second nature, and it applies to science too!

The first basic rule of scientific projects is, put it in git.

This means that I can (almost) always figure out what I was doing a month ago when I got that neat result that I haven't been able to replicate again. More importantly I can see exactly what I changed in the last hour, and either fix it or revert to what last worked.

Over almost 30 years of sciencing, project naming becomes a problem! Especially since I tend to start projects small and grow them (or let them die on the vine if my focus shifts). So my repo names usually start with the year, followed by a few keywords -- e.g. 2020-long-read-assembly-decontam. While I can't predict which code I'll go back to, I always end up going back to some of it!

Write scripts using a language that encourages modularity and code sharing

I've developed scientific workflows in C, bash, Perl, Tcl, Java, and Python. By far my favorite language of these is Python. The main reason I switched wholeheartedly to Python is that, more than any of the others, Python had a nice blend of modularity and reusability. I could quickly pick up a blob of useful code from one script and put it in a shared module for other scripts to use. And it even had its own simple namespace scheme, which encouraged modularity by default!

At the time (late '90s, early '00s) this kind of namespacing was something that wasn't as well supported by other interpreted languages like Perl (v4?) and Tcl. While I was already a knowledgeable programmer, the ease of code reuse combined with such simple modularity encouraged systematic code reuse in my scripts in a new way. When combined with the straightforward C extension module API, Python was a huge win.

Nowadays there are many good options, of course, but Python is still one of them, so I haven't had to change! My lab now uses an increasing amount of R, of course, because if its dominance in stats and viz. And we're starting to use Rust instead of C/C++ for extension modules.

Automate scientific workflows

Every project ends up with a mess of scripts.

When you have a pile of scripts, it's usually not clear how to run them in order. When you're actively developing the scripts, it becomes confusing to remember whether your output files have been updated by the latest code. Enter workflows!

I've been using make to run workflows for ages, but about 2 years ago the entire lab switched over to snakemake. This is in part because it's well integrated with Python, and in part because it supports conda environments. It's been lovely! And we now have a body of shared snakemake expertise in the lab that is hard to beat.

snakemake also works really well for combining my own scripts with other programs, which is of course something that we do a lot in bioinformatics.

There are a few problems with snakemake, of course. It doesn't readily scale to 100s of thousands of jobs, and we're still working out the best way to orchestrate complex workflows on a cluster. But it's proven relatively straightforward to teach, and it's nicely designed, with an awful lot of useful features. I've heard good things about nextflow, and if I were going to operate at larger scales, I'd be looking at CWL or WDL.

New: Work in isolated execution environments

One problem that we increasingly encounter is the need to run different incompatible versions of software within the same workflow. Usually this manifests in underlying dependencies -- this package needs Python 2 while this other package requires Python 3.

Previously, tackling this required ugly heavyweight hacks such as VMs or docker containers. I personally spent a few years negotiating with Python virtualenvs, but they only solved some of the problems, and only then in Python-land.

Now, we are 100% conda, all the time. In snakemake, we can provide environment config files for running the basic pipeline, with rule/step-specific environment files that rely on pinned (specific) versions of software.

Briefly, with --use-conda on the command line and conda: directives in the Snakefile, snakemake manages creating and updating these environments for you, and activate/deactivates them on a per-rule basis. It's beautiful and Just Works.

New: Provide quickstart demonstration data sets.

(This is a brand new approach to my daily practice, supported by the easy configurability of snakemake!)

The problem is this: often I want to develop and rapidly execute workflows on small test data sets, while also periodically running them on bigger "real" data sets to see what the results look like. It turns out this is hard to stage-manage! Enter ...snakemake config files! These are YAML or JSON files that are automatically loaded into your Snakefile name space.

Digression: A year or three ago, I got excited about using workflows as applications. This was a trend that Camille Scott, a PhD student in the lab, had started with dammit, and we've been using it for spacegraphcats and elvers.

The basic idea is this: Increasingly, bioinformatics "applications" are workflows that involve running other software packages. Writing your own scripts that stage-manage other software execution is problematic, since you have to reinvent a lot of error handling that workflow engines already have. This is also true of issues like parallelization and versioning.

So why not write your applications as wrappers around a workflow engine? It turns out with both pydoit and snakemake, you can do this pretty easily! So that's an avenue we've been exploring a few projects.

Back to the problem to be solved: What I want for workflows is the following:

  1. A workflow that is approximately the same, independent of the input data.
  2. Different sets of input data, ready to go.
  3. In particular, a demo data set (a real data set cut down in size, or synthetic data) that exercises most or all of the features of the workflow.
  4. The ability to switch between input data sets quickly and easily without changing any source code.
  5. In a perfect world, I would have the ability to develop and run the same workflow code on both my laptop and in an HPC queuing system.

This set of functionality is something that snakemake easily supports with its --configfile option - you specify a default config file in your Snakefile, and then override that with other config files when you want to run for realz. Moreover, with the rule-specific conda environment files (see previous section!), I don't even need to worry about installing the software; snakemake manages it all for me!

With this approach, my workflow development process becomes very fluid. I prototype scripts on my laptop, where I have a full dev environment, and I develop synthetic data sets to exercise various features of the scripts. I bake this demo data set into my default snakemake config so that it's what's run by default. For real analyses, I then override this by specifying a different config file on the command line with --configfile. And this all interacts perfectly well with snakemake's cluster execution approach.

As a bonus, the demo data set provides a simple quickstart and example config file for people who want to use your software. This makes the installation and quickstart docs really simple and nearly identical across multiple projects!

(Note that I develop on Mac OS X and execute at scale on Linux HPCs. I'd probably be less happy with this approach if I developed on Windows, for which bioconda doesn't provide packages.)

Libraries and applications

On the opposite end of the spectrum from "piles of scripts" is research software engineering, where we are trying explicitly to build maintainable and reusable libraries and command-line applications. Here we take a very different approach to the workflow approach detailed above, although in recent years I've noticed that we're working across this full spectrum on several projects. (This is perhaps because workflows, done like we are doing them above, start to resemble serious software engineering :).

Whenever we find a core set of functionality that is being used across multiple projects in the lab, we start to abstract that functionality into a library and/or command line application. We do this in part because most scripts have bugs that should be fixed, and we remain ignorant of them until we start reusing the scripts; but it also aids in efficiency and code reuse. It's a nice use-case driven way to develop software!

We've developed several software packages this way. For example, the khmer and screed libraries emerged from piles of code that slowly got unified into a shared library.

More recently, the sourmash project has become the in-lab exemplar of intentional software development practices. We now have 3-5 people working regularly on sourmash, and it's being used by an increasingly wide range of people. Below are some of the key techniques we've been using, which will (in most cases) be readily recognized as matching basic open source development practices!

I want to give an especially big shoutout here to Michael Crusoe, Camille Scott, and Luiz Irber, who have been the three key people leading our adoption of these techniques.

Automate tests

Keeping software working is hard. Automated tests are one of the solutions.

We have an increasingly expansive set of automated tests for sourmash - over 600 at the moment. It takes about a minute to run the whole test suite on my laptop. If it looks intimidating, that's because we've grown it over the years. We started with one test, and went from there.

We don't really use test-driven development extensively, or at least I don't. I know Camille has used it in her De Bruijn graph work. I tend to reserve it for situations where the code is becoming complicated enough at a class or function level that I can't understand it -- and that's rarely necessary in my work. (Usually it means that I need to take a step back and rethink what I'm doing! I'm a big believer in Kernighan's Lever - if you're writing code at the limit of your ability to understand it, you'll never be able to debug it!)

Use code review

Maintainability, sustainability, and correctness of code are all enhanced by having multiple people's eyes on it.

We basically use GitHub Flow, as I understand it. Every PR runs all the tests on each commit, and we have a checklist to help guide contributors.

We have a two-person sign-off rule on every PR. This can slow down code development when some of us are busy, but on the flip side no one person is solely responsible when bad code makes it into a release :).

Most importantly, it means that our code quality is consistently better than what I would produce working on my own.

Use semantic versioning

Semantic versioning means that when we release a new version, outside observers can quickly know if they can upgrade without a problem. For example, within the sourmash 3.x series, the only reason for the same command line options to produce different output is if there was a bug.

We are still figuring out some of the details, of course. For example, we have only recently started tracking performance regressions. And it's unclear exactly what parts of our API should be considered public. Since sourmash isn't that widely used, I'm not pushing hard on resolving these kinds of high level issues, but they are a regular background refrain in my mind.

In any case, what semantic versioning does is provide a simple way for people to know if it's safe to upgrade. It also lets us pin down versions in our own workflows, with some assurance that the behavior shouldn't be changing (but performance might improve!) if we pin to a major version.

Nail down behavior with tests, then refactor underneath

I write a lot of hacky code when I'm exploring research functionality. Often this code gets baked into our packages with a limited understanding of its edge cases. As I explore and expand the use cases more and more, I find more of these edge cases. And, if the code is in a library, I nail down the edge cases with stupidity-driven testing. This then lets me (or others) refactor the code to be less hacky and more robust, without changing its functionality.

For example, I'm currently going through a long, slow refactor of some formerly ugly sourmash code that creates a certain kind of indexed database. This code worked reasonably well for years, but as we developed more uses for it, it became clear that there were, ahem, opportunities for refactoring it to be more usable in other contexts.

We don't start with good code. We don't pretend that our code is good (or at least I wouldn't, and can't :). But we iteratively improve upon our code as we work with it.

Explore useful behavior, then nail it down with tests, and only then optimize the heck out of it

The previous section is how we clean up code, but it turns out it also works really well for speeding up code.

There this really frustrating bias amongst software developers towards premature optimization, which leads to ugly and unmaintainable code. In my experience, flexibility trumps optimization 80% or more of the time, so I take this to the other extreme and rarely worry about optimizing code. Luckily some people in my lab counterbalance me in this preference, so we occasionally produce performant code as well :).

What we do is get to the point where we have pretty well-specified functionality, and then benchmark, and then refactor and optimized based on the benchmarking.

A really clear example of this applied to sourmash was here, when Luiz and Taylor noticed that I'd written really bad code that was recreating large sets again and again in Python. Luiz added a simple "remove_many" method that did the same operation in place and we got a really substantial (order of magnitude?) speed increase.

Critically, this optimization was to a new research algorithm that we developed over the period of years. First we got the research algorithm to work. Then we spent a lot of time understanding how and why and where it was useful. During this period we wrote a whole bunch of tests that nailed down the behavior. And then when Luiz optimized the code, we just dropped in a faster replacement that passed all the tests.

This has become a bit of a trend in recent years. As sourmash has moved from C to C++ to Rust, Luiz has systematically improved the runtimes for various operations. But this has always occurred in the context of well-understood features with lots of tests. Otherwise we just end up breaking our software when we optimize it.

As a side note, whenever I hear someone emphasize the speed of their just-released scientific software, my strong Bayesian prior is that they are really telling me their code is not only full of bugs (all software is!) but that it'll be really hard to find and fix them...

Collaborate by insisting the tests pass

Working on multiple independent feature sets at the same time is hard, whether it's only one person or five. Tests can help here, too!

One of the cooler things to happen in sourmash land in the last two years is that Olga Botvinnik and some of her colleagues at CZBioHub started contributing substantially to sourmash. This started with Olga's interest in using sourmash for single-cell RNAseq analysis, which presents new and challenging scalability challenges.

Recently, the CZBioHub folk submitted a pull request to significantly change one of our core data structures so as to scale it better. (It's going to be merged soon!) Almost all of our review comments have focused on reviewing the code for understandability, rather than questioning the correctness - this is because the interface for this data structure is pretty well tested at a functional level. Since the tests pass, I'm not worried that the code is wrong.

What this overall approach lets us do is simultaneously work on multiple parts of the sourmash code base with some basic assurances that it will still work after all the merges are done.

Distribute via (bio)conda, install via environments

Installation for end users is hard. I've spent many, many years writing installation tutorials. Conda just solves this, and is our go-to approach now for supporting user installs.

Conda software installation is awesome and awesomely simple. Even when software isn't yet packaged for conda install (like spacegraphcats, which is research-y enough that I haven't bothered) you can still install it that way -- see the pip commands, here.

Put everything in issues

You can find most design decisions, feature requests, and long-term musings for sourmash in our issue tracker. This is where we discuss almost everything, and it's our primary help forum as well. Having a one-stop shop that ties together design, bugs, code reviews, and documentation updates is really nice. We even try to archive slack conversations there!

Concluding thoughts

Academic workflow and software development is a tricky business. We operate in slow moving and severely resource-constrained environments, with a constant influx of people who have a variety of experience, to solve problems that are often poorly understood in the beginning (and maybe at the end). The practices above have been developed for a small lab and are battle-tested over a decade and more.

While your mileage may vary in terms of tools and approaches, I've seen convergence across the social-media enabled biological data science community to similar practices. This suggests these practices solve real problems that are being experienced by multiple labs. Moreover, we're developing a solid community of practice in not only using these approaches but also teaching them to new trainees. Huzzah!

--titus

(Special thanks go to the USDA, the NIH, and the Moore Foundation for funding so much of our software development!)

Read the whole story
luizirber
33 days ago
reply
This is how we science =)
Davis, CA
Share this story
Delete

RIP John Conway

5 Comments and 17 Shares
1937-2020
Read the whole story
kyleniemeyer
42 days ago
reply
Ugh. Made me tear up, when I realized what it was showing.
Corvallis, OR
emdeesee
40 days ago
reply
Sherman, TX
vitormazzi
42 days ago
reply
Brasil
luizirber
42 days ago
reply
Davis, CA
jepler
42 days ago
reply
Earth, Sol system, Western spiral arm
popular
42 days ago
reply
digdoug
42 days ago
reply
Louisville, KY
Share this story
Delete
4 public comments
laza
40 days ago
reply
So beautiful!
Belgrade, Serbia
rraszews
41 days ago
reply
Conway dying of the disease that we fight via social distancing feels like an O Henry twist ending.
Columbia, MD
istoner
42 days ago
reply
Wow. Poignant.
Saint Paul, MN, USA
alt_text_bot
42 days ago
reply
1937-2020

Let's All Wear A Mask

2 Shares

Let's talk about masks!

On Friday, the Centers for Disease Control recommended that every American wear a face covering when in public. Masks will be the hot, bold look for summer.

The medical evidence for the practice is overwhelming. The post-SARS countries in East Asia have known this for a long time, and America and Europe are finally coming around. I've put a bunch of resources about the medical benefits of mask wearing in a further reading section at the bottom of this post.

But in this essay, I want to persuade you not just to wear a mask, but to go beyond the new CDC guidelines and help make mask wearing a social norm. That means always wearing a mask when you go out in public, and becoming a pest and nuisance to the people in your life until they do the same.

Mask Types

Before I go further, it's really important to distinguish between two kinds of masks: N95 masks (the one on the left), and procedure masks (the ones on the right).

When I talk about mask use in this essay, I'll be referring exclusively to the second kind of mask, or its cloth equivalent.

If you have any N95 masks, you need to donate them to a hospital. These masks are lifesaving protective equipment for doctors and medical staff. They are in incredibly short supply. Wearing them in daily life is like wearing a fireman's coat instead of suntan lotion—it doesn't do much for you, and wastes an invaluable resource that could save the life of a first responder.

Depending on where you live, mass-produced procedure masks might also be in short supply. If your local hospital needs them, and you have some on hand, donate them too! You can easily make all the cloth or paper masks you need for your own use at home, using resources I'll link at the end of this essay.

Why we wear masks

In America, we still tend to think of face masks as a defensive shield to ward off illness. This is one reason there has been a run on respirators and hospital-grade masks. People treat it like a space helmet and want the strongest possible protection, doctors be damned.

But that is thinking about it backwards. The point of wearing a mask in public is not to protect yourself, but to protect other people from you. We know that many people who fall ill won't show symptoms during the time when they are most infectious. Some people may even remain asymptomatic through the whole course of the disease, never knowing they had it.

The safest thing to do is assume you're sick all the time, and wear the mask.

The job of protecting you, meanwhile, falls to everyone else! That's why it's so important that we adopt mask wearing as a social norm. When enough sick people wear masks, even of the most rudimentary kind, it becomes difficult for a disease like coronavirus to spread in the population.

In countries like Taiwan and Japan, even before this pandemic started, it was common for people to put on a mask on at the first sign of a cold, or to wear one at all times in the winter months, particularly on public transportation. It was a small courtesy to fellow passengers.

If you've never seen it before, a subway car full of people wearing surgical masks can be an arresting sight. In America, we still tend to associate face masks with hospitals and illness. But it only takes a short time for the practice to start feeling normal, and that's where we want to get to in the next couple of weeks across America and Europe.

By the end of the month, wearing a face mask should be like wearing a shirt—a routine social behavior that is expected of everyone and gets you weird looks if you don't do it.

So that's the main reason we need to wear masks—to protect others in case we are sick! But there are some other good reasons that get less attention. Hipster reasons! I've divided them here into two groups: the ways a mask helps you individually, and the ways it helps us collectively.

Here are the ways wearing a mask will help you as an individual:

  1. A mask is a barrier that keeps you from touching your nose and mouth. By now you've probably noticed how irresistibly drawn your hands are to your face, far more than you would have guessed possible before paying attention to it. Masks make it harder to indulge that habit, as well as other unconscious habits like nose-picking, nail-biting, chewing on pens, or licking your finger when you count money.
  2. Wearing a mask is a mental reminder that things are not normal. Just like many religions ask believers to wear a special garment to keep them mindful of their duty to God, having a mask on your face can help you remember that you are in a situation that calls for special behavior.

  3. Masks are somewhat uncomfortable, a helpful feature when we're trying to limit time spent in public places. Wearing one out in the world gives you an incentive to get your business done quickly so you can go home, scrub your hands, and paw at your naked face in voluptuous luxury.
  4. Masks can help you remember to wash your hands. If you form an association between handwashing and touching your mask, it becomes harder to forget to wash your hands when you come home and take your mask off.

Here are some ways that wearing masks helps other people:

  1. To repeat the most important point, masks reduce the quanity of virus-laden aerosols that come out of our mouth. The depends on what the mask is made of, how it's fitted, whether it's wet, and so on. But evidence is compelling (see below) that there is benefit to almost any kind of facial covering, even a scarf, compared to leaving your face bare. And when you cough or sneeze or talk, the mask reduces your epidemiological blast radius.
  2. A mask is a visible public signal to strangers that you are trying to protect their health. No other intervention does this. It would be great if we had a soap that turned our hands gold for an hour, so everyone could admire our superb hand-washing technqiue. But all of the behaviors that benefit public health are invisible, with the exception of mask wearing.

    If I see you with a mask on, it shows me you care about my health, and vice versa. This dramatically changes what it feels like to be in a public space. Other people no longer feel like an anonymous threat; they are now your teammates in a common struggle.

  3. Universal mask use gives cover to sick people who, for whatever reason, need to be out in the world. If we only ask people to wear a mask when they have symptoms, they might as well put on a flashing neon sign that says INFECTED. Obviously, we want sick people to stay at home, but if they have to go out, they need be able to wear a mask without stigma.

    If you're a nerd, you may recognize this as the same rationale that we give for mandating end-to-end encryption. Everyone needs to follow the safe behavior if we don't want the people who need its protection most to stand out.

  4. Mask wearing prevents harrassment of people from other cultures who choose to wear a mask in public. Adopting a culture of mask wearing may not stop racist assholes, but it will at least make their targets less prominent.

  5. Most importantly, a culture of mask wearing protects doctors, nurses, medical staff, and retail employees who are right now being punished for wearing masks at work, on the grounds that it alarms patients and clients. The same goes for clerks, delivery people, and anyone else who has to do a public-facing job. We want those people to be able to cover up, and for everyone they interact with to be covered up as well.

  6. Finally, because there is a mask shortage, learning to make masks at home will increase the supply of masks we can donate to hospitals, nursing homes, and anywhere else where personal protective equipment is in short supply. While these homemade masks may not be as good as specialized equipment, they are infinitely better than having nothing at all.

So I hope I've convinced you that creating a culture of mask wearing is one of the most effective things we can do, right now, to defeat coronavirus.

The next step is for all of us to make masks, wear masks, make a big ruckus about it on social media, and to put pressure on anyone who tries to impede people from wearing masks in the workplace.

It is important that we do this without exception. That means even if you're going for a jog in the woods, or you've recovered from coronavirus, or you've recently been thawed from a glacier and know you can't possibly be contagious, you still put a mask on.

The goal is not only to keep people safe, but to make it rare and weird to see anyone outside with a bare face.

We've done this kind of thing before in the name of public health! Spitting in the street used to be routine in America; now it's considered extremely rude. Public smoking is no longer tolerated, and smokers have been shamed and confined into furtive puffing, even though they're still a sixth of the populaton.

This doesn't mean we'll never see a stranger's bare face again. Once the immediate crisis is over, we can dial things down and adopt the same social norms as East Asia, where wearing a mask is an optional and unremarkable choice people make, one that tends to increase during cold and flu season, but is just a normal part of the social landscape.

But for the time being, let's get fundamentalist about it! Mask wearing is a powerful weapon, and if we combine it with hand washing, social distancing, staying home whenever possible, and not touching our face, we can really start to kick this virus's ass.

So please, commit with me to starting a lifesaving trend: always wear a face mask when you go out in public, with no exceptions, and make as many masks as you can for the people who need them.

Addendum I: How to make masks

Masks are in short supply, but you can MacGuyver one out of practically anything, including paper towels, cotton, vaccum cleaner bags. Expect the number of online tutorials to proliferate. Here are some I am partial to:

Addendum II: Further reading on mask use

Read the whole story
vitormazzi
42 days ago
reply
Brasil
luizirber
43 days ago
reply
Davis, CA
Share this story
Delete
Next Page of Stories