ezyang's blog

the arc of software bends towards understanding

Posts

A year into Backpack

It’s been a year since I got my hood and gown and joined Facebook (where I’ve been working on PyTorch), but while I’ve been at Facebook Backpack hasn’t been sleeping; in fact, there’s been plenty of activity, more than I could have ever hoped for. I wanted to summarize some of the goings on in this blog post.

Libraries using Backpack

There’s been some really interesting work going on in the libraries using Backpack space. Here are the two biggest ones I’ve seen from the last few months:

Read more...

A compile-time debugger that helps you write tensor shape checks

A run-time debugger allows you to see concrete values in a program, make changes to them, and continue running your program. A compile-time debugger allows you to see symbolic values in a program, reason about them, and write the rest of your program, e.g. filling in missing tensor size checks, for example.

Here’s an example of a compiler-time debugger in action.

Let’s suppose you are writing a simple program to read a pair of tensors from two files and do a matrix multiply on them. “Easy,” you think, while writing the following program:

Read more...

Online/offline continuous integration

Raise your hand if you’ve ever put one of these commands in your continuous integration scripts:

  • apt install somepackage
  • pip install -r requirements.txt or pip install somepkg
  • conda install blah
  • cabal update or cabal install blah
  • git clone https://github.com/someguy/somerepo
  • wget http://some-website/thingy-latest.tgz

Can you tell what the problem is? These commands are not reproducible: depending on when you run them, they may give different results. More insidiously, most of the time they give you the same result (or, perhaps, a different result that still works for your use case).

Read more...

Semantic Import Versioning in the wild

The best and worst thing about semantic import versioning is that it makes BC-breaking changes hard.

In the past few days, Russ Cox has made a splash in a series of white papers describing Go and Versioning. In them, he coins a new term, Semantic Import Versioning, distilling it to the following principle:

If an old package and a new package have the same import path, the new package must be backwards compatible with the old package.

Read more...

Systems ML workshop panel

  • JG: Joseph Gonzalez
  • GG: Garth Gibson (CMU)
  • DS: Dawn Song (UC Berkeley)
  • JL: John Langford (Microsoft NY)j
  • YQ: Yangqing Jia (Facebook)
  • SB: Sarah Bird
  • M: Moderator
  • A: Audience

M: This workshop is bringing together ML and systems. Can you put your place on that spectrum? Who is your home community?

YJ: Right in the middle. I’d like to move more towards systems side, but Berkeley Parallel Labs kicked me out. ML is my home base.

Read more...

Accelerating Persistent Neural Networks at Datacenter Scale (Daniel Lo)

The below is a transcript of a talk by Daniel Lo on BrainWave, at the ML Systems Workshop at NIPS'17.


Deploy and serve accelerated DNNs at cloud scale. As we’ve seen, DNNs have enabled amazing applications. Architectures achieve SoTA on computer vision, language translation and speech recognition. But this is challenging to serve in large-scale interactive because there are latency, cost and power constraints. Also, DNNs are growing larger in size and complexity.

Read more...

MOCHA: Federated Multi-Tasks Learning (Virginia Smith)

The below is a transcript of a talk by Virginia Smith on MOCHA, at the ML Systems Workshop at NIPS'17.


The motivation for this work comes from the way we think about solving ML problems in practice is changing. The typical ML workflow looks like this. You start iwth dataset and problem to solve. Say you want to build a classifier to identify high quality news articles. Next step is to select an ML model to solve the problem. Under the hood, to fit the model to your data, you have to select an optimization algorithm. The goal is to find an optimal model that minimizes some function over your data.

Read more...

A Machine Learning Approach to Database Indexes (Alex Beutel)

The below is a transcript of a talk by Alex Beutel on machine learning database indexes, at the ML Systems Workshop at NIPS'17.


DB researchers think about there research differently. You have a system that needs to work for all cases. Where as in ML, we have a unique circumstance, I’ll build a model that works well. In DB, you have to fit all.

To give an example of this is a B-tree. A B-tree works for range queries. We have records, key, we want to find all records for range of keys. 0-1000, you build tree on top of sorted array. To quickly look up starting point in range. What if all my data, all of the keys, from zero to million… it becomes clear, you don’t need the whole tree above. You can use the key itself as an offset into the array. Your lookup is O(1), O(1) memory, no need for extra data structure.

Read more...

Ray: A Distributed Execution Framework for Emerging AI Applications (Ion Stoica)

The below is a transcript of a talk by Ion Stoica on Ray, at the ML Systems Workshop at NIPS'17.


We’ve been working on it at Berkeley for more than one year. Over the past years, there’s been tremendous progress in AI. Ad targeting, image&speech, many more. Many applications are based on supervised learning with DNNs. Supervised plus unsupervised are the two dominant approaches.

However, the next generation of AI applications will be very different. They’re deployed in mission critical scenarios, need to continually learn from a rapidly changing env. Robotics, self driving cars, unmanned drones, dialogue systems. Implementing this new generation of AI applications requires a broader range of techniques. Stochastic optimization, parallel simulations, many more.

Read more...

Backpack for deep learning

This is a guest post by Kaixi Ruan.

Backpack is a module system for Haskell, released recently in GHC 8.2.1. As this is a new feature, I wanted to know how people use it. So I searched Twitter every day, and the other day I saw this tweet:

Are there other examples than String/Bytestring/Text? So far I haven’t seen any; it seems like backpack is just for glorified string holes.

There were a number of good responses, but I want to give another use case from deep learning.

Read more...