Systems and Storage Research Lab: Sosp 2017

Now that I have some time to breathe in the semester, I thought I would talk about the 26th Symposium on Operating Systems Principles (SOSP 2017). I attended SOSP 2017 in Shanghai, China where my student Pandian Raju presented our work on PebblesDB. Two other UT professors, Emmett Witchel and Simon Peter, also attended SOSP along with their student, Youngjin Kwon, who presented Strata. The UT Systems group has had at least one paper in SOSP/OSDI every year since 2005!

The view from my hotel room

The conference was held in the beautiful Pudong Shangri-La hotel in Shanghai. Right off the bat, I want to thank the volunteers of Shanghai Jiao Tong University (about 30 of them!) for their amazing hospitality. Here’s some of the great things they did:

Received conference participants at the Pudong Airport, gave them a metro-card with 100 RMB, and directed them to where they could catch the train.
Took us sight-seeing to the Shanghai Tower. Another group went on a river cruise.
Had people posted at different locations in the enormous Shangri-La hotel to direct people to the right conference rooms.

Overall, the hospitality was amazing, and the conference participants were made to feel comfortable. Kudos to the amazing volunteer team!

The AI Systems Workshop

I went to Shanghai a day earlier to catch the AI Systems workshop. It did not disappoint! The event was standing room only, and whenever the speakers said something, a dozen cameras would go up and flash, leaving me wondering whether I was at an academic conference or a fashion show.

The craze in AI is just insane, and a lot of it is just re-branding whatever you are currently doing as ML or AI. However, the workshop also pointed out that behind the empty marketing buzz words there are some good systems problems that are worth attacking. Here is one thing I found insightful (as someone not part of the AI scene): to improve performance, you need to batch your computations; however, while improving performance, batching decreases accuracy. So it is not as simple as lets batch and throw more compute at it! I wonder which other tried-and-tested systems techniques to improve performance fail when applied to AI Systems.

There was a panel where they discussed what current researchers are doing wrong. The panelists mentioned that everything they do is in search of that elusive +1% or 0.5% in accuracy: systems that sacrifice this to increase performance are a total no-go. Similarly, AI-Systems research seems to suffer drastically from the “Bad Baseline” evaluation problem where researchers pick a bad baseline to compare their system against.

Diversity Workshop. I quickly want to mention the Ada Diversity Workshop, which was conducted for the first year this time at SOSP. The workshop allowed female researchers and other under-represented minorities to hear from leading researchers on how to have a successful research career. My student Jayashree got to meet her academic grand-advisor, my advisor Andrea Arpaci-Dusseau, at the workshop! Kudos to the organizers for this year’s workshop, and I hope they continue doing this at OSDI and SOSP!

The Program

We had an excellent program this year under the watch of PC chairs Peter Chen and Lorenzo Alvisi. They accepted 39 papers (a record number), signifying the increasing number of submissions to SOSP. The acceptance rate was 16.8%. One thing to note though was that it was rather tiring sitting through three full days of presentations. As the number of papers at SOSP or OSDI goes up, I’m not sure how the community will handle this (I really don’t want parallel tracks).

There were many interesting presentations at the conference. Let me highlight a few of them.

I loved the My VM is Lighter (and Safer) than your Container presentation by Costin Raiciu. Costin spoke at break-neck pace (it was almost like a rap song!), but the presentation was still clear and engaging. This is very hard to pull off! I think the paper reflects the systems spirit of carefully analyzing where the overhead is coming from, and achieving a seemingly impossible goal (VMs as fast as containers to start up!) through a sequence of careful engineering decisions. We covered the LightVM paper in my undergrad Virtualization class.

I liked the talks from MIT: clear and engaging. I especially liked the Scaling a file system to many cores using an operation log talk which uses ideas similar to my old Optimistic Crash Consistency work. The Algorand talk, which introduced a new cryptocurrency that confirms transactions in minutes, was also intriguing.

I also liked the Tock work. Tock introduces a new embedded operating system written entirely in Rust. Tock introduces multi-programming for embedded devices, using a combination of Rust features and new abstractions such as grants. I’m glad SOSP included at least one instance where somebody was actually building an operating system :)

Trends. The folks at MIT and Xi Wang at Washington continue to work on the challenge of building verified systems. I believe this line of research is crucial, and once done, will transform the way we build systems. I envision a world where we don’t write low-level code anymore (similar to how we don’t write assembly anymore for the most part). We will be writing specs which will synthesize code for us. The Hyperkernel and verifying crash-safe file system work are in this direction. There is a lot more work to be done, as evidenced by the fact that only now can we verify a non-toy file system (with a lot of limitations). I’m quite excited to see where this will go!

The other trend is folks building techniques and systems to help with debugging. The Pensieve work and the Lazy Bug Diagnosis work fall in this bucket. Tools like these will become more and more important as our systems scale and become more complex. My group’s CrashMonkey project is working in this direction to find crash-consistency bugs in file systems.

Minor Issues. The audio at the conference was not very good. Often, in the back, we wouldn’t be able to hear what the speakers were saying. When people asked questions, we often didn’t hear what the questions were either. The slides were projected a bit low, so that the folks in the front blocked the bottom part of the slides for those in the back. These were minor issues which took away from an otherwise excellent and enjoyable conference.

Focus. One thing I wondered back at OSDI 16, and again at SOSP 17, was whether the conference has gotten too broad. Unlike conferences like FAST, there is no single theme that connects together all the papers. I suppose “systems” is the connecting link, but that is too broad to be useful. Going by the number of questions, many talks do not seem to excite or engage a large fraction of the audience. I remember at OSDI 16, even one of the best paper talks hardly had any questions! My (perhaps faulty) memory is that SOSP 13 and OSDI 14 had far more engagement (I missed SOSP 15). I do not know if this is inevitable as the systems community expands, but it is something to think about.

Arpaci-Dusseau Extended Family at SOSP! My academic siblings who are professors, Haryadi Gunawi (UChicago) and Yiying Zhang (Purdue), were attending the conference along with their students who were presenting the MittOS and the Lite Kernel RDMA papers. Since my advisor Andrea was also attending the conference, it gave us a chance to take a picture of Andrea with her academic children and grand-children: Andrea, Yiying, Haryadi, and I with all of our students who were attending SOSP :)

The ADG Extended Family!

Awards

The best paper awards went to DeepXplore from Columbia and The Efficient Server Audit Problem, Deduplicated Re-execution, and the Web from NYU. Nickolai Zeldovich won the Mark Weiser award. Nickolai’s student Haogang Chen won the Dennis M. Ritchie Dissertation Award. The Hall of Fame awards went to the Chubby and Dynamo papers. Interestingly, all Hall of Fame awards in 2016 and 2017 went to industry papers.

Student Research Competition. Ding Yuan organized the first ACM Student Research Competition at SOSP. I was part of the program committee which evaluated posters and presentations at the conference. The SRC is an excellent way to encourage undergraduate and graduate researchers! I hope the SRC is held every year from now on. The final winners were Jon Gjengset (MIT), Tej Chajed (MIT), and Brandon Zhang (UBC).

Congratulations to all the winners!