My Top 8 Moments at TWIMLcon 2019

What do Andrew Ng, Eric Colson, and Peter Skomoroch (and many others) all have in common? Aside from being some of the biggest names in the machine learning community, they were all speakers at the inaugural TWIMLcon conference that took place last week in San Francisco. Machine learning practitioners from many industries convened at the Mission Bay Conference Center to hear industry leaders discuss the platforms, tools, technologies, and practices necessary to enable and scale machine learning and AI in the enterprise. Presenters explored topics such as the barriers to getting models into production, how to apply MLOps and DevOps to ML workflows, and organizational and cultural best practices for success.

I had an incredible time speaking about building a machine learning platform in Kubernetes, but my talk was just a small part of what was otherwise an extremely informative and well organized conference. Aside from keynote interviews and technical sessions by industry thought leaders like Andrew Ng and Deepak Agarwal, the conference featured "team teardown" panels where several members of a single company explored relationships between ML platform builders and users and an Unconference where conference attendees could suggest and vote on group led sessions through the conference mobile app.

What follows are my top 8 moments, experiences, and takeaways from the conference.

Sam Charrington presents Unconference voting results.

1. “It takes a village to get AI right."

While building ML platforms requires data scientists to partner with platform engineers and SREs, building machine learning products involves incorporating additional expertise from UX, Product, and Design. "At LinkedIn," says Deepak Agarwal, "we have a machine learning engineer sitting with product designers at the design stage when building new products." If you want to get the AI right, the VP of AI at LinkedIn continued, these different roles need to work together from the planning stage all the way through implementation. But bringing these diverse roles together requires extensive planning and coordination.

In order to attract and keep high quality UX designers on ML projects, says Hussein Mehanna, you need to treat them as first class citizens along with your data scientists and machine learning engineers. This coordination becomes considerably more challenging when you include teams responsible for manually labeling datasets, which may be distributed across the world. In their discussion on human-in-the-loop machine learning, Robert Munro and Radha Basu discussed best practices for ensuring that data labelers are treated fairly and humanely, regardless of whether they’re internal employees or externally sourced. Deepak Agarwal summarized it perfectly: “It takes a village to get AI right."

2. Companies everywhere are making the Build vs Buy vs Open Source decision for ML platforms

A recurring question asked at the conference was whether organizations should build their own ML platforms or buy from vendors. During my presentation I discussed my experience having both built and bought platforms throughout my career. Afterwards I was asked about my experiences by the heads of data science departments from several different companies. They were interested in what vendors I had tested, which ones I preferred, and how many engineers it took to build a platform from scratch.

Franziska Bell, Director of Data Science Platforms at Uber, answered several of these questions during her keynote. When deciding whether or not to build a platform at Uber, Franziska said she considers three questions: 1) Will the investment in the platform lead to a step function improvement in the modeling? 2) What is the breadth of applications of a particular methodology at the company? and 3) How reusable is the platform being considered?

The Build vs. Buy topic was the most popular Unconference session. During this group chat I offered additional advice from my experience. For example, I believe strongly that companies should only build platforms if the platform provides differentiated business value. If you’re considering different vendors, you should invest time up front in defining what features you’re looking for and determine criteria by which to compare different vendors. I’ll be following up with an entire blog post on the topic, so if you’re interested in receiving it, sign up at the bottom of this page.

3. Importance of language and framework agnosticism

A theme that emerged over and over again throughout the sessions was the importance of language and framework agnosticism in machine learning development. Ameen Kazerouni, the Director of Data Science at Zappos described how his team uses tools like protocol buffers and avro/parquet files to establish contract first protocols and promote cross-language data exchanges. They also use model protocols like PMML and ONNX to enable data scientists to build a model in one language and serve it in another. John Swift and Sumit Daryani from Capital One included Framework and Language agnosticism as one of the key features they prioritized when planning to build their internal Kubernetes based machine learning platform.

This emphasis on framework and language agnosticism points to how rapidly new ML libraries are being introduced and developed. Rather than constrain data scientists to develop in a single language, which tends to reduce operational concerns, platform engineers are building solutions that rely on efficient data transfer and containerization to enhance flexibility. As Deepak Agarwal said in his keynote "A proper platform is an algorithm agnostic platform”.

4. Kubernetes is everywhere

It was clear from the conference that companies everywhere have embraced Kubernetes for their machine learning workloads. In his opening statement, Sam Charrington declared that in the ML community, Kubernetes has gone from a bleeding edge idea to "almost passè", and that TWIMLcon had recieved a large number of proposals from individuals seeking to speak about their experience with the container orchestration platform. Although many of these proposals weren’t accepted, there were several fantastic talks on the topic.

Sumit Daryani and John Swift presented the microservice-based architecture their team has built on top of Kubernetes at Capital One. The system has enabled self-service model deployment for data scientists, provided resilient & scalable on-demand resources, and allowed them to leverage opensource components like seldon-core for deployment. In the autonomous vehicle space, Clément Farabet from NVIDIA described MagLev, their end-to-end GPU enabled AI platform running on kubernetes. I was amazed by the level of scale and processing Farabet described – the MagLev system is running across thousands of GPUs and processing petabyte-scale datasets. His was definitely a talk for scale/speed junkies!

And I’d be remiss not to mention my own talk. Directly after Clément spoke I presented a case study of building a kubernetes based ML platform at 2U. Judging by the number of photos taken by audience members, the private messages I received in the conference app, and the number of people who came up to me afterwards to ask questions, I’d say that the talk was extremely well received ; )

There was also a lot of discussion around the Kubeflow platform. During the Unconference, Kubeflow co-founder and Kubernetes contributor David Aronchick led a packed session on how companies are utilizing the platform in production settings.

Luigi Patruno discusses Kubernetes at 2U — Hey, that’s me!

5. Human in the loop creating new jobs that were previously unimagined

One of the sessions I enjoyed most was the joint talk by Robert Munro and Radha Basu. Robert Munro, author of the new book Human-in-the-Loop Machine Learning, discussed systems and processes companies can employ to efficiently utilize humans for data labeling while ensuring that these workers are treated fairly. For example, companies should always assume good will on the part of the humans doing the labeling, even if there are errors in the labeling process. Workers should be compensated completely for their efforts, and companies can decide to exclude those individuals or teams from future work.

Robert introduced Radha, the CEO of iMerit, a data labeling company employing 2,500 individuals responsible for tagging and labeling datasets. iMerit effects positive social and economic change by tapping into a talent pool that was under-resourced and digitally excluded. It was incredibly inspiring to hear Radha talk about iMerit, its mission, and its employees, especially since discussions about artificial intelligence and labor usually revolve around workforce displacement rather than engagement. Instead, data labeling is an example of machine learning creating jobs that were previously unimagined. iMerit has capitalized on this opportunity with their social mission.

Radha Basu discusses her company iMerit.

6. Autonomous vehicles driving the field of machine learning forward

It’s hard to find a more exciting arena for the application of AI algorithms than autonomous vehicles. But several sessions demonstrated that autonomous vehicles are also a hotbed of innovation when it comes to AI platforms. In his keynote, Hussein Mehanna was described how his previous platform work at Facebook and Google compares to his current work at the self-driving car company Cruise. The biggest difference, he said, was the scale of data. According to Hussein, people believe that Facebook and Google are working with the largest datasets in the world. In reality, the size of their datasets pales in comparison to the amount of sensor data being generated by autonomous vehicles. This sensor data, he continued, doesn’t just have many samples (long) but is extremely high-dimensional (wide) due to the number of sensors on the vehicle. Hussein believes that these constraints in the autonomous space will drive innovation and propel the fields of artificial intelligence and machine learning forward. He concluded this topic with what was probably my favorite quote of the conference: "I think machine learning and AI will fulfill their promise through the vehicle of robotics."

Clément Farabet picked up where Hussein left off by discussing the combination of hardware and platform innovation currently in development at NVIDIA. Clément mentioned that the scale of data in autonomous is absolutely unprecedented and requires innovation at all levels of the stack from the data centers through distributed algorithms.

Hussein Mehanna discusses his work at Cruise.

7. Full Stack Data Scientists Vs Specialization

Another major topic of conversation was whether to build a data science team around generalists a.k.a. fullstack data scientists or specialists. The most common reason folks for choosing the specialist approach is that finding, hiring, and retaining fullstack data scientists is difficult. During his session Ameen Kazerouni said that interviewing for disparate skill sets in product and design along with the traditional topics of statistics and machine learning is both extremely difficult and expensive. This sentiment was echoed in an Unconference session where the major takeaway was that fullstack data scientists are expensive and hard to find, and that "If you hire one, you should keep him around, even if he’s a bit of an ass" (LOL).

But while finding and hiring fullstack data scientists may be difficult, these arguments say nothing about the quality of work performed by a generalist. Eric Colson of Stitch Fix believes that generalists are capable of doing higher quality work because the number of handoffs between different team members is minimized. I followed up with Eric and asked if he believes there are situations in which he specialists are favored over generalists. Essentially he replied that specialists are preferred when the cost of errors, either on the modeling or infrastructure side, is very high (such as in autonomous vehicles). His statement resonated with my experience as a machine learning engineer at CTRL-Labs. We had to drive innovation on the algorithms side (computational neuroscientists modeling neural EMG signal), on the software side (low latency data transfer), and on the hardware side (designing and fabricating the wearable).

8. Takeaways from Andrew Ng’s talk

Without a doubt my favorite experience of the conference was watching Andrew Ng’s keynote interview. With a resume that includes roles like Chief Scientist at Baidu and founding lead of Google Brain, it’s hard to argue that anyone in the world has more experience bringing machine learning and artificial intelligence to production than Andrew Ng. His interview was filled with wisdom, insights, and recommendations from each his experiences.

Companies that are new to machine learning, said Ng, should avoid lofty goals and focus first on completing small projects. The success from completing these small projects will help create momentum within a company and build confidence to enable the business to do bigger projects. When it comes to planning these bigger, loftier projects, he continued, machine learning practitioners should apply the Goldilocks Rule for AI: don’t be overly optimistic about what current approaches can achieve, but don’t be overly pessimistic and undershoot either. After achieving initial success with machine learning, Andrew claimed, business risks rather than technical challenges are what make scaling AI hard difficult. Before implementing an ML project, teams should brainstorm the risks associated with an idea and, for each risk, estimate how probable that risk is, the severity should the risk occur, and how detectable it is.

But my favorite part of his talk was the idea of the 1-day sprint. In the morning, data scientists spend time performing error analysis on the results of an experiment they initiated the night prior. In the early afternoon they spend time brainstorming new experiments and ideas. Teammates then split up in the afternoon to implement the ideas from the brainstorming session. Finally, they start these experiments in the evening before heading home. The next day the loop is repeated. According to Andrew Ng, these 1-day sprints are more like day-long debugging sessions. Brilliant!

Andrew Ng and Sam Charrington on stage. — The Great Andrew Ng! (and the great Sam Charrington 😀 )

Conclusion

It’s hard to summarize all of the great ideas and conversations from my experiences at TWIMLcon into one article. And my experience suffers from the selection bias of being able to attend only a single session at a given time. But I’m willing to bet that the other attendees and speakers had their own terrific experiences. If you attended the conference, please let me know what your favorite moment was in the comment section below!

THANK YOU! It’s hard to express the immense gratitude to everyone who helped with, participated in, and supported the very idea of #TWIMLcon. pic.twitter.com/PlCTxx0pcd
— The TWIML AI Podcast (@twimlai) October 3, 2019