Below is an interview I conducted with Yochay Ettun, Co-founder & CEO of cnvrg.io, an end-to-end machine learning platform to build and deploy AI models at scale. Yochay Ettun and his fellow Co-founder Leah Kolben launched cnvrg.io in 2017.
I caught up with Yochay to discuss the launch of CORE, a free community version of the data science platform.
1. Can you tell us about the motivation behind cnvrg.io? Why did you decide to start the company? What is the main problem you’re seeking to solve?
My co-founder Leah Kolben and I built cnvrg.io 3 years ago as AI consultants. Our vision was to help other data scientists manage and simplify their machine learning projects with an end to end platform. With the growing technical complexity of the AI field, cnvrg.io helps data science teams focus less on DevOps and dependencies, and more on the real data science – algorithms.
Our platform has grown to provide model management and MLOps solutions, and constantly releases new features to help accelerate AI development and give data scientists the best possible end to end experience.
2. Based on your discussions with different companies, can you describe what differentiates successful machine learning projects in industry from unsuccessful projects?
We’ve been working with companies on ML for years, and remarkably have found that more than 80% of models don’t ever make it to production. That means that all of the invested time and money put into these ML projects don’t produce ROI for the company. The most successful ML projects have been the ones that actually make it into production, and have the infrastructure to be managed while in production.
No matter how complex the machine learning, the most important thing is getting it to production. That is how companies can realize the potential of their machine learning team, and increase ROI for the company so that there is continued buy-in from the business leaders.
3. What are the main challenges that companies face as they incorporate machine learning into their products and services?
Often times, companies spend most of their efforts on building their models, and ensuring they are perfect. We’ve found that building the model is only half the work. Maintaining a model while it is in production can have a few challenges on its own. For one, it’s rare to have engineers that are trained in the unique requirements of managing a model in production. It is quite different than deploying a typical application.
For one, these models need to be prepared to handle a burst in inputs at any time, and have the correct compute available and automated to fluctuate based on demand. Another major challenge is monitoring specific parameters in real time.
Models also take a lot of tweaking and maintenance while in production. Models are not stagnant, they should continue to learn while in production and stay up to date with the current data. As we see with recent Covid-19 pandemic events, models that were running on data from a few weeks ago are no longer relevant. The models need to be constantly updated with current data to maintain accuracy, and to ensure performance is as up to date as possible, otherwise it can have serious implications for companies and societal implications.
4. How does cnvrg.io help companies with these challenges?
As data scientists, we know that a lot of time is wasted on non-data science tasks such as DevOps, configuration, versioning, data management, tracking experiments, visualization, deployments, monitoring performance and more. cnvrg.io’s enterprise machine learning platform helps data science teams focus less on the technical complexities of ML, and allows them to focus on the real data science – building high impact models.
We’ve developed a code-first platform for data scientists for effective model management and one click MLOps solutions to accelerate time from research to production. Our container-based platform has simplified infrastructure management with automated cluster orchestration, meta-scheduler features, and one click integration to any Nvidia GPU Container.
5. Two challenges I see again and again from practitioners have to do with model deployment and model monitoring. How does cnvrg.io address these challenges?
A major focus of cnvrg.io is on how to get more models to production and keep them running at optimized performance. We’ve made deployment of any ML application simple. In one click, any model can be deployed as a scalable REST API running on Kubernetes. Once published, data scientists and engineers have complete control over their model in production with an in-depth visual dashboard to monitor all parameters.
Our monitoring dashboard is complete with tools like Grafana and Kibana to monitor and visualize your system and ML health. Users have the ability to set alerts for underperforming models, and can set up automatic retraining triggers called continual learning to ensure your models remain at peak performance while in production – with zero downtime.
6. You recently announced CORE, a lightweight free version of the cnvrg.io platform. What motivated you to create this offering and what are you hoping to achieve through a freemium offering?
At a time where there is so much uncertainty, we believe that data scientists have the ability to solve some of the world’s most complex problems with machine learning. As was our vision from the very start, cnvrg.io wants to help data scientists do what they do best. With the growing technical complexity of the AI field, the data science community has strayed from the core of what makes data science such a captivating profession- the algorithms.
Today’s reality is that data scientists are spending 80% of their time on non-data science tasks, and 65% of models don’t make it to production. cnvrg.io CORE is an opportunity to open our end-to-end solution to the community to help data scientists and engineers focus less on technical complexity and DevOps, and more on the core of data science – solving complex problems.
At such a crucial time in history, it’s so important to empower the data science community that has been the catalyst for innovation over the past few decades. We hope that while many other fields must slow down, CORE can help the data science community accelerate AI innovation and use it to solve the unique complex challenges of today. Now, any data scientist can launch cnvrg CORE from our website and start an ML workstation from their home office for free – whether on premise or in the cloud.
7. What advice do you have for ML teams that are struggling to build machine learning solutions into products? What about teams that are just starting out with machine learning?
Getting started with the right infrastructure is key to accelerated growth of a machine learning team. We’ve worked with a multitude of organizations that have wasted years building in-house infrastructure to support their data science team. Our advice – don’t reinvent the wheel. What these organizations failed to realize is that they are building for a moving target.
You never know when your team or AI production needs to scale. Teams that are trying to build solutions with high business impact should focus on that, not on trying to stay up-to-date with the most recent AI technology. That is what companies like cnvrg.io are for. Data science platforms like ours provide all the latest MLOps tools and infrastructure to actually accelerate your machine learning development so you can focus on impact, and attaining company buy-in with your high performing models.
8. Why should companies choose cnvrg.io over platform solutions from more established companies like AWS Sagemaker or Google Cloud ML?
cnvrg.io is a tool by data scientists, for data scientists. We’ve worked with data science teams across industries to provide everything a data scientist needs to build high impact ML solutions. As data scientists we understand that practitioners value flexibility to use any language, framework or compute whether on premise or in the cloud.
cnvrg.io blends in easily with existing IT infrastructure, and ensures users never need to be vendor-locked so they can utilize the most economical multi-cloud or hybrid-cloud options for their specific use case.
Our unified code-first workbench takes care of all the “plumbing” so data scientists and ML engineers can focus on building solutions, gathering insights, and delivering business value. While cnvrg.io has everything a data scientist needs to build high impact models, with cnvrg you can integrate to AWS Sagemaker, MLFlow and Google Cloud ML to enhance your existing ML workflow.
About Yochay Ettun, Co-founder & CEO of cnvrg.io
Yochay is an experienced tech leader with a background in building and designing products. Since the age of 7 Yochay has been writing code. He served in the Israeli Defence Force Intelligence unit for 4 years, and studied Computer Science at the Hebrew University of Jerusalem (HUJI) where he founded the HUJI Innovation Lab. Yochay lead as the former CTO of Webbing labs, and has been consulting companies in AI and machine learning. After 3 years of consulting, Yochay, along with Co-founder Leah Kolben decided to create a tool to help data scientists and companies scale their AI and Machine Learning with cnvrg.io. The company continues to help data science teams from Fortune 500 companies manage, build and automate machine learning from research to production. Feel free to connect Yochay Ettun on LinkedIn.
Visit cnvrg.io to learn more about the end-to-end data science platform.