Interview Transcript

This is a snippet of the transcript, sign up to read more.

That's quite a statement. Could you elaborate on why that is?

To put it bluntly, Snowflake's ML story is a hodgepodge. I wouldn't even know where to begin using Snowflake to build medium-sized or complex ML models. Perhaps I could do simple things by running Docker containers with Python scripts in Snowpark, but there isn't a robust end-to-end pipeline of products spanning Spark, which is massive, and MLflow. It's important to remember that Databricks also has something that Snowflake lacks in the ML use case - developer mindshare. If you and I were to start a machine learning company or hire engineers who know how to build models, they would likely be familiar with TensorFlow and Spark. They probably wouldn't know much about Snowflake. Therefore, Snowflake has a significant gap when it comes to developer mindshare. Snowflake is definitely stronger on the BI analysts and Tableau side. Similarly, the work that Databricks has done with Databricks SQL and Delta Lake is commendable. However, I don't believe their data warehouse products are on par. I think Snowflake's product scales better and performs better. But if you're not looking for the fastest and most extensive data warehouse out there, the Databricks model will work just fine. It also has additional benefits. It's open, allowing you to store your data in a variety of formats and different storage systems. This is very appealing to businesses that don't want to store all their data in a single closed ecosystem product like Snowflake.

This is a snippet of the transcript, sign up to read more.

Could you elaborate on that? Given your area of expertise, could you be more specific about the advantages and disadvantages?

Open approaches, which is like the lake house approach espoused by Databricks and Dremio, which are the two vendors making noise there, the story is very different. The story is, “You’ve got some data in Postgres; you’ve got some data in SD, in Parquetfiles. You’ve also got CSV files.” Not a problem. You can keep that data stored wherever you have it today, in whatever format you have and you just layer our software. In Dremio’s case, it’s the Dremio product. For Databricks, it will be Delta Lake and Databricks SQL. We will manage, we will query all of the data for you. You don’t need to move it. And if you don’t like our products, you can just remove the engine sitting on top of your data. But the data is yours. You did not give it up to someone else.

This is a snippet of the transcript, sign up to read more.

So, what does this dynamic with Databricks mean for these large customers? Do you have any insights?

They can coexist in the sense that I would assign analytical workloads to Snowflake, while Machine Learning workloads would go to Databricks. I can envision a scenario where some analytical workloads are also on Databricks, but I can't see that happening with Snowflake unless they make further investments.

This is a snippet of the transcript, sign up to read more.

Sign up to test our content quality with a free sample of 50+ interviews