It’s Go Time for Open Data Lakehouses

June 10, 2024

If you’re a supporter of open data, it’s hard not to feel good about last week’s news around Apache Iceberg. Customers demanded an open storage format, and the two leading providers, Snowflake and Databricks, are delivering it, in a big way.

To recap: Databricks surprised the big data community last Tuesday by throwing its weight behind Apache Iceberg with the announcement of its intent to acquire Tabular, which was founded by former Netflix engineers who created Iceberg.

That announcement came a day after Snowflake unveiled Polaris, a new metadata catalog designed to work with Iceberg, thereby enabling customers to use open query engines with their data. The move furthered Snowflake’s transition from a proudly proprietary cloud data warehouse into an open data platform for analytics and AI.

Members of the open data ecosystem responded with applause. Among the biggest supporters is Dremio, which develops an open-source query engine of the same name, is the main backer for an open metadata catalog, Project Nessie, and also manages an Iceberg-based lakehouse for customers.

“I think it’s a statement that, in table formats, Iceberg won. I think it’s the realization of that,” said James Rowland-Jones (JRJ), Dremio’s vice president of product management. “It’s also the realization that table format bifurcation, when you are not winning, is not helpful to your business.”

Databricks’ table format, called Delta, was the most-used table format when Dremio surveyed customers on their lakehouse technologies in late 2023. While Delta was number one in terms of total deployments, Iceberg was the leader in terms of planned deployments over the next three years, said Read Maloney, Dremio’s chief marketing officer.

“Who’s driving these changes? It’s customers. Customers are sick of being locked-in, and the only way to do that is to ensure that you’re not only in an open table format, but then you have an open catalog,” Maloney told Datanami in an interview at Snowflake’s Data Cloud Summit in San Francisco last week.

Read the full article by Alex Woodie, via Datanami.

get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.