Get Started Free
No time limit - totally free - just the way you like it.Sign Up Now
Companies can choose between two high-level architectures to support analytics: data warehouses and data lakehouses. Cloud data warehouses have become popular because they’re relatively easy to get started with. However, once a company is really using their warehouse, it quickly becomes one of their largest expenses — often 10 times more expensive than anticipated.
It’s easy to see how data warehouse costs can quickly get out of hand:
It looks like the market agrees that cloud data warehouses are overwhelmingly expensive. Countless blogs have been written, pre-built dashboards have been developed, and entire companies have emerged to help organizations better manage their cloud data warehouse spend. Capital One, one of Snowflake’s largest customers (and an investor in Snowflake’s Series D), even released a tool, Slingshot, to help “[s]eamlessly manage your Snowflake data costs.”
The price isn’t purely monetary either. By requiring you to load data before usage*, data warehouses delay time to insight and introduce vendor lock-in:
Migrating to a new warehouse that proposes to charge less per consumable unit won’t solve these problems either — these issues are common across all warehouses.
*Here, “usage” specifically refers to running production workloads that fully benefit from the performance and other features/optimizations that warehouses provide.
Data lakehouses combine the scalability and openness of data lakes with the performance and functionality of data warehouses. Lakehouses enable companies to run all their analytical workloads on a single copy of data as it lives in cloud object storage (where most of their data already is), rather than needing to copy it into different proprietary systems before it can be analyzed. With a lakehouse architecture, data consumers can use their favorite tools to analyze data immediately, and data engineers save the time and money needed to load data into a warehouse for others to use (and to maintain the associated infrastructure).
In addition, data lakehouses eliminate vendor lock-in and lock-out that cloud data warehouses are notorious for. Data in cloud object storage is stored in open, vendor-agnostic formats like Apache Parquet and Apache Iceberg, so no vendor has leverage over the data. And, companies benefit from competition and innovation in the lakehouse space, so they can use best-of-breed compute engines to process this data today, and easily try new compute engines as they emerge.
If companies want to fully experience the many benefits of cloud storage and computing, they’ll see the most success by selecting vendors that embrace open architectures and understand the need for other vendors to be at the table.
Dremio is the easy and open lakehouse platform. Data teams use Dremio to deliver self-service analytics on the lakehouse, while enjoying the flexibility to use Dremio's lightning-fast SQL query service and any other processing engine on the same data. Companies looking to get started with their lakehouse journey choose Dremio for several reasons.
First, Dremio makes it easy for companies to immediately start running all their SQL workloads directly on data lake storage, from ad-hoc queries to mission-critical BI dashboards. Dremio supports full DML on Apache Iceberg tables (including inserts, updates, deletes, merges, and truncations), which means companies no longer need to load data into a warehouse to tackle workloads that require data mutations. And, with data stored in open formats on the lake, companies get the flexibility to use any other processing engine on that same data.
Companies can also use Dremio to combine their lakehouse data with data that resides in external sources. We know companies won’t have all their data in data lake storage on day one, so Dremio supports a variety of connectors to external systems like relational databases. Connectors, combined with Dremio’s DML support, enable teams to run a wide range of analytics workloads across a variety of sources, and get started with a lakehouse architecture with less up front work.
On top of connectivity, companies choose Dremio because it gives data consumers a self-service experience to explore, analyze, curate, and share data through a modern, intuitive UI. With Dremio, data analysts and data scientists can analyze and experiment with data immediately, without needing help from data engineers. In addition, Dremio’s transparent query acceleration enables teams to use any SQL client to work directly with their logical data model, without needing to worry about physically optimized tables or create BI extracts.
Companies across the world have already used Dremio to melt away their cloud data warehouse costs as part of their lakehouse journey:
Companies have become interested in data lakehouse architectures because they eliminate the operational costs and vendor lock-in associated with data warehouses, and enable their teams to use best-of-breed tooling on their data. You can get started with your lakehouse architecture for free today with Dremio! If you have any questions or want to discuss cost optimization strategies when thinking about a lakehouse architecture, feel free to reach out to our experts here.