Dremio Jekyll

CMSWire: Understanding the Difference Between Data Lakes and Data Warehouses

Our CMO and VP of Strategy, Kelly Stirman was quoted in this great piece by CMS Wire on data lakes.

Here is what we liked:

“Since data lakes primarily store raw and unprocessed data, the stored data can be utilized for any purpose, which makes it ideal for artificial intelligence (AI), machine learning and data science. However, unprocessed data does require a large storage capacity and there is also an issue of data governance.

Data warehouses are great for exploring data relationships across the enterprise. For example, if customer, product and facility information are all in the data warehouse, the [data] warehouse makes it relatively easy to see the customer satisfaction and returns ratings are related to the different facilities at which those products are made.

A data lake can be considered a vast pool of raw data where the purpose is not defined. A data warehouse is a repository for structured and defined data that has already been processed for a particular purpose.”