Compression, Dedupe and Encryption Conundrums in Cloud Data Lakes

Cloud data lake footprints are in exabytes and exponentially growing, and companies pay billions of dollars to store and retrieve data. In this talk, we will cover some of the space and time optimizations, which have historically been applied to on-premises file storage, and how they would be applied to objects stored in cloud data lakes.Deduplication and compression are techniques that have been traditionally used to reduce the amount of storage used by applications. Data encryption is table stakes for any remote storage offering, and today we have client-side and server-side encryption support by cloud providers.Combining compression, encryption and deduplication for object stores in the cloud is challenging due to the nature of overwrites and versioning, but the right strategy can save millions of dollars for an organization. We will cover some strategies for employing these techniques, depending on whether an organization prefers client-side or server-side encryption, and discuss online and offline deduplication of objects.Companies such as Box and Netflix employ a subset of these techniques to reduce their cloud footprint and provide agility in their cloud operations.

Ready to Get Started? Here Are Some Resources to Help


Dremio’s Well-Architected Framework

read more
Whitepaper Thumb


Harness Snowflake Data’s Full Potential with Dremio

read more
Whitepaper Thumb


Simplifying Data Mesh for Self-Service Analytics on an Open Data Lakehouse

read more
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.