Webinars

Panel: Open Data Architecture

The Open Data Architecture panel closes Subsurface LIVE Summer 2021 with a lively discussion about the state of the cloud data lake with some of the most influential creators and contributors to key open source data lake software. Moderator Gartner analyst Sanjeev Mohan opens the panel with highlights of recent cloud data lake industry trends. Then he asks the open source panelists questions such as:

  1. Why data lakes have not, until now, succeeded in having the fast turnaround of data that was expected;
  2. What missteps have there been along the way and what lessons have we learned from them;
  3. What each contributors journey has been, and why they have succeeded.

Topics Covered

Apache Arrow Flight
Dremio Subsurface for Apache Arrow
Dremio Subsurface for Apache Parquet
In-Memory Formats
Metastores
Subsurface: Nessie Project Insights
Table Formats
Unlocking Potential with Apache Iceberg

Speakers

Sanjeev Mohan

Sanjeev Mohan

Mohan Sanjeev is an established thought leader in the areas of cloud, big data and analytics. He researches and advises on changing trends and technologies in the modern cloud data architectures. He started his data and analytics journey at Oracle where he worked on emerging technologies. Until recently, he was a Gartner vice president known for his prolific and detailed research, and for directing the data and analytics agenda. Now a Principal at SanjMo, he provides advisory and consulting services, covering modern data architectures, governance and operations. He regularly presents on topics pertaining to end-to-end data pipelines and is excited to help businesses discover what their data can do for them.

Ryan Blue

Ryan Blue

Ryan Blue is the co-creator of Apache Iceberg, and he works on open source data infrastructure. He is also an Avro, Parquet, and Spark committer.

Ryan Murray

Ryan Murray

Ryan Murray is an open source Engineering Lead at Dremio. He previously served in the financial services industry doing everything from bond trader to data engineering lead. Ryan holds a PhD in theoretical physics and is an active open source contributor who dislikes it when data isn’t accessible in an organisation. He is passionate about making customers successful and self-sufficient, and still one day dreams of winning the Stanley Cup.

Julien Le Dem

Julien Le Dem

Julien Le Dem is the Chief Architect of Astronomer and Co-Founder of Datakin. He co-created Apache Parquet and is involved in several open source projects including OpenLineage, Marquez (LFAI&Data), Apache Arrow, Apache Iceberg, and others. Previously, he was a senior principal at WeWork, a principal architect at Dremio, a tech lead for Twitter’s data processing tools, where he obtained a two-character Twitter handle (@J_), and a principal engineer and tech lead working on content platforms at Yahoo, where he received his Hadoop initiation. His French accent makes his talks particularly attractive.

Wes McKinney

Wes McKinney

Wes McKinney is a software developer and entrepreneur focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow. He authored two editions of the reference book, Python for Data Analysis. Wes is a member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is now the CTO and co-founder of Voltron Data, a new startup working on accelerated computing technologies powered by Apache Arrow.

Ready to Get Started? Here Are Some Resources to Help

Whitepaper Thumb

Whitepaper

Dremio Upgrade Testing Framework

read more
Whitepaper Thumb

Whitepaper

Operating Dremio Cloud Runbook

read more

Webinars

Unlock the Power of a Data Lakehouse with Dremio Cloud

read more
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Ready to Get Started?

Bring your users closer to the data with organization-wide self-service analytics and lakehouse flexibility, scalability, and performance at a fraction of the cost. Run Dremio anywhere with self-managed software or Dremio Cloud.