Panel: Open Data Architecture
The Open Data Architecture panel closes Subsurface LIVE Summer 2021 with a lively discussion about the state of the cloud data lake with some of the most influential creators and contributors to key open source data lake software. Moderator Gartner analyst Sanjeev Mohan opens the panel with highlights of recent cloud data lake industry trends. Then he asks the open source panelists questions such as:
- Why data lakes have not, until now, succeeded in having the fast turnaround of data that was expected;
- What missteps have there been along the way and what lessons have we learned from them;
- What each contributors journey has been, and why they have succeeded.
Mohan Sanjeev is an established thought leader in the areas of cloud, big data and analytics. He researches and advises on changing trends and technologies in the modern cloud data architectures. He started his data and analytics journey at Oracle where he worked on emerging technologies. Until recently, he was a Gartner vice president known for his prolific and detailed research, and for directing the data and analytics agenda. Now a Principal at SanjMo, he provides advisory and consulting services, covering modern data architectures, governance and operations. He regularly presents on topics pertaining to end-to-end data pipelines and is excited to help businesses discover what their data can do for them.
Ryan Blue is the co-creator of Apache Iceberg, and he works on open source data infrastructure. He is also an Avro, Parquet, and Spark committer.
Ryan Murray is an open source Engineering Lead at Dremio. He previously served in the financial services industry doing everything from bond trader to data engineering lead. Ryan holds a PhD in theoretical physics and is an active open source contributor who dislikes it when data isn’t accessible in an organisation. He is passionate about making customers successful and self-sufficient, and still one day dreams of winning the Stanley Cup.
Julien Le Dem
Julien Le Dem is the Chief Architect of Astronomer and Co-Founder of Datakin. He co-created Apache Parquet and is involved in several open source projects including OpenLineage, Marquez (LFAI&Data), Apache Arrow, Apache Iceberg, and others. Previously, he was a senior principal at WeWork, a principal architect at Dremio, a tech lead for Twitter’s data processing tools, where he obtained a two-character Twitter handle (@J_), and a principal engineer and tech lead working on content platforms at Yahoo, where he received his Hadoop initiation. His French accent makes his talks particularly attractive.
Wes McKinney is a software developer and entrepreneur focusing on analytical computing. He created the Python pandas project and is a co-creator of Apache Arrow. He authored two editions of the reference book, Python for Data Analysis. Wes is a member of The Apache Software Foundation and also a PMC member for Apache Parquet. He is now the CTO and co-founder of Voltron Data, a new startup working on accelerated computing technologies powered by Apache Arrow.
Ready to Get Started? Here Are Some Resources to Help
What Is a Data Lakehouse?
The data lakehouse is a new architecture that combines the best parts of data lakes and data warehouses. Learn more about the data lakehouse and its key advantages.read more
Simplifying Data Mesh for Self-Service Analytics on an Open Data Lakehouse
The adoption of data mesh as a decentralized data management approach has become popular in recent years, helping teams overcome challenges associated with centralized data architecture.read more