Juan Pan is the CTO and co-founder of an open source SaaS commercial startup that has received seed funding from some of the world’s top VCs. Her entrepreneurial journey started just half a year ago – when she was wearing the hats of open source enthusiast, Apache member, distributed database developer, and DBA. In this […]
Apache Iceberg was created out of necessity by companies like Netflix and Apple. Now it’s available to all of us, so it’s time to answer “What can Iceberg do for you?”! In this session, Capitalize will provide the history, benefits, and possibilities of Iceberg in the discipline of BI and analytics. This talk will cover:• […]
Germany-based Douglas GmbH is the leading premium beauty group in Europe. Offering more than 160,000 beauty and lifestyle products in online shops, the beauty marketplace and around 2,000 stores. In this session you will learn how Douglas is using a cloud-based data lake and Dremio to accelerate the delivery of business-critical information to drive more […]
This session provides an overview of the largest AWS build in Fannie Mae, a leading source of mortgage financing, to support finance and the enterprise in centralizing finance data and delivering core calculations to improve analytics, modeling, governance, and regulatory reporting.
Data integration has been solved in countless different ways and is constantly being reinvented from scratch at thousands of companies around the globe. While many sectors of technology have made massive strides in simplifying and democratizing their fundamental tasks, moving data from one place to another has somehow taken longer to arrive to the data […]
Due to the ongoing exploding supply of data, enterprises keep constant watch over their data and the pipelines through which it is ingested and managed. Yet many organizations still have issues with data quality and reliability. Multidimensional data observability optimizes the health of data pipelines, data and the data processing of modern data systems. Adding […]
Inside Adobe Experience Platform we noticed we needed to track actions happening at the control plane level and act upon them at lower levels like data lake, ingestion processes, etc. Using Apache Spark Stateful Streaming we’ve been able to create services that act by starting processes like compacting data, consolidating data, and cleaning data, minimizing […]
Expand your technical knowledge. Hear from your peers and industry experts about cloud data lake use cases and architectures at Subsurface, where we explore what’s below the surface of the data lake. Hear firsthand from open source and technology leaders at companies about their experiences spearheading open source projects and building modern data lakes. Explore […]
Bill Inmon, “The Father of the Data Warehouse,” discusses the evolution to the lakehouse. From OLTP and data marts to enterprise data warehouses and now lakehouses, the analytics landscape has changed considerably. Learn how a lakehouse architecture addresses the issues of closed data warehouses and discover: Why SQL is a fundamental language for the past, […]
Organizations want data lakes to be the source of truth for analytics. But typically there are more users that treat operational tools like Salesforce, Marketo, and NetSuite as their source of truth. The reality is these tools are all missing critical pieces of data and don’t provide the full picture. Reverse ETL solves this last […]
In the era of cloud computing, it can be a challenge to consolidate data to successfully serve self-service analytics. It is impractical for organizations to rely on the traditional ETL approach. At HyreCar, we successfully deployed self-service analytics by leveraging open data architecture. The implementation not only streamlined the self-serve BI experience, kept business stakeholders […]
Join AWS and Dremio as they explore trends in open data architectures, the emergence of the lakehouse & how S3 is helping organizations do more with their data,
You have big data: now what? The real complexity is from the number of systems and stakeholders that interact with data, not the volume. Learn how to manage this complexity and activate your data using your data lakehouse or warehouse.
India’s National Data Platform is one of the largest public sector data lake projects of its kind and is used by everyone from the Prime Minister of India himself to every member of parliament and 100,000 government officials. The project brought together disparate data sources from 42 of India’s government ministries and siloed databases. The […]
The Data Platform team at Mercedes-Benz Research & Development is in the critical path of the company’s road to building the world’s best EV and autonomous vehicles. The MBOS Data Platform team will present how Dremio became instrumental to this effort, as part of a “one-click deployment” approach that allows internal teams to leverage infrastructure-as-code […]