Open Data Architecture
-
Open Data Architecture
The Why and How of Using Apache Iceberg on Databricks
The Databricks platform is widely used for extract, transform, and load (ETL), machine learning, and data science. When using Databricks, it's essential to save your data in a format compatible with the Databricks File System (DBFS). This ensures that either the Databricks Spark or Databricks Photon engines can access it. Delta Lake is designed for […] -
Open Data Architecture
Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop
We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […] -
Open Data Architecture
Exploring the Architecture of Apache Iceberg, Delta Lake, and Apache Hudi
In the age of data-centric applications, storing, accessing, and managing data can significantly influence an organization's ability to derive value from a data lakehouse. At the heart of this conversation are data lakehouse table formats, which are metadata layers that allow tools to interact with data lake storage like a traditional database. But why do […] -
Open Data Architecture
How to Create a Lakehouse with Airbyte, S3, Apache Iceberg, and Dremio
Using Airbyte, you can easily ingest data from countless possible data sources into your S3-based data lake, and then directly into Apache Iceberg format. The Apache Iceberg tables are easily readable directly from your data lake using the Dremio data lakehouse platform, which allows for self-service data curation and delivery of that data for ad hoc analytics, BI dashboards, analytics applications, and more. -
Open Data Architecture
Revolutionizing Open Data Lakehouses, Data Access and Analytics with Dremio & Alteryx
In the last decade, a lot of innovation in the data space took place to address the need for more storage, more compute, hybrid environments, data management automation, self-service analytics, and lower TCO to operate business at scale. -
Dremio News
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code. -
Dremio News
5 Use Cases for the Dremio Lakehouse
With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets. -
Open Data Architecture
10 Data Quality Checks in SQL, Pandas and Polars
By implementing these data quality checks, you can trust the accuracy of your data and make reliable decisions based on high-quality information. Data quality is an ongoing process, and these techniques serve as essential tools to ensure the integrity and usefulness of your datasets. -
Open Data Architecture
Getting Started with Flink SQL and Apache Iceberg
This blog presents how to get started with Flink SQL and Apache Iceberg for streaming analytical workloads. -
Open Data Architecture
Using Flink with Apache Iceberg and Nessie
This tutorial should show you how to work with Flink and Iceberg with a Nessie catalog to configure for your needs. -
Open Data Architecture
Arctic Changelog – July 2023
This month, we’re excited to announce exciting new features to help you enrich, accelerate, and secure your data in Dremio Arctic! -
Open Data Architecture
Introducing Nessie as a Dremio Source
In this blog, learn how Dremio helps end users analyze massive datasets in their semantic layer without sacrificing speed or ease of use. -
Open Data Architecture
How to Convert JSON Files Into an Apache Iceberg Table with Dremio
Apache Iceberg is an open table format that enables robust, affordable, and quick analytics on the data lakehouse and is poised to change the data industry in ways we can only begin to imagine. Check out our Apache Iceberg 101 course to learn all the nuts and bolts about Iceberg. By storing your data in […] -
Open Data Architecture
Deep Dive Into Configuring Your Apache Iceberg Catalog with Apache Spark
Apache Iceberg is a data lakehouse table format that has been taking the data world by storm with robust support from tools like Dremio, Fivetran, Airbyte, AWS, Snowflake, Tabular, Presto, Apache Flink, Apache Spark, Trino, and so many more. Although one of the tools most data professionals use is Apache Spark and many introductory tutorials […] -
Open Data Architecture
Using Generative AI as a Data Engineer
Generative AI can be a powerful tool for data engineers, but it must be used responsibly and effectively.
- 1
- 2
- 3
- …
- 6
- Next Page »