Dremio Blog: Open Data Insights
-
Dremio Blog: Open Data Insights
Virtual Data Marts 101: The Benefits and How-To
The concept of data marts has long been a pivotal strategy for organizations seeking to provide specialized access to critical data for their business units. Traditionally, data marts were seen as satellite databases, each carved out from the central data warehouse and designed to serve specific departments or teams. These data marts played a vital […] -
Dremio Blog: Open Data Insights
The Why and How of Using Apache Iceberg on Databricks
The Databricks platform is widely used for extract, transform, and load (ETL), machine learning, and data science. When using Databricks, it's essential to save your data in a format compatible with the Databricks File System (DBFS). This ensures that either the Databricks Spark or Databricks Photon engines can access it. Delta Lake is designed for […] -
Dremio Blog: Open Data Insights
Intro to Dremio, Nessie, and Apache Iceberg on Your Laptop
We're always looking for ways to better handle and save money on our data. That's why the "data lakehouse" is becoming so popular. It offers a mix of the flexibility of data lakes and the ease of use and performance of data warehouses. The goal? Make data handling easier and cheaper. So, how do we […] -
Dremio Blog: Open Data Insights
Exploring the Architecture of Apache Iceberg, Delta Lake, and Apache Hudi
This article has been revised and updated from its original version published in 2022 to reflect the latest developments in all three table formats. The three major open table formats (Apache Iceberg, Delta Lake, and Apache Hudi) each solve the "open lakehouse" problem differently at the architectural level. While the high-level comparison covers features and ecosystem support, […] -
Dremio Blog: Open Data Insights
How to Create a Lakehouse with Airbyte, S3, Apache Iceberg, and Dremio
Using Airbyte, you can easily ingest data from countless possible data sources into your S3-based data lake, and then directly into Apache Iceberg format. The Apache Iceberg tables are easily readable directly from your data lake using the Dremio data lakehouse platform, which allows for self-service data curation and delivery of that data for ad hoc analytics, BI dashboards, analytics applications, and more. -
Dremio Blog: Open Data Insights
Revolutionizing Open Data Lakehouses, Data Access and Analytics with Dremio & Alteryx
In the last decade, a lot of innovation in the data space took place to address the need for more storage, more compute, hybrid environments, data management automation, self-service analytics, and lower TCO to operate business at scale. -
Dremio Blog: News Highlights
Dremio Arctic is Now Your Data Lakehouse Catalog in Dremio Cloud
Dremio Arctic bring new features to Dremio Cloud, including Apache Iceberg table optimization and Data as Code. -
Dremio Blog: News Highlights
5 Use Cases for the Dremio Lakehouse
With its capabilities in on-prem to cloud migration, data warehouse offload, data virtualization, upgrading data lakes and lakehouses, and building customer-facing analytics applications, Dremio provides the tools and functionalities to streamline operations and unlock the full potential of data assets. -
Dremio Blog: Open Data Insights
10 Data Quality Checks in SQL, Pandas and Polars
By implementing these data quality checks, you can trust the accuracy of your data and make reliable decisions based on high-quality information. Data quality is an ongoing process, and these techniques serve as essential tools to ensure the integrity and usefulness of your datasets. -
Dremio Blog: Open Data Insights
Getting Started with Flink SQL and Apache Iceberg
This blog presents how to get started with Flink SQL and Apache Iceberg for streaming analytical workloads. -
Dremio Blog: Open Data Insights
Using Flink with Apache Iceberg and Nessie
This tutorial should show you how to work with Flink and Iceberg with a Nessie catalog to configure for your needs. -
Dremio Blog: Open Data Insights
Arctic Changelog – July 2023
This month, we’re excited to announce exciting new features to help you enrich, accelerate, and secure your data in Dremio Arctic! -
Dremio Blog: Open Data Insights
Introducing Nessie as a Dremio Source
In this blog, learn how Dremio helps end users analyze massive datasets in their semantic layer without sacrificing speed or ease of use. -
Dremio Blog: Open Data Insights
How to Convert JSON Files Into an Apache Iceberg Table with Dremio
This article has been revised and updated from its original version published in 2022 to reflect the latest Dremio and Apache Iceberg capabilities. JSON is the lingua franca of APIs, event streams, and NoSQL databases, but its nested, schema-on-read nature makes it challenging for analytical queries. Converting JSON data to Apache Iceberg tables flattens nested […] -
Dremio Blog: Open Data Insights
Deep Dive Into Configuring Your Apache Iceberg Catalog with Apache Spark
Apache Iceberg is a data lakehouse table format that has been taking the data world by storm with robust support from tools like Dremio, Fivetran, Airbyte, AWS, Snowflake, Tabular, Presto, Apache Flink, Apache Spark, Trino, and so many more. Although one of the tools most data professionals use is Apache Spark and many introductory tutorials […]
- « Previous Page
- 1
- …
- 5
- 6
- 7
- 8
- 9
- …
- 12
- Next Page »