Data Mastery Hub: Term Resource for Data Professionals

Whether you're a newcomer to the world of big data and data lakes or an experienced pro looking to expand your knowledge, the Dremio Wiki provides insights and guidance for all your data-related needs. Dive in and unlock the power of your data today!

Data Management

Data Rollback

Data Rollback is a feature that allows businesses to revert their data to a previous state, aiding in data processing and analytics.

Network Infrastructure

Data Routing

Data Routing is the process of directing data flows to the appropriate systems for processing and analytics.

Data Management

Data Sampling

Data Sampling is a technique used to select a subset of data from a larger dataset to perform analysis, processing, or testing.

Data Management

Data Schema Evolution

Data Schema Evolution is the process of modifying the structure of a database or data warehouse to accommodate changes in data requirements.

Machine Learning

Data Science

Feature engineering is the process of selecting, manipulating, and transforming raw data into features used in machine learning algorithms to improve model accuracy on unseen data.

Data Management

Data Scrubbing

Data Scrubbing is the process of identifying and correcting or removing errors, inconsistencies, and inaccuracies in datasets to ensure data quality and reliability.

Data Security

Data Security and Governance Policies

Data Security and Governance Policies is a set of guidelines and practices implemented by organizations to protect their data assets, ensure compliance with regulations, and facilitate effective data processing and analytics.

Data Security

Data Security and Privacy

Understand Data Security and Privacy, its benefits and challenges, and how it integrates with data lakehouse environments for data scientists.

Data Management

Data Segregation

Data Segregation is the practice of organizing and separating data based on its attributes or characteristics to optimize data processing and analytics.

Data Engineering

Data Serialization

Data Serialization is the process of converting structured or semi-structured data into a serialized format, such as JSON or XML, for storage or transmission.

Data Storage

Data Sharding

Data Sharding is a technique for horizontally partitioning large datasets into smaller, more manageable parts.

Data Management

Data Silos

Data Silos is a term used to describe isolated repositories of data within an organization that are not easily accessible or interoperable with other systems.

Data Management

Data Skew

Data Skew is an imbalance in the distribution of data within a dataset that can impact data processing and analytics.

Data Analysis

Data Skewness

Data Skewness is the imbalance in the distribution of data across partitions or nodes in a distributed computing environment.

Data Management

Data Snapshot

Data Snapshot is a technology that allows businesses to capture and store a static copy of their data at a specific point in time.

1 2 3 4 22 23 24 25 26 60 61 62 63
No Wikis Found
Topics
get started

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now
demo on demand

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo
talk expert

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us

Make data engineers and analysts 10x more productive

Boost efficiency with AI-powered agents, faster coding for engineers, instant insights for analysts.