Azure Data Lake Store (ADLS) is one of the several offerings of Microsoft’s Azure cloud platform. Its primary function is to store the enterprise data, mainly big data. It is a highly scalable and secure system. Besides data storing, the Azure Data Lake has other interesting features. Some of them include: In this tutorial, we […]
Tutorials
Azure Blob Storage is a cloud storage solution for massive amounts of unstructured data. It also can be used for streaming video, saving logs, storing data for backup and restore, storing data for analysis, and more. While it is a powerful tool to store data, sometimes there is a need to work with data from […]
One of the most common cases that we face when dealing with different data sources is “Data Inconsistency”. Data inconsistency exists when different and conflicting versions of the same data appear in different places. Data inconsistency creates unreliable information, because it is difficult to determine which version of the information is correct. As we all […]
Apache Hive is an open source data warehouse software, used for processing large amounts of data. It supports different storage types and ensures fast data management. Sometimes it is necessary to perform analysis of the data stored in Hive with other powerful tools, for example, Keras – an open source high-level API for Deep Learning […]
Often, customers using Dremio with Power BI have the need to move to Power BI Gateway . In order to set this up, customers will need to follow a few steps to get live connections with Dremio. Dremio Cluster. System for PBI Gateway to run from (must be on 24×7 for live queries). Power BI […]
As data consumers, we all want to start gaining value from our data as soon as we get our hands on it. TIBCO Spotfire is an analytics tool designed to answer any question, easy or challenging, based on your data as fast as possible. It is a BI platform that can be used across organizations […]
The lightweight directory access protocol (LDAP) is an open protocol used to store and retrieve data from a hierarchical directory structure. In this tutorial we will show you what are the steps needed to integrate LDAP authentication in Dremio. Then we will demonstrate how we can control access for different users and groups in Dremio […]
Amazon Redshift is a powerful data warehouse service in the cloud. It is a simple mean of analyzing data across your data warehouse and data lake. Moreover, it is cost-effective. Redshift delivers ten times faster performance than other data warehouses because of involved technics, such as machine learning, massively parallel query execution, and columnar storage […]
When using standard SQL, we can create a table by issuing the CREATE TABLE statement, however, this will create an empty table that we then have to insert data into. Dremio provides the formal ability to export virtual datasets (VDS) to their respective sources (S3, ADLS, HDFS, and NAS) using standard SQL-based commands. Create table […]
Dremio’s semantic layer is an integrated, searchable catalog that indexes all of your metadata, so business users can easily make sense of your data. Virtual datasets and spaces make up the semantic layer, and are all indexed and searchable. We understand that searching for data in organizations usually is more complicated than it should; these […]
Note: Dremio only supports one data governance policy manager at a time, so you can use either Dremio or Ranger as a policy manager but not both at the same time. In this tutorial, we will go on an overview of the ranger-based policy enforcement procedures, we will also exercise the different permissions that you […]
Apache Hive is a modern and convenient instrument built on top of Apache Hadoop. It is used for processing large amounts of data, stored in a distributed file system, using SQL. Sometimes there are situations when we may need to get data from sources like Hive and perform analysis with the help of different tools […]
This install will be done using Docker on Mac OS High Sierra. The first thing that we need to do, is verify that Docker is up and running. There are different ways to do this: If you are working with Docker-for-desktop, the icon on the desktop toolbar should provide a general status. Normally, I would […]
Power BI is a powerful business intelligence platform. It is known for the abilities to connect to various data sources, tools for aggregating and analyzing data, and for the rich library of visualizations with many styling options. Today we will show you how to use Power BI with Azure Data Lake Store (ADLS) which is […]
We continue to explore the connection and abilities of the most popular data sources and your favorite BI tools, and today we will show you the integration of Azure Data Lake Store and Tableau with the help of Dremio. Azure Data Lake Store is a single repository from which you can easily extract data of […]