Predicting TV Tune-In Using PySpark, MLlib & Delta Lakehouse

At MIQ Digital India Pvt. Ltd. we collect and process high-volume TV viewing data and apply machine learning models to help TV networks get the maximum value out of their ad slots.

We use Apache Spark MLlib to model and PySpark for data wrangling and feature engineering with a Kafka-based event-driven microservices architecture. It uses a well-defined data engineering ecosystem of a lakehouse architecture built on top of Delta Engine.

This talk will cover scaling MiQ’s TV product to market across >50 advertisers, details of pipeline optimization for data at TB scale, and cost optimizations for model generations and prediction.

Download PDF

Ready to Get Started? Here Are Some Resources to Help


Smart Data – Smart Factory with Octotronic and Dremio

read more


What Is a Data Lakehouse?

The data lakehouse is a new architecture that combines the best parts of data lakes and data warehouses. Learn more about the data lakehouse and its key advantages.

read more
Simplifying Data Mesh Featured Image


Simplifying Data Mesh for Self-Service Analytics on an Open Data Lakehouse

The adoption of data mesh as a decentralized data management approach has become popular in recent years, helping teams overcome challenges associated with centralized data architecture.

read more

Get Started Free

No time limit - totally free - just the way you like it.

Sign Up Now

See Dremio in Action

Not ready to get started today? See the platform in action.

Watch Demo

Talk to an Expert

Not sure where to start? Get your questions answered fast.

Contact Us