March 3, 2022

Predicting TV Tune-In Using PySpark, MLlib & Delta Lakehouse

At MIQ Digital India Pvt. Ltd. we collect and process high-volume TV viewing data and apply machine learning models to help TV networks get the maximum value out of their ad slots.

We use Apache Spark MLlib to model and PySpark for data wrangling and feature engineering with a Kafka-based event-driven microservices architecture. It uses a well-defined data engineering ecosystem of a lakehouse architecture built on top of Delta Engine.

This talk will cover scaling MiQ’s TV product to market across >50 advertisers, details of pipeline optimization for data at TB scale, and cost optimizations for model generations and prediction.

Download PDF

Speakers

Rohit Srivastava

Rohit Srivastava is a Senior Engineering Lead with expertise working in the programmatic media buying space for the Ad-Tech domain. He is skilled in application development and building highly scalable big data analytics platforms. He is a data pipeline builder responsible for optimizing data stores and building them from the ground up. Rohit is experienced in performing root cause analysis and optimizing data pipelines at scale with a cost-effective approach.

Bitanshu Das

Bitanshu is the lead data engineer at MiQ managing the Data Engineering team responsible for building Scalable Big data solutions in Programmatic Media Buying in Ad Tech space. He loves exploring the data and building Optimised Data pipelines from the scratch.

Predicting TV Tune-In Using PySpark, MLlib & Delta Lakehouse

Speakers

Discover How PySpark Accelerates AI and Analytics with Unified, AI-Ready Data Products

Get Started Free

See Dremio in Action

Talk to an Expert

Ready to Get Started?