How to Build an IoT Data Lake

Session Abstract

The Internet of Things (IoT) is one of the driving forces for the increase in today’s data volume and diversity. In response, IoT platforms must enable users to leverage incoming data for immediate analysis/action as well as long-term historical analytics. For the latter, the data needs to be stored efficiently and made available for large-scale analytics.

Cloud data lakes are cost-efficient and scale almost infinitely. However, data lakes only provide low-level primitives that must be used appropriately to unleash their full potential.

This talk discusses both the challenges and solutions of using data lakes as the essential building blocks for long-term storage of IoT data, such as how data is moved from the IoT platform to the data lake, how data is organized, and how efficient querying by various consumers (e.g., IoT platform user interface, business intelligence tools and machine learning applications) is achieved.