The Internet of Things (IoT) is one of the driving forces for the increase in today’s data volume and diversity. In response, IoT platforms must enable users to leverage incoming data for immediate analysis/action as well as long-term historical analytics. For the latter, the data needs to be stored efficiently and made available for large-scale analytics.
Cloud data lakes are cost-efficient and scale almost infinitely. However, data lakes only provide low-level primitives that must be used appropriately to unleash their full potential.
This talk discusses both the challenges and solutions of using data lakes as the essential building blocks for long-term storage of IoT data, such as how data is moved from the IoT platform to the data lake, how data is organized, and how efficient querying by various consumers (e.g., IoT platform user interface, business intelligence tools and machine learning applications) is achieved.
Tim Doernemann graduated with a degree in computer science from the University of Marburg, Germany, in 2006. During his doctoral program at the Distributed Systems Research Group of University of Marburg he worked on various topics around scheduling and quality of service for high-performance computing, grid computing and cloud computing applications. Since 2012, he has workeds as a developer and architect at the intersection of IoT and data analytics.
Michael Cammert graduated in with a degree in computer science from the University of Marburg, Germany, in 2002. During his time as a research assistant at the Database Research Group of University of Marburg, he worked on data stream processing. In 2007 he co-founded the complex event processing startup RTM Realtime Monitoring GmbH which was acquired by Software AG in 2010. Since then, he has worked as a developer and manager on various projects, currently at the intersection of IoT and data analytics. In 2014, Michael received his Ph.D. from the University of Marburg.