1 Stone, 3 Birds: Finer – Grained Encryption @ Apache Parquet
Data access restrictions, retention, and encryption-at-rest are fundamental security controls to achieve data privacy and compliance. This talk shows how we build and utilize open source Parquet’s finer-grained encryption feature to support all three controls in a unified way.
In particular, we will focus on the technical challenges of designing and applying encryption in a secure, reliable, and efficient manner for large-scale data. Those challenges include multiple access routes, performance overhead, handling the access denied, reliability, huge historical data, auto-onboarding, etc.
We will also share our experiences with recommended practices to manage the system in production at scale.
Topics Covered
Speakers
Xinli Shang
Xinli Shang is a manager on the Uber Big Data Infra team, Apache Parquet PMC Chair, Presto committer, and Uber Open Source Committee member. He is leading the Apache Parquet community and contributing to several other communities like Presto and Alluxio. He is also leading several initiatives on data format for storage efficiency, security, and performance. He is also passionate about tuning large-scale services for performance, throughput, and reliability.
Mohammad Islam
Mohammad Islam is a Sr. Staff Engineer at Uber. He co-leads the Data cost efficiency effort, and also leads Data security and compliance efforts. He is also an Apache Oozie and Tez PMC member.