Protect your Private Data in your Hadoop Clusters with ORC Column Encryption

Wednesday, March 20
11:50 AM - 12:30 PM
Room 127-128

Fine-grained data protection at a column level in data lake environments has become a mandatory requirement to demonstrate compliance with multiple local and international regulations across many industries today. ORC is a self-describing type-aware columnar file format designed for Hadoop workloads that provides optimized streaming reads, but with integrated support for finding required rows quickly. In this talk, we will outline the progress made in Apache community for adding fine-grained column level encryption natively into ORC format that will also provide capabilities to mask or redact data on write while protecting sensitive column metadata such as statistics to avoid information leakage. The column encryption capabilities will be fully compatible with Hadoop Key Management Server (KMS) and use the KMS to manage master keys providing the additional flexibility to use and manage keys per column centrally. An end to end scenario that demonstrates how this capability can be leveraged will be also demonstrated.

講演者

Owen O'Malley
Co-founder & Technical Fellow
Hortonworks, Inc.
Owen O'Malley is a co-founder and technical fellow at Hortonworks, a rapidly growing company (25 to 1,000 employees in 5 years), which develops the completely open source Hortonworks Data Platform (HDP). HDP includes Hadoop and the large ecosystem of big data tools that enterprises need for their data analytics. Owen has been working on Hadoop since the beginning of 2006 at Yahoo, was the first committer added to the project, and used Hadoop to set the Gray sort benchmark in 2008 and 2009. In the last 8 years, he has been the architect of MapReduce, Security, and now Hive. Recently he has been driving the development of the ORC file format and adding ACID transactions to Hive. Before working on Hadoop, he worked on Yahoo Search's WebMap project, which was the original motivation for Yahoo to work on Hadoop.  Prior to Yahoo, he wandered between testing (UCI), static analysis (Reasoning), configuration management (Sun), and software model checking (NASA). He received his PhD in Software Engineering from University of California, Irvine.
スリカンス・ベンカット
プロダクトマネージメント、シニアディレクター
Hortonworks, Inc.
スリカンス・ベンカットは、現在、HortonworksにてApache Knox、Apache Ranger、Apache Atlas、プラットフォーム ワイド セキュリティ、Hortonworks DataPlane Serviceを含む、製品のセキュリティ&ガバナンスのポートフォリオに携わっています。Hortonworksに入社する以前は、クラウドサービス、市場、セキュリティ、ビジネスアプリケーションなどの分野で様々な職務の経験があります。スリカンスは、製品管理から、戦略および運営、テクニカルアーキテクチャまで様々な分野でリーダーシップの経験があり、TelefonicaやSalesforce、Cisco-Webex、Proofpoint、Dataguise、Trilogy Software、Hewlett-Packardを含む、新興企業からグローバル企業まで広範囲の職務経験を持ちます。スリカンスは、ピッツバーグ大学で人工知能に焦点を置いたエンジニアリングの博士号、インディアナ大学でGeneral ManagementのMBA、サンダーバード国際経営大学院にてグローバルマネジメントの修士号を取得しています。趣味はデータサイエンスと機械学習で、ビッグデータテクノロジーを触ることを楽しんでいます。