Containers and Big Data

Wednesday, February 6
2:00 PM - 2:40 PM
Room 101/102

As containerization continues to gain momentum and become a de facto standard for application deployment, challenges around containerization of big data workloads are coming to light. Great strides have been made within the open source communities towards running big data workloads in containers, but much is left to be done.

Apache Hadoop YARN is the modern distributed operating system for big data applications. It has morphed the Hadoop compute layer into a common resource-management platform that can host a wide variety of applications. At its core, YARN has a very powerful scheduler which enforces global cluster level invariants and helps sites manage user and operator expectations of elastic sharing, resource usage limits, SLAs, and more. YARN recently increased its support for Docker containerization and added a YARN service framework supporting long-running services.

In this session we will explore the emerging patterns and challenges related to containers and big data workloads, including running applications such as Apache Spark, Apache HBase, and Kubernetes in containers on YARN.


Hortonworks, Inc.
サンジェイは、Hortonworksの設立者、チーフアーキテクトであり、Apache Hadoop コミッタ、およびApache Hadoop PMCのメンバーでもあります。Hortonworks 共同設立に先立ち、サンジェイはYahooのコアHadoopのチーフアーキテクトとしてHadoopの設計に携わりました。Hadoopでは、HDFS、MapReduceスケジューラー、YARNの設計、高可用性、互換性など様々な分野に貢献しています。また、Sun Microsystems と INRIA ではシニアエンジニアの役職を果たし、分散システム、グリッド/ユーティリティ·コンピューティング·インフラストラクチャのためのソフトウェアを開発しました。サンジェイは、カナダの University of Waterloo にてコンピュータ科学の博士号を取得しています。