セッション

    • AI・データサイエンス
    • Big Compute and Storage
    • Cloud and Operations
    • Cybersecurity
    • データウェアハウジング・オペレーショナルデータストア
    • Enterprise Adoption
    • Governance and Security
    • IoT and Streaming
Artificial Intelligence (AI) is transforming every industry. Data science and machine learning are opening new doors in process automation, predictive analytics, and decision optimization. This track offers sessions spanning the entire data science lifecycle: development, test, and production. You’ll see examples of innovative analytics applications and systems for data visualization, statistics, machine learning, cognitive systems, and deep learning. We’ll show you how to use modern open source workbenches to develop, test, and evaluate advanced AI models before deploying them. You’ll hear from leading researchers, data scientists, analysts, and practitioners who are driving innovation in AI and data science. Sample technologies: Apache Spark, R, Apache Livy, Apache Zeppelin, Jupyter, scikit-learn, Keras, TensorFlow, DeepLearning4J, Chainer, Lasagne/Blocks/Theano, CaffeOnSpark, Apache MXNet, and PyTorch/Torch
Apache Hadoop continues to drive data management innovation at a rapid pace. Hadoop 3.0 adds container management to YARN, an object store to HDFS, and more. This track presents these advances and describes projects in incubation and the industry initiatives driving innovation in and around the Hadoop platform.

You’ll learn about key projects like HDFS, YARN, and related technologies. You’ll interact with technical leads, committers, and experts who are driving the roadmaps, key features, and advanced technology research around what is coming next and the extended open source big compute and storage ecosystem.

Sample technologies: Apache Hadoop (YARN, HDFS, Ozone), Apache Kudu, Kubernetes, Apache BookKeeper
For a system to be “open for business,” system administrators must be able to efficiently manage and operate it. That requires a comprehensive dataflow and operations strategy. This track provides best practices for deploying and operating data lakes, streaming systems, and the extended Apache data ecosystem on premises and in the cloud. Sessions cover the full deployment lifecycle including installation, configuration, initial production deployment, upgrading, patching, loading, moving, backup, and recovery. You’ll discover how to get started and how to operate your cluster. Speakers will show how to set up and manage high-availability configurations and how DevOps practices can help speed solutions into production. They’ll explain how to manage data across the edge, the data center, and the cloud. And they’ll offer cutting-edge best practices for large-scale deployments. Sample technologies: Apache Ambari, Cloudbreak, HDInsight, HDCloud, Data Plane Service, AWS, Azure, and Apache Oozie
The speed and scale of recent ransomware attacks and cybersecurity breaches have taught us that threat detection and mitigation are the key to security operations in data-driven businesses. Creating cybersecurity machine learning models and deploying these models in streaming systems is becoming critical to defending and managing these growing threats. In this track, you’ll learn how to leverage big data and stream processing to improve your cybersecurity. Experts will explain how to scale with analytics on more data and react in real time. Sample technologies: Apache Metron, Apache Spot
Apache Hadoop YARN has transformed Hadoop into a multi-tenant data platform that enables the interaction of legacy data stores and big data. It is the foundation for multiple processing engines that let applications interact with data in the most appropriate way from batch to interactive SQL to low latency access with NoSQL. Sessions will cover the vast ecosystem of SQL engines and tools that enable richer enterprise data warehousing (EDW) on Hadoop. You’ll learn how NoSQL stores like Apache HBase are adding transactional capability that brings traditional operational data store (ODS) workloads to Hadoop and why data preparation is a key workload. You’ll meet Apache community rock stars and learn how these innovators are building the applications of the future. Sample technologies: Apache Hive, Apache Tez, Apache ORC, Druid, Apache Parquet, Apache HBase, Apache Phoenix, Apache Accumulo, Apache Drill, Presto, Apache Pig, JanusGraph, Apache Impala
Enterprise business leaders and innovators are using data to transform their businesses. These modern data applications are augmenting traditional architectures and extending the reach for insights from the edge to the data center. Sessions in this track will discuss business justification and ROI for modern data architectures. You’ll hear from ISVs and architects who have created applications, frameworks, and solutions that leverage data as an asset to solve real business problems. Speakers from companies and organizations across industries and geographies will describe their data architectures, the business benefits they’ve experienced, their challenges, secrets to their successes, use cases, and the hard-fought lessons learned in their journeys.
Your data lake contains a growing volume of diverse enterprise data, so a breach could be catastrophic. Privacy violations and regulatory infractions can damage your corporate image and long-term shareholder value. Government and industry regulations demand you properly secure and govern your data to assure compliance and mitigate risks. But as Hadoop and streaming applications emerge as a critical foundation of a modern data architecture, enterprises face new requirements for protection and governance.

In this track, you’ll learn about the key enterprise requirements for governance and security of the extended data plane. You’ll hear best practices, tips, tricks, and war stories on how to secure and govern your big data infrastructure.

Sample technologies: Apache Ranger, Apache Sentry, Apache Atlas, and Apache Knox
The rapid proliferation of sensors and connected devices is fueling an explosion in data. Streaming data allows algorithms to dynamically adapt to new patterns in data, which is critical in applications like fraud detection and stock price prediction. Deploying real-time machine learning models in data streams enables insights and interactions not previously possible. In this track you’ll learn how to apply machine learning to capture perishable insights from streaming data sources and how to manage devices at the “jagged edge.” Sessions present new strategies and best practices for data ingestion and analysis. Presenters will show how to use these technologies to develop IoT solutions and how to combine historical with streaming data to build dynamic, evolving, real-time predictive systems for actionable insights. Sample technologies: Apache Nifi, Apache Storm, Streaming Analytics Manager, Apache Flink, Apache Spark Streaming, Apache Beam, Apache Pulsar and Apache Kafka

フィルター

興味のあるアジェンダを検索するには、最低1つのアイテムを選択してください
種類
ビジネス
技術的
トラック
AI・データサイエンス
Big Compute and Storage
バーズ オブ ア フェザー
Cloud and Operations
Cybersecurity
データウェアハウジング・オペレーショナルデータストア
Enterprise Adoption
Governance and Security
IoT and Streaming
SUN, JUN 17
8:30 AM
トレーニング
8 HOURS 30 MIN
MON, JUN 18
8:30 AM
HBaseCon/PhoenixCon
8 HOURS 30 MIN
8:30 AM
トレーニング
8 HOURS 30 MIN
TUE, JUN 19
12:30 PM
Community Expo Lunch
1 HOUR 30 MIN
12:30 PM
Special Events
Women in Big Data Luncheon
4:00 PM
ブレイクアウトセッション
Open source computer vision with TensorFlow, Apache MiniFi, Apache NiFi, OpenCV, Apache Tika, and Python
Timothy Spann, Hortonworks
Grand Ballroom 220A
技術的
40分
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
Kasiviswanathan Natarajan, PayPal Inc
Grand Ballroom 220B
技術的
40分
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
Mark Lochbihler, Hortonworks
Grand Ballroom 220C
技術的
40分
Harnessing the Power of Big Data at Freddie Mac
Dennis Tally, Freddie Mac
Meeting Room 230A
ビジネス
40分
Cloud Storage: PUT is the new rename()
Steve Loughran, Hortonworks
Meeting Room 211A/B/C/D
技術的
40分
Enabling ABAC with Accumulo and Ranger integration
John Highcock, Hortonworks
Executive Ballroom 210A/E
技術的
40分
Building a Modern Data Warehouse on Microsoft Azure with Azure HDInsight and Azure Databricks
Arindam Chatterjee, Microsoft
Executive Ballroom 210B/F
技術的
40分
Securing data in hybrid environments using Apache Ranger
Don Bosco Durai, Privacera
Executive Ballroom 210C/G
ビジネス
40分
Landuse Classification from Satellite Imagery using Deep Learning
Suneel Marthi, Amazon Web Services
Executive Ballroom 210D/H
技術的
40分
DISCOVER with Data Steward Studio: Understanding and unlocking the value of data in hybrid enterprise data lake environments
Srikanth Venkat, Hortonworks Inc
Meeting Room 230C
Business Technical
40分
San Antonio’s electric utility making big data analytics the business of the people, for the people
Rolando Vega, CPS Energy
Meeting Room 230B
ビジネス
40分
7:00 PM
Sponsor Reception
1 HOUR 30 MIN
WED, JUN 20
7:30 AM
Industry Roundtables
1 HOUR 30 MIN
12:30 PM
Community Expo Lunch
1 HOUR 30 MIN
4:00 PM
ブレイクアウトセッション
ExxonMobil’s journey to unleash time-series data with open source technology
Kevin Brown, ExxonMobil
Grand Ballroom 220A
技術的
40分
Securing Data in Hadoop at Uber
Mohammad Islam, Uber Inc
Grand Ballroom 220B
技術的
40分
Running Hive queries fast in the cloud
Nita Dembla, Hortonworks
Grand Ballroom 220C
技術的
40分
AI from your data lake: Using Solr for analytics
Cassandra Targett, Lucidworks
Meeting Room 230A
技術的
40分
High Performance and Scalable Geospatial Analytics on Cloud with Open Source
Constantin Stanca, Hortonworks
Meeting Room 211A/B/C/D
ビジネス
40分
Big Data Analytics from Edge to Core
Bob Mumford, Hewlett Packard Enterprise
Executive Ballroom 210A/E
技術的
40分
Leveraging advanced technologies to support critical applications in a secure, HIPAA-compliant hybrid Cloud
Sandeep Chandra, San Diego Supercomputer Center
Executive Ballroom 210B/F
技術的
40分
How a major bank leveraged Apache Spark and StreamAnalytix to rapidly re-build their Insider Threat Detection application
Anand Venugopal, IMPETUS
Executive Ballroom 210C/G
Business Technical
40分
Bridging the gap: achieving fast data synchronization from SAP HANA by leveraging Hadoop
John Kuchmek, American Water
Executive Ballroom 210D/H
技術的
40分
Big data processing meets non-volatile memory: opportunities and challenges
Shashank Gugnani, The Ohio State University
Meeting Room 230C
技術的
40分
Achieving a 360 degree view of manufacturing
マイケル・ガー
Hortonworks
Meeting Room 230B
ビジネス
40分
4:40 PM
ブレイクアウトセッション
Running Apache Hadoop on the Google Cloud Platform
Siddharth Seth, Hortonworks
Grand Ballroom 220A
技術的
50分
Detecting real-time market manipulation in decentralized cryptocurrency exchanges
Kat Petre, Hortonworks
Grand Ballroom 220B
技術的
50分
Real-time Freight Visibility: How TMW Systems uses NiFi and SAM to create sub-second transportation visibility
Krishna Potluri, TMW Systems, A Trimble Company
Grand Ballroom 220C
技術的
50分
How to use flash drives with Apache Hadoop 3.x: Real world use cases and proof points—better results, better economics
Saumitra Buragohain, Hortonworks
Meeting Room 230A
技術的
50分
NameNode Analytics – Scouting the HDFS Metadata
Plamen Jeliazkov, PayPal
Meeting Room 211A/B/C/D
技術的
50分
Breathing New Life into Apache Oozie with Apache Ambari Workflow Manager
Artem Ervits, Hortonworks
Executive Ballroom 210A/E
技術的
50分
Exploiting machine learning to keep Hadoop clusters healthy
Dheeraj Kapur, Oath
Executive Ballroom 210B/F
技術的
50分
Data transformation and its impact on our digital transformation
Giovanni Pizzoferrato, Canadian Tire Corporation
Executive Ballroom 210C/G
ビジネス
50分
Running Enterprise Workloads in the Cloud
Jeff Sposetti, Hortonworks
Executive Ballroom 210D/H
技術的
50分
Risk listening: monitoring for profitable growth
シンディ・マイキ
Hortonworks
Meeting Room 230C
ビジネス
50分
Network planning automation using modern big data platform
Chaitanya Vasamsetty, Cox Communications
Meeting Room 230B
技術的
50分
THU, JUN 21
9:30 AM
ブレイクアウトセッション
Running secured Spark job in Kubernetes compute cluster and integrating with Kerberized HDFS
Joy Chakraborty, NA
Grand Ballroom 220A
技術的
40分
Practice of large Hadoop cluster in China Mobile
YUXUAN PAN, China Mobile(Suzhou) Software Technology Co.,Ltd
Grand Ballroom 220B
ビジネス
40分
Highly configurable and extensible data processing framework at PubMatic
Kunal Umrigar, PubMatic
Grand Ballroom 220C
Business Technical
40分
GDPR Community Showcase for Apache Ranger and Apache Atlas
Ali Bajwa, Hortonworks
Meeting Room 230A
技術的
40分
Seamless replication and disaster recovery for Apache Hive Warehouse
Sankar Hariappan, Hortonworks
Meeting Room 211A/B/C/D
技術的
40分
IIoT + Predictive Analytics: Solving for Disruption in Oil & Gas and Energy & Utilities
Kenneth Smith, Hortonworks
Executive Ballroom 210A/E
ビジネス
40分
The columnar roadmap: Apache Parquet and Apache Arrow
Julien Le Dem, WeWork
Executive Ballroom 210B/F
技術的
40分
Quick! Quick! Exploration!: A framework for searching a predictive model on Apache Spark
Masato Asahara, NEC System Platform Research Laboratories
Executive Ballroom 210C/G
技術的
40分
YARN federation: taming a beasty fleet with global optimizations
Carlo Curino, Microsoft
Executive Ballroom 210D/H
技術的
40分
Enhance User Experience & Increase Revenue by Placing Targeted Video Advertisements
Pavan Surapaneni, Cox Communications
Meeting Room 230C
技術的
40分
DataSketch based aggregations and windowing in a streaming query system
Akshai Sarma, Yahoo
Meeting Room 230B
技術的
40分