A Birds of a Feather(BOF) is an informal discussion group. DataWorks will sponsor several Birds of Feather (BoFs) meeting groups, hosted by Apache Committers, architects, tech-leads, and engineers. Attendees group together based on a shared interest and carry out discussions without any pre-planned agenda. These groups will have hosts that will moderate the discussion.
Come to join the discussion and share your experiences, challenges, future interests, and requirements on key Apache and other open source projects and discuss what’s on the roadmap and future design options.
Date: Wednesday, May 22
Room: Check agenda or check the DataWorks Summit Mobile App
Apache Hive is the de facto standard for SQL queries in Hadoop. With the next phase of SQL in Hadoop, the Apache community has greatly improved Hive’s speed (LLAP), scale and SQL semantics. Come learn and discuss what is new in Hive 3.0.
Apache Druid is an open source column-oriented distributed data store designed for OLAP queries on event data. Druid provides the ability to have interactive queries on real-time streams that are horizontally scalable. Druid has rich client libraries and integration with tools like Pivot and Apache Superset. Come learn about the latest developments in Druid and Hive/Druid integration.
Come learn and discuss the latest innovations and future direction in Apache Spark, Apache Zeppelin, and other ecosystem tools for machine learning operations.
Like-minded individuals within public sector will be discussing the latest trends and innovations in open source Big Data, advanced analytics, and data science technologies – how it is being applied in government agencies today, and where it can take mission-critical initiatives to enable transformative changes in the future. Bring your questions and share thoughts around Security and Governance, AI and Machine Learning, Data Engineering & Science, Cloud & Operations, IoT Streaming & Data Flow. Focus technologies include: Apache Spark, Apache Hadoop (YARN, HDFS, Ozone), AWS, Apache Ambari, Cloudbreak, Apache Hive, Apache Ranger, and Apache Nifi.
Real-time data processing with Apache NiFi, Apache Kafka, Apache Storm, Apache Spark Streaming and many more provide the foundation for data processing in IoT. Come learn and discuss the latest streaming & data flow innovations and future directions.
Most enterprises are on an exciting journey to complement their on-prem environment with the cloud’s flexibility, scale, and agility for big data workloads and associated data. Data management in such hybrid environments is fraught with many challenges. As data is migrated or replicated between these environments it becomes challenging to manage and homogenize the data context across these environments throughout the replication and data movement processes. Once operating in such hybrid cloud environments, having consistent security and governance across all the environments so that the data can be managed seamlessly and uniformly poses even bigger challenges. Come to learn, discuss, and share your experience and insights on the how to navigate this hybrid enterprise data cloud journey with uniform security & governance and how the innovations in open source communities that can help enterprises accelerate this journey in the age of regulations like GDPR, CCPA and various industry regulations and standards, both currently and looking out in the future.