Lightning Fast Analytics with Hive LLAP and Druid

Lightning Fast Analytics with Hive LLAP and Druid

Thursday, May 23
2:50 PM - 3:30 PM
Marquis Salon 9

Cox Communications, one of the largest network providers in the U.S., is primarily focused on ensuring network security and providing better service to customers including:
• Real-time monitoring of IP security traffic to identify and alert the unusual network activities across interfaces within an organization
• Enrich the security team with capabilities to determine the source and destination of traffic, class of service, and the causes of congestion on NetFlow data

Data related to Network Security includes more granular streaming data. The major challenge lies in having an unified platform to perform data cleansing, transformation, analytics and reporting on this huge streaming datasets. With the growing network traffic, there is an exponential growth with the associated data. There is a need for Scalable framework to handle these datasets and derive useful information out of data. Along with data processing, data retrieval also plays a major role for better analysis. Currently Data processing was done in daily batch using manual python scripts and with implementation of custom data structures which were specific to use cases. There was a need for more generic and unified framework to provide automated real time end to end solution to obtain high performing, more granular business results.

Automation of this process has opportunities on several fronts, notably, providing consistency, repeat-ability, and modernization of OLAP analytics on enterprise big data platform. Reports can be generated easier and faster with the underlying OLAP engine.
• Modern Big Data Platform provides the necessary tool and infrastructure to land, cleanse, process Real time stream data processing and enriching data using the ecosystem components like Spark, Kafka, Hive
• Impressively faster OLAP analytics using Hive LLAP and Druid Integration
• Simple and faster reporting using Superset

All of the necessary components under one roof of Hortonworks Hadoop Platform.
An end-to-end solution using Big Data platform produced faster and repeatable results with sub second query results.
Value Additions by above solution:

• Deliver ultra-fast SQL analytics that can be consumed from the BI tool by security engineering team to get accelerated business results
• Opportunity for business users to explore and visualize real time streaming datasets with integration for various data sources and build dashboards for different slices
• Capability to run BI queries in just milliseconds over 1TB dataset
• High granular permission model on security datasets that allow intricate rules on accessibility for the datasets


Pavan Chandrashekar
Senior Big Data Engineer
Cox Communications
Experienced Software development professional with a strong exposure in various big data technologies. Skilled in Hadoop eco system components(HDFS, MapReduce, Pig, Hive, SQOOP), Cassandra, Spark, Core Java, Scala, Relational databases and Data warehousing, and also possess good skills in various SDLC methodologies.
Rohit Kuruppath
Lead Big Data Engineer
Experienced software development professional with a strong exposure across big data technologies including HDF, Spark, HBase, Pheonix and expertise in building end to end Datawarehousing/BI solutions