Counting Unique Users in Real-Time: Here's a Challenge for You!

Wednesday, March 20
11:00 AM - 11:40 AM
Room 129-130

Finding the number of unique users out of 10 billion events per day is challenging. At this session, we're going to describe how re-architecting our data infrastructure, relying on Druid and ThetaSketch, enables our customers to obtain these insights in real-time.

To put things into context, at NMC (Nielsen Marketing Cloud) we provide our customers (marketers and publishers) real-time analytics tools to profile their target audiences. Specifically, we provide them with the ability to see the number of unique users who meet a given criterion.

Historically, we have used Elasticsearch to answer these types of questions, however, we have encountered major scaling and stability issues.

In this presentation we will detail the journey of rebuilding our data infrastructure, including researching, benchmarking and productionizing a new technology, Druid, with ThetaSketch, to overcome the limitations we were facing.

We will also provide guidelines and best practices with regards to Druid.

Topics include :
* The need and possible solutions
* Intro to Druid and ThetaSketch
* How we use Druid
* Guidelines and pitfalls

講演者

Itai Yaffe
Big Data Tech Lead
Nielsen
A Big Data Tech Lead at the Nielsen Marketing Cloud. I have been dealing with Big Data challenges for the past 6 years, using tools like Spark, Druid, Kafka, and others. I'm keen about sharing my knowledge and have presented my real-life experience in various forums in the past (e.g meetups, conferences, etc.).
Yakir Buskilla
Director, Big Data
Nielsen
Yakir Buskilla is a Director of Big Data at the Nielsen Marketing Cloud. His fields of interest are Big Data solutions and large scale machine learning.