- This event has passed.
Putting Apache Kafka to Use: Building Real-Time Data Platform for Event Streams
March 23, 2015 @ 6:45 pm
Jay Kreps, Co-founder & CEO, Confluent
*** Bring ID (e.g. Driver’s License) for eBay Security ***
6:30 Doors Open, Food & Networking
*** Please arrive by 7 PM due to Security ***
What happens if you take everything that is happening in your company—every click, every database change, every application log—and make it all available as a real-time stream of well structured data?
I will discuss the experience at LinkedIn and elsewhere moving from batch-oriented ETL to real-time streams using Apache Kafka. I’ll talk about how the design and implementation of Kafka was driven by this goal of acting as a real-time platform for event data. I will cover some of the challenges of scaling Kafka to hundreds of billions of events per day at Linkedin, supporting thousands of engineers, applications, and data systems in a self-service fashion.
I will describe how real-time streams can become the source of ETL into Hadoop or a relational data warehouse, and how real-time data can supplement the role of batch-oriented analytics in Hadoop or a traditional data warehouse.
I will also describe how applications and stream processing systems such as Storm, Spark, or Samza can make use of these feeds for sophisticated real-time data processing as events occur.
Jay is the CEO of Confluent, a company focused on building a real-time stream platform around Apache Kafka.
Previous he was one of the primary architects for LinkedIn where he focused on data infrastructure and data-driven products.
He was among the original authors of a number of open source projects in the scalable data systems space, including Voldemort, Azkaban, and Kafka, and Samza.