KEMBAR78
Building Event-Driven Systems with Apache Kafka | PPTX
BUILDING
EVENT-DRIVEN SYSTEMS
WITH APACHE KAFKA
BRIAN RITCHIE
CTO, XEOHEALTH
2016
@brian_ritchie
brian.ritchie@gmail.com
http://www.dotnetpowered.com
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS
Definition
Event-driven architecture, also known as message-driven architecture, is
a software architecture pattern promoting the production, detection,
consumption of, and reaction to events. An event can be defined as "a
significant change in state".
https://en.wikipedia.org/wiki/Event-driven_architecture
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS ARE ABOUT UNLOCKING DATA
• Data is the driving force behind innovation
• Event-driven systems allow you to unlock the data –
and unlock the innovation.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENTS ARE THE “WHAT HAPPENED” DATA
• It’s about recording “what happened”, but not coupling it to the “how”
• It’s the “transactions” of your system
• Product Views
• Completed Sales
• Page Visits
• Site Logins
• Shipping Notifications
• Inventory Received
• IoT
• …and much more
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENTS – A HEALTHCARE EXAMPLE
Event
Stream
Healthcare
Claim
Fraud
Detection
Data Lake
Archive
Disease
Trending
Contract &
Pricing
More… You don’t need to
integrate with
consumers or even
know about a future
uses of your data
What happened?
A patient received a set of
services
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS MAKE SCALABILITY EASIER
• Scalability of processing
• Scalability of design
• Scalability of change
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
EVENT-DRIVEN SYSTEMS REQUIRE INFRASTRUCTURE
• Queue / Stream
• Persistence
• Distribution
• Pub / Sub
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA IS THE INFRASTRUCTURE
• Apache Kafka is publish-subscribe messaging rethought as a distributed
commit log.
• Developed by LinkedIn
• Written in Java
• Open Sourced in 2011 and graduated Apache Incubator in 2012
• Unique features of Kafka
• Super fast
• Distributed & Replicated out of the box
• Extremely low cost
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
WHO USES APACHE KAFKA?
A few small companies you might have heard of…
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
MICROSOFT SUPPORTS KAFKA
Microsoft ♥ Linux
Microsoft ♥ Open Source
Nearly 1 in 3 VMs are Linux
Microsoft moves to GitHub
Microsoft sponsors the Kafka summit, releases Kafka .NET driver on GitHub, and
even buys LinkedIn. That is some Kafka love.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERFORMANCE
Kafka performs amazingly well on modest hardware.
https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
Producers and consumers
simultaneously accessing
cluster.
Test on the LinkedIn
Engineering Blog:
- 3 machines in Kafka
cluster, 3 to generate
load
- 6 SATA drives each, 32
GB RAM each
- 1 GB Ethernet
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERFORMANCE
Microsoft has one of the largest Kafka installations called “Siphon”
http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka
1.3 million
Events per second at peak
~1 trillion
Events per day at peak
3.5 petabytes
Processed per day
1,300
Production brokers
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERFORMANCE
Microsoft has one of the largest Kafka installations called “Siphon”
http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka
https://github.com/Microsoft/Availability-Monitor-for-Kafka
Availability & Latency monitor for Kafka using Canary messages
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – ARCHITECTURE
producer producer
consumer consumer consumer
Producers publish messages to a Kafka topic
Consumers subscribe to topics and process messages
Kafka cluster
broker
broker
broker
A Kafka cluster is made up of one or more brokers (nodes)
Zookeeper
Kafka uses Zookeeper for configuration
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – ROLE OF ZOOKEEPER
What is ZooKeeper?
ZooKeeper is a centralized service for maintaining
configuration information, naming, providing distributed
synchronization, and providing group services to
distributed applications.
Role of ZooKeeper in Kafka
It is responsible for: maintaining consumer offsets and
topic lists, leader election, and general state information.
Apache ZooKeeper
zk-web: Web UI for ZooKeeper
https://github.com/qiuxiafei/zk-web
Or get the Docker container
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – TOPICS
Kafka topic
producer
producer
0 1 2 3 4 5
writes
0 1 2 3 4
0 1 2 3 4
5
writes
consumer
consumer
reads
reads
Partition 0
Partition 1
Partition 2
Producers write messages to the end of a
partition
• Messages can be round robin load balanced across
partitions or assigned by a function.
Consumers read from the lowest offset to the
highest
• Unlike most queuing systems, state is not maintained on
the server. Each consumer tracks its own offset.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – MORE ON PARTITIONS
Partitions for scalability
• The more partitions you have, the more throughput you get when consuming data.
• Each partition must fit entirely on a single server.
Partitions for ordering
• Kafka only guarantees message order within the same partition.
• If you need strong ordering, make sure that data is pinned to a single partition based
on some sort of key
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – PERSISTENCE
Kafka topic
0 1 2 3 4 5
0 1 2 3 4
0 1 2 3 4
5
Partition 0
Partition 1
Partition 2
All messages are written to disk and
replicated.
Messages are not removed from Kafka when
they are read from a topic.
A cleanup process will remove old messages
based on a sliding timeframe.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – CONSUMER GROUPS
Kafka topic
consumer
1
consumer
2
consumer
reads
rea
ds
reads
Partition 0 Partition 1 Partition 2
Each consumer group is a “logical subscriber”
Messages are processed in parallel by
consumers
Only one consumer is assigned to a partition
in a consumer group.
consumer
3
reads
Consumer
Group 2
consumer
reads
Consumer
Group 1
Partition 3
consumer
4
reads
Note: consumers are responsible for handling duplicate
messages. These could be caused by failures of another
consumer in the group.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – SERIALIZATION
Pick a format!
• JSON
• BSON
http://bsonspec.org/implementations.html
• PROTOCOL BUFFERS
https://github.com/google/protobuf
• BOND
https://github.com/Microsoft/bond
• AVRO
https://avro.apache.org/index.html
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – GETTING STARTED
Install Kafka & ZooKeeper
https://dzone.com/articles/running-apache-kafka-on-windows-os
• Install JDK
• Install ZooKeeper
• Install Kafka
Start Kafka & ZooKeeper
Start ZooKeeper
C:binzookeeper-3.4.8bin>zkServer.cmd
Start Kafka
C:binkafka_2.11-0.8.2.2>.binwindowskafka-server-start.bat .configserver.properties
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE KAFKA – GETTING STARTED
Create a topic
kafka-topics.bat --create --zookeeper localhost:2181
--replication-factor 1 --partitions 1 --topic SampleTopic1
Other Useful Topic Commands
List Topics
• kafka-topics.bat --list --zookeeper localhost:2181
Describe Topics
• kafka-topics.bat --describe --zookeeper localhost:2181 --topic [Topic Name]
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
KAFKA MANAGER
https://github.com/yahoo/kafka-manager
A tool for managing Apache Kafka
created by Yahoo.
Or get the Docker container
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
DEMO
Producing and consuming message in C#
Sample code:
https://github.com/dotnetpowered/StreamProcessingSample
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE
• Apache Spark is a fast and general engine for large-scale data
processing, Runs programs up to 100x faster than Hadoop MapReduce in
memory, or 10x faster on disk.
• Spark Streaming makes it easy to build scalable fault-tolerant streaming
applications.
https://spark.apache.org/streaming/
• Supports streaming directly from Apache Kafka.
http://spark.apache.org/docs/latest/streaming-kafka-integration.html
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE - FIRING UP THE CLUSTER
• Start the master
• Start one or more slaves
• Access the Spark cluster via browser
spark-class org.apache.spark.deploy.master.Master
spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077
http://spark-master:8080
Spark is made up of master and slave processes…
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
APACHE WITH MOBIUS
Mobius is a .NET language binding for Spark. It is a Java wrapper for building
workers in C# and other CLR-based languages.
• Reference the Microsoft.SparkCLR Nuget Package
• Build a console application utilizing the API
• Submit your program to Spark using the following script
sparkclr-submit.cmd
--master spark://spark-master:7077
--jars <path>runtimedependenciesspark-streaming-kafka-assembly_2.10-1.6.1.jar
--exe StreamingRulesEngineHost.exe
C:srcStreamProcessingStreamProcessingHostbinDebug
https://github.com/Microsoft/Mobius
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
DEMO
Consuming messages in C# using Spark
Sample code:
https://github.com/dotnetpowered/StreamProcessingSample
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
USING THE ELK STACK FOR INTEGRATION & VISUALIZATION
Use Logstack to ingest events and/or consume events. Allows for “ETL” and
integration with tools such as Elastic Search.
Shipper
(for non-Kafka
enabled producers)
Indexer
search
https://www.elastic.co/blog/just-enough-kafka-for-the-elastic-stack-part1
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
CONNECTING KAFKA TO ELASTIC SEARCH
For consumers: Configure a Kafka input
input {
kafka {
zk_connect => "kafka:2181"
group_id => "logstash"
topic_id => "apache_logs"
consumer_threads => 16
}
}
Don’t forget about to select a codec for serialization!
C:binlogstash-2.3.2bin>logstash -e "input { kafka { topic_id
=> 'SampleTopic2' } } output { elasticsearch { index=>'sample-
%{+YYYY.MM.dd}' document_id => '%{docid}' } }"
Putting it all together:
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
LET’S REVIEW
• Event-driven systems are a key ingredient to
unlocking your organization’s potential. Make data
available to current and future apps, improve
scalability, and decrease complexity.
• Kafka is foundational infrastructure for event-driven
systems and is battle tested at scale.
• The ecosystem building around Kafka is rich -
allowing you to connect using various tools.
BUILDING EVENT-DRIVEN SYSTEMS WITH APACHE KAFKA
QUESTIONS?
THANK YOU!
BRIAN RITCHIE
CTO, XEOHEALTH
2016
@brian_ritchie
brian.ritchie@gmail.com
http://www.dotnetpowered.com
Sample code:
https://github.com/dotnetpowered/StreamProcessingSample

Building Event-Driven Systems with Apache Kafka

  • 1.
    BUILDING EVENT-DRIVEN SYSTEMS WITH APACHEKAFKA BRIAN RITCHIE CTO, XEOHEALTH 2016 @brian_ritchie brian.ritchie@gmail.com http://www.dotnetpowered.com
  • 2.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA EVENT-DRIVEN SYSTEMS Definition Event-driven architecture, also known as message-driven architecture, is a software architecture pattern promoting the production, detection, consumption of, and reaction to events. An event can be defined as "a significant change in state". https://en.wikipedia.org/wiki/Event-driven_architecture
  • 3.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA EVENT-DRIVEN SYSTEMS ARE ABOUT UNLOCKING DATA • Data is the driving force behind innovation • Event-driven systems allow you to unlock the data – and unlock the innovation.
  • 4.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA EVENTS ARE THE “WHAT HAPPENED” DATA • It’s about recording “what happened”, but not coupling it to the “how” • It’s the “transactions” of your system • Product Views • Completed Sales • Page Visits • Site Logins • Shipping Notifications • Inventory Received • IoT • …and much more
  • 5.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA EVENTS – A HEALTHCARE EXAMPLE Event Stream Healthcare Claim Fraud Detection Data Lake Archive Disease Trending Contract & Pricing More… You don’t need to integrate with consumers or even know about a future uses of your data What happened? A patient received a set of services
  • 6.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA EVENT-DRIVEN SYSTEMS MAKE SCALABILITY EASIER • Scalability of processing • Scalability of design • Scalability of change
  • 7.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA EVENT-DRIVEN SYSTEMS REQUIRE INFRASTRUCTURE • Queue / Stream • Persistence • Distribution • Pub / Sub
  • 8.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA IS THE INFRASTRUCTURE • Apache Kafka is publish-subscribe messaging rethought as a distributed commit log. • Developed by LinkedIn • Written in Java • Open Sourced in 2011 and graduated Apache Incubator in 2012 • Unique features of Kafka • Super fast • Distributed & Replicated out of the box • Extremely low cost
  • 9.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA WHO USES APACHE KAFKA? A few small companies you might have heard of…
  • 10.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA MICROSOFT SUPPORTS KAFKA Microsoft ♥ Linux Microsoft ♥ Open Source Nearly 1 in 3 VMs are Linux Microsoft moves to GitHub Microsoft sponsors the Kafka summit, releases Kafka .NET driver on GitHub, and even buys LinkedIn. That is some Kafka love.
  • 11.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – PERFORMANCE Kafka performs amazingly well on modest hardware. https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Producers and consumers simultaneously accessing cluster. Test on the LinkedIn Engineering Blog: - 3 machines in Kafka cluster, 3 to generate load - 6 SATA drives each, 32 GB RAM each - 1 GB Ethernet
  • 12.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – PERFORMANCE Microsoft has one of the largest Kafka installations called “Siphon” http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka 1.3 million Events per second at peak ~1 trillion Events per day at peak 3.5 petabytes Processed per day 1,300 Production brokers
  • 13.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – PERFORMANCE Microsoft has one of the largest Kafka installations called “Siphon” http://www.confluent.io/kafka-summit-2016-users-siphon-near-rea-time-databus-using-kafka https://github.com/Microsoft/Availability-Monitor-for-Kafka Availability & Latency monitor for Kafka using Canary messages
  • 14.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – ARCHITECTURE producer producer consumer consumer consumer Producers publish messages to a Kafka topic Consumers subscribe to topics and process messages Kafka cluster broker broker broker A Kafka cluster is made up of one or more brokers (nodes) Zookeeper Kafka uses Zookeeper for configuration
  • 15.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – ROLE OF ZOOKEEPER What is ZooKeeper? ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services to distributed applications. Role of ZooKeeper in Kafka It is responsible for: maintaining consumer offsets and topic lists, leader election, and general state information. Apache ZooKeeper zk-web: Web UI for ZooKeeper https://github.com/qiuxiafei/zk-web Or get the Docker container
  • 16.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – TOPICS Kafka topic producer producer 0 1 2 3 4 5 writes 0 1 2 3 4 0 1 2 3 4 5 writes consumer consumer reads reads Partition 0 Partition 1 Partition 2 Producers write messages to the end of a partition • Messages can be round robin load balanced across partitions or assigned by a function. Consumers read from the lowest offset to the highest • Unlike most queuing systems, state is not maintained on the server. Each consumer tracks its own offset.
  • 17.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – MORE ON PARTITIONS Partitions for scalability • The more partitions you have, the more throughput you get when consuming data. • Each partition must fit entirely on a single server. Partitions for ordering • Kafka only guarantees message order within the same partition. • If you need strong ordering, make sure that data is pinned to a single partition based on some sort of key
  • 18.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – PERSISTENCE Kafka topic 0 1 2 3 4 5 0 1 2 3 4 0 1 2 3 4 5 Partition 0 Partition 1 Partition 2 All messages are written to disk and replicated. Messages are not removed from Kafka when they are read from a topic. A cleanup process will remove old messages based on a sliding timeframe.
  • 19.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – CONSUMER GROUPS Kafka topic consumer 1 consumer 2 consumer reads rea ds reads Partition 0 Partition 1 Partition 2 Each consumer group is a “logical subscriber” Messages are processed in parallel by consumers Only one consumer is assigned to a partition in a consumer group. consumer 3 reads Consumer Group 2 consumer reads Consumer Group 1 Partition 3 consumer 4 reads Note: consumers are responsible for handling duplicate messages. These could be caused by failures of another consumer in the group.
  • 20.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – SERIALIZATION Pick a format! • JSON • BSON http://bsonspec.org/implementations.html • PROTOCOL BUFFERS https://github.com/google/protobuf • BOND https://github.com/Microsoft/bond • AVRO https://avro.apache.org/index.html
  • 21.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – GETTING STARTED Install Kafka & ZooKeeper https://dzone.com/articles/running-apache-kafka-on-windows-os • Install JDK • Install ZooKeeper • Install Kafka Start Kafka & ZooKeeper Start ZooKeeper C:binzookeeper-3.4.8bin>zkServer.cmd Start Kafka C:binkafka_2.11-0.8.2.2>.binwindowskafka-server-start.bat .configserver.properties
  • 22.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE KAFKA – GETTING STARTED Create a topic kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic SampleTopic1 Other Useful Topic Commands List Topics • kafka-topics.bat --list --zookeeper localhost:2181 Describe Topics • kafka-topics.bat --describe --zookeeper localhost:2181 --topic [Topic Name]
  • 23.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA KAFKA MANAGER https://github.com/yahoo/kafka-manager A tool for managing Apache Kafka created by Yahoo. Or get the Docker container
  • 24.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA DEMO Producing and consuming message in C# Sample code: https://github.com/dotnetpowered/StreamProcessingSample
  • 25.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE • Apache Spark is a fast and general engine for large-scale data processing, Runs programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk. • Spark Streaming makes it easy to build scalable fault-tolerant streaming applications. https://spark.apache.org/streaming/ • Supports streaming directly from Apache Kafka. http://spark.apache.org/docs/latest/streaming-kafka-integration.html
  • 26.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE - FIRING UP THE CLUSTER • Start the master • Start one or more slaves • Access the Spark cluster via browser spark-class org.apache.spark.deploy.master.Master spark-class org.apache.spark.deploy.worker.Worker spark://spark-master:7077 http://spark-master:8080 Spark is made up of master and slave processes…
  • 27.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA APACHE WITH MOBIUS Mobius is a .NET language binding for Spark. It is a Java wrapper for building workers in C# and other CLR-based languages. • Reference the Microsoft.SparkCLR Nuget Package • Build a console application utilizing the API • Submit your program to Spark using the following script sparkclr-submit.cmd --master spark://spark-master:7077 --jars <path>runtimedependenciesspark-streaming-kafka-assembly_2.10-1.6.1.jar --exe StreamingRulesEngineHost.exe C:srcStreamProcessingStreamProcessingHostbinDebug https://github.com/Microsoft/Mobius
  • 28.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA DEMO Consuming messages in C# using Spark Sample code: https://github.com/dotnetpowered/StreamProcessingSample
  • 29.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA USING THE ELK STACK FOR INTEGRATION & VISUALIZATION Use Logstack to ingest events and/or consume events. Allows for “ETL” and integration with tools such as Elastic Search. Shipper (for non-Kafka enabled producers) Indexer search https://www.elastic.co/blog/just-enough-kafka-for-the-elastic-stack-part1
  • 30.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA CONNECTING KAFKA TO ELASTIC SEARCH For consumers: Configure a Kafka input input { kafka { zk_connect => "kafka:2181" group_id => "logstash" topic_id => "apache_logs" consumer_threads => 16 } } Don’t forget about to select a codec for serialization! C:binlogstash-2.3.2bin>logstash -e "input { kafka { topic_id => 'SampleTopic2' } } output { elasticsearch { index=>'sample- %{+YYYY.MM.dd}' document_id => '%{docid}' } }" Putting it all together:
  • 31.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA LET’S REVIEW • Event-driven systems are a key ingredient to unlocking your organization’s potential. Make data available to current and future apps, improve scalability, and decrease complexity. • Kafka is foundational infrastructure for event-driven systems and is battle tested at scale. • The ecosystem building around Kafka is rich - allowing you to connect using various tools.
  • 32.
    BUILDING EVENT-DRIVEN SYSTEMSWITH APACHE KAFKA QUESTIONS?
  • 33.
    THANK YOU! BRIAN RITCHIE CTO,XEOHEALTH 2016 @brian_ritchie brian.ritchie@gmail.com http://www.dotnetpowered.com Sample code: https://github.com/dotnetpowered/StreamProcessingSample

Editor's Notes

  • #10 http://blog.underdog.io/post/107602021862/inside-datadogs-tech-stack
  • #12 https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
  • #13 https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
  • #14 https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines