KEMBAR78
Apache kafka | PPTX
Apache kafka
3/1/2018 1
What is Apache kafka ?
3/1/2018 2
What is a Messaging System?
3/1/2018 3
What is Apache kafka ? (1/2)
• A unified platform for handling all the real-
time data feeds
• A distributed publish-subscribe messaging
system
• A robust queue that can handle a high volume
of data
3/1/2018 4
What is Apache kafka ? (2/2)
• It integrates very well with Apache Spark for
real-time streaming data analysis
• Enables you to pass messages from one end-
point to another
• Built on top of the Apache ZooKeeper
synchronization service
3/1/2018 5
Benefits of Apache Kafka (1/4)
• Following are a few benefits of Kafka −
• Reliability − Kafka is distributed, partitioned,
replicated and fault tolerance.
3/1/2018 6
Benefits of Apache Kafka (2/4)
• Scalability − Kafka messaging system scales
easily without down time..
3/1/2018 7
Benefits of Apache Kafka (3/4)
• Durability − Kafka uses Distributed commit
log which means messages persists on disk as
fast as possible, hence it is durable..
3/1/2018 8
Benefits of Apache Kafka (4/4)
• Performance − Kafka has high throughput for
both publishing and subscribing messages. It
maintains stable performance even many TB
of messages are stored.
• Kafka is very fast and guarantees zero
downtime and zero data loss.
3/1/2018 9
Apache Kafka: When to use (1)
• Metrics − Kafka is often used for operational
monitoring data. This involves aggregating
statistics from distributed applications to
produce centralized feeds of operational data.
3/1/2018 10
Apache Kafka: When to use (2)
• Log Aggregation Solution − Kafka can be used
across an organization to collect logs from
multiple services and make them available in a
standard format to multiple con-sumers.
3/1/2018 11
Apache Kafka: When to use (3)
3/1/2018 12
Apache Kafka: When to use (4)
• Stream Processing − Popular frameworks such
as Storm and Spark Streaming read data from
a topic, processes it, and write processed data
to a new topic where it becomes available for
users and applications. Kafka’s strong
durability is also very useful in the context of
stream processing.
3/1/2018 13
Apache Kafka: When to use (5/5)
3/1/2018 14
Apache Kafka: Architecture(1/2)
• Broker
• ZooKepper
• Producers
• Consumers
3/1/2018 15
Apache Kafka: Architecture (2/2)
3/1/2018 16
Apache kafka

Apache kafka

  • 1.
  • 2.
    What is Apachekafka ? 3/1/2018 2
  • 3.
    What is aMessaging System? 3/1/2018 3
  • 4.
    What is Apachekafka ? (1/2) • A unified platform for handling all the real- time data feeds • A distributed publish-subscribe messaging system • A robust queue that can handle a high volume of data 3/1/2018 4
  • 5.
    What is Apachekafka ? (2/2) • It integrates very well with Apache Spark for real-time streaming data analysis • Enables you to pass messages from one end- point to another • Built on top of the Apache ZooKeeper synchronization service 3/1/2018 5
  • 6.
    Benefits of ApacheKafka (1/4) • Following are a few benefits of Kafka − • Reliability − Kafka is distributed, partitioned, replicated and fault tolerance. 3/1/2018 6
  • 7.
    Benefits of ApacheKafka (2/4) • Scalability − Kafka messaging system scales easily without down time.. 3/1/2018 7
  • 8.
    Benefits of ApacheKafka (3/4) • Durability − Kafka uses Distributed commit log which means messages persists on disk as fast as possible, hence it is durable.. 3/1/2018 8
  • 9.
    Benefits of ApacheKafka (4/4) • Performance − Kafka has high throughput for both publishing and subscribing messages. It maintains stable performance even many TB of messages are stored. • Kafka is very fast and guarantees zero downtime and zero data loss. 3/1/2018 9
  • 10.
    Apache Kafka: Whento use (1) • Metrics − Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of operational data. 3/1/2018 10
  • 11.
    Apache Kafka: Whento use (2) • Log Aggregation Solution − Kafka can be used across an organization to collect logs from multiple services and make them available in a standard format to multiple con-sumers. 3/1/2018 11
  • 12.
    Apache Kafka: Whento use (3) 3/1/2018 12
  • 13.
    Apache Kafka: Whento use (4) • Stream Processing − Popular frameworks such as Storm and Spark Streaming read data from a topic, processes it, and write processed data to a new topic where it becomes available for users and applications. Kafka’s strong durability is also very useful in the context of stream processing. 3/1/2018 13
  • 14.
    Apache Kafka: Whento use (5/5) 3/1/2018 14
  • 15.
    Apache Kafka: Architecture(1/2) •Broker • ZooKepper • Producers • Consumers 3/1/2018 15
  • 16.
    Apache Kafka: Architecture(2/2) 3/1/2018 16

Editor's Notes

  • #3 Kafka: hệ thống hàng đợi dữ liệu (message queue) phục vụ chức năng thu thập dữ liệu đầu vào (stream ingestion system)
  • #5 một nền tảng thống nhất để xử lý tất cả các nguồn cấp dữ liệu thời gian thực một hệ thống nhắn tin đăng ký phân phối một hàng đợi mạnh mẽ có thể xử lý một lượng lớn dữ liệu
  • #7 Reliability: do tin cay
  • #8 Reliability: do tin cay
  • #9 Durability: Độ bền
  • #10 Reliability: do tin cay
  • #11 Metric: Chỉ số (Đo lường)
  • #12 Log Aggregation Solution : Giải pháp tổng hợp log
  • #13 https://techtalk.vn/he-thong-xu-ly-du-lieu-luong-va-kien-truc.html
  • #14 https://techtalk.vn/he-thong-xu-ly-du-lieu-luong-va-kien-truc.html
  • #15 https://techtalk.vn/he-thong-xu-ly-du-lieu-luong-va-kien-truc.html
  • #16 https://www.ibm.com/support/knowledgecenter/en/SSPFMY_1.3.5/com.ibm.scala.doc/config/iwa_cnf_scldc_apche_con_c.html