Here’s a list of high-priority Kafka topics from which interview questions commonly arise,
especially for Big Data / Spark / ETL / Streaming Engineer roles and CDAC module exams.
🎯 High-Yield Kafka Interview Topics
🔹 1. Kafka Basics
• What is Kafka and why is it used?
• Kafka architecture: Producer, Broker, Consumer, Zookeeper
• Core components: Topics, Partitions, Offsets, Brokers
🔹 2. Kafka Topics
• What is a Kafka topic?
• Partitions and replication in topics
• Topic configuration (retention, cleanup.policy)
• Commands: kafka-topics.sh (create, list, describe, delete)
🔹 3. Producers
• Role of Kafka producer
• Producer acknowledgements (acks=0, 1, all)
• Batching, compression
• Serialization (String, Avro, JSON, Protobuf)
🔹 4. Consumers
• Consumer groups and consumer offset
• Auto commit vs manual commit
• Consumer rebalancing
• Offset management and lag
• Commands: kafka-consumer-groups.sh
🔹 5. Kafka Broker & Cluster
• Role of broker
• Partitioning and replication factor
• Leader election among partitions
• Fault tolerance in Kafka
🔹 6. Zookeeper
• Role of Zookeeper in Kafka (older versions)
• What happens if Zookeeper goes down?
• Kafka without Zookeeper (KRaft mode)
🔹 7. Kafka Connect
• Purpose and architecture of Kafka Connect
• Source and Sink connectors (e.g., MySQL to Kafka, Kafka to HDFS)
• Use cases in ETL pipelines
🔹 8. Kafka Streams
• Kafka Streams API vs Kafka Consumer
• Stateless vs stateful operations
• Windowing, join, aggregation
🔹 9. Kafka with Spark Streaming / Structured Streaming
• Integration with Spark Structured Streaming
• Kafka as source and sink in Spark
• Exactly-once semantics with Spark Kafka integration
🔹 10. Message Delivery Semantics
• At most once
• At least once
• Exactly once
• How Kafka ensures these?
🔹 11. Data Retention and Cleanup
• Retention by size/time
• Log compaction vs log deletion
• Cleanup policies
🔹 12. Security in Kafka
• SSL encryption
• SASL authentication
• ACLs for topic-level authorization
🔹 13. Monitoring & Metrics
• How to monitor Kafka?
• Tools: JMX, Kafka Manager, Prometheus, Grafana
• Key metrics: consumer lag, ISR, throughput
🔹 14. Kafka Configuration & Tuning
• Key broker config options
• Tuning producers/consumers for performance
• Threading model and batching
✅ Suggested Preparation Strategy
• Start with Architecture & Core Concepts
• Deep-dive into Producers, Consumers, Topics
• Understand Kafka + Spark Integration
• Cover exactly-once semantics, offsets, Kafka Connect
• Know some real-world use cases
• Review common commands for CLI operations
Let me know if you want:
• ✅ MCQs or coding-based Kafka interview questions
• ✅ Kafka + Spark Structured Streaming mini project idea
• ✅ Mindmap / PDF notes
• ✅ Command cheat sheet for Kafka CLI