Kafka for Administrators - Bespoke Certificate for Pradeep Kumar
Introduction (Brief)
Apache Kafka vs. traditional message brokers (Brief)
Overview of Kafka Features and Ecosystem (Brief)
Apache Kafka On-premise vs. in the Cloud (Brief)
Apache Kafka variants & in Docker/Kubernetes (Brief)
Setup
Installing and Configuring Apache Kafka
Setting up Zookeeper to Manage the Kafka Cluster
Testing the Cluster
Testing IDE, APIs integration with Kafka
- Note** A lab with distributed Kafka Cluster will already be setup (Kafka with zookeeper)
Deep Dive into Kafka
Understanding Kafka Internals & architecture
- Cluster formation & Membership
- Zookeeper and its role
- Leader, follower role for Kafka brokers & zookeeper (Load balancing)
- The Controller –as per new version where ZK is not used (brief introduction)
- Kraft Mode –as per new version where ZK is not used (brief introduction)
- Topics, Partitions & Segments.
- Messages & batches
- Producers & consumers
- Consumer groups, offsets and fault tolerance.
- Brokers & replication.
- Fault tolerance & semantics.
- Important configurations.
- Physical storage & understanding underlying log/index files.
- Log retention & compaction
Understanding Kafka APIs
- Kafka Java Client APIs
- Kafka Producer Java API
- Kafka Consumer Java API
- Kafka AdminClient Java API
- Kafka Streams Java API
- Kafka Connect Java API
- Terminologies
Working with Kafka programmatically & understanding reliability
Kafka Producers
Constructing a Kafka Producer, publishing messages, configurations for producers, understanding serializers, interceptors, headers, partitions, consistency, retries, compression, quotas/ throttling etc.
Related configurations for optimized performance.
Kafka Consumers
Constructing a consumer, working with consumers, consumer Groups.
Subscribing & consuming from topics.
Understanding Polling & heartbeat thread, commits & offsets, fetch behaviour, auto offset or preferred read, rebalancing listeners, serializers & deserializers
Related configurations for optimized performance.
Reliable Data Delivery
Reliability Guarantees & Validating System Reliability
Understanding semantics & ensuring
Data durability & retention
Important considerations
Kafka Streaming API
Kafka Connect API
Building Data Pipelines
Considerations
Kafka Connect Versus Producer and Consumer
Kafka Connect
Managing Kafka Programmatically
AdminClient Lifecycle: Creating, Configuring and Closing, Configuration management, Consumer group management, Cluster Metadata & Testing
Managing, administering & Monitoring Kafka
Cross-Cluster Data Mirroring
Use Cases, Hub-and-Spokes Architecture, Active-Active Architecture, Active-Standby Architecture and Apache Kafka’s MirrorMaker.
Administering Kafka
Topic Operations, Consumer & Consumer Groups, Dynamic Configuration Changes
Partition Management
Monitoring Kafka
Using tools to monitor Kafka Cluster
Understanding emitted metrics from Kafka & zookeepers
Client, performance & Lag Monitoring
Kafka logs
Known issues and optimizing Kafka & its components
Troubleshooting
Summary & conclusion