Kafka Basics Recap
Consumer group → a set of consumers working together to read from a topic.
Topic → logical channel where producers write data.
Partition → a topic is split into multiple partitions for parallelism.
Offset → a sequential ID for messages within a partition.
Scenario A: One Consumer, One Topic
- Single consumer reads all partitions.
- All data from the topic is delivered to that consumer.
- Use case: debugging, low throughput requirements.
Scenario B: Multiple Consumers in Same Consumer Group
- Kafka balances partitions across consumers within a group.
- Rule: One partition → at most one consumer in a group.
- Example: Topic has 4 partitions, Consumer group has 2 consumers → each consumer gets ~2 partitions.
- Good for: scaling horizontally, increasing throughput.
Scenario C: Multiple Consumers in Different Groups
- Each group works independently and gets a full copy of the data.
- Example: Group
analyticsand Groupbillingconsume the same topic independently. - Good for: fan-out, multiple independent applications consuming the same data.
Scenario D: More Consumers than Partitions (in the same group)
- Some consumers stay idle because a partition can only be assigned to one consumer.
- Example: Topic has 3 partitions, Group has 5 consumers → 2 consumers idle.
- Lesson: number of consumers ≤ number of partitions.
Scenario E: Offset Management
- Consumers track position (offset) in each partition.
- Auto-commit (
enable.auto.commit=true) → Kafka commits offsets periodically. - Manual commit → Application decides when to commit.
Scenarios:
- Auto-commit → simple but may reprocess messages on crash.
- Manual commit after processing → safer (at-least-once).
- Manual commit before processing → risk of message loss.
Scenario F: Multiple Consumers Reading Same Partition
- Within a group → ❌ Not possible.
- Across groups → ✅ Possible, each group consumes independently.
What Can Happen vs What Cannot Happen
✔ What can happen:
- Multiple consumer groups can consume the same topic independently.
- Within a group, each partition goes to one consumer.
- Consumers can rewind offsets (seek).
- Consumers can commit offsets manually.
❌ What cannot happen:
- Two consumers in the same group reading the same partition simultaneously.
- Kafka does not push messages → consumers must poll actively.
- Offsets are not shared between groups.
Common Real-World Patterns
- At-least-once delivery (default): possible duplicates.
- At-most-once delivery: commit before processing → risk of loss.
- Exactly-once delivery: requires Kafka transactions + idempotent logic.
