The Core Concept: Delivery Guarantees
In any messaging system like Kafka, “delivery guarantees” refer to the promises the system makes about whether a message sent by a producer will be received by a consumer, and how many times it might be received. There are three main types:
- At Most Once: A message will be delivered either once or not at all. Data loss is possible, but you’ll never get duplicates.
- At Least Once: A message will always be delivered, but it might be delivered more than once. No data loss, but duplicates are possible.
- Exactly Once: A message is delivered once and only once. No data loss and no duplicates.
Default Scenario in Kafka: At Least Once
By default, Kafka is configured for an at-least-once delivery guarantee. This is a balance between performance and durability. Here’s how the default settings of the producer and consumer work together to achieve this:
Producer Default Settings:
acks=1: When a producer sends a message, it waits for an acknowledgment (ack) from only the leader broker of the topic partition. It doesn’t wait for the follower replicas. This is faster than waiting for all replicas, but if the leader broker crashes right after sending the ack and before the followers have replicated the message, the message can be lost.retries> 0: If the producer doesn’t receive an acknowledgment (due to a transient network issue, for example), it will automatically try to send the message again. This is key to preventing data loss and ensures the message is delivered at least once.
Consumer Default Settings:
enable.auto.commit=true: The consumer is configured to automatically commit the offsets of the messages it has polled from the topic at regular intervals (auto.commit.interval.ms).isolation.level=read_uncommitted: The consumer will read all messages that have been written to the leader broker, including those that might be part of an incomplete transaction.
How the Default “At Least Once” Scenario Plays Out:
- A producer sends a message. The leader broker writes it and sends back an
ack. - If the producer doesn’t get the
ack, it retries, which could lead to the same message being written to the broker twice. - The consumer polls for new messages and receives a batch.
- The consumer’s
auto.commitfeature commits the offset of the last message in the batch. - If the consumer application crashes after committing the offset but before it has finished processing all the messages in the batch, the next time the consumer starts up, it will begin reading from the last committed offset, effectively skipping the unprocessed messages from the previous batch. However, due to producer retries, it’s also possible for the consumer to receive duplicate messages.
All Configuration Scenarios
Now, let’s look at how you can configure the producer and consumer to achieve all three delivery guarantees.
At Most Once: “Fire and Forget” 💨
This scenario prioritizes low latency over durability. You’re willing to risk losing messages to get the fastest possible throughput.
Producer Configuration:
acks=0: The producer sends the message and doesn’t wait for any acknowledgment from the broker. It’s a “fire and forget” approach. If the broker is down or there’s a network issue, the message is lost.retries=0: The producer will not attempt to resend a message if it fails.
Consumer Configuration:
enable.auto.commit=true: To achieve “at most once” on the consumer side, you’d typically commit the offset before you process the message. If your application crashes after the commit but before processing, you’ve lost that message for good.
When to use it: This is suitable for non-critical data where occasional loss is acceptable, such as collecting metrics or logging.
At Least Once: “Durable Delivery” 📨
This is the most common scenario, ensuring no data is lost, even if it means getting some duplicates.
Producer Configuration:
acks=all(oracks=-1): This is the most durable setting. The producer will wait for an acknowledgment from the leader broker and all of the in-sync replicas. This ensures that even if the leader broker crashes, another replica has the message and can take over as the new leader.retries> 0: The producer will retry sending the message if it fails.
Consumer Configuration:
enable.auto.commit=false: You take control of when to commit offsets. You would typically commit the offset after you have successfully processed the message. If the consumer crashes before committing, the next time it starts, it will re-read and re-process the last batch of messages, which can lead to duplicates.
When to use it: This is the go-to for most use cases where data loss is not an option, such as financial transactions, e-commerce orders, and critical event tracking. Your application must be designed to handle potential duplicate messages (a concept known as idempotence).
Exactly Once: “The Holy Grail” 🎯
This is the strongest guarantee, ensuring each message is processed once and only once. It requires more complex configuration and has a slight performance overhead.
Producer Configuration:
enable.idempotence=true: An idempotent producer will not create duplicate messages in the topic. The broker keeps track of a sequence number for each producer, and if the producer tries to send a message with a sequence number that has already been received, the broker will discard it. This handles producer-side duplicates.acks=all: This is required for an idempotent producer.transactional.id: To achieve exactly-once semantics across multiple partitions, you use Kafka’s transactional API. The producer is configured with atransactional.idwhich allows it to group a series of messages into a single atomic transaction.
Consumer Configuration:
enable.auto.commit=false: Manual offset commits are necessary.isolation.level=read_committed: The consumer will only read messages that are part of a committed transaction. It will not see messages from transactions that were aborted or are still in progress.
The Transactional Flow:
- A consumer reads a message from an input topic.
- It processes the message.
- The results are written to an output topic by a producer.
- The offset of the initial message is committed.
All of these steps (producing new messages and committing the original offset) are done within a single Kafka transaction. If any step fails, the entire transaction is rolled back. This ensures that the message is either fully processed and its offset committed, or not at all, thus achieving an “exactly once” end-to-end guarantee.
When to use it: This is ideal for critical applications where data accuracy is paramount and duplicates are intolerable, such as financial ledgers or systems that require strict state management.