avatarRob Golder

Summary

Kafka Producers can be configured with different acknowledgment settings (acks) to balance performance with message durability, with options for no acknowledgment (0), acknowledgment from the lead replica only (1), or from all in-sync replicas (all).

Abstract

The acks configuration parameter for Kafka Producers determines the number of replicas that must acknowledge a message write before it is considered successful. Setting acks to 0 results in a fire-and-forget approach with the highest performance but also the highest risk of message loss. Configuring acks to 1 waits for the lead replica to acknowledge the write, reducing the risk of message loss with minimal performance impact. Setting acks to all ensures that all in-sync replicas acknowledge the write, providing the strongest guarantee against message loss, albeit with a slight performance cost. The trade-off between performance and durability is a key consideration when configuring this setting, as it affects the system's behavior during message consumption and publishing, as well as the updating of consumer offsets.

Opinions

  • The author suggests that configuring acks to all provides the best guarantee against message loss, which is crucial in scenarios where data integrity is paramount.
  • There is an opinion that the performance cost associated with setting acks to all is generally insignificant for most use cases, implying that the trade-off for increased durability is often worth it.
  • The article implies that the min.insync.replicas setting is important in conjunction with acks set to all to ensure that a minimum number of replicas are in-sync before writes are accepted, thus preventing data loss.
  • The author emphasizes the importance of understanding the behavior of the acks parameter, as it impacts not only durability but also the performance of the Kafka system.
  • It is conveyed that the choice of acks configuration can significantly affect the outcome of message processing, especially in failure scenarios where message loss is a risk.

Kafka Producer Acks

Kafka Producers can be configured to determine how many replicas must acknowledge the write of a message to a topic partition before the message is considered successfully written.

It is important to understand the behaviour of this parameter and the trade-offs being made when configuring this setting, as it impacts durability and performance.

Acks Configuration

The Producer can be configured to wait for 0, 1, or all replicas to acknowledge a message write using the acks configuration parameter.

Configuring as 0 means the write is simply a fire and forget, as the Producer does not await any acknowledgement, and does not know whether the write succeeded or failed.

Configuring as 1 means that the Producer will await the one lead topic partition replica to acknowledge the write.

Configuring as all ensures that the Producer only receives acknowledgement of a successful message write once all the current in-sync replicas have received the message. The partition itself will only accept writes if there are at least the minimum required number of replicas in-sync, as configured by the min.insync.replicas setting. If there are insufficient replicas available then an error will be thrown, and the Producer can retry the write if configured to do so.

This then results in a trade-off between performance and durability. Requiring fewer replicas to acknowledge the message leads to improved performance at the expense of the risk of message loss during failure scenarios.

Acks Behaviour

For the purposes of demonstrating the differences in behaviour based on the configuration of the Producer acks property, consider the flow where a message is consumed by a service and a resulting outbound event is published by the Producer. When the outbound message is successfully published then the original message that triggered the flow is marked as consumed by updating the consumer offsets.

Figure 1: consumer, produce & update offsets

When Producer acks is configured as 0, then no acknowledgement of a successful write is sought by the Producer. This will be the most performant option, but with the highest risk of message loss. If the lead topic partition dies before the message is replicated the message will still be considered successfully sent, so the consumer offsets are updated marking the original message as consumed, despite the outbound event being lost.

Figure 2: acks = 0, resulting in message loss

If acks is configured to 1, then only the partition lead replica need acknowledge the message for the write to be considered successful. This greatly reduces the chance of message loss for a minimal performance cost. The risk is that the replica could die before the message has been replicated to other broker nodes. The Producer would not then re-publish the message, resulting in the message being lost.

Figure 3: acks = 1, resulting in message loss

With the same failure scenario as above, but with acks configured as all, then the lead replica will not have acknowledged receipt of the message from the Producer as it must first replicate the message across the in-sync replicas before it does so. As the attempt to write the message times out the Producer is able to continue as required, be it to retry the publish or fail the message processing. Typically the retry will happen automatically by the Kafka client library unless configured not to do so.

Figure 4: acks = all, resulting in message retry

With acks configured as all, and the lead replica receives acknowledgment of receipt of the message from all its replicas, it is then able to send acknowledgement of receipt of the message back to the Producer. Even if the lead replica now dies the message will not be lost as it has been safely replicated.

Figure 5: acks = all, with successful replication

Configuring acks to all therefore provides the strongest available guarantee of avoiding message loss. It does come at a small performance cost, although this is usually considered insignificant for the majority of use cases.

Configuration Summary

More On Kafka

Head over to Lydtech Consulting to read this article and many more on Kafka and other interesting areas of software development.

Kafka & Spring Boot Udemy Course

Lydtech’s Udemy course Introduction to Kafka with Spring Boot covers everything from the core concepts of messaging and Kafka through to step by step code walkthroughs to build a fully functional Spring Boot application that integrates with Kafka.

Put together by our team of Kafka and Spring experts, this course is the perfect introduction to using Kafka with Spring Boot.

Recommended from ReadMedium