Ordering of Events in Apache Kafka

Now it's time to understand the order in which events are stored in Kafka topic. So let's assume we have a topic that is called profile updated event topic. Every time user updates their profile, a new event is published to this topic and let's assume that user changed the spelling of their first name and clicked on the update button. A new event is published and it is persisted into partition zero. With this change, the spelling of user's first name has changes to navab, Then this user changed their name again and clicked on the update button again. A new event is published and this time it was stored in a partition. One user changed their mind again and updated the spelling of their first name again. A new event was published and this time it was stored in partition two.

And here's a very important detail. If you remember from previous lessons, that a Kafka message is a key value pair, where message key can be a string value and message value can be event details in the form of Json payload. When message key is not provided, then this event can be stored in any partition. It can be stored in partition zero. It can be stored in partition one or in partition two.

Kafka will make this decision for us. When message key is not provided, Kafka will try to load balance events and store a new event in one of the available partitions. Another important detail is that consumer microservice will consume events from topic partitions in parallel, because Kafka consumer reads events in parallel, there is no guarantee in what order consumer microservice will receive these events.


It could read event A first, then C, then B, it could read events in order B, A and C, or it could read events in order C, B and A. This can be a problem if the order in which these events should be processed is important. The order in which we want these events to be processed is A, B, and C, and this is very important because if the order is not followed, we will update user's first name in the database with incorrect value.

So in those situations, when the order in which events should be processed is important, we should always provide message key. If message key is provided, then Kafka will use it to determine which partition this event should be written to in Kafka topic.

When the same message key is used for different events, those events will be stored in the same topic Partition. If I use user id as a message key, for example, then all events that I send with this message key will be persisted in the same partition.

For example, if I send first event with the message key, it will get stored in the first partition with index zero. If I publish one more event with exactly the same message key, Kafka will store this event in the same partition as other events that have exactly the same message key. It will take message key, it will hash it, and then it will use this hash value to determine which partition to use to store this event. It always stores messages with the same message key in the same partition this is how Kafka helps us to achieve ordering of events. If we send another event with exactly the same message key, it will go to the same partition where events with the same message key are stored

In this case, when consumer microservice reads events from topic partitions, even if it reads events from other partitions in parallel, the order in which events with the same message key is processed will be exactly the same, in which these events were persisted.