The website content provides a comprehensive guide on setting up a local event-driven environment using Kafka Docker, including instructions on creating a Kafka cluster with Docker Compose, managing topics, and producing and consuming messages.
Abstract
The article titled "Setting Up Your Local Event-Driven Environment Using Kafka Docker" serves as a tutorial for setting up a Kafka cluster within a Docker container. It begins by highlighting the importance of event-driven architecture and mentioning various tools that support it, with a focus on Apache Kafka. The guide outlines the prerequisite of having Docker installed and proceeds to explain how to write a Docker Compose file to orchestrate the necessary Kafka and Zookeeper services. It also details the steps to create a Kafka topic, produce messages to it, and consume those messages, all using Kafka's command-line tools. The article emphasizes the use of Docker for easy setup and management of the Kafka environment and concludes by encouraging readers to explore further and consider integrating Kafka with other technologies like DynamoDB.
Opinions
The author suggests that event-driven architecture is essential for decoupling services and enhancing the scalability of applications.
Apache Kafka is presented as a robust solution for handling event-driven architectures, with the benefit of being supported by a large ecosystem.
The use of Docker and Docker Compose is recommended for simplifying the setup process of a Kafka cluster, making it more accessible for development and testing purposes.
The tutorial advocates for the practicality of command-line tools provided by Kafka for managing topics and messages, indicating their effectiveness for developers.
The article promotes further exploration into Kafka's capabilities and potential integrations, such as with DynamoDB, to enhance the functionality of event-driven systems.
Setting Up Your Local Event-Driven Environment Using Kafka Docker
Spin up a Kafka cluster in a Docker container and learn how to create topics, produce and consume messages
Event-driven architecture is one of the modern architectures that are implemented in many applications today. There are many tools developed over the past few years to support this kind of architecture, for example, AWS SNS, SQS, RabbitMQ, and Apache Kafka.
The main purpose of an event-driven architecture is to decouple your services by having a message queue or queues the services will publish to and poll or consume from. This way, we can replace the producer or the consumer pretty easily as they are decoupled from each other.
In this piece, we’re going to set up a local Kafka Docker and use its CLI tools to do the basic operations including creating a topic, publishing messages, and consuming them.
The Docker images we will be using are as follows:
The first thing we’ll do is create a docker-compose.yml file. The advantage of using a compose file is that we can group together the services that compose our application per se. We can then spin up all the containers defined in that file using one command: docker-compose up.
An alternative would be to use the docker run command, but that would mean we need to execute it three times, one for each Docker image we want to spin up a container from.
As mentioned in the Introduction, we will have three Docker containers running. So, our compose file will consist of three services with the following names: zookeeper, broker and kafka-tools.
Our local Kafka cluster is built by the zookeeper and broker containers. broker is Apache Kafka and it relies on Apache Zookeeper, which is why we specify dependency in the compose file.
Lastly, we have the kafka-tools container, which contains the command-line tools that we can use to interact with the broker.
Notice that for the kafka-tools container, we added network_mode: "host". This is so that from inside the container, localhost is interpreted as the Docker host on our machine.
This is so that we can connect to the broker easily as it exposes itself on localhost:9092. We’ll see more about this later.
Start the Containers
We are going to start the three containers we defined in the docker-compose.yml file. Go to your terminal, change directory to where you created the docker-compose.yml file, and execute the following command.
~/demo/kafka-local ❯ ls -l
total 8
-rw-r--r-- 1 billyde staff 1347 12 Feb 23:06 docker-compose.yml
~/demo/kafka-local ❯ docker-compose up -d
Creating network "kafka-local_default" with the default driver
Creating kafka-tools ... done
Creating zookeeper ... done
Creating broker ... done
Finally, to check whether all the containers are running, run this command.
~/demo/kafka-local ❯ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
748c5da81038 confluentinc/cp-server:5.4.0 "/etc/confluent/dock…" 5 seconds ago Up 4 seconds 0.0.0.0:9092->9092/tcp broker
5044ae334235 confluentinc/cp-zookeeper:5.4.0 "/etc/confluent/dock…" 5 seconds ago Up 4 seconds 2888/tcp, 0.0.0.0:2181->2181/tcp, 3888/tcp zookeeper
ff1f9070dc42 confluentinc/cp-kafka:5.4.0 "tail -f /dev/null" 5 seconds ago Up 4 seconds 9092/tcp kafka-tools
If you see all of the containers are listed there and their STATUS is Up xx seconds, then we’re good. Happy days!
Create a Kafka Topic
Just like any message broker, Kafka operates by having messages sent to topics. Each topic usually indicates a particular event, for example, the user-activity topic, in which the messages will be related to user activities.
Let’s see how we can create a topic using the cli tools.
First, we need to get into the kafka-tools container because that’s where our Kafka cli tools reside. To do this, execute the following command from your terminal.
If you see root@kafka-tools:/#, you’re in! Let’s create a topic, and we’ll call it to-do-list because it will contain the list of things we need to do.
The command actually returns nothing, and nothing means good. To check whether the topic was actually created or not, we can run the following command.
Alright, before moving on to the next section, let’s discuss a few things here.
--bootstrap-server localhost:9092: Remember this from the earlier section? If we didn’t add network_mode: "host" to the kafka-tools service, it would not be able to connect to the broker. Instead, this is what we would see.
[2020-02-12 12:40:27,990] WARN [AdminClient clientId=adminclient-1] Connection to node -1 (localhost/127.0.0.1:9092) could not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
--replication-factor 1: This specifies how many replications of the topic partitions we want to create. Replication is for fault-tolerance purposes. For instance, if we specify the factor to be 3, that means there will be three servers that contain the same data (topic, its partitions, and its messages). One of the three servers will be the leader and the remaining will be the followers.
--partitions 2: The name is self-descriptive, but this basically allows us to set how many partitions we want to have for our topic. Whenever our Kafka producer sends a message or record to this topic, the record will only be present in one of the available partitions. The placement of the record in the partition is based on the record’s key.
--topic to-do-list: As you might have guessed, this argument specifies the name of the topic we want to create.
Send Messages to the Topic
Our topic is ready, which means we can put some messages into it. To do this, we are going to use kafka-console-producer. Let’s take a look at how we can send key-value messages.
The option --property parse.key=true tells the producer that we want to send a message with a key. Additionally, we also need to tell the producer what separator will be used to separate the key from the value (or the message itself), which is specified by this option --property "key.separator=:".
We can now close the producer as we finish sending messages. Just press ctrl + c from your keyboard to terminate it.
Consume Messages from the Topic
We have published some messages in the previous topic, so let’s see if we can consume them. To do this, we are going to use kafka-console-consumer.
Perfect! We can see all three messages we produced in the previous section.
A few things to note:
--from-beginning: Kafka consumer tracks its message consumption by offset. This option tells the consumer to start consuming from the earliest offset available if it does not already have an established offset (which, in this case, it doesn’t).
--property "print.key=true": This option tells the consumer to print the key of the messages to the console. If not specified, the consumer simply will output only the messages.
Same as the producer, to terminate the consumer, just press ctrl + c.
Wrap Up
In this tutorial, we have learnt how to spin up a local Kafka cluster in a Docker container. Additionally, we also did some basic operations, including creating a topic, and producing and consuming messages — all done via the Kafka command-line tools.
By now, we should all have a basic understanding of how Kafka works. I really hope that this inspires you to explore more and build event-driven applications with Kafka.
It’s not uncommon for a Kafka application to have DynamoDB coupled with it. For this reason, you might be interested in setting up a local instance of DynamoDB as well. You can check out this tutorial for that.
Also, check out Avro and Schema Registry. They are awesome tools that will allow you to define schemas that tell your producers and consumers how to serialise and deserialise (respectively) Kafka messages from and to your POJOs or data classes.