Undoubtedly, Apache Kafka is one of the most prominent pieces of software existing in today’s distributed architectures, from cloud-providers supporting it to on-premise setups. It can be used to store and process millions of messages per second, so it makes it a perfect fit for distributed real-time data processing solutions. Although it provides very advanced features like stream-processing and horizontal scalability, it is very easy to use: you just have to send your messages to it (produce) and later pick them up (consume) for further processing. It has all the features you need from a modern distributed data store: communication interface, persistence, scalability, streaming and management. Once developed at LinkedIn for data processing, later donated to the Apache Software Foundation. I do highly recommend you to have a look at its documentation.

k6 is a free and open-source load testing tool built with unit-testing in mind, but for performance. It can produce hundreds of messages per second, as a result of testing a system or platform. These messages can be sent to a number of different destinations, like a JSON file or Apache Kafka, which works out of the box with zero dependency, other than the Apache Kafka setup itself.

To be able to send messages from k6 to Apache Kafka, we’d go with a very simple setup, with only one instance. In our scenario, k6 would act as a simple producer of JSON messages and we’ll try to consume message on terminal, using the built-in simple kafka consumer. Here I will use the official binary provided by Apache and would leave the docker setup up to you to try. To give you a clue, using Lenses.io’s fast-data-dev, you can have a complete Docker environment with Kafka, ZooKeeper, Schema Registry, Kafka-Connect, Landoop Tools and more than 20 connectors in an easy to use package.

Since we’re using the binary package, no special installation is needed, and you can directly invoke the binary from the command line. I have used Debian GNU/Linux 10.2 to run these commands. Simply follow the instructions below:

1. Download and extract Apache Kafka platform binaries

The current version as of this writing is 2.3.0, but you can choose your own desired version.

wget http://apache.mirrors.spacedump.net/kafka/2.3.0/kafka_2.12-2.3.0.tgz

$ tar xvf kafka_2.12-2.3.0.tgz

$ cd kafka_2.12-2.3.0

2. Start Apache ZooKeeper and Kafka

As you can see, when you start the ZooKeeper, it will start listening on all interfaces (0.0.0.0) on port 2181.

$ bin/zookeeper-server-start.sh config/zookeeper.properties

Now start Kafka server in a new terminal. It will connect to your local ZooKeeper instance on port 2181 and will start listening for new connections on port 9092.

$ bin/kafka-server-start.sh config/server.properties

[2019-11-15 12:43:54,672] INFO Connecting to zookeeper on localhost:2181 (kafka.server.KafkaServer)

...

[2019-11-15 12:43:55,366] INFO Awaiting socket connections on 0.0.0.0:9092. (kafka.network.Acceptor)

...

[2019-11-15 12:43:55,666] INFO [KafkaServer id=0] started (kafka.server.KafkaServer)

3. Create a topic to send k6 data to

On another terminal, create a topic, for example k6-output, to receive k6 messages on. This topic would be a single partition topic with no replication.

$ bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --topic k6-output

4. Start k6 with Kafka output

Using the following script in a file named scenario.js, we’ll start k6 to send its output to Kafka:

$ k6 run --logformat=raw --out kafka=brokers=localhost:9092,topic=k6-output,format=json scenario.js

5. Consume message on a terminal

To view message of k6 output on your terminal, use the following built-in consumer:

$ bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic k6-output

As you can see, we’ve sent our message with JSON format to Kafka, so you can write a simple JSON consumer in your own desired programming language (e.g. Python) and start consuming and using the data provided by k6 to Kafka. There are other possibilities, like using platforms that can import or consume data from Apache Kafka, like Apache Spark or Kafka Connect from Confluent Platform, to name a few.

You can also use visualization platforms that has support for Kafka as input. I recommend reading the following articles about it:

Just Enough Kafka for the Elastic Stack, Part 1

Just Enough Kafka for the Elastic Stack, Part 2

Guest Blog Post: How the k6 Load Testing Tool Is Leveraging Grafana

I hope you’ve found this article helpful and were able to send your k6 output to Apache Kafka, an industry-proven technology.

If you have any suggestions or comments, I really like to hear from you.