KSQL is the powerful SQL streaming engine for Apache Kafka®. Using SQL statements, you can build powerful stream processing applications. In this article, we’ll see how to troubleshoot some of the common issues that people encounter with KSQL. More advanced topics such as digging deeper into the internals for more advanced troubleshooting will be covered by a subsequent article.

You can use this article as a reference, or you can follow along with the code examples to try them out as we go. Using Docker and Docker Compose, we can easily provision an environment in which we explore and try out the different techniques and tools. The environment includes a data generator sending continuous stream of events into a Kafka topic that we will use for testing. You can find all the necessary code on GitHub .

So, now let’s dive in and start exploring what to do when things aren’t working…

Now when you run a SELECT , KSQL will return the data from the beginning of the topic. The SELECT will still run continuously, so if there is new data arriving, you’ll see that—and if there isn’t, the SELECT will just hang and wait for new data (or for you to cancel the query).

Oftentimes—and particularly in testing and development—you’ll want to read the data that already exists in a topic. To tell KSQL to do this, change the offset property:

When KSQL reads data from a topic, it will by default read from the latest offset—that is to say, only new messages arriving in the topic after the topic is registered in KSQL will be read.

Kafka is an immutable log of events, and data is persisted according to the retention settings. When an application reads data from a Kafka topic, the data remains in place, but the offset in the log at which that particular application has read up to is recorded. Another application can read the same data from the same topic with complete independence from the first. The main thing is that there is a log of data, and consuming applications choose the point on the log at which they want to start reading.

It turns out that there are thousands of messages in the topic. But, by default, KSQL reads from the end of a topic, and no new messages are being written in the topic. As soon as new messages are sent to the topic, the SELECT returns results.

And use kafkacat to check if there’s any data in it:

Like last time, first verify the source topic for the stream that we’re querying:

As before, try to query the stream, and you’ll find that there’s no data being returned:

Let’s fix our stream and use the ratings topic (with an s this time). Register it with KSQL, dropping the previous version first:

The -o beginning argument tells kafkacat to go back to the beginning of the topic, and -C to read (consume) the messages. Reached end of topic shows that there’s no data to read. No data means that KSQL isn’t going to be showing anything in the output of a SELECT .

Our source topic is ratingz . Now, we will step away from KSQL and use another Kafka consumer to verify if there is data in the topic. My preferred tool here is kafkacat , but you can use other tools, such as kafka-console-consumer , if you like. Invoking kafkacat shows:

So, using DESCRIBE EXTENDED , we can verify the topic from which the stream is sourced:

First, we need to confirm which Kafka topic we’re using to drive the stream. We think we know (we just ran the CREATE STREAM , right?), but as with any good troubleshooting approach the key is methodically working through the possibilities.

No data comes back—time to do some detective work. To exit from the running SELECT statement, press Ctrl-C.

To start with, we believe we’ve got data in a topic called ratingz , so we register it with KSQL as a STREAM :

Let’s look at each of these in turn and how to diagnose them.

The answer usually comes down to one of five main reasons (the first four of which are variations on a theme):

…nothing happens. And because KSQL queries are continuous, your KSQL session appears to “hang.” That’s because KSQL is continuing to wait for any new messages to show you. So if your run a KSQL SELECT and get no results back, what could be the reasons for that?

That is, you’ve run a CREATE STREAM , but when you go to query it…

Probably the most common question in the Confluent Community Slack group’s #ksql channel is:

4. Data is read from the topic, but none of it matches the predicate specified

Since KSQL is based on SQL, you can restrict the data you want to return using predicates. These include WHERE and HAVING , as well as by implication JOIN .

If your query is not returning data and you think it should, and you’ve followed the other advice in this article already, then take a standard troubleshooting course of methodically stripping back its components until it does.

Consider this example:

SELECT * FROM RATINGS WHERE STARS > 5

But if no star ratings are above 5, no data will be returned. To debug this, perhaps remove the predicate and bring in the STARS column to verify the data:

ksql> SELECT STARS FROM RATINGS; 3 3 2 1

Or, use KSQL’s aggregate functionality to look at the maximum value:

ksql> SELECT CHANNEL, MAX(STARS) FROM RATINGS GROUP BY CHANNEL; android | 2 iOS | 4 android | 2

From just a sample of the data, it looks like the issue here could be that STARS is never greater than 5, and thus the SELECT in the example above will never return data.

5. Deserialization errors in reading the data

Data in Kafka is just bytes. It’s up to the producer how it serializes the source message; the consumer (which is KSQL here) needs to deserialize using the same method. Common serialization formats include Avro, JSON, etc.

If KSQL cannot deserialize message data, it will not write anything to the SELECT results. If this happens, you could have checked the three situations above and ruled them out but still not have any data returned to your SELECT .

Here’s a simple example using one of the existing internal topics called _confluent-metrics . Let’s register it using a fictional schema that we believe to be correct for the purposes of this example, and declare the serialization format of the message values to be JSON (Tip: It’s not JSON!).

CREATE STREAM METRICS (col1 int, col2 int, col3 varchar) \ WITH (KAFKA_TOPIC='_confluent-metrics', VALUE_FORMAT='JSON');

Taking the lesson from above, set the offset to earliest so that we will definitely pull all the messages, and run a SELECT :

ksql> SET 'auto.offset.reset'='earliest'; Successfully changed local property 'auto.offset.reset' from 'earliest' to 'earliest' ksql> SELECT * FROM METRICS;

So … no results coming back. Let’s go through the checklist. We can check off the offset already, as we’ve specifically set that.

What topic are we querying? ksql> DESCRIBE EXTENDED METRICS; [...] Kafka topic : _confluent-metrics (partitions: 12, replication: 1) Is there any data in it? $ docker run --network ksql-troubleshooting_default --tty --interactive --rm \ confluentinc/cp-kafkacat \ kafkacat -b kafka:29092 \ -C -t _confluent-metrics \ -o beginning -c 1 (1) ���,� kafka.logSizeLog"$partition.9.topic.__consumer_offsets*Akafka.log:type=Log,name=Size,topic=__consumer_offsets,partition=90� kafka.logSizeLog"$partition.8.topic.__consumer_offsets*Akafka.log:type=Log,name=Size,topic=__consumer_offsets,partition=80� kafka.logSizeLog"$partition.7.topic.__consumer_offsets*Akafka.log:type=Log,name=Size,topic=__consumer_offsets,partition=70� kafka.logSizeLog"$partition.6.topic.__consumer_offsets*Akafka.log:type=Log,name=Size,topic=__consumer_offsets,partition=60� [...] (1) The -c 1 argument tells kafkacat to just return the one message and then exit.

Per the checklist, there is data, we’re querying the correct topic and have set the offset back to the beginning … but why isn’t KSQL returning data?

Well, the data we can see from the output of kafkacat clearly isn’t JSON, which is what we declared in the CREATE STREAM command. If we go to the KSQL server log file, you’ll see a whole bunch of these deserialization errors:

[2018-09-17 12:29:09,929] WARN task [0_10] Skipping record due to deserialization error. topic=[_confluent-metrics] partition=[10] offset=[70] (org.apache.kafka.streams.processor.internals.RecordDeserializer:86) org.apache.kafka.common.errors.SerializationException: KsqlJsonDeserializer failed to deserialize data for topic: _confluent-metrics Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected character ((CTRL-CHAR, code 127)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: (byte[])�����,� � [...] [truncated 1544 bytes]; line: 1, column: 2] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1804) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:669) at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:567) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2624) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:826) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:723) at com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4042) at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2571) at io.confluent.ksql.serde.json.KsqlJsonDeserializer.getGenericRow(KsqlJsonDeserializer.java:88) at io.confluent.ksql.serde.json.KsqlJsonDeserializer.deserialize(KsqlJsonDeserializer.java:77) at io.confluent.ksql.serde.json.KsqlJsonDeserializer.deserialize(KsqlJsonDeserializer.java:45) at org.apache.kafka.common.serialization.ExtendedDeserializer$Wrapper.deserialize(ExtendedDeserializer.java:65) [...]

You can see from the stack trace that it’s using the JSON deserializer (as you’d expect, given our VALUE_FORMAT configuration). You can also see from the sample message ( [Source: (byte[])�����,� � ) in the log output that the JsonParseException cites that the message clearly isn’t JSON.

ℹ️ If you hit this problem, then you need to synchronize your serialization and deserialization formats. KSQL supports delimited (CSV), JSON or Avro. Protobuf users can check out KLIP-0, which proposes adding Protobuf support to KSQL.

Not all of the messages from my topic are shown in KSQL Our previous example referred to cases where no messages are being returned, but you may also see cases where only some of the messages are shown, and it could be the same root cause—serialization—but with a different slant. Instead of simply getting your serialization format wrong, maybe you chose the right serialization format, but some malformed messages exist on the topic. In this simple example, we’ll put some data onto a new topic, using JSON but with some malformed messages: docker run --interactive --rm --network ksql-troubleshooting_default \ confluentinc/cp-kafkacat \ kafkacat -b kafka:29092 \ -t dummy_topic \ -P <<EOF {"col1":1,"col2":16000} {"col1":2,"col2:42000} {"col1":3,"col2":94000} EOF Note that the second message is invalid JSON, as it’s missing a " after the field name ( col2 ). Next, register the topic in KSQL: ksql> CREATE STREAM DUMMY (COL1 INT, COL2 VARCHAR) \ WITH (KAFKA_TOPIC='dummy_topic', VALUE_FORMAT='JSON'); Message ---------------- Stream created ---------------- Set the offset to earliest so that we will definitely pull all the messages, and then run a SELECT : ksql> SET 'auto.offset.reset'='earliest'; Successfully changed local property 'auto.offset.reset' from 'none' to 'earliest' ksql> SELECT * FROM DUMMY; 1537186945005 | null | 1 | 16000 1537186945005 | null | 3 | 94000 Note that we only get two messages even though there are three on the topic. If you check out the KSQL server log, you’ll see: [2018-09-17 13:03:13,662] WARN task [0_0] Skipping record due to deserialization error. topic=[dummy_topic] partition=[0] offset=[1] (org.apache.kafka.streams.processor.internals.RecordDeserializer:86) org.apache.kafka.common.errors.SerializationException: KsqlJsonDeserializer failed to deserialize data for topic: dummy_topic Caused by: com.fasterxml.jackson.core.io.JsonEOFException: Unexpected end-of-input in field name at [Source: (byte[])"{"col1":2,"col2:42000}"; line: 1, column: 45] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:594) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.parseEscapedName(UTF8StreamJsonParser.java:1956) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.slowParseName(UTF8StreamJsonParser.java:1861) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._parseName(UTF8StreamJsonParser.java:1645) at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextFieldName(UTF8StreamJsonParser.java:999) at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:247) at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:68) at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:15) Note the partition and offset shown in the error message ( partition=[0] offset=[1] ). Head back to the ever-versatile kafkacat and run the following: docker run --network ksql-troubleshooting_default \ --tty --interactive --rm \ confluentinc/cp-kafkacat \ kafkacat -b kafka:29092 -C -K: \ -f '

Key: %k\t

Value: %s

\Partition: %p\tOffset: %o

--

' \ (1) -t dummy_topic \ -o 1 \ (2) -p 0 \ (3) -c 1 (4) (1) -f to format the output and show some nice metadata

(2) -o 1 start at offset 1

(3) -p 0 read from partition 0

(4) -c 1 consume just one message The output is: Key: Value: {"col1":2,"col2:42000} Partition: 0 Offset: 1 -- This shows us, if we were in any doubt, that the message value is not valid JSON and thus can’t be consumed by KSQL.