{question}
How do I tell if there is pipeline lag?
{question}
{answer}
In this article, we are going to discuss determining the lag in the pipeline.
Approximate pipeline lag can be identified from the table information_schema.PIPELINES_CURSORS.
Lag is the difference between the latest_cursor and the current_cursor.
cursor_offset - latest_offset = Lag
Lag information can be gathered with the below SQL statement:
select database_name, pipeline_name, sum(cursor_offset - latest_offset) from information_schema.pipelines_cursors group by 1,2;
One thing to note is that it is only an approximate indicator of lag because there might be more than the mentioned messages available. We check for available messages at periodic intervals (controlled by the batch_interval clause of create pipeline). So it's possible for there to be recently uploaded messages that we haven't yet found out about and thus can't report in pipelines_cursors.
Example:
> select database_name, pipeline_name, sum(cursor_offset - latest_offset) from information_schema.pipelines_cursors group by 1,2
+--------------------------------------------------------------------------------------------+
| database_name | pipeline_name | sum(cursor_offset - latest...|
+--------------------------------------------------------------------------------------------+
| db_1 | pipeline_load_1 | -250 |
| db_2 | pipeline_load_2 | 0 |
+--------------------------------------------------------------------------------------------+
The above output means that there were at least 250 messages in Kafka, which we hadn't yet consumed at the time of the lag check query.
As mentioned above, this is only an approximate indicator of lag because there might be more than 250 messages available. We check for available messages at periodic intervals (controlled by the batch_interval clause of create pipeline). So it's possible for there to be recently uploaded messages that we haven't yet found out about and thus can't report in pipelines_cursors.
Click here to learn more about information_schema tables related to pipelines.
{answer}