Understanding Run Count Calculation in Workload Monitoring – SingleStore Support

{question}

What does the run count calculation in workload monitoring signify?

{question}

{answer}

Overview

In SingleStore Studio’s Workload Monitoring, the Run Count is an important metric that helps track the execution of activities. When expanding an activity, users can see Run Count per node, and in some cases, a higher Run Count may appear on specific Leaf nodes.

This article explains how the Run Count is calculated and offers insights into the query used for retrieval data.

How is Run Count Calculated?

The Run Count is determined by summing the following values:

• run_count

• success_count

• failure_count

These values are retrieved from the mv_activities_extended_cumulative table in the information_schema database.

Formula:

SELECT SUM(run_count + success_count + failure_count) 
FROM information_schema.mv_activities_extended_cumulative;

How is the Data Retrieved?

Two queries are executed against mv_activities_extended_cumulative:
One at the start of the recording.
One at the end of the recording.
A delta calculation is conducted between the results to ascertain the final values.
The final Run Count represents the most recent snapshot of execution, while Success Count and Failure Count are calculated using:

finalAct.successCount = monus(finalAct.successCount, startAct.successCount);
finalAct.failureCount = monus(finalAct.failureCount, startAct.failureCount);

Query for Retrieving the Run Count

The following query is executed to collect activity data, including the Run Count:

SELECT
    a.NODE_ID AS nodeId,
    a.ACTIVITY_TYPE AS activityType,
    a.ACTIVITY_NAME AS activityName,
    a.AGGREGATOR_ACTIVITY_NAME AS aggregatorActivityName,
    a.DATABASE_NAME AS databaseName,
    a.PARTITION_ID AS partitionId,
    q.QUERY_TEXT AS queryText,

    IFNULL(a.RUN_COUNT, 0) AS runCount,
    IFNULL(a.SUCCESS_COUNT, 0) AS successCount,
    IFNULL(a.FAILURE_COUNT, 0) AS failureCount,

    IFNULL(a.CPU_TIME_MS, 0) AS cpuTimeMs,
    IFNULL(a.CPU_WAIT_TIME_MS, 0) AS cpuWaitTimeMs,
    IFNULL(a.ELAPSED_TIME_MS, 0) AS elapsedTimeMs,

    IFNULL(a.MEMORY_BS, 0) AS memoryBs,
    IFNULL(a.MEMORY_MAJOR_FAULTS, 0) AS memoryMajorFaults,

    IFNULL(a.LOCK_TIME_MS, 0) AS lockTimeMs,
    IFNULL(a.LOCK_ROW_TIME_MS, 0) AS lockRowTimeMs,

    IFNULL(a.LOG_BUFFER_TIME_MS, 0) AS logBufferTimeMs,
    IFNULL(a.LOG_FLUSH_TIME_MS, 0) AS logFlushTimeMs,
    IFNULL(a.LOG_BUFFER_WRITE_B, 0) AS logBufferWriteB,

    ( IFNULL(a.NETWORK_LOGICAL_RECV_B, 0) + IFNULL(a.NETWORK_LOGICAL_SEND_B, 0) ) AS networkB,
    IFNULL(a.NETWORK_TIME_MS, 0) AS networkTimeMs,
    IFNULL(a.NETWORK_LOGICAL_RECV_B, 0) AS networkLogicalRecvB,
    IFNULL(a.NETWORK_LOGICAL_SEND_B, 0) AS networkLogicalSendB,

    ( IFNULL(a.DISK_LOGICAL_READ_B, 0) + IFNULL(a.DISK_LOGICAL_WRITE_B, 0) + IFNULL(a.LOG_BUFFER_WRITE_B, 0) ) AS diskB,
    IFNULL(a.DISK_TIME_MS, 0) AS diskTimeMs,
    IFNULL(a.DISK_LOGICAL_READ_B, 0) AS diskLogicalReadB,
    IFNULL(a.DISK_LOGICAL_WRITE_B, 0) AS diskLogicalWriteB,
    IFNULL(a.DISK_PHYSICAL_READ_B, 0) AS diskPhysicalReadB,
    IFNULL(a.DISK_PHYSICAL_WRITE_B, 0) AS diskPhysicalWriteB
FROM
    information_schema.mv_activities_extended_cumulative a
LEFT JOIN information_schema.mv_queries q
    ON a.ACTIVITY_NAME = q.ACTIVITY_NAME;

After retrieving the data, the calculation for the Run Count is performed as follows:

return row.runCount
    .plus(row.successCount)
    .plus(row.failureCount)
    .toNumber();

Why Does One Leaf Node Show a Higher Run Count?

Since the Run Count is derived from the final snapshot and encompasses all executions, one node may display a higher Run Count due to:

More queries are executed on that node.
Uneven workload distribution among leaf nodes.
Increased retry counts because of temporary failures.

Recap:

The Run Count in SingleStore Studio’s Workload Monitoring is calculated from the mv_activities_extended_cumulative table and represents the total of runs, successes, and failures in the final snapshot. The variation in Run Count across nodes can be attributed to workload distribution and retries.

{answer}

Articles in this section

Overview

How is Run Count Calculated?

Why Does One Leaf Node Show a Higher Run Count?

Related articles