Maximum_memory error on Query execution (ERROR 1712 ER_MEMSQL_OOM) – SingleStore Support

{question}

What are the possible causes to get a 'maximum_memory' error while executing a query on a large dataset?

{question}

{answer}

At times, the engine may run out of memory while executing a complex query and may throw a 'maximum_memory' error such as:

ERROR 1712 ER_MEMSQL_OOM: Leaf Error (<>): Leaf Error (<>): Leaf Error (<>): Memory used by MemSQL (<> Mb) has reached the 'maximum_memory' setting (<> Mb) on this node. 
Possible causes include (1) available query execution memory has been used up for table memory (in use table memory: <> Mb) and 
(2) the query is large and complex and requires more query execution memory than is available (in use query execution memory <> Mb).

We can resolve this issue in multiple ways, all of which would reduce memory consumption by the query execution. You can consider trying some of the following items:

Ensure the table type is columnstore since columnstore tables write to the disk and not to memory.
Enable spilling to reduce the memory usage by 'HashGroupBy' operations(if any) by setting the global variable as SET GLOBAL enable_spilling = ON. You can control when to start spilling with the variables spilling_node_memory_threshold_ratio and spilling_query_operator_memory_threshold. This feature is available from SingleStore version 7.6 (partially 7.5) and above. Please note, since spilling is a global feature, other queries could also spill if the condition is met and may result in a longer execution time.
Change the shard key to match 'group by' columns. This helps the 'group by' to use up to #num_leaves times less memory.
Change the sort key to match 'group by' columns. This possibly makes the 'group by' streaming and using constant memory.
Change SUM(IF(..., 1, 0)) to COUNT(IF(..., 1, NULL)), and change SUM(X) to SUM(X :> double) if precision loss is ok. This is because Integer SUM uses 2.5x more memory than floating-point sum and 5x more memory than COUNT.
It's also possible to split the query into multiple queries, each computing a subset of aggregates at a time. This would need to be done carefully as it may come with a performance overhead, though.

You can use the following query to identify the 10 most consumed memory queries:

select act.Activity_name, act.database_name, q.query_text, act.memory_bs, act.memory_bs / (act.run_count + act.success_count) as memory_adjusted, act.network_b, act.disk_b, act.cpu_time_ms, act.last_finished_timestamp, act.run_count, act.success_count
from INFORMATION_SCHEMA.mv_activities_cumulative act
left join INFORMATION_SCHEMA.mv_queries q
on act.activity_name = q.activity_name
where act.activity_type = 'Query'
order by (act.memory_bs / (act.run_count + act.success_count)) desc
limit 10;

{answer}

Articles in this section

Related articles