{question}
How can I harden my cluster against mmap failures during query code generation?
{question}
{answer}
Overview
Linux uses mmap() to create new mappings in the virtual address space. When mmap is called during query code generation and SingleStore cannot allocate memory from the operating system to execute the query, this may cause the node to crash or OOM.
To confirm whether a node crash was potentially caused by mmap failure during query execution, check the memsql.log file of the crashed node.
For example:
6551550134535 2022-03-01 11:54:26.584 WARN: BackgroundMerger[0] Failed to allocate 123456 bytes of memory from the operating system (Error 12: Cannot allocate memory). This is usually due to a misconfigured operating system or virtualization technology. See https://docs.memsql.com/troubleshooting/latest/memory-errors.
6551550134584 2022-03-01 11:54:26.584 ERROR: BackgroundMerger[0] Nonfatal buffer manager memory allocation failure.
[libmemtrack.so (0x7f9bb900d446)] backtrace 0x56
[memsqld (0x446de15)] PrintCallStack(_IO_FILE*) 0x25
[memsqld (0x247ca17)] RegisterCrashReport 0xD7
/path/to/memsql-server/memsqld() [0x1df752a]
[libpthread.so.0 (0x7f9bb87cf630)] 0xF630
[libpthread.so.0 (0x7f9bb87cf4fb)] raise 0x2B
[memsqld (0x4499795)] MemsqlTracer::VTraceIfSettingsAllow(TraceLevel, TraceLevel, TraceLevel, char const*, char const*, __va_list_tag*) 0xF5
[memsqld (0x4499698)] MemsqlTracer::VTrace(TraceLevel, char const*, __va_list_tag*) 0x38
[memsqld (0x1df8e14)] TracerTrace(MemsqlTracer*, TraceLevel, char const*, ...) 0x74
[memsqld (0x25d720d)] LLVMEngine::CompiledUnit::ReserveAllocationSpace(unsigned long, unsigned long, unsigned long) 0x7D
[memsqld (0x37e7b11)] llvm::RuntimeDyldImpl::loadObjectImpl(llvm::object::ObjectFile const&) 0xAB1
[memsqld (0x37f9a87)] llvm::RuntimeDyldELF::loadObject(llvm::object::ObjectFile const&) 0x57
[memsqld (0x37e49f2)] llvm::RuntimeDyld::loadObject(llvm::object::ObjectFile const&) 0x42
[memsqld (0x25d7a2b)] LLVMEngine::CompiledUnit::Load(Mbc::Unit const*, bool, bool) 0x16B
[memsqld (0x25f4a18)] Mbc::Module::Load() 0x1D8
[memsqld (0x25f68e0)] Mbc::Module::LoadPersistentModule(char const*, char const*, char const*, InterpreterMode, CodeGenAllocator&, bool, Mbc::ModuleHandle&, IntrusivePtr<RCStringLock, std::allocator, std::default_delete<RCStringLock> >&) 0xB30
[memsqld (0x289384b)] GenMbcPlanInternal(GenMbcPlanContext&, bool, Mbc::Context*, AutoRefc<CompiledPlan, (Ref)1, false>&, PhaseTimer&) 0x70B
[memsqld (0x2895c4b)] MbcPersistentPlanCacheLookup(GenMbcPlanContext&, AutoRefc<CompiledPlan, (Ref)1, false>&) 0x3B
[memsqld (0x27f3eed)] MbcCompileInsert(QueryContext&, int, InsertQuery*, char const*, char const*, PlancacheParameterizedValues const&, AutoRefc<CompiledPlan, (Ref)1, false>&) 0x44D
[memsqld (0x2355fd6)] MemSqlCompileQuery(int, QueryContext&, MemSqlQuery*&, char const*, PlancacheParameterizedValues const&, bool, AutoRefc<CompiledPlan, (Ref)1, false>&, std::function<void ()>, PhaseTimer&, std::vector<TwoPartName, std::allocator<TwoPartName> >&) 0x2D6
[memsqld (0x2357540)] MemSqlParseAndCodeGen(int, QueryContext&, char const*, char const*, unsigned long, char const*, PlancacheParameterizedValues const&, MemSqlArrayParam*, unsigned long, bool, AutoRefc<CompiledPlan, (Ref)1, false>&, int, QueryType, std::function<void ()>, Vector<bool, TempIncrementalAllocator, 8u, false, int>*, PhaseTimer&) 0x2A0
[memsqld (0x2360f86)] MemsqlAutoParamExecute(QueryContext&, char*, unsigned int, char*&, EOQ_PACKET_MODE, QueryStats&, ConnectionTask&, int, int, bool&, bool&) 0x12A6
[memsqld (0x23f9398)] MemSqlExecute(RPC::Request::Accessor, char*, unsigned int, int, int, EOQ_PACKET_MODE, QueryContext&, bool) 0x3B8
[memsqld (0x233e87c)] HandleRequest(ConnectionContext*, char*, unsigned long%6551550134602 2022-03-01 11:59:50.388 INFO: Log opened
Note:
- The error "Failed to allocate xxxxxx bytes of memory from the operating system (Error 12: Cannot allocate memory). This is usually due to a misconfigured operating system or virtualization technology." indicates that SingleStore couldn't allocate memory from the operating system, which is then followed by the query code generation crash stack.
How to Harden a Cluster Against mmap
Failures
1. Set vm.overcommit_memory
to 0
SingleStore manages its own memory usage through the maximum_memory
setting. As long as the sum of maximum_memory
across all nodes on a host does not exceed 90% of the total available system memory, SingleStore will not over-allocate memory, and the kernel should not terminate it due to out-of-memory (OOM) conditions.
Allowing heuristic memory overcommit on the host (vm.overcommit_memory=0
) is safe under these conditions.
The virtual memory footprint of the memsqld
process includes plan cache files, blobs, and other mmap
-ed files. These files can be evicted from memory when unchanged, so virtual memory usage is not equivalent to physical memory pressure.
Allowing overcommit helps prevent mmap
from failing due to virtual address space exhaustion
2. Tune the Following vm
Settings
-
Set
vm.max_map_count = 1000000000
-
Set
vm.min_free_kbytes
to at least 1% of total physical memory -
Set
vm.swappiness
between 1 and 10 -
Set
vm.overcommit_memory = 0
If
vm.overcommit_memory
is set to 2 (not recommended), you must also:
Set
vm.overcommit_ratio = 99
Ensure that swap space is larger than total physical memory
Regardless of overcommit settings, swap should be at least 10% of total physical memory (unless you're intentionally setting vm.overcommit_memory
to a non-zero value, which is generally discouraged).
Documentation
Kernel
vm.overcommit
mmap
Redhat
Configuring System Memory Capacity
What is vm.overcommit_memory parameter?
SingleStore
System Requirements and Recommendations
{answer}