{question}
Why is it useful to enable core files on a healthy cluster?
{question}
{answer}
A core file is a recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally.
Core dumps are useful for understanding why a Singlestore node crashed. Often times a core dump will contain complete information about why a Singlestore node crashed. Some crash scenarios cannot be debugged without a core file. For this reason, Singlestore Support might ask you if there was a core file generated during the recent crash of a node, for further investigation. By default, core files are located in the current working directory of the process that crashed yet this configuration can be controlled by the /proc/sys/kernel/core_pattern
kernel parameter.
The amount of disk space used by a core file is roughly equal to the value of theTotal_server_memory
database variable at the time when the core dump was generated. Singlestore nodes can use 100s GB of memory so it's worthwhile to check to make sure there is adequate disk space to accommodate the core file before generating the core file. If there is not have enough disk space for core dumps the creation of a core dump can cause the host to run out of disk space.
To enable core file generation on the database side, add core_file = ON
to each memsql.cnf
file (core files are being enabled on the per-node basis) and restart the node so the change takes effect. Please refer to our core_file engine variable entry in our documentation for all values.
On the OS side, the limit of core file size
currently effective for the Singlestore OS processes should be set to unlimited
so the operating system allows the process to generate core files.
{answer}