{question}
How is OPTIMIZE TABLE different on a rowstore table from a columnstore table?
{question}
{answer}
The OPTIMIZE TABLE operation on a table helps manage table memory by compressing and releasing the unused data segments. It also sorts the data in the memory/disk, which helps in improving the performance of the queries.
SingleStore DB periodically runs optimization routines on tables, but they can also be executed manually. In addition, it supports the online OPTIMIZE TABLE, allowing concurrent read and writes operations while the table is being optimized. The online OPTIMIZE TABLE waits till all DML running on the table are complete.
The above functionalities are the same between Rowstore and Columnstore tables, but there are differences as well, listed below:
COLUMNSTORE TABLES:
Syntax : OPTIMIZE TABLE <table_name> [FULL | FLUSH];
Types of optimizations that can be performed on Columnstore tables as listed below:
- Without any arguments, i.e., if executed as ‘OPTIMIZE TABLE <table_name>;’, the manual merge operation is triggered. Manual merge is a foreground optimization, which uses multiple threads to complete the process as fast as possible, unlike automatic background optimization, which uses a single thread and is slow.
- With the FLUSH argument, Manual flush optimization is triggered. Though a regular background flush operation flushes rows in the in-memory rowstore-format row segment group into one or more columnstore-format row segment groups, invoking a manual flush releases memory sooner.
- With the FULL argument, Manual merge-full and manual flush operations will be triggered in the same order.
Manual merge-full optimization sorts the entire table data into a single row segment.
ROWSTORE TABLES:
Syntax : OPTIMIZE TABLE <table_name>;
- Rowstore tables can be optimized manually with only one syntax given above, i.e., without arguments.
- The FLUSH flag is not a valid option for rowstore tables.
- The FULL flag is not a valid option for rowstore tables; the syntax is accepted, but SingleStore DB will ignore it.
Note :
- Running frequent OPTIMIZE TABLE statements on a table, which has a lot of long-running queries on that table, disturbs the normal workload. It may experience some delay since it blocks other queries from starting while it waits for the completion of long-running queries.
{answer}