{question}
Why isn’t my auto_increment column sequential?
{question}
{answer}
The auto_increment
values in sharded tables in SingleStore are range-partitioned by the aggregator, in order to ensure fast performance.
One of the inherent challenges in designing a high-performance distributed database is ensuring the uniqueness of data (such as primary keys or auto-incrementing values). This is because uniqueness is a global property
: a unique data value on one particular node in the database that cannot overlap with other data on any node. When inserting unique values, this would create potentially significant performance degradation if every node had to query every other node in the cluster in order to guarantee uniqueness.
SingleStore avoids this issue with auto_increment
data by keeping track of the unique auto_increment
value on the aggregator node on which the insert is performed. This saves the leaf nodes from having to do any work to ensure uniqueness, because it is guaranteed by the aggregator beforehand. However, since a cluster may have multiple aggregators that "do not know" about each other's auto_increment
values, the range of auto_increment
values used by each aggregator is partitioned to eliminate the possibility of inserts on different aggregators attempting to generate the same auto_increment
value.
In practice, this is implemented by requiring a BIGINT data type for auto_increment
tables, and then using the high 14 bits to encode an aggregator ID and the bottom 50 bits for the actual auto_increment
value. The latter still allows 1,125,899,906,842,622 (about 10^15) unique values for each aggregator, and is why the value will appear to "jump around" when using multiple aggregators for inserts (the master aggregator will use values from 1 to 1125899906842623, the first child aggregator from 1125899906842624 to 2251799813685246, etc).
A notable exception to this behavior is in Reference tables, which are not sharded and so exhibit "normal" auto_increment
behavior.
{answer}