SingleStore 7.3 release announcement – SingleStore Support

SingleStore 7.3 was released on December 15th, 2020. Below you can find the release highlights. You can see the full feature list on our Release Notes page.

Release Highlights

The SingleStore DB 7.3 release is focused on storage, query processing, and programmability. As Universal Storage evolves, useful features such as columnstore as a default and support for upserts into columnstore tables have been added. Other highlights include data definition language (DDL) forwarding from child to master aggregator, a new, safer command for promoting a child to master aggregator, support for large queries that join over 30 tables, and a variety of new engine variables.

System of Record

Added three information schema views: mv_aggregated_replication_status, mv_replication_status, and lmv_replication_status to monitor the progress of a DR replication. Users can now view the aggregated replication status of each database (including partition-level details) and replication links between the primary and the secondary cluster to know if there is any lag in replication, and view statistics related to the lag.
Added a new column, type, to the Backup History Table. This column shows the type of backup and can be accessed by querying the information_schema.MV_BACKUP_HISTORY table.
Tables that define a unique key using UNENFORCED now have an INDEX_TYPE of NONE in the information schema. The INDEX_TYPE was listed as BTREE in previous versions.

Storage

Implemented forwarding of data definition language (DDL) commands from child to master aggregator. Previously, these commands could only be run on a master aggregator. See Node Requirements for SingleStore DB Commands for more information about how to enable this feature.
Database-level DDL and clustering operations are now allowed to run in parallel across databases.
Added new command REBALANCE ALL DATABASES, which rebalances the partitions on all databases in the cluster.
Added the FULL option to REBALANCE PARTITIONS which takes effect when the number of partitions in the database is not divisible by the number of leaves. The extra partitions are placed on the leaves containing the fewest number of partitions.
Added new command PROMOTE AGGREGATOR … TO MASTER for all use cases that require promotion of a child aggregator to master, aside from permanent loss of the master aggregator.
Added the information schema view information_schema.MV_BACKUP_STATUS for monitoring backup progress.

Universal Storage

Added support for INSERT … ON DUPLICATE KEY UPDATE, INSERT … IGNORE, and REPLACE on columnstore tables.
Added the columnstore as default feature, which allows you to create a columnstore table using standard CREATE TABLE syntax.
Added support for the LOAD DATA ... [REPLACE | IGNORE | SKIP { ALL | CONSTRAINT | DUPLICATE KEY } ERRORS] semantics for ingesting data into columnstore tables with unique keys. These semantics allow duplicate keys to be handled without returning an error to the client application. See example 10 in LOAD DATA.
Added support for upserts on columnstore tables using Pipelines.

Query Optimization

Improved optimization of queries with large numbers of tables being joined. Join optimization is now significantly faster and adaptively handles very large join sizes to provide good execution plans while keeping query optimization time low. The engine variable distributed_optimizer_max_join_size is now deprecated and replaced by new variables distributed_optimizer_unrestricted_search_threshold, distributed_optimizer_min_join_size_run_initial_heuristics, and singlebox_optimizer_cost_based_threshold - see their descriptions in the Sync Variables List for further information on configuring these variables.
Added a new engine variableprofile_for_debug which can be used to enable collection of additional data with PROFILE that can be displayed using SHOW PROFILE JSON and is useful for troubleshooting query optimizer issues. For more information, see PROFILE.
Improved selectivity estimation by using sampling and histograms together, when both are available. This improvement only applies when cardinality_estimation_level is set to 7.3 or higher. By default, cardinality_estimation_level is set to 7.1.
Decreased query optimization cost of lookups of query plans from the on-disk plancache.

Query Execution

Decreased the in-memory size of query plans by up to 80%.
Implemented optimizations for system information schema queries, resulting in significant performance increases for tables such as index_statistics, column_statistics, and columnar_segments in particular.
Added support for EXPLAIN and PROFILE queries in stored procedures.

Usability and Programmability

Added a new aggregate function APPROX_PERCENTILE that calculates the approximate percentile and is about 10 times faster than the PERCENTILE_DISC and PERCENTILE_CONT functions.
The USING clause of query text is no longer captured as part of audit logging. It is now included in the output of SHOW PROCESSLIST in a new column titled RPC Info.
Added a new JSON function, JSON_AGG, that aggregates values as a JSON array.
- Added a new variable, json_agg_max_len, which is the maximum string length JSON_AGG can return in bytes.
  
  For more information, see the Non-Sync Variables List.
The ALTER permission is no longer required for ANALYZE (SELECT and either ALTER or INSERT are required).
Added support for defining User-Defined Variables, via SELECT INTO @varname.

Ingest

Added support for publishing data to Google Cloud Storage (GCS) via SELECT … INTO GCS.
Added support for specifying the chunk size while uploading data to an Amazon S3 bucket via SELECT … INTO S3 to enable output of very large files.
SingleStore Pipelines now supports new Avro schema evolution capabilities. Hostname or IP address of the schema registry can be specified at the time of Pipeline creation; any changes to the schema can be easily viewed by the Pipelines. If fields are added to the Avro schema, the Pipelines can be modified without stopping and losing any offsets.
Added new option, max_retries_per_batch_partition, for Filesystem Pipeline Syntax and ALTER PIPELINE. When set, this determines the number of retries that will be attempted for writing batch partition data to the destination table. Specifying fewer retries when there is a large amount of data to load can be useful for conserving resources; the reverse is true for increasing the number of retries for smaller tables where performance is less of a concern.
Added a new option to the CONFIG clause of CREATE PIPELINE for Filesystem pipelines called process_zero_byte_files. Enabling this option ensures zero byte files are processed. Otherwise they are skipped by default.

Articles in this section