{question}
I do not see an error when a column from the SELECT list is missing in the GROUP BY clause. How can I enable an error or warning to prevent incorrect aggregation results?
{question}
{answer}
By default, SingleStore DB does not issue an error or warning when columns in the list are not included in the clause. This behaviour can potentially result in incorrect or misleading aggregated results.
To enforce stricter SQL standards and ensure the accuracy of aggregation queries, you can enable the ONLY_FULL_GROUP_BY mode using the SQL_MODE system variable.
By default,SQL_MODEis set asSTRICT_ALL_TABLES.
STRICT_ALL_TABLES: always enabled, regardless of whether other modes are enabled or not.ONLY_FULL_GROUP_BY: throws an error when fields in theSELECTlist,HAVINGcondition, orORDER BYlist are not in aGROUP BYclause, or are outside of an aggregate function. The result set could include multiple values for the non-grouped, non-aggregated field.
Below is an example of the scenario.
CREATE TABLE tab_details_groupby(tab_name varchar(30), col_name varchar(30), datatype varchar(15));
INSERT INTO tab_details_groupby values('tab1','col1','INT'),
('tab2','col2','CHAR'),
('tab1','col1','VARCHAR'),
('tab2','col2','VARCHAR'),
('tab3','col3','INT');
No issues when all the columns in the SELECT list are part of the GROUP BY clause.
SELECT tab_name, col_name, datatype
FROM tab_details_groupby
GROUP BY tab_name, col_name, datatype
ORDER By 1;
We still get the results without any error/warning when:
Scenario 1: A column name from the SELECT list is missed in the GROUP BY clause.
SELECT tab_name, col_name, datatype
FROM tab_details_groupby
GROUP BY tab_name, col_name
ORDER By 1;
Scenario 2: When the column name in the ORDER BY clause is not part of the GROUP BY clause.
SELECT tab_name, col_name
FROM tab_details_groupby
GROUP BY tab_name, col_name
ORDER By datatype;
SQL_MODEtoONLY_FULL_GROUP_BYglobally, we get the required warning as follows:ERROR 2295 ER_NONAGGREGATED_NONGROUPED_FIELD: Column '<table_name.column_name>' from project list is used outside an aggregate function and does not appear in the GROUP BY clause. This is illegal because there may be multiple possible values for the field. If it is in fact uniquely determined, or if you wish to use one of the many possible values, you can use the ANY_VALUE aggregate function.
Below are the above-mentioned scenarios:
Scenario 1: A column name from the SELECT list is missed in the GROUP BY clause.
SELECT tab_name, col_name, datatype
FROM tab_details_groupby
GROUP BY tab_name, col_name
ORDER By 1;
Scenario 2: When the column name in the ORDER BY clause is not part of the GROUP BY clause.
SELECT tab_name, col_name
FROM tab_details_groupby
GROUP BY tab_name, col_name
ORDER By datatype;
{answer}