DISABLE_STREAMING_PREAGGREGATIONS Query Option (Impala 2.5 or higher only)
Turns off the "streaming preaggregation" optimization that is available in Impala 2.5
and higher. This optimization reduces unnecessary work performed by queries that perform aggregation
operations on columns with few or no duplicate values, for example DISTINCT id_column
or GROUP BY unique_column
. If the optimization causes regressions in
existing queries that use aggregation functions, you can turn it off as needed by setting this query option.
Type: Boolean; recognized values are 1 and 0, or true
and
false
; any other value interpreted as false
Default: false
(shown as 0 in output of SET
statement)
true
is not recognized. This limitation is tracked by the issue
IMPALA-3334, which shows the releases where the
problem is fixed.
Usage notes:
Typically, queries that would require enabling this option involve very large numbers of aggregated values, such as a billion or more distinct keys being processed on each worker node.
Added in: Impala 2.5.0