DataStore Aggregation Functions
DataStore provides comprehensive aggregation and window function support, leveraging ClickHouse's powerful SQL aggregation capabilities.
Basic Aggregations
Built-in Methods
| Method | SQL Equivalent | Description |
|---|---|---|
sum() | SUM() | Sum of values |
mean() | AVG() | Average/mean |
count() | COUNT() | Count non-null values |
min() | MIN() | Minimum value |
max() | MAX() | Maximum value |
median() | MEDIAN() | Median value |
std() | stddevPop() | Standard deviation |
var() | varPop() | Variance |
nunique() | COUNT(DISTINCT) | Count unique values |
Examples:
GroupBy Aggregations
Single Aggregation
Multiple Aggregations
Named Aggregations
Multiple GroupBy Keys
Statistical Aggregations
| Method | SQL Equivalent | Description |
|---|---|---|
quantile(q) | quantile(q) | q-th quantile (0-1) |
skew() | skewPop() | Skewness |
kurt() | kurtPop() | Kurtosis |
corr() | corr() | Correlation |
cov() | covar() | Covariance |
sem() | - | Standard error of mean |
Examples:
Conditional Aggregations
ClickHouse-specific conditional aggregation functions.
| Function | ClickHouse | Description |
|---|---|---|
sum_if(cond) | sumIf() | Sum where condition |
count_if(cond) | countIf() | Count where condition |
avg_if(cond) | avgIf() | Average where condition |
min_if(cond) | minIf() | Min where condition |
max_if(cond) | maxIf() | Max where condition |
Examples:
Collection Aggregations
ClickHouse-specific functions that collect values.
| Function | ClickHouse | Description |
|---|---|---|
group_array() | groupArray() | Collect into array |
group_uniq_array() | groupUniqArray() | Collect unique into array |
group_concat(sep) | groupConcat() | Concatenate strings |
top_k(n) | topK(n) | Top K frequent values |
any() | any() | Any value |
any_last() | anyLast() | Last value |
first_value() | first_value() | First value in order |
last_value() | last_value() | Last value in order |
Examples:
Window Functions
Ranking Functions
| Function | SQL | Description |
|---|---|---|
row_number() | ROW_NUMBER() | Sequential row number |
rank() | RANK() | Rank with gaps |
dense_rank() | DENSE_RANK() | Rank without gaps |
ntile(n) | NTILE(n) | Divide into n buckets |
percent_rank() | PERCENT_RANK() | Percentile rank (0-1) |
cume_dist() | CUME_DIST() | Cumulative distribution |
Examples:
Value Functions
| Function | SQL | Description |
|---|---|---|
lag(n) | LAG(col, n) | Previous row value |
lead(n) | LEAD(col, n) | Next row value |
first_value() | FIRST_VALUE() | First value in window |
last_value() | LAST_VALUE() | Last value in window |
nth_value(n) | NTH_VALUE(col, n) | Nth value in window |
Examples:
Cumulative Functions
| Method | Description |
|---|---|
cumsum() | Cumulative sum |
cummax() | Cumulative maximum |
cummin() | Cumulative minimum |
cumprod() | Cumulative product |
diff(n) | Difference from n rows back |
pct_change(n) | Percent change from n rows back |
Examples:
Rolling Windows
F Namespace
The F namespace provides access to ClickHouse functions.
Import
Using F Functions
F with Window Functions
Common Aggregation Patterns
Top N per Group
Running Total
Moving Average
Year-over-Year Comparison
Percentile Ranking
Aggregation Methods Summary
| Category | Methods |
|---|---|
| Basic | sum, mean, count, min, max, median |
| Statistical | std, var, quantile, skew, kurt, corr, cov |
| Conditional | sum_if, count_if, avg_if, min_if, max_if |
| Collection | group_array, group_uniq_array, group_concat, top_k |
| Ranking | row_number, rank, dense_rank, ntile, percent_rank |
| Value | lag, lead, first_value, last_value, nth_value |
| Cumulative | cumsum, cummax, cummin, cumprod, diff, pct_change |
| Rolling | rolling().mean/sum/std/..., expanding().mean/sum/... |