Skip to main content
The Aggregate transformation component lets you summarize multiple rows of data in one output row by grouping or aggregating the input data. You can aggregate one or more columns at the same time. By default, the aggregation is output in a new column with a name that combines the source column name and the aggregation type—for example, avg_Cost. You can use a Rename component immediately after the Aggregate component to change these names if necessary.

Use case

This component can be used to create useful metrics and KPIs. For example, you can use it to:
  • Calculate a total. For example, use the Sum aggregation to calculate the total sales per product type from a dataset containing information about all purchases.
  • Calculate averages. For example, use the Average aggregation to find the average transaction amount per customer account from a dataset containing all customer transactions.
  • Determine minimum and maximum values. For example, use the Max aggregation type to find users’ most recent logins from a dataset containing all logins in the past month.

Properties

Name
string
required
A human-readable name for the component.

Groupings
dual listbox
Select one or more columns from the source table that will form the groupings. The output flow will contain one row for each distinct combination of values within the grouping columns.

Aggregations
column editor
required
Select the input (source) columns and the aggregation types that should be applied to each. Note that certain aggregation types expect certain data types. For example, you cannot average non-numeric data.
  • Column Name: Select the input (source) column for the summary function.
  • Aggregate Type: The summary function to use for the column. Available functions include:
    • Any Value
    • Approximate Count
    • Approximate Count Distinct (HLL)
    • Array Aggregate
    • Array Aggregate Distinct
    • Average
    • Bit AND Aggregate
    • Bit OR Aggregate
    • Bit XOR Aggregate
    • Bool AND Aggregate
    • Bool OR Aggregate
    • Bool XOR Aggregate
    • Count
    • Count Distinct
    • Hash Aggregate
    • List Aggregate
    • List Aggregate Distinct
    • Max
    • Median
    • Min
    • Mode
    • Skew
    • Standard Deviation - Population
    • Standard Deviation - Sample
    • Sum
    • Variance - Population
    • Variance - Sample
For more information on the Aggregation Types, read the Snowflake Aggregate Functions documentation.

Grouping Type
drop-down
required
Select one of the following methods for grouping rows:
  • GROUP_BY: Groups rows with the same group-by-item expressions and computes aggregate functions for the resulting group. A Snowflake GROUP_BY expression can be a column name, a number referencing a position in the SELECT list, or a general expression.
  • GROUP_BY_CUBE: The Snowflake GROUP_BY_CUBE clause is an extension of the GROUP_BY clause similar to GROUP_BY_ROLLUP. In addition to producing all the rows of a GROUP_BY_ROLLUP, GROUP_BY_CUBE adds all the “cross-tabulations” rows. Subtotal rows are rows that further aggregate, whose values are derived by computing the same aggregate functions that were used to produce the grouped rows.
  • GROUP_BY_GROUPING_SETS: A Snowflake GROUP_BY_GROUPING_SETS clause is a powerful extension of the GROUP_BY clause that allows computing multiple group-by clauses in a single statement. The group set is a set of dimension columns.
  • GROUP_BY_ROLLUP: A Snowflake GROUP_BY_ROLLUP clause is an extension of the GROUP_BY clause that produces subtotal rows (in addition to the grouped rows). Subtotal rows are rows that further aggregate, whose values are derived by computing the same aggregate functions that were used to produce the grouped rows.

Examples

Alice and Zoe are managing a merchandise booth on different days. We have some data for their sales on each day.
Input data

+------------+----------+----------+---------+
|    DATE    | PRODUCT  | QUANTITY | MANAGER |
+------------+----------+----------+---------+
| 2023-04-01 | Mugs     |       10 | Alice   |
| 2023-04-01 | Stickers |       20 | Alice   |
| 2023-04-02 | Mugs     |       15 | Alice   |
| 2023-04-02 | Stickers |        5 | Alice   |
| 2023-04-03 | Mugs     |       20 | Zoe     |
| 2023-04-03 | Stickers |       25 | Zoe     |
| 2023-04-04 | Mugs     |       15 | Zoe     |
| 2023-04-04 | Stickers |        5 | Zoe     |
+------------+----------+----------+---------+

Find the total number of each product sold.One of the simplest and most useful things to do would be to aggregate the sales numbers, grouped by product.Aggregate component properties:
  • Groupings: PRODUCT
  • Aggregations:
    • Column Name: QUANTITY
    • Aggregation Type: Sum
  • Grouping Type: Group By
Output data

+--------------+----------+
| sum_Quantity | Product  |
+--------------+----------+
|           60 | Mugs     |
|           55 | Stickers |
+--------------+----------+