avg_Cost. You can use a Rename component immediately after the Aggregate component to change these names if necessary.
Use case
This component can be used to create useful metrics and KPIs. For example, you can use it to:- Calculate a total. For example, use the
Sumaggregation to calculate the total sales per product type from a dataset containing information about all purchases. - Calculate averages. For example, use the
Averageaggregation to find the average transaction amount per customer account from a dataset containing all customer transactions. - Determine minimum and maximum values. For example, use the
Maxaggregation type to find users’ most recent logins from a dataset containing all logins in the past month.
Properties
- Snowflake
- Databricks
- Amazon Redshift
A human-readable name for the component.
Select one or more columns from the source table that will form the groupings. The output flow will contain one row for each distinct combination of values within the grouping columns.
Select the input (source) columns and the aggregation types that should be applied to each. Note that certain aggregation types expect certain data types. For example, you cannot average non-numeric data.
- Column Name: Select the input (source) column for the summary function.
- Aggregate Type: The summary function to use for the column. Available functions include:
- Any Value
- Approximate Count
- Approximate Count Distinct (HLL)
- Array Aggregate
- Array Aggregate Distinct
- Average
- Bit AND Aggregate
- Bit OR Aggregate
- Bit XOR Aggregate
- Bool AND Aggregate
- Bool OR Aggregate
- Bool XOR Aggregate
- Count
- Count Distinct
- Hash Aggregate
- List Aggregate
- List Aggregate Distinct
- Max
- Median
- Min
- Mode
- Skew
- Standard Deviation - Population
- Standard Deviation - Sample
- Sum
- Variance - Population
- Variance - Sample
Select one of the following methods for grouping rows:
- GROUP_BY: Groups rows with the same group-by-item expressions and computes aggregate functions for the resulting group. A Snowflake GROUP_BY expression can be a column name, a number referencing a position in the SELECT list, or a general expression.
- GROUP_BY_CUBE: The Snowflake GROUP_BY_CUBE clause is an extension of the GROUP_BY clause similar to GROUP_BY_ROLLUP. In addition to producing all the rows of a GROUP_BY_ROLLUP, GROUP_BY_CUBE adds all the “cross-tabulations” rows. Subtotal rows are rows that further aggregate, whose values are derived by computing the same aggregate functions that were used to produce the grouped rows.
- GROUP_BY_GROUPING_SETS: A Snowflake GROUP_BY_GROUPING_SETS clause is a powerful extension of the GROUP_BY clause that allows computing multiple group-by clauses in a single statement. The group set is a set of dimension columns.
- GROUP_BY_ROLLUP: A Snowflake GROUP_BY_ROLLUP clause is an extension of the GROUP_BY clause that produces subtotal rows (in addition to the grouped rows). Subtotal rows are rows that further aggregate, whose values are derived by computing the same aggregate functions that were used to produce the grouped rows.
Examples
Alice and Zoe are managing a merchandise booth on different days. We have some data for their sales on each day.- Example 1
- Example 2
Find the total number of each product sold.One of the simplest and most useful things to do would be to aggregate the sales numbers, grouped by product.Aggregate component properties:
- Groupings: PRODUCT
- Aggregations:
- Column Name: QUANTITY
- Aggregation Type: Sum
- Grouping Type: Group By

