- Snowflake window functions
- Databricks window functions
- Amazon Redshift window functions
- Google BigQuery window functions
Use case
This component can be used to highlight the highest and lowest values in your data, identify duplicate data, and rank values by percentile. For example, you can use it to:- Identify top-performing ads using RANK or DENSE RANK.
- Remove duplicate data from your dataset by partitioning and sorting data, then using ROW NUMBER = 1.
- Segment your data by percentile, to analyze data from different customer demographics.
When sampling data from a Rank component, if the input dataset is very large, the sample output may not display rows in the correct order. This is a display issue only—the data in this component has been ranked correctly.
Properties
A human-readable name for the component.
Defines whether the component passes all input columns into the output.
Defines how the input data is partitioned to perform the rank calculation. The calculation is then performed on each partition.To use grid variables, toggle Use Grid Variable on at the top of the dialog. For more information, read Grid variables.
Order input columns within the partitioned data. Drag to reorder, then choose the following:Click the Text mode toggle at the bottom of the dialog to open a multi-line editor that lets you add items in a single block. For more information, read Text mode.To use grid variables, toggle Use Grid Variable on at the bottom of the dialog. For more information, read Grid variables.
- Ascending
- Descending
- Nulls First
- Nulls Last
Nulls First and Nulls Last are not available for Google BigQuery.
Select a window function:
- Rank: Determines the rank of a value in a group of values.
- Dense Rank: Determines the rank of a value in a group of values. The Dense Rank function differs from rank in one respect: if two or more rows tie, there is no gap in the sequence of ranked values.
- Cumulative Distribution: Determines the cumulative distribution of a value within a window or partition.
- Percent Rank: Calculates the percent rank of a given row.
- Row Number: Determines the ordinal number of the current row within a group of rows, counting from 1.
