The Transpose Columns transformation component rotates a table by transforming columns from the input dataset into rows in the output data. This reshapes data by outputting multiple rows for each individual input row. Each set of input columns is mapped to an output column. The output rows are labelled to determine which column the value originated from. This component effectively performs the reverse of a pivot operation on the data. Consider the Unpivot component for an alternative way of obtaining similar results.Documentation Index
Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
Use this file to discover all available pages before exploring further.
There is no fixed limit on the number of columns you can transpose. The maximum depends on the structure of your source table and the query complexity limits of your data warehouse.
Use case
This component is useful when you have “wide” data (many columns) that you want to make “long” or normalized to help with analytics, modeling, and reporting. Some typical uses of this are:- Unpivoting time-based data to allow time-series analysis. For example, convert a table with one column per year into a table with one row per year.
- Converting multiple response columns from a survey into a normalized format for easier filtering and aggregation.
Properties
- Snowflake
- Amazon Redshift
A human-readable name for the component.
Choose the ordinary columns, those that are not going to be transposed but are still required in the output. These are effectively a set of grouping columns that are passed to the output unchanged.To use grid variables, select the Use Grid Variable checkbox at the bottom of the dialog. For more information, read Grid variables.
Provide the name of a new column here. It will contain constants you enter into the Column to Row Mapping property, which identifies the original column that the new row originated from.
- Name: A new column name to hold the output of multiple input columns.
- Type: Specify the data type for the column. Should be compatible with all input columns that will be mapped into this column. This is used to validate that the input columns all conform to the type of the output column. Choose from the following data types:
- VARCHAR: This type is suitable for numbers and letters. A varchar or Variable Character field is a set of character data of indeterminate length.
- NUMBER: This type is suitable for numeric types, with or without decimals.
- FLOAT: This type of values are approximate numeric values with fractional components.
- BOOLEAN: This type is suitable for data that is either “true” or “false”.
- DATE: This type is suitable for dates without times.
- TIMESTAMP: This type is a timestamp left unformatted (exists as Unix/Epoch Time).
- TIME: This type is suitable for time, independent of a specific date and timezone.
- VARIANT: Variant is a tagged universal type that can hold up to 16 MB of any data type supported by Snowflake.
- Row Label Name: This editor column will actually appear as the label provided in Row Label Name. Enter an identifier to specify what the rows represent.
- Output Column-1: Each defined output column will appear as a column in this mapping. Add a row to this grid for each input column you want to map into an output column.
- Output Column-n: As above, if you are mapping multiple sets of input columns. When you map data into multiple output columns, there should be a set of similar input columns for each output column. For example, you may have a set of input columns for each quarterly revenue amount, and another set of input columns for quarterly profits.
Example
We have a table of products and yearly sales information. However, the format doesn’t allow us to easily do analysis by year:- Ordinary Columns: product_id
- Row Label Name: year
- Output Columns:
- Name: sales
- Type: NUMBER
- Columns To Row Mapping:
- year: 2023 sales: sales_2023
- year: 2024 sales: sales_2024
product_id in the input, data from the original sales_2023 column goes into an output row with “2023” in the year column, while data from the original sales_2024 column goes into a separate output row with “2024” in the year column.