Transpose Rows - Maia Documentation

The Transpose Rows transformation component lets you combine multiple input rows into a single output row. One row will be produced for each value of the columns specified in the Groupings parameter. This is especially helpful when you want to restructure long datasets into a wide format for reporting, comparisons, or aggregations. The output of this component differs slightly depending on your cloud data warehouse environment:

In a Snowflake environment, this component concatenates each value into a delimited string with the format [ value, value, value ]. You could then use a Split Field component after the Transpose Rows component to convert the delimited strings into a set of new columns, if required.
In a Databricks environment, this component returns values as Databricks ARRAY <String> types. These values can’t be manipulated by a Split Field component.
In Amazon Redshift and Google BigQuery environments, this component concatenates each value into a delimited string with the format value,value,value. You could then use a Split Field component after the Transpose Rows component to convert the delimited strings into a set of new columns, if required.

For an alternative approach to transforming rows into columns, consider the Pivot component instead.
If you want to map columns from the input dataset into rows in the output data, use the Transpose Columns component.

Use case

This component can be used to restructure long datasets into a wide format for reporting, comparisons, or aggregations. Some typical uses of this are:

Pivoting long-form time-based data into a wide format for period-over-period comparisons.
Condensing repeated rows (for example, in survey responses or log files) into a single record.

Properties

Snowflake
Databricks
Amazon Redshift
Google BigQuery

Name

string

required

A human-readable name for the component.

Aggregations

dual listbox

required

The input column to aggregate.To use grid variables, toggle Use Grid Variable on at the top of the dialog. For more information, read Grid variables.

Sort order

dual listbox

Rows included in the “groupings” are ordered using this Sort order parameter. If you’re performing multiple transpositions, ordering will ensure that the items are in a consistent order.To use grid variables, toggle Use Grid Variable on at the top of the dialog. For more information, read Grid variables.

Groupings

dual listbox

One or more source columns that form the groupings. The output will have one row for every combination of grouping column values.To use grid variables, toggle Use Grid Variable on at the top of the dialog. For more information, read Grid variables.

Example

This example uses a Snowflake project. Other cloud data warehouses will give slightly different output formats.

We have a table of products and yearly sales information. However, the format gives us a lot of repeated data:

Input Data

+------------+------+---------+
| product_id | year | sales   |
+------------+------+---------+
| P123       | 2023 | 1000.00 |
| P124       | 2023 | 2500.00 |
| P125       | 2023 | 1800.00 |
| P126       | 2023 | 3500.00 |
| P123       | 2024 | 1500.00 |
| P124       | 2024 | 3000.00 |
| P125       | 2024 | 2200.00 |
| P126       | 2024 | 4000.00 |
+------------+------+---------+

We want to transpose the data into a condensed format that puts the data for each product into a single row:

Output Data

+------------+--------------------+----------------+
| product_id | year               | sales          |
+------------+--------------------+----------------+
| P126       | [ "2023", "2024" ] | [ 3500, 4000 ] |
| P123       | [ "2023", "2024" ] | [ 1000, 1500 ] |
| P124       | [ "2023", "2024" ] | [ 2500, 3000 ] |
| P125       | [ "2023", "2024" ] | [ 1800, 2200 ] |
+------------+--------------------+----------------+

To achieve this transformation, we connect the input table to a Transpose Rows component, which we will configure as follows:

Aggregations: year, sales
Sort Order: Left blank. In a long dataset, sorting on product_id might be appropriate.
Groupings: product_id

​Use case

​Properties

​Example

Use case

Properties

Example