> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Extract Structured Data

export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => {
  const allWarehouses = [...warehouses.map(w => ({
    name: w,
    supported: true
  })), ...unsupportedWarehouses.map(w => ({
    name: w,
    supported: false
  }))];
  return <div style={{
    background: 'var(--colors-background-light, #f9fafb)',
    border: '1px solid var(--colors-border-default, #e5e7eb)',
    borderRadius: '12px',
    padding: '20px 28px',
    marginBottom: '28px',
    boxShadow: '0 1px 4px rgba(0,0,0,0.10)'
  }}>
      <table style={{
    width: '100%',
    borderCollapse: 'collapse'
  }}>
        <tbody>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle',
    width: '180px'
  }}>Project Availability</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>
              <div style={{
    display: 'flex',
    flexWrap: 'wrap',
    gap: '8px'
  }}>
                {allWarehouses.map((w, i) => <span key={i} style={{
    background: w.supported ? '#dcfce7' : '#fee2e2',
    color: w.supported ? '#15803d' : '#b91c1c',
    border: `1px solid ${w.supported ? '#bbf7d0' : '#fca5a5'}`,
    borderRadius: '9999px',
    padding: '3px 12px',
    fontSize: '0.85rem',
    fontWeight: '500',
    whiteSpace: 'nowrap'
  }}>
                    {w.name} {w.supported ? '✅' : '❌'}
                  </span>)}
              </div>
            </td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Component Type</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{componentType}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Inputs</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{connectionInputs}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Outputs</td>
            <td style={{
    verticalAlign: 'middle'
  }}>{connectionOutputs}</td>
          </tr>
        </tbody>
      </table>
    </div>;
};

<ComponentMetadata warehouses={["Databricks", "Google BigQuery"]} unsupportedWarehouses={["Snowflake", "Amazon Redshift"]} componentType="Transformation" connectionInputs="One" connectionOutputs="Unlimited" />

The **Extract Structured Data** transformation component unpacks arrays of structured data—for example, in [STRUCT](https://docs.databricks.com/en/sql/language-manual/data-types/struct-type.html) or [ARRAY](https://docs.databricks.com/en/sql/language-manual/data-types/array-type.html) data types—into columns and rows of data in a table. The component is especially useful when using a [Custom Connector](/docs/guides/custom-connector-overview) to access an API that returns data in structured format.

The input to this component should include one or more variant-type columns containing a STRUCT or ARRAY that is to be unpacked. The component will operate on every suitable column in the input.

Each element in the source structured data can be mapped to a different column in your target table. For example, consider the following array of three elements:

```
"name", "png", "alt"
```

When data in this format is extracted, each of the three array elements will be mapped to a different column in the target table. When we sample the data from the transformation, we will see something like the following table:

| Name      | Flag                                                               | Alt\_Text              |
| --------- | ------------------------------------------------------------------ | ---------------------- |
| Cyprus    | [https://flagcdn.com/w320/cy.png](https://flagcdn.com/w320/cy.png) | The flag of Cyprus.    |
| Somalia   | [https://flagcdn.com/w320/so.png](https://flagcdn.com/w320/so.png) | The flag of Somalia.   |
| Venezuela | [https://flagcdn.com/w320/ve.png](https://flagcdn.com/w320/ve.png) | The flag of Venezuela. |

Use the **Columns** property to define which elements in the source array will be mapped to columns in the target data.

<Note>
  * This component won't unpack semi-structured data formats such as JSON. For that, you must use the [Extract Nested Data](/docs/components/extract-nested-data) component.
  * For an alternative method of extracting semi-structured data in a Snowflake project, you can use the [Flatten Variant](/docs/components/flatten-variant) component.
</Note>

### Use case

The Extract Structured Data component is used to flatten and extract fields from structured data such as an ARRAY or STRUCT. Some common uses for this include:

* Taking source data where a column contains an array, and putting the array elements into separate table columns to allow further transformations.
* Handling output from the [Convert String To Struct](/docs/components/convert-string-to-struct) component. When you use Convert String To Struct to parse a JSON string into a struct, you can then use Extract Structured Data to make each field in that struct accessible as its own column for filtering, joining, or writing to a table.

***

## Properties

<ResponseField name="Name" type="string" required>
  A human-readable name for the component.
</ResponseField>

{/* <!-- param-start:[columns.selected] | warehouses: [databricks, bigquery] --> */}

<ResponseField name="Columns" type="data structure" required>
  Use this property to select which elements from the structured input will be mapped to columns in the output. The **Columns** dialog shows a graphical representation of every addressable element in the input. If the input has multiple columns of structured data, all will be included here. Each element has a corresponding checkbox. Select an element's checkbox to include that element in the output. No elements are selected by default.

  * To select every element, click **Select all**.
  * To clear every element, click **Clear all** or **Reset**.

  Click **Save** when you have finished editing and selecting elements.
</ResponseField>

{/* <!-- param-start:[aliases] | warehouses: [databricks, bigquery] --> */}

<ResponseField name="Aliases" type="column editor">
  By default, the output columns will have the same names as the input elements. You can rename the output columns by specifying aliases in this dialog.

  * **Source column:** Select the source element that you wish to provide an alias for.
  * **Target column:** Provide a name for the output column.

  You can provide aliases for any or all input elements. This property is optional, however, and can be left empty if you don't wish to provide alternative names for any columns.

  Click the **Text mode** toggle at the bottom of the dialog to open a multi-line editor that lets you add items in a single block. For more information, read [Text mode](/docs/guides/components-overview#text-mode).

  To use grid variables, toggle **Use Grid Variable** on at the bottom of the dialog. For more information, read [Grid variables](/docs/guides/grid-variables).
</ResponseField>

{/* <!-- param-start:[includeInputColumns] | warehouses: [databricks, bigquery] --> */}

<ResponseField name="Include input columns" type="boolean" required>
  Choose whether to include input columns in the output.
</ResponseField>

{/* <!-- param-start:[inputColumnPrefix] | warehouses: [bigquery] --> */}

<ResponseField name="Input column prefix" type="string">
  Prepended to all input column names when **Include input columns** is set to **Yes**. Defaults to `input_`.
</ResponseField>

{/* <!-- param-start:[outerJoin] | warehouses: [databricks, bigquery] --> */}

<ResponseField name="Outer join" type="boolean" required>
  Determines how to handle input rows that can't be expanded (for example, because they have no fields to expand, or because they can't be accessed). Select **No** to completely omit these rows from the output, or **Yes** to generate an output row with `NULL` values.
</ResponseField>
