> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Extract

export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => {
  const allWarehouses = [...warehouses.map(w => ({
    name: w,
    supported: true
  })), ...unsupportedWarehouses.map(w => ({
    name: w,
    supported: false
  }))];
  return <div style={{
    background: 'var(--colors-background-light, #f9fafb)',
    border: '1px solid var(--colors-border-default, #e5e7eb)',
    borderRadius: '12px',
    padding: '20px 28px',
    marginBottom: '28px',
    boxShadow: '0 1px 4px rgba(0,0,0,0.10)'
  }}>
      <table style={{
    width: '100%',
    borderCollapse: 'collapse'
  }}>
        <tbody>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle',
    width: '180px'
  }}>Project Availability</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>
              <div style={{
    display: 'flex',
    flexWrap: 'wrap',
    gap: '8px'
  }}>
                {allWarehouses.map((w, i) => <span key={i} style={{
    background: w.supported ? '#dcfce7' : '#fee2e2',
    color: w.supported ? '#15803d' : '#b91c1c',
    border: `1px solid ${w.supported ? '#bbf7d0' : '#fca5a5'}`,
    borderRadius: '9999px',
    padding: '3px 12px',
    fontSize: '0.85rem',
    fontWeight: '500',
    whiteSpace: 'nowrap'
  }}>
                    {w.name} {w.supported ? '✅' : '❌'}
                  </span>)}
              </div>
            </td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Component Type</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{componentType}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Inputs</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{connectionInputs}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Outputs</td>
            <td style={{
    verticalAlign: 'middle'
  }}>{connectionOutputs}</td>
          </tr>
        </tbody>
      </table>
    </div>;
};

<ComponentMetadata warehouses={["Databricks"]} unsupportedWarehouses={["Snowflake", "Amazon Redshift", "BigQuery"]} componentType="Transformation" connectionInputs="One" connectionOutputs="Unlimited" />

<Info>
  Production use of this feature is available for specific editions only. [Contact our sales team](https://www.matillion.com/contact) for more information.
</Info>

The **AI Extract** transformation component uses the Databricks [ai\_extract()](https://docs.databricks.com/en/sql/language-manual/functions/ai_extract.html) function to extract entities from a given text input according to labels you provide. This function uses a Databricks chat model serving endpoint made available by [Databricks Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/index.html).

The output is a STRUCT where each field is a string containing an extracted entity that matches the specified label. For example, if you give the label `email` and an email address can be identified in the input, that email address will be put into the output STRUCT in the following format: `{"email": "sales@example.com"}`.

If the input is `NULL`, the output result is `NULL`.

<Note>
  Make sure you have read and understand the [Requirements](https://docs.databricks.com/en/sql/language-manual/functions/ai_extract.html#requirements) set out by Databricks before using this component.
</Note>

### Use case

AI Extract intelligently extracts structured data (like names, dates, entities, amounts, or custom fields) from unstructured or semi-structured text, making it ideal for tasks that would be too complex for traditional parsing or regular expressions. Some typical uses for this include:

* Invoice or receipt parsing, extracting fields like vendor name, invoice number, amount, or due date from scanned text or an email body.
* Customer feedback highlighting, extracting specific themes or attributes like mentioned features, issues, or locations from open-ended survey responses.
* Email or message field extraction, pulling structured fields like sender name, intent, dates, or product IDs from freeform email or chat content.

***

## Properties

<ResponseField name="Name" type="string" required>
  A human-readable name for the component.
</ResponseField>

{/* <!-- param-start:[column] | warehouses: [databricks] --> */}

<ResponseField name="Column" type="drop-down" required>
  Select an input column that contains the source text.

  The component will operate on only a single column of the input. If you need to extract from multiple columns of a source table, use multiple copies of the AI Extract component.
</ResponseField>

{/* <!-- param-start:[extractLabels] | warehouses: [databricks] --> */}

<ResponseField name="Extract Labels" type="column editor" required>
  Enter a list of labels which will be used to define what elements will be extracted from the text. For example, `name`, `email`.

  Enter one label per row in the **Extract Labels** dialog. Click **+** to add a new row.
</ResponseField>

{/* <!-- param-start:[includeInputColumns] | warehouses: [databricks] --> */}

<ResponseField name="Include Input Columns" type="boolean" required>
  * **Yes:** Outputs both your source input columns *and* the extracted data column. This will also include those input columns *not* selected in **Column**.
  * **No:** Only includes the extracted data column.
</ResponseField>
