> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# AI Mask

export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => {
  const allWarehouses = [...warehouses.map(w => ({
    name: w,
    supported: true
  })), ...unsupportedWarehouses.map(w => ({
    name: w,
    supported: false
  }))];
  return <div style={{
    background: 'var(--colors-background-light, #f9fafb)',
    border: '1px solid var(--colors-border-default, #e5e7eb)',
    borderRadius: '12px',
    padding: '20px 28px',
    marginBottom: '28px',
    boxShadow: '0 1px 4px rgba(0,0,0,0.10)'
  }}>
      <table style={{
    width: '100%',
    borderCollapse: 'collapse'
  }}>
        <tbody>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle',
    width: '180px'
  }}>Project Availability</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>
              <div style={{
    display: 'flex',
    flexWrap: 'wrap',
    gap: '8px'
  }}>
                {allWarehouses.map((w, i) => <span key={i} style={{
    background: w.supported ? '#dcfce7' : '#fee2e2',
    color: w.supported ? '#15803d' : '#b91c1c',
    border: `1px solid ${w.supported ? '#bbf7d0' : '#fca5a5'}`,
    borderRadius: '9999px',
    padding: '3px 12px',
    fontSize: '0.85rem',
    fontWeight: '500',
    whiteSpace: 'nowrap'
  }}>
                    {w.name} {w.supported ? '✅' : '❌'}
                  </span>)}
              </div>
            </td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Component Type</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{componentType}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Inputs</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{connectionInputs}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Outputs</td>
            <td style={{
    verticalAlign: 'middle'
  }}>{connectionOutputs}</td>
          </tr>
        </tbody>
      </table>
    </div>;
};

<ComponentMetadata warehouses={["Databricks"]} unsupportedWarehouses={["Snowflake", "Amazon Redshift", "BigQuery"]} componentType="Transformation" connectionInputs="One" connectionOutputs="Unlimited" />

<Info>
  Production use of this feature is available for specific editions only. [Contact our sales team](https://www.matillion.com/contact) for more information.
</Info>

The **AI Mask** transformation component uses the Databricks [ai\_mask()](https://docs.databricks.com/en/sql/language-manual/functions/ai_mask.html) function to invoke generative AI to identify and mask specified entities in unstructured text. This function uses a Databricks chat model serving endpoint made available by [Databricks Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/index.html).

The input to this component is a column of data in string format. The component operates on a single column only, so if you have multiple text columns you want to mask in the input datastream, you will need to use multiple instances of the AI Mask component, and then combine the outputs downstream in your pipeline.

You also need to specify the type of data you want to be masked (for example: name, email). To do this, use the `Mask Labels` property.

The output is a column of string data with the labelled data masked.

<Note>
  Make sure you have read and understand the [Requirements](https://docs.databricks.com/en/sql/language-manual/functions/ai_mask.html#requirements) set out by Databricks before using this component.
</Note>

### Example

A simple example of how masking works is as follows.

The input string is: "These comments were made by customer John Doe. For further clarification contact him at [john.doe@company.com](mailto:john.doe@company.com)."

We want to redact the name and email before we share this data. To accomplish this, we run the AI Mask component with the labels `person` and `email`.

The output string is: "These comments were made by customer \[MASKED]. For further clarification contact him at \[MASKED]."

### Use case

The AI Mask component is used to mask sensitive information in any type of input text. Some typical uses of this include:

* Automatically detect and redact Personally Identifiable Information (PII) or Protected Health Information (PHI) from text fields before storage, sharing, or analysis, to facilitate compliance with privacy standards such as GDPR and HIPAA.
* Anonymize customer data (like account numbers, names, and emails) in support logs before sharing for training or analytics.
* Mask sensitive data after loading from source connectors. Data loading components (such as Salesforce Load, HubSpot Load, etc.) don't include built-in masking, so you can use Databricks AI Mask in a downstream transformation pipeline to redact sensitive information before it's written to target tables.

***

## Properties

<ResponseField name="Name" type="string" required>
  A human-readable name for the component.
</ResponseField>

{/* <!-- param-start:[column] | warehouses: [databricks] --> */}

<ResponseField name="Column" type="drop-down" required>
  Select the column that holds data you wish to mask.
</ResponseField>

{/* <!-- param-start:[maskLabels] | warehouses: [databricks] --> */}

<ResponseField name="Mask Labels" type="column editor" required>
  Add mask labels as text strings. Each label represents a type of information to be masked. For example, adding a mask label of `name` prompts the component to mask any names in a given row.
</ResponseField>

{/* <!-- param-start:[includeInputColumns] | warehouses: [databricks] --> */}

<ResponseField name="Include Input Columns" type="boolean" required>
  * **Yes:** Includes both your input column and the newly created masked column. This will also include those input columns *not* selected in **Column**.
  * **No:** Only includes the new, masked column.
</ResponseField>
