Production use of this feature is available for specific editions only. Contact our sales team for more information.
Mask Labels property.
The output is a column of string data with the labelled data masked.
Make sure you have read and understand the Requirements set out by Databricks before using this component.
Example
A simple example of how masking works is as follows. The input string is: “These comments were made by customer John Doe. For further clarification contact him at john.doe@company.com.” We want to redact the name and email before we share this data. To accomplish this, we run the AI Mask component with the labelsperson and email.
The output string is: “These comments were made by customer [MASKED]. For further clarification contact him at [MASKED].”
Use case
The AI Mask component is used to mask sensitive information in any type of input text. Some typical uses of this include:- Automatically detect and redact Personally Identifiable Information (PII) or Protected Health Information (PHI) from text fields before storage, sharing, or analysis, to facilitate compliance with privacy standards such as GDPR and HIPAA.
- Anonymize customer data (like account numbers, names, and emails) in support logs before sharing for training or analytics.
- Mask sensitive data after loading from source connectors. Data loading components (such as Salesforce Load, HubSpot Load, etc.) don’t include built-in masking, so you can use Databricks AI Mask in a downstream transformation pipeline to redact sensitive information before it’s written to target tables.
Properties
A human-readable name for the component.
Select the column that holds data you wish to mask.
Add mask labels as text strings. Each label represents a type of information to be masked. For example, adding a mask label of
name prompts the component to mask any names in a given row.- Yes: Includes both your input column and the newly created masked column. This will also include those input columns not selected in Column.
- No: Only includes the new, masked column.

