> ## Documentation Index > Fetch the complete documentation index at: https://docs.maia.ai/llms.txt > Use this file to discover all available pages before exploring further. # AI Similarity export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => { const allWarehouses = [...warehouses.map(w => ({ name: w, supported: true })), ...unsupportedWarehouses.map(w => ({ name: w, supported: false }))]; return

Project Availability	{allWarehouses.map((w, i) => {w.name} {w.supported ? '✅' : '❌'} )}
Component Type	{componentType}
Connection Inputs	{connectionInputs}
Connection Outputs	{connectionOutputs}

; }; Production use of this feature is available for specific editions only. [Contact our sales team](https://www.matillion.com/contact) for more information. The **AI Similarity** transformation component uses the Databricks [ai\_similarity()](https://docs.databricks.com/en/sql/language-manual/functions/ai_similarity.html) function to invoke generative AI to compare two strings and compute the semantic similarity score. This function uses a Databricks chat model serving endpoint made available by [Databricks Foundation Model APIs](https://docs.databricks.com/en/machine-learning/foundation-models/index.html). This lets the comparison go beyond simple string matching, as the chat model understands meaning, context, and phrasing. The input is two columns of text data, which are to be compared. Both columns must be in the same input table. If you want to compare data from different tables, you will first need to perform additional transformations, such as a [Join](/docs/components/join), to put the data into a single table. The output is a float value, representing the semantic similarity between the two input strings. The output score is relative and should only be used for ranking. Scores of 1 indicate that the two texts are equal. Make sure you have read and understand the [Requirements](https://docs.databricks.com/en/sql/language-manual/functions/ai_similarity.html#requirements) set out by Databricks before using this component. ### Use case Some typical use cases for this component include: * Deduplication of text data, by identifying and grouping duplicate or near-duplicate entries in datasets like product descriptions, survey responses, or user comments. For example, "iPhone 14 Pro Max 256GB" and "Apple iPhone 14 Pro Max, 256 GB" are non-matching strings but have a high similarity score so can be considered duplicates. * Record linking through semantic joins on datasets where the matching field contains slightly different wording. * Detecting content overlaps to check whether content is reworded or copied from other sources. *** ## Properties A human-readable name for the component. {/* */} **Base Column:** The base column. **Comparison Column:** The column to compare against your base column. {/* */} * **Yes:** Includes both your input columns *and* the new semantic similarity scores column. This will also include those input columns *not* selected in **Columns**. * **No:** Only includes the new semantic similarity scores column.