> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Distinct

export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => {
  const allWarehouses = [...warehouses.map(w => ({
    name: w,
    supported: true
  })), ...unsupportedWarehouses.map(w => ({
    name: w,
    supported: false
  }))];
  return <div style={{
    background: 'var(--colors-background-light, #f9fafb)',
    border: '1px solid var(--colors-border-default, #e5e7eb)',
    borderRadius: '12px',
    padding: '20px 28px',
    marginBottom: '28px',
    boxShadow: '0 1px 4px rgba(0,0,0,0.10)'
  }}>
      <table style={{
    width: '100%',
    borderCollapse: 'collapse'
  }}>
        <tbody>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle',
    width: '180px'
  }}>Project Availability</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>
              <div style={{
    display: 'flex',
    flexWrap: 'wrap',
    gap: '8px'
  }}>
                {allWarehouses.map((w, i) => <span key={i} style={{
    background: w.supported ? '#dcfce7' : '#fee2e2',
    color: w.supported ? '#15803d' : '#b91c1c',
    border: `1px solid ${w.supported ? '#bbf7d0' : '#fca5a5'}`,
    borderRadius: '9999px',
    padding: '3px 12px',
    fontSize: '0.85rem',
    fontWeight: '500',
    whiteSpace: 'nowrap'
  }}>
                    {w.name} {w.supported ? '✅' : '❌'}
                  </span>)}
              </div>
            </td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Component Type</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{componentType}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Inputs</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{connectionInputs}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Outputs</td>
            <td style={{
    verticalAlign: 'middle'
  }}>{connectionOutputs}</td>
          </tr>
        </tbody>
      </table>
    </div>;
};

<ComponentMetadata warehouses={["Snowflake", "Databricks", "Amazon Redshift", "Google BigQuery"]} componentType="Transformation" connectionInputs="One" connectionOutputs="Unlimited" />

The **Distinct** transformation component lets you remove any duplicate rows from a dataset. A row is considered to be a duplicate if the values in **all** of the selected columns match **all** the values in another row.

This component is equivalent to writing a SELECT DISTINCT statement.

### Use case

This component can be used to return all possible values for a given column, or to remove duplicate records from a dataset. For example, you can use it to:

* Remove duplicate customer records from your dataset to avoid issues with customer analytics.
* Clean up transaction logs that contain duplicate records due to retries or system errors.
* Make sure your data is ready for aggregation.

***

## Properties

<ResponseField name="Name" type="string" required>
  A human-readable name for the component.
</ResponseField>

{/* <!-- param-start:[columns] | warehouses: [snowflake, databricks, redshift, bigquery] --> */}

<ResponseField name="Columns" type="dual listbox" required>
  Only these selected columns are kept and passed to the next component. Duplicate records from these columns are removed, leaving only distinct values.

  To use grid variables, toggle **Use Grid Variable** on at the top of the dialog. For more information, read [Grid variables](/docs/guides/grid-variables).
</ResponseField>

## Examples

We have some data about our employees, which we're identifying by ID. Let's see what the Distinct component can do for our data.

```
Input data

+-------+-------------+----------+
|  ID   | DEPARTMENT  | POSITION |
+-------+-------------+----------+
| 00001 | Marketing   | Junior   |
| 00002 | Sales       | Senior   |
| 00003 | Marketing   | Senior   |
| 00004 | Engineering | Junior   |
| 00001 | Marketing   | Junior   |
| 00005 | Sales       | Junior   |
+-------+-------------+----------+
```

<Tabs>
  <Tab title="Example 1">
    **Cleaning up duplicates**

    It looks like we've somehow got a duplicate record for ID 00001. That's not going to help our later transformations so let's clean it up.

    If we put every column into the Distinct component then we can get everything back except the duplicates.

    **Distinct component properties:**

    * **Columns:**
      * ID
      * DEPARTMENT
      * POSITION

    ```
    Output data

    +-------+-------------+----------+
    |  ID   | DEPARTMENT  | POSITION |
    +-------+-------------+----------+
    | 00001 | Marketing   | Junior   |
    | 00002 | Sales       | Senior   |
    | 00003 | Marketing   | Senior   |
    | 00004 | Engineering | Junior   |
    | 00005 | Sales       | Junior   |
    +-------+-------------+----------+
    ```

    Excellent. We now only have one entry per ID, as expected.
  </Tab>

  <Tab title="Example 2">
    **Finding all departments**

    We'd just like a nice clean list of what departments our organization has. If we feed just the DEPARTMENT column in, the Distinct component will find all unique values for this column.

    **Distinct component properties:**

    * **Columns:** DEPARTMENT

    ```
    Output data

    +-------------+
    | DEPARTMENT  |
    +-------------+
    | Marketing   |
    | Sales       |
    | Engineering |
    +-------------+
    ```
  </Tab>
</Tabs>
