> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Sampling output

export const maia = "Maia";

export const designer = "Designer";

export const maia_agents = "Maia AI Agents";

In {designer}, many components let you sample the data output by that component. Sampling the data lets you see the data available once the component has performed its designated tasks. Sampling also confirms that the component is set up correctly before using it in a live pipeline.

Before you can sample a component's output, you must first validate the pipeline. Read [The pipeline canvas](/docs/guides/designer-ui-basics#the-pipeline-canvas) for more information.

***

## Sampling data

To sample the data output by a component, select the component on the canvas and either:

* Click the **Sample data** tab at the bottom of the canvas, then click **Sample**.
* Click **Sample data** in the top right of the component properties panel.

The **Sample data** tab will open at the bottom of the canvas and display a table of data that the component will output when the pipeline runs. If you haven't validated your pipeline yet, click **Validate now** in the **Sample data** tab to validate it.

### Sample more data

The sample limit drop-down menu in the top left of the **Sample data** tab shows how many rows will be sampled. Click this drop-down to increase or decrease the number of rows to sample by choosing from the available options between 1 and 1000 rows. The default is 25 rows.

### Sort sampled data

To sort the data in the table, click the header of the column you want to sort the data by. Click a column once to sort the data by the values in that column in ascending order. Click the column again to sort in descending order. The arrow icons in each column header indicate which column is being used to sort the data, and in what order.

### Resize columns

To resize columns in the table, click and drag the border of a column to change its width.

### Refresh sampled data

To refresh the sampled data, click the **Refresh sample** icon in the top right of the **Sample data** tab.

### Export sampled data

To download the sampled data as a CSV file, click the **Download CSV** icon in the bottom right of the **Sample data** tab.

***

## Metadata

To view the metadata about the output of a component, select the component on the canvas and click the **Metadata** tab at the bottom of the canvas. The **Metadata** tab contains a read-only table showing the name, data type, and size of each column in the component's output.

Use the toggle to switch to text mode. You can use the metadata information shown in text mode as the basis for creating a table using the [Create Table](/docs/components/create-table) component.

To change the data type of a column in your component's output, use the [Convert Type](/docs/components/convert-type) transformation component.

***

## Filtering sampled data

You can filter the rows displayed in the table in the **Sample data** tab using a filter query. {maia} sends this query to your warehouse as an SQL `WHERE` clause, and returns data that matches the conditions you have specified in your query.

To filter the data shown in the **Sample data** table, either:

* Click **Ask Maia** next to the **Filter** field and tell {maia_agents} how you want to filter the sampled data. {maia_agents} will create and apply a filter query for you.
* Enter a filter query in the **Filter** field at the top of the **Sample data** tab, then press Enter.

When writing your filter query, follow these rules:

* For Snowflake, enter column names either in UPPERCASE, e.g. `ORDER_QUANTITY`, or surrounded by double quotes, e.g. `"order_quantity"`.
* For Amazon Redshift and Databricks, enter column names without any quote marks, e.g. `order_quantity`.
* Use single quotes around string values.
* Do not use quotes around number values.
* Enter date values in the format `YYYY-MM-DD` surrounded by single quotes, e.g. `'2024-12-31'`.

### Filter query examples

The following examples show how you can create filter queries for different value types and combine query clauses. These filters are written for Snowflake, which is why the column names are in double quotes.

* Filter for string values: `"customer_surname" = 'Smith'` will display rows where the customer's surname is "Smith".
* Filter for number values: `"order_quantity" > 5` will display rows where the order quantity is greater than 5.
* Filter for date values: `"order_date" < '2025-01-01'` will display rows where the order date is before January 1, 2025.

You can combine query clauses using `AND` and `OR`, and search for the opposite of a condition using `NOT`.

* Using `AND`: `"customer_organization" = 'Matillion' AND "order_date" > '2025-04-01'` will display rows for all orders placed by Matillion after April 1, 2025. The data displayed must meet both of these conditions.
* Using `OR`: `"country" = 'UK' OR "customer_organization" = 'Matillion'` will display rows for all orders placed in the UK, regardless of the customer's organization, and all orders placed by Matillion, regardless of the country. The data displayed only needs to meet one of the conditions.
* Using `NOT`: `NOT "country" = 'UK'` will display rows for all orders placed in a country other than the UK.

### Operators

You can use the following operators in your query:

* Comparison operators
  * `=`: equals
  * `!=` or `<>`: does not equal
  * `<` and `>`: less than, greater than
  * `<=` and `>=`: less than or equal to, greater than or equal to
  * `IS NULL`: is a null value
  * `IS NOT NULL`: is a non-null value

* Logical operators
  * `AND`: filter results that meet more than one condition
  * `OR`: filter results that meet at least one of a number of conditions
  * `NOT`: filter results that meet the opposite of a condition

* Set operators
  * `IN`: matches a value within a list or subquery
  * `NOT IN`: does not match a value within a list or subquery
  * `BETWEEN`: is within a specified range
  * `NOT BETWEEN`: is not within a specified range
  * `LIKE`: matches a pattern (case-sensitive)
  * `NOT LIKE`: does not match a pattern (case-sensitive)
  * `ILIKE`: matches a pattern (case-insensitive)

* Other operators:
  * `||`: string concatenation

### Warehouse-specific operators

Additionally, there are some operators available for use with each cloud data warehouse.

#### Snowflake

* Comparison and logical operators
  * `<=>`: NULL-safe equals (useful for comparing `NULL` values)
  * `RLIKE`: matches a regular expression
  * `NOT RLIKE`: does not match a regular expression
  * `NOT ILIKE`: does not match a pattern (case-insensitive)

* Bitwise operators
  * `&`: Bitwise AND
  * `|`: Bitwise OR
  * `^`: Bitwise XOR
  * `~`: Bitwise NOT

#### Amazon Redshift

* Comparison and logical operators
  * `SIMILAR TO`: matches a specified pattern with SQL regular expressions
  * `NOT SIMILAR TO`: does not match a specified pattern with SQL regular expressions

* Arithmetic operators
  * `%`: modulo

#### Databricks

* Comparison and logical operators
  * `RLIKE`: matches a regular expression
  * `NOT RLIKE`: does not match a regular expression
  * `DIV`: integer division

* Arithmetic operators
  * `MOD` or `%`: modulo

***

## Enabling and disabling sampling for a project

If necessary, you can enable and disable sampling at the project level. This is useful if your data contains personal information that cannot be viewed outside your region. Only users with project admin permissions can change this setting.

To enable or disable sampling for a project:

1. In the **Your projects** page, click the three dots **...** next to the intended project.
2. Click **Enable sampling** or **Disable sampling** as required.
