Sampling data
To sample the data output by a component, select the component on the canvas and either:- Click the Sample data tab at the bottom of the canvas, then click Sample.
- Click Sample data in the top right of the component properties panel.
Sample more data
The sample limit drop-down menu in the top left of the Sample data tab shows how many rows will be sampled. Click this drop-down to increase or decrease the number of rows to sample by choosing from the available options between 1 and 1000 rows. The default is 25 rows.Sort sampled data
To sort the data in the table, click the header of the column you want to sort the data by. Click a column once to sort the data by the values in that column in ascending order. Click the column again to sort in descending order. The arrow icons in each column header indicate which column is being used to sort the data, and in what order.Resize columns
To resize columns in the table, click and drag the border of a column to change its width.Refresh sampled data
To refresh the sampled data, click the Refresh sample icon in the top right of the Sample data tab.Export sampled data
To download the sampled data as a CSV file, click the Download CSV icon in the bottom right of the Sample data tab.Metadata
To view the metadata about the output of a component, select the component on the canvas and click the Metadata tab at the bottom of the canvas. The Metadata tab shows the name, data type, and size of each column in the component’s output. Use the toggle to switch to text mode. You can use the metadata information shown in text mode as the basis for creating a table using the Create Table component.Filtering sampled data
You can filter the rows displayed in the table in the Sample data tab using a filter query. sends this query to your warehouse as an SQLWHERE clause, and returns data that matches the conditions you have specified in your query.
To filter the data shown in the Sample data table, either:
- Click Ask Maia next to the Filter field and tell how you want to filter the sampled data. will create and apply a filter query for you.
- Enter a filter query in the Filter field at the top of the Sample data tab, then press Enter.
- For Snowflake, enter column names either in UPPERCASE, e.g.
ORDER_QUANTITY, or surrounded by double quotes, e.g."order_quantity". - For Amazon Redshift and Databricks, enter column names without any quote marks, e.g.
order_quantity. - Use single quotes around string values.
- Do not use quotes around number values.
- Enter date values in the format
YYYY-MM-DDsurrounded by single quotes, e.g.'2024-12-31'.
Filter query examples
The following examples show how you can create filter queries for different value types and combine query clauses. These filters are written for Snowflake, which is why the column names are in double quotes.- Filter for string values:
"customer_surname" = 'Smith'will display rows where the customer’s surname is “Smith”. - Filter for number values:
"order_quantity" > 5will display rows where the order quantity is greater than 5. - Filter for date values:
"order_date" < '2025-01-01'will display rows where the order date is before January 1, 2025.
AND and OR, and search for the opposite of a condition using NOT.
- Using
AND:"customer_organization" = 'Matillion' AND "order_date" > '2025-04-01'will display rows for all orders placed by Matillion after April 1, 2025. The data displayed must meet both of these conditions. - Using
OR:"country" = 'UK' OR "customer_organization" = 'Matillion'will display rows for all orders placed in the UK, regardless of the customer’s organization, and all orders placed by Matillion, regardless of the country. The data displayed only needs to meet one of the conditions. - Using
NOT:NOT "country" = 'UK'will display rows for all orders placed in a country other than the UK.
Operators
You can use the following operators in your query:-
Comparison operators
=: equals!=or<>: does not equal<and>: less than, greater than<=and>=: less than or equal to, greater than or equal toIS NULL: is a null valueIS NOT NULL: is a non-null value
-
Logical operators
AND: filter results that meet more than one conditionOR: filter results that meet at least one of a number of conditionsNOT: filter results that meet the opposite of a condition
-
Set operators
IN: matches a value within a list or subqueryNOT IN: does not match a value within a list or subqueryBETWEEN: is within a specified rangeNOT BETWEEN: is not within a specified rangeLIKE: matches a pattern (case-sensitive)NOT LIKE: does not match a pattern (case-sensitive)ILIKE: matches a pattern (case-insensitive)
-
Other operators:
||: string concatenation
Warehouse-specific operators
Additionally, there are some operators available for use with each cloud data warehouse.Snowflake
-
Comparison and logical operators
<=>: NULL-safe equals (useful for comparingNULLvalues)RLIKE: matches a regular expressionNOT RLIKE: does not match a regular expressionNOT ILIKE: does not match a pattern (case-insensitive)
-
Bitwise operators
&: Bitwise AND|: Bitwise OR^: Bitwise XOR~: Bitwise NOT
Amazon Redshift
-
Comparison and logical operators
SIMILAR TO: matches a specified pattern with SQL regular expressionsNOT SIMILAR TO: does not match a specified pattern with SQL regular expressions
-
Arithmetic operators
%: modulo
Databricks
-
Comparison and logical operators
RLIKE: matches a regular expressionNOT RLIKE: does not match a regular expressionDIV: integer division
-
Arithmetic operators
MODor%: modulo
Enabling and disabling sampling for a project
If necessary, you can enable and disable sampling at the project level. This is useful if your data contains personal information that cannot be viewed outside your region. Only users with project admin permissions can change this setting. To enable or disable sampling for a project:- In the Your projects page, click the three dots … next to the intended project.
- Click Enable sampling or Disable sampling as required.
