> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create External Table

export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => {
  const allWarehouses = [...warehouses.map(w => ({
    name: w,
    supported: true
  })), ...unsupportedWarehouses.map(w => ({
    name: w,
    supported: false
  }))];
  return <div style={{
    background: 'var(--colors-background-light, #f9fafb)',
    border: '1px solid var(--colors-border-default, #e5e7eb)',
    borderRadius: '12px',
    padding: '20px 28px',
    marginBottom: '28px',
    boxShadow: '0 1px 4px rgba(0,0,0,0.10)'
  }}>
      <table style={{
    width: '100%',
    borderCollapse: 'collapse'
  }}>
        <tbody>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle',
    width: '180px'
  }}>Project Availability</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>
              <div style={{
    display: 'flex',
    flexWrap: 'wrap',
    gap: '8px'
  }}>
                {allWarehouses.map((w, i) => <span key={i} style={{
    background: w.supported ? '#dcfce7' : '#fee2e2',
    color: w.supported ? '#15803d' : '#b91c1c',
    border: `1px solid ${w.supported ? '#bbf7d0' : '#fca5a5'}`,
    borderRadius: '9999px',
    padding: '3px 12px',
    fontSize: '0.85rem',
    fontWeight: '500',
    whiteSpace: 'nowrap'
  }}>
                    {w.name} {w.supported ? '✅' : '❌'}
                  </span>)}
              </div>
            </td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Component Type</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{componentType}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Inputs</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{connectionInputs}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Outputs</td>
            <td style={{
    verticalAlign: 'middle'
  }}>{connectionOutputs}</td>
          </tr>
        </tbody>
      </table>
    </div>;
};

<ComponentMetadata warehouses={["Snowflake", "Amazon Redshift"]} unsupportedWarehouses={["Databricks", "BigQuery"]} componentType="Orchestration, Test" connectionInputs="One" connectionOutputs="Unlimited" />

Create External Table is an orchestration component that lets you define a table that references data stored outside of your data warehouse. These external tables don't store data themselves, but can be queried similarly to regular tables. External tables are read-only.

## Cloud platform information

If you're a Snowflake or Amazon Redshift user, familiarize yourself with the corresponding information about this component.

### Snowflake

When using Snowflake, external tables are useful for querying large datasets without physically loading them into Snowflake storage. The data is typically returned in a single column named VALUE, containing variant-type values. Each row stores the source data as a set of properties within this variant.

<Note>
  * Due to the structure of the returned data, transformation (e.g using the [Flatten Variant](/docs/components/flatten-variant) component) is often required to convert it into a usable format.
  * External tables must be refreshed after any changes to the underlying data. Use the [Refresh External Table](/docs/components/refresh-external-table) component to update the metadata before querying.
</Note>

### Amazon Redshift

When using Amazon Redshift, the Create External Table component references data stored in an Amazon S3 bucket. Like Snowflake, the external table doesn't store the data internally.

If the component requires access to a cloud provider (AWS, Azure, or Google Cloud), it will use credentials as follows:

* If using [Matillion Full SaaS](/docs/guides/runner-overview#matillion-full-saas): The component will use the [cloud credentials](/docs/guides/cloud-credentials) associated with your environment to access resources.
* If using [Hybrid SaaS](/docs/guides/runner-overview#hybrid-saas): By default the component will inherit the agent's execution role (service account role). However, if there are [cloud credentials](/docs/guides/cloud-credentials) associated to your environment, these will overwrite the role.

Referencing external data in this way provides an efficient method for querying large datasets without consuming storage on the Amazon Redshift cluster.

<Note>
  * External tables are part of [Amazon Redshift Spectrum](https://docs.aws.amazon.com/redshift/latest/dg/c-getting-started-using-spectrum.html), and may not be available in all regions. For a list of supported regions, read [Amazon Redshift endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/redshift-service.html#redshift_region).
  * For more information about working with external tables, read [Creating external tables for Redshift Spectrum](https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-external-tables.html).
</Note>

***

## Properties

<Tabs>
  <Tab title="Snowflake">
    <ResponseField name="Name" type="string" required>
      A human-readable name for the component.
    </ResponseField>

    {/* <!-- param-start:[createReplace] | warehouses: [snowflake] --> */}

    <ResponseField name="Create/Replace" type="drop-down" required>
      Select one of:

      * **Create:** (default). This option creates a new external table. The component will fail if an external table with the same name already exists.
      * **Create if not exists:** This option will only create a new external table when an external table of the same name does **not** already exist. If the schema of the existing external table does **not** match the schema defined in this component, no attempt is made to fix or correct the schema, which could lead to errors later on in the pipeline if:
        * Users did not expect an external table to already exist.
        * Users did not expect to have a different schema to the one defined in this component.
      * **Replace:** This option **drops** any existing external table of the same name and then creates a new one. Selecting this option guarantees that after the component succeeds, the external table matches the schema defined in this component.

      import CopyGrantsSnowflake from '/snippets/content/copy-grants-snowflake.mdx';

      <CopyGrantsSnowflake />
    </ResponseField>

    {/* <!-- param-start:[database] | warehouses: [snowflake] --> */}

    <ResponseField name="Database" type="drop-down" required>
      import SnowflakeDatabase from '/snippets/content/snowflake-database.mdx';

      <SnowflakeDatabase />
    </ResponseField>

    {/* <!-- param-start:[schema] | warehouses: [snowflake] --> */}

    <ResponseField name="Schema" type="drop-down" required>
      import SnowflakeSchema from '/snippets/content/snowflake-schema.mdx';

      <SnowflakeSchema />
    </ResponseField>

    {/* <!-- param-start:[newTableName] | warehouses: [snowflake] --> */}

    <ResponseField name="New Table Name" type="string" required>
      The name of the external table to be created or used.
    </ResponseField>

    {/* <!-- param-start:[partitionColumns] | warehouses: [snowflake] --> */}

    <ResponseField name="Partition Columns" type="column editor">
      A partition is defined by an expression applied to a set of data, resulting in a partition that only contains rows that satisfy the expression. For example, partitioning a dataset by the value in the 'year' column.

      For more information about setting up partitions with Snowflake's external tables, read [Managing Regular Data Loads](https://docs.snowflake.net/manuals/user-guide/data-load-considerations-manage.html#partitioning-staged-data-files).

      * **Name:** The name of the new partition column.
      * **Type:** The data type of the partition column.
      * **Size:** The data size of the new column values.
      * **Precision:** The precision of the new column values.
      * **Partition Expression:** The expression used to partition the data.

      import TextMode from '/snippets/content/text-mode.mdx';

      <TextMode />

      import UseGridVariable from '/snippets/content/use-grid-variable.mdx';

      <UseGridVariable />
    </ResponseField>

    {/* <!-- param-start:[stageDatabase] | warehouses: [snowflake] --> */}

    <ResponseField name="Stage Database" type="drop-down" required>
      Select a Snowflake database for the external stage.
    </ResponseField>

    {/* <!-- param-start:[stageSchema] | warehouses: [snowflake] --> */}

    <ResponseField name="Stage Schema" type="drop-down" required>
      Select a Snowflake schema for the external stage.
    </ResponseField>

    {/* <!-- param-start:[stage] | warehouses: [snowflake] --> */}

    <ResponseField name="Stage" type="drop-down" required>
      Select an external stage for the data. Staging areas can be created through Snowflake using the [CREATE STAGE](https://docs.snowflake.net/manuals/sql-reference/sql/create-stage.html) command.
    </ResponseField>

    {/* <!-- param-start:[relativePath] | warehouses: [snowflake] --> */}

    <ResponseField name="Relative Path" type="string">
      The directory path to follow to the target data. File names cannot be specified. The path is relative to the storage location given in the external stage setup.
    </ResponseField>

    {/* <!-- param-start:[pattern] | warehouses: [snowflake] --> */}

    <ResponseField name="Pattern" type="string">
      Specify, using regular expression (RegEx) pattern syntax, files to be matched on the external stage.

      For example: `'.*flight.*[.]csv'`
    </ResponseField>

    {/* <!-- param-start:[format] | warehouses: [snowflake] --> */}

    <ResponseField name="Format" type="drop-down" required>
      Select a pre-made file format that will automatically set many of the component properties accordingly. These formats can be created through the [Create File Format](/docs/components/create-file-format) component.

      The list of file formats will come from the database and schema that the specified tables exist in.
    </ResponseField>

    {/* <!-- param-start:[fileType] | warehouses: [snowflake] --> */}

    <ResponseField name="File Type" type="drop-down" required>
      The type of expected data to load. Some data may require additional formatting, explained in [Preparing to Load Data](https://docs.snowflake.net/manuals/user-guide/data-load-prepare.html).

      Available options are: AVRO, CSV, JSON, ORC, PARQUET, and XML.

      Component properties will change to reflect the choice made here and give options based on the specific file type.
    </ResponseField>

    {/* <!-- param-start:[compression] | warehouses: [snowflake] --> */}

    <ResponseField name="Compression" type="drop-down" required>
      (AVRO, CSV, JSON, PARQUET, XML) Select whether the input file is compressed in GZIP format, BROTLI, BZ2, DEFLATE, RAW\_DEFLATE, ZSTD, or not compressed at all. The default is AUTO.

      When AUTO is selected, the compression algorithm is detected automatically, except for Brotli-compressed files, which cannot currently be detected automatically.
    </ResponseField>

    {/* <!-- param-start:[recordDelimiter] | warehouses: [snowflake] --> */}

    <ResponseField name="Record Delimiter" type="string">
      (CSV) The delimiter to be used that separates records (rows) in the file. Defaults to newline.

      `\` can also signify a newline. `\\r` can signify a carriage return.
    </ResponseField>

    {/* <!-- param-start:[fieldDelimiter] | warehouses: [snowflake] --> */}

    <ResponseField name="Field Delimiter" type="string">
      (CSV) The delimiter that separates columns. The default is a comma. A \[TAB] character can be specified as `\`.
    </ResponseField>

    {/* <!-- param-start:[skipHeader] | warehouses: [snowflake] --> */}

    <ResponseField name="Skip Header" type="integer">
      (CSV) The number of rows at the top of the file to ignore. The default setting is **0** and the property is left blank.
    </ResponseField>

    {/* <!-- param-start:[skipBlankLines] | warehouses: [snowflake] --> */}

    <ResponseField name="Skip Blank Lines" type="boolean" required>
      (CSV) When true, skip any blank lines encountered in the CSV files. When false, blank lines produce an end-of-record error. Default is false.
    </ResponseField>

    {/* <!-- param-start:[escape] | warehouses: [snowflake] --> */}

    <ResponseField name="Escape" type="string">
      (CSV) When this option is specified, the backslash character `\` in input data is treated as an escape character.
    </ResponseField>

    {/* <!-- param-start:[escapeUnenclosedField] | warehouses: [snowflake] --> */}

    <ResponseField name="Escape Unenclosed Field" type="string">
      (CSV) Single character string used as the escape character for unenclosed field values. Accepts common escape sequences, octal values, or hex values. Also accepts a value of NONE (default). If a character is specified in the Escape field, it will override this field.
    </ResponseField>

    {/* <!-- param-start:[fieldOptionallyEnclosed] | warehouses: [snowflake] --> */}

    <ResponseField name="Field Optionally Enclosed" type="string">
      (CSV) A character that is used to enclose strings. Can be a single quote `'` or a double quote `"` or NONE (default). Escape the character with an instance of the same character.
    </ResponseField>

    {/* <!-- param-start:[errorOnColumnCountMismatch] | warehouses: [snowflake] --> */}

    <ResponseField name="Error On Column Count Mismatch" type="boolean" required>
      (CSV) Generate an error if the number of delimited columns in the input does not match that of the table. If false, extra columns are not loaded into the table and missing columns are recorded as **NULL** in the table. Default is false.
    </ResponseField>

    {/* <!-- param-start:[emptyFieldAsNull] | warehouses: [snowflake] --> */}

    <ResponseField name="Empty Field As Null" type="boolean" required>
      (CSV) When set to true, empty columns in the input file will become **NULL**. Default is true.
    </ResponseField>

    {/* <!-- param-start:[replaceInvalidCharacters] | warehouses: [snowflake] --> */}

    <ResponseField name="Replace Invalid Characters" type="boolean" required>
      (CSV) When true, Snowflake replaces invalid UTF-8 characters with the Unicode replacement character. When false, the load operation produces an error when invalid UTF-8 character encoding is detected. Default is false.
    </ResponseField>

    {/* <!-- param-start:[encodingType] | warehouses: [snowflake] --> */}

    <ResponseField name="Encoding Type" type="drop-down">
      (CSV) The type of encoding that has been applied to the data. The default setting is UTF-8. Consult the [Snowflake documentation](https://docs.snowflake.com/en/sql-reference/sql/create-file-format#:~:text=TRUE-,ENCODING,-%3D%20%27string) for more information.
    </ResponseField>

    {/* <!-- param-start:[enableOctal] | warehouses: [snowflake] --> */}

    <ResponseField name="Enable Octal" type="boolean" required>
      (JSON) Enables parsing of octal numbers. Default is false.
    </ResponseField>

    {/* <!-- param-start:[allowDuplicates] | warehouses: [snowflake] --> */}

    <ResponseField name="Allow Duplicates" type="boolean" required>
      (JSON) When you pass in a JSON with a duplicate property, choose whether to allow the duplicate or not. Default is false.
    </ResponseField>

    {/* <!-- param-start:[stripOuterArray] | warehouses: [snowflake] --> */}

    <ResponseField name="Strip Outer Array" type="boolean" required>
      (JSON) When true, enables removal of outer square brackets from JSONs. Default is false.
    </ResponseField>

    {/* <!-- param-start:[stripNullValues] | warehouses: [snowflake] --> */}

    <ResponseField name="Strip Null Values" type="boolean" required>
      (JSON) When true, enables removal of **NULL** values. Default is false.
    </ResponseField>

    {/* <!-- param-start:[ignoreUtf8Errors, ignoreUtf8Errors1] | warehouses: [snowflake] --> */}

    <ResponseField name="Ignore UTF-8 Errors" type="boolean" required>
      (JSON, XML) When true, replaces invalid UTF-8 sequences with the Unicode replacement character (U+FFFD), instead of throwing an error. Default is false.
    </ResponseField>

    {/* <!-- param-start:[preserveSpace] | warehouses: [snowflake] --> */}

    <ResponseField name="Preserve Space" type="boolean" required>
      (XML) When set to true, leading and trailing spaces in XML element values are preserved during parsing. Default is false.
    </ResponseField>

    {/* <!-- param-start:[stripOuterElement] | warehouses: [snowflake] --> */}

    <ResponseField name="Strip Outer Element" type="boolean" required>
      (XML) When set to true, when parsing XML files, this will strip the outermost XML element, exposing the second-level elements as separate documents. Default is false.
    </ResponseField>

    {/* <!-- param-start:[disableSnowflakeData] | warehouses: [snowflake] --> */}

    <ResponseField name="Disable Snowflake Data" type="boolean" required>
      (XML) When set to true, the parser will ignore Snowflake [semi-structured data](https://docs.snowflake.net/manuals/user-guide/semistructured-intro.html) tags when processing XML files. Default is false.
    </ResponseField>

    {/* <!-- param-start:[disableAutoConvert] | warehouses: [snowflake] --> */}

    <ResponseField name="Disable Auto Convert" type="boolean" required>
      (XML) When set to true, numeric and Boolean values in XML text elements are automatically converted to their native types during parsing. Default is false.
    </ResponseField>

    {/* <!-- param-start:[nullIf, nullIf1, nullIf2, nullIf3, nullIf4] | warehouses: [snowflake] --> */}

    <ResponseField name="Null If" type="string">
      (AVRO, CSV, JSON, ORC, PARQUET) This option replaces the specified string with **NULL** in the output table. Use this if your data has a particular representation of missing data.
    </ResponseField>

    {/* <!-- param-start:[trimSpace, trimSpace1, trimSpace2, trimSpace3, trimSpace4] | warehouses: [snowflake] --> */}

    <ResponseField name="Trim Space" type="boolean" required>
      (AVRO, CSV, JSON, ORC, PARQUET) When true, removes trailing and leading whitespace from the input data. Default is false.
    </ResponseField>
  </Tab>

  <Tab title="Amazon Redshift">
    <ResponseField name="Name" type="string" required>
      A human-readable name for the component.
    </ResponseField>

    {/* <!-- param-start:[externalSchema] | warehouses: [redshift] --> */}

    <ResponseField name="External Schema" type="drop-down" required>
      import RedshiftTableSchema from '/snippets/content/redshift-table-schema.mdx';

      <RedshiftTableSchema />
    </ResponseField>

    {/* <!-- param-start:[newTableName] | warehouses: [redshift] --> */}

    <ResponseField name="New Table Name" type="string" required>
      The name of the external table to be created or used.
    </ResponseField>

    {/* <!-- param-start:[method] | warehouses: [redshift] --> */}

    <ResponseField name="Method" type="drop-down" required>
      * **Create:** Create the new table with the given name. The pipeline will fail if a table of that name already exists.
      * **Create if not exists:** Create the new table with the given name unless one already exists. The pipeline will succeed and continue in either case. If the schema of the existing table does not match the schema defined in this component, no attempt is made to fix or correct it, which could lead to errors later in the pipeline if you did not expect an existing table to exist, or to have a different schema to the one defined in this component.
      * **Replace:** Create the new table, potentially overwriting any existing table of the same name.

      Since other database objects depend upon this table, `DROP ...` cascade is used, which may actually remove many other database objects.
    </ResponseField>

    {/* <!-- param-start:[tableMetadata] | warehouses: [redshift] --> */}

    <ResponseField name="Table Metadata" type="column editor" required>
      * **Name:** The name of the new column.
      * **Type:** Select the data type.
        * [Text:](https://docs.aws.amazon.com/redshift/latest/dg/r_Character_types.html#r_Character_types-text-and-bpchar-types) Can hold any type of data, subject to a maximum size.
        * [Integer:](https://docs.aws.amazon.com/redshift/latest/dg/r_Numeric_types201.html#r_Numeric_types201-integer-types) Suitable for whole-number types (no decimals).
        * [Numeric:](https://docs.aws.amazon.com/redshift/latest/dg/r_Numeric_types201.html#r_Numeric_types201-decimal-or-numeric-type) Suitable for numeric types, with or without decimals.
        * [Real:](https://docs.aws.amazon.com/redshift/latest/dg/r_Numeric_types201.html) Suitable for data of a single precision floating-point number.
        * [Double Precision:](https://docs.aws.amazon.com/redshift/latest/dg/r_Numeric_types201.html) Suitable for data of a double precision floating-point number.
        * [Boolean:](https://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html) Suitable for data that is either true or false.
        * [Date:](https://docs.aws.amazon.com/redshift/latest/dg/r_Datetime_types.html#r_Datetime_types-date) Suitable for dates without times.
        * [DateTime:](https://docs.aws.amazon.com/redshift/latest/dg/r_Datetime_types.html) Suitable for dates, times, or timestamps (both date and time).
        * [Super:](https://docs.aws.amazon.com/redshift/latest/dg/r_SUPER_type.html) Suitable for storing semi-structured data or documents as values.
      * **Size:** For text types, this is the maximum length. This is a limit on the number of bytes, not characters. For Amazon Redshift, since all data is stored using UTF-8, any non-ASCII character will count as 2 or more bytes. For numeric types, this is the total number of digits allowed, whether before or after the decimal point.
      * **Scale:** Scale is the number of digits to the right of the decimal point in a Float. For example, the number 123.45 has a scale of 2.

      <TextMode />

      <UseGridVariable />
    </ResponseField>

    {/* <!-- param-start:[partition] | warehouses: [redshift] --> */}

    <ResponseField name="Partition" type="column editor" required>
      Assign one or more columns in this table as potential partitions.

      Partition columns allow queries on large datasets to be optimized when that query is made against the columns chosen as partition columns. When a partition is created, values for that column become distinct Amazon S3 storage locations, allowing rows of data in a location that is dependent on their partition column value.

      For example, it is common for a date column to be chosen as a partition column, thus storing all other data according to the date it belongs to. When creating partitioned data using the [Add Partition](/docs/components/add-partition) component, it is vital that those partitioned columns have already been marked using this property.

      <UseGridVariable />
    </ResponseField>

    {/* <!-- param-start:[location] | warehouses: [redshift] --> */}

    <ResponseField name="Location" type="column editor" required>
      The Amazon S3 bucket location for the external table data.

      If the component requires access to a cloud provider (AWS, Azure, or Google Cloud), it will use credentials as follows:

      * If using [Matillion Full SaaS](/docs/guides/runner-overview#matillion-full-saas): The component will use the [cloud credentials](/docs/guides/cloud-credentials) associated with your environment to access resources.
      * If using [Hybrid SaaS](/docs/guides/runner-overview#hybrid-saas): By default the component will inherit the agent's execution role (service account role). However, if there are [cloud credentials](/docs/guides/cloud-credentials) associated to your environment, these will overwrite the role.
    </ResponseField>

    {/* <!-- param-start:[format] | warehouses: [redshift] --> */}

    <ResponseField name="Format" type="drop-down" required>
      Choose a file format for the source file.
    </ResponseField>

    {/* <!-- param-start:[fieldTerminator] | warehouses: [redshift] --> */}

    <ResponseField name="Field Terminator" type="string">
      (TEXTFILE) The delimiter to be used that separates fields (columns) in the file. Defaults to `\\A`.
    </ResponseField>

    {/* <!-- param-start:[lineTerminator] | warehouses: [redshift] --> */}

    <ResponseField name="Line Terminator" type="string">
      (TEXTFILE) The delimiter to be used that separates records (rows) in the file. Defaults to newline.

      * `\` can also signify a newline.
      * `\\r` can signify a carriage return.
    </ResponseField>

    {/* <!-- param-start:[skipHeaderRows] | warehouses: [redshift] --> */}

    <ResponseField name="Skip Header Rows" type="string">
      The number of rows at the top of the file to skip. The default setting is an empty field.
    </ResponseField>

    {/* <!-- param-start:[stripOuterArray] | warehouses: [redshift] --> */}

    <ResponseField name="Strip Outer Array" type="drop-down" required>
      (JSON) Strips the outer array from the JSON file, enabling JSON files that contain a single, anonymous array to be loaded without error. The default setting is **No**.
    </ResponseField>
  </Tab>
</Tabs>
