Skip to main content
The Google Cloud Storage Load orchestration component lets users load data stored on the Google Cloud Storage service into an existing Snowflake table. This component requires working Google Cloud Storage credentials with “read” access to the source data files. If the component requires access to a cloud provider (AWS, Azure, or GCP), it will use the cloud credentials associated with your environment to access resources.

Properties

Name
string
required
A human-readable name for the component.
Stage
drop-down
required
Select a staging area for the data. Staging areas can be created through Snowflake using the CREATE STAGE command. Internal stages can be set up this way to store staged data within Snowflake. Selecting [Custom] will allow the user to specify a custom staging area.
Storage Integration
drop-down
required
Select the storage integration. Storage integrations are required to permit Snowflake to read from and write to a cloud storage location. Integrations must be set up in advance and configured to support Google Cloud Storage.
Google Storage URL Location
file explorer
required
To retrieve the intended files, use the file explorer to enter the container path where the Google Cloud Storage bucket is located, or select from the list of GCS buckets.This must have the format GS://<bucket>/<path>.
Pattern
string
required
A regular expression pattern string that specifies the file names and/or paths to match. For more information on pattern matching, read the Snowflake documentation.
Warehouse
drop-down
required
The Snowflake warehouse used to run the queries. The special value [Environment Default] uses the warehouse defined in the environment. Read Overview of Warehouses to learn more.
Database
drop-down
required
The Snowflake database. The special value [Environment Default] uses the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.
Schema
drop-down
required
The Snowflake schema. The special value [Environment Default] uses the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.
Target Table
string
required
Select an existing table to load data into. The tables available for selection depend on the chosen schema.
Load Columns
dual listbox
required
Choose the columns to load. If you leave this parameter empty, all columns will be loaded.
Format
drop-down
required
Select a pre-made file format that will automatically set many of the Google Cloud Storage Load component properties. These formats can be created through the Create File Format component. Select [Custom] to specify a custom format using the properties available in this component.
File Type
drop-down
required
Select the type of data to load. Available data types are: AVRO, CSV, JSON, ORC, PARQUET, and XML. For additional information on file type options, read the Snowflake documentation.Component properties will change to reflect the selected file type. Click one of the tabs below for properties applicable to that file type.
Compression
drop-down
required
Select the compression method if you wish to compress your data. If you do not wish to compress at all, select NONE. The default setting is AUTO.

Trim Space
boolean
required
When Yes, removes whitespace from fields. Default setting is No.

Null If
editor
required
Specify one or more strings (one string per row in the dialog) to convert to NULL values. When one of these strings is encountered in the file, it is replaced with an SQL NULL value for that field in the loaded table. Click + to add a string.
On Error
drop-down
required
Decide how to proceed upon an error.
  • Abort Statement: Aborts the load if any error is encountered.
  • Continue: Continue loading the file.
  • Skip File: Skip file if any errors are encountered in the file.
  • Skip File When n Errors: Skip file when the number of errors in the file is equal to or greater than the specified number in the next property, n.
  • Skip File When n% Errors: Skip file when the percentage of errors in the file exceeds the specified percentage of n.
Default setting is Abort Statement.
n
integer
required
Specify the number of errors or the percentage of errors required for the load to skip the file. Only used when On Error is set to Skip File When n Errors or Skip File When n% Errors.This property only accepts integer characters. Specify percentages as a number only, without the % symbol.
Size Limit (B)
integer
required
Specify the maximum size, in bytes, of data to be loaded for a given COPY statement. If the maximum is exceeded, the COPY operation discontinues loading files. For more information, read the Snowflake documentation.
Purge Files
boolean
required
Select Yes to purge data files after the data is successfully loaded. Default setting is No.
Truncate Columns
boolean
required
  • Yes: The component will automatically truncate strings to the target column length.
  • No: The COPY statement produces an error if a loaded string exceeds the target column length.
Default setting is No.
Force Load
boolean
required
Select Yes to load all files, regardless of whether they have been loaded previously and haven’t changed since they were loaded. This option reloads files and can lead to duplicated data in a table.Default setting is No.
Metadata Fields
dual listbox
required
Snowflake metadata columns available to include in the load.Snowflake automatically generates metadata for files in internal stages (i.e. Snowflake) and external stages (Google Cloud Storage, Microsoft Azure, or Amazon S3). This metadata is “stored” in virtual columns. These metadata columns are added to the staged data, but are only added to the table when included in a query of the table. For more information, read Querying Metadata for Staged Files.