Creating grid variables
These instructions explain how to create a grid variable. The steps to create project and pipeline variables are different. For instructions on creating project or pipeline variables, read Creating project and pipeline variables.Your project role determines how you can work with grid variables within your project:
- Owners and Contributors can create, edit, and delete grid variables.
- Viewers can see grid variables, but cannot create, edit, or delete them.
- Open the Variables panel at the bottom of the pipeline canvas.
- Select Pipeline in the top left of the panel.
- Click Add in the top right of the panel, then select Grid from the drop-down.
- Complete the following fields, and then click Next:
-
Name: A name for the variable.
All pipeline variables and grid variables must have unique names within their pipeline, although variables with the same name can be defined in a different pipeline. The name must follow these rules:
- Names must only include letters, digits, and underscores.
- Names must not begin with a digit. For example,
my_table_02and_02_my_tableare valid, but02_my_tableis not. - Names must not be a JavaScript reserved word (e.g.
varorconst). - Names of variables used in Python or Bash scripts must not be a reserved word in those scripts.
- Behavior: Select Shared or Copied. For more information, read Variable behavior. The default behavior is Shared.
- Visibility: Select Public or Private. Read Grid variable visibility, below, for a full explanation.
-
Default: Click into this field to open a table showing the default values for this grid variable. For more information, read Columns and Row values.
- To add a new column, click the + icon in the top right of the panel, and give this column a name.
- To add row values, click the + icon under the table, and enter the default values for this row in the table. You can only add rows after you have added at least one column.
- To delete a column, hover over the column name and click the Delete icon that appears in the cell containing the name.
- To delete a row, hover over any cell in the row and click the Delete icon that appears to the right of the row.
- Description: A description to inform other users of the purpose of the variable.
-
Name: A name for the variable.
Managing grid variables
Users with the Owner or Contributor role can edit and delete grid variables.Text mode
Text mode is a feature that lets you rapidly add column definitions and default values to your grid variable. Instead of typing or selecting values in the individual fields for these items, you can click the Text mode toggle to open a multi-line editor that lets you add all the items in a single block, one item to a line, with the “fields” separated by tabs. For example, to define three columns in text mode you can type the following:Cost Numbor, the Column type field for “Cost” would reject the invalid type “Numbor” and remain blank. You must correct the type, either directly in the field or in text mode again, before you can proceed with variable creation.
You can go into text mode to edit the column definitions or default values of an existing grid variable, regardless of whether it was originally created in text mode or not.
An advantage of text mode is that it will allow you to paste in values from other text sources, as long as the source has items correctly separated by tabs and new lines. You can also rapidly create a copy of an existing grid variable:
- Open the existing grid variable in text mode.
- Copy the text containing the column definitions.
- Create the new grid variable.
- Paste the copied text into the text mode field of the new grid variable.
Columns
A grid variable is a two-dimensional array, or table, consisting of one or more named columns with one or more values in each column. The columns in a grid variable are defined and fixed when the variable is created, but the values they contain can be changed dynamically at runtime. Each column in the grid can be a different variable type, so you could have a column of text and another column of numbers, for example. To define the columns, enter the following details when the grid variable is created:- Column Name: A name to identify the column. This must be unique within the grid variable, but can be the same as column names defined in other grid variables. The name can only contain letters, numbers, underscores
_and dollar symbols$. It can’t start with a number. - Column Type: The data type that the column will contain. Select Text or Number. The values stored in each column must be suitable for the data type of the column.
Row values
Each row of a grid variable holds a single value for each of the defined columns. This allows you to construct a “table” of related information in a single grid variable. For example, a grid variable with three columns defined as “Name” (text), “Salary” (number) and “Location” (text) could have three rows of values as follows:| Name | Salary | Location |
|---|---|---|
| Alice | 35000 | New York |
| Bob | 40000 | San Francisco |
| Charlie | 30000 | Chicago |
Grid variables are limited to a maximum of 5,000 rows.
Grid variable visibility
Setting the correct visibility for a pipeline grid variable is important when you want to call a pipeline from another pipeline, using the Run Orchestration or Run Transformation components. A Private grid variable is only visible to the pipeline it is defined in. If the pipeline is called from another pipeline, the calling pipeline can’t “see” the Private variable and so can’t use its value, reset its value, or otherwise interact with it in any way. A Public grid variable is visible outside the pipeline it is defined in, so it can be “seen” and used by any pipeline that calls the pipeline where it is set.Using grid variables
The following components are specifically for use with grid variables:- Grid Iterator implements a loop over the rows of a grid variable, running an attached component multiple times, each time with a different row of values from the grid.
- JDBC Table Metadata to Grid takes the metadata from a JDBC table and uses this data to populate a grid variable.
- Query Result to Grid queries a table and returns rows of data that are loaded into a predefined grid variable.
- Remove from Grid removes rows of data (but not columns) from a grid.
- Table Metadata to Grid takes the metadata from a table and uses this data to populate a grid variable.
Data selection
Query components have a Data Selection property that lets you select the columns you want from the data source. Instead of manually selecting the columns when you configure the query component, you can specify that a grid variable will provide the column names. By using a variable instead of hard-configuring the component, you can re-use the same column names across all components in the pipeline, making it easier to consistently set the property and make global changes to all such properties. To use a grid variable in a Data Selection property:- Create a grid variable with a single column of type Text.
-
Populate the default values of the grid variable with the names of the columns you want to include in the Data Selection property.
These default values could be set or changed dynamically at runtime by use of other components such as Python Script.
- In the Data Selection property of your query component, select Use Grid Variable. This will change the dialog to display Grid Variable and Grid Column drop-downs instead of the column selection listboxes.
- Use the drop-downs to select the grid variable and the name of the grid variable column.
- Click Save.
Populate metadata
In the Create Table and Create External Table components, the metadata required to define table columns in the Columns property can be assigned from a grid variable, as follows:- Create a grid variable with one column for each piece of metadata you want to populate (Column Name, Data Type, Size, Precision, etc.).
-
Populate the default values of each column with the values of the metadata.
These default values could be set or changed dynamically at runtime by use of other components such as Python Script.
- In the Columns dialog, select Use Grid Variable. This will change the dialog to display Grid variable and Column Mapping drop-downs instead of the column definition fields.
- In the Grid variable drop-down, select which grid variable to use for metadata.
- In each Column Mapping drop-down, select the grid variable column that will be used to provide a value to each piece of column metadata.
- Click Save.
Populate load options
Query components have a Load Options property that configures different aspects of data loading. The Load Options dialog contains five drop-downs and a text field, which you set individually. Instead of manually selecting the setting of each drop-down, you can use a grid variable to make the selections. By using a variable, you can keep your preferred load options consistent across all query components in the pipeline, and easily make global changes to all such properties. To use a grid variable in a Load Options property:-
Create new grid variable with two columns defined as follows:
Column name Column type Key Text Value Text - Click Next.
-
Add six rows to the variable, one for each of the six options in the Load Options dialog. In each row, the Key column must contain the name of one of the load options, and the corresponding Value column must contain a valid value for that load option. For example:
Key Value Clean Staged Files On String Null is Null Off Recreate Target Table On File Prefix Trim String Columns On Compression type Gzip The values provided for the drop-downs must exactly match an option in the corresponding drop-down. For example, Clean Staged Files has a choice ofOnandOff, and one of those words must be typed exactly in the value column, including the capitalization. - In the Load Options dialog, select Use Grid Variable. This will change the dialog to display Grid variable and Column Mapping drop-downs instead of the load option drop-down fields.
- In the Grid variable drop-down, select the grid variable you created for load options.
- In the Column Mapping drop-down, select Value.
- Click Save.
Passing variables between pipelines
Grid variable values can be set when you call one pipeline from another using the Run Orchestration and Run Transformation components, which allows you to pass variable values from the “parent” to the “child” pipeline it is running. Use the Set Grid Variables property on those components to configure what values will be set in the “child” grid variables and what will be passed to them from the “parent”.Using grid variables in scripts
Python script
Grid variables are visible to Python scripts created in the Python Script component, and in those scripts they will act as any other Python variable. In the Python script component, you can use Python’scontext object to read and write grid variable values. Because grid variables are arrays, you can use Python arrays to hold the grid variable data.
To get the values from a grid variable, use:
context object.
A full discussion of Python scripting is beyond the scope of this article. Refer to any good Python reference for further information.
Python Pushdown
Grid variables are visible to Python scripts created in the Python Pushdown component, and in those scripts they will act as any other Python variable. In Python Pushdown, you can use Python’sprint function to read grid variable values. Python Pushdown does not require the context object to read grid variables. You can just refer to it directly as shown in the examples provided.
To assign the values from a grid variable to a Python variable, use:
Key-based grid variable lookup
You can use Python to perform a key-based lookup of information in a grid variable, allowing you to search for a particular value in one column of the grid and return the corresponding value from another column in the same row. For example, if your grid variable has two columns, “Key” and “Value”, you can define the following Python function to perform a lookup based on the “Key” column and return the corresponding value from the “Value” column:LoadOptions with the following values:
| Key | Value |
|---|---|
| File Format | CSV |
| Compression | GZIP |
context object in the first statement to get the grid variable rows, as described in the preceding sections:
