Skip to main content
Grid variables are a special type of pipeline variable. A grid variable is a two-dimensional array that holds multiple values in named columns. You cannot use a grid variable as a project variable.

Creating grid variables

These instructions explain how to create a grid variable. The steps to create project and pipeline variables are different. For instructions on creating project or pipeline variables, read Creating project and pipeline variables.
Your project role determines how you can work with grid variables within your project:
  • Owners and Contributors can create, edit, and delete grid variables.
  • Viewers can see grid variables, but cannot create, edit, or delete them.
  1. Open the Variables panel at the bottom of the pipeline canvas.
  2. Select Pipeline in the top left of the panel.
  3. Click Add in the top right of the panel, then select Grid from the drop-down.
  4. Complete the following fields, and then click Next:
    • Name: A name for the variable.
      All pipeline variables and grid variables must have unique names within their pipeline, although variables with the same name can be defined in a different pipeline. The name must follow these rules:
      • Names must only include letters, digits, and underscores.
      • Names must not begin with a digit. For example, my_table_02 and _02_my_table are valid, but 02_my_table is not.
      • Names must not be a JavaScript reserved word (e.g. var or const).
      • Names of variables used in Python or Bash scripts must not be a reserved word in those scripts.
    • Behavior: Select Shared or Copied. For more information, read Variable behavior. The default behavior is Shared.
    • Visibility: Select Public or Private. Read Grid variable visibility, below, for a full explanation.
    • Default: Click into this field to open a table showing the default values for this grid variable. For more information, read Columns and Row values.
      • To add a new column, click the + icon in the top right of the panel, and give this column a name.
      • To add row values, click the + icon under the table, and enter the default values for this row in the table. You can only add rows after you have added at least one column.
      • To delete a column, hover over the column name and click the Delete icon that appears in the cell containing the name.
      • To delete a row, hover over any cell in the row and click the Delete icon that appears to the right of the row.
    • Description: A description to inform other users of the purpose of the variable.

Managing grid variables

Users with the Owner or Contributor role can edit and delete grid variables.

Text mode

Text mode is a feature that lets you rapidly add column definitions and default values to your grid variable. Instead of typing or selecting values in the individual fields for these items, you can click the Text mode toggle to open a multi-line editor that lets you add all the items in a single block, one item to a line, with the “fields” separated by tabs. For example, to define three columns in text mode you can type the following:
Name    Text
Descr   Text
Cost    Number
Once you have entered details in text mode, you must switch out of text mode by clicking the Text mode toggle once more, to verify the details are valid before you continue. For example, if you mistyped Cost Numbor, the Column type field for “Cost” would reject the invalid type “Numbor” and remain blank. You must correct the type, either directly in the field or in text mode again, before you can proceed with variable creation. You can go into text mode to edit the column definitions or default values of an existing grid variable, regardless of whether it was originally created in text mode or not. An advantage of text mode is that it will allow you to paste in values from other text sources, as long as the source has items correctly separated by tabs and new lines. You can also rapidly create a copy of an existing grid variable:
  1. Open the existing grid variable in text mode.
  2. Copy the text containing the column definitions.
  3. Create the new grid variable.
  4. Paste the copied text into the text mode field of the new grid variable.
Similarly, you can rapidly copy default values from one grid variable to another, assuming they have matching columns.

Columns

A grid variable is a two-dimensional array, or table, consisting of one or more named columns with one or more values in each column. The columns in a grid variable are defined and fixed when the variable is created, but the values they contain can be changed dynamically at runtime. Each column in the grid can be a different variable type, so you could have a column of text and another column of numbers, for example. To define the columns, enter the following details when the grid variable is created:
  • Column Name: A name to identify the column. This must be unique within the grid variable, but can be the same as column names defined in other grid variables. The name can only contain letters, numbers, underscores _ and dollar symbols $. It can’t start with a number.
  • Column Type: The data type that the column will contain. Select Text or Number. The values stored in each column must be suitable for the data type of the column.

Row values

Each row of a grid variable holds a single value for each of the defined columns. This allows you to construct a “table” of related information in a single grid variable. For example, a grid variable with three columns defined as “Name” (text), “Salary” (number) and “Location” (text) could have three rows of values as follows:
NameSalaryLocation
Alice35000New York
Bob40000San Francisco
Charlie30000Chicago
Grid variables are limited to a maximum of 5,000 rows.
When a pipeline run begins, all of its grid variables are initialized with the default values that were specified when the variable was created. Variable values can be assigned or reassigned dynamically while the pipeline runs. Values set during the run are valid only for the current run of the pipeline, and for the next run of the pipeline the variable will revert to its default values, or a new value must be assigned. If you construct a branching pipeline and update a grid variable value in one branch, the value only changes in that branch; the other branch will use the original value of the variable. In effect, each branch of a complex pipeline has its own “local copy” of a grid variable, each of which can be updated independently. If the branches later rejoin, for example through the use of an And or Or component, the grid variable will revert to its default values, regardless of any updates made in either branch.

Grid variable visibility

Setting the correct visibility for a pipeline grid variable is important when you want to call a pipeline from another pipeline, using the Run Orchestration or Run Transformation components. A Private grid variable is only visible to the pipeline it is defined in. If the pipeline is called from another pipeline, the calling pipeline can’t “see” the Private variable and so can’t use its value, reset its value, or otherwise interact with it in any way. A Public grid variable is visible outside the pipeline it is defined in, so it can be “seen” and used by any pipeline that calls the pipeline where it is set.

Using grid variables

The following components are specifically for use with grid variables:
  • Grid Iterator implements a loop over the rows of a grid variable, running an attached component multiple times, each time with a different row of values from the grid.
  • JDBC Table Metadata to Grid takes the metadata from a JDBC table and uses this data to populate a grid variable.
  • Query Result to Grid queries a table and returns rows of data that are loaded into a predefined grid variable.
  • Remove from Grid removes rows of data (but not columns) from a grid.
  • Table Metadata to Grid takes the metadata from a table and uses this data to populate a grid variable.
Grid variables may also be used to populate property values in many different components. Where grid variables can be used, the dialog which sets the property value will include a checkbox labelled Use Grid Variable. Some common use cases for grid variables in components are described below.

Data selection

Query components have a Data Selection property that lets you select the columns you want from the data source. Instead of manually selecting the columns when you configure the query component, you can specify that a grid variable will provide the column names. By using a variable instead of hard-configuring the component, you can re-use the same column names across all components in the pipeline, making it easier to consistently set the property and make global changes to all such properties. To use a grid variable in a Data Selection property:
  1. Create a grid variable with a single column of type Text.
  2. Populate the default values of the grid variable with the names of the columns you want to include in the Data Selection property.
    These default values could be set or changed dynamically at runtime by use of other components such as Python Script.
  3. In the Data Selection property of your query component, select Use Grid Variable. This will change the dialog to display Grid Variable and Grid Column drop-downs instead of the column selection listboxes.
  4. Use the drop-downs to select the grid variable and the name of the grid variable column.
  5. Click Save.

Populate metadata

In the Create Table and Create External Table components, the metadata required to define table columns in the Columns property can be assigned from a grid variable, as follows:
  1. Create a grid variable with one column for each piece of metadata you want to populate (Column Name, Data Type, Size, Precision, etc.).
  2. Populate the default values of each column with the values of the metadata.
    These default values could be set or changed dynamically at runtime by use of other components such as Python Script.
  3. In the Columns dialog, select Use Grid Variable. This will change the dialog to display Grid variable and Column Mapping drop-downs instead of the column definition fields.
  4. In the Grid variable drop-down, select which grid variable to use for metadata.
  5. In each Column Mapping drop-down, select the grid variable column that will be used to provide a value to each piece of column metadata.
  6. Click Save.

Populate load options

Query components have a Load Options property that configures different aspects of data loading. The Load Options dialog contains five drop-downs and a text field, which you set individually. Instead of manually selecting the setting of each drop-down, you can use a grid variable to make the selections. By using a variable, you can keep your preferred load options consistent across all query components in the pipeline, and easily make global changes to all such properties. To use a grid variable in a Load Options property:
  1. Create new grid variable with two columns defined as follows:
    Column nameColumn type
    KeyText
    ValueText
  2. Click Next.
  3. Add six rows to the variable, one for each of the six options in the Load Options dialog. In each row, the Key column must contain the name of one of the load options, and the corresponding Value column must contain a valid value for that load option. For example:
    KeyValue
    Clean Staged FilesOn
    String Null is NullOff
    Recreate Target TableOn
    File Prefix
    Trim String ColumnsOn
    Compression typeGzip
    The values provided for the drop-downs must exactly match an option in the corresponding drop-down. For example, Clean Staged Files has a choice of On and Off, and one of those words must be typed exactly in the value column, including the capitalization.
  4. In the Load Options dialog, select Use Grid Variable. This will change the dialog to display Grid variable and Column Mapping drop-downs instead of the load option drop-down fields.
  5. In the Grid variable drop-down, select the grid variable you created for load options.
  6. In the Column Mapping drop-down, select Value.
  7. Click Save.

Passing variables between pipelines

Grid variable values can be set when you call one pipeline from another using the Run Orchestration and Run Transformation components, which allows you to pass variable values from the “parent” to the “child” pipeline it is running. Use the Set Grid Variables property on those components to configure what values will be set in the “child” grid variables and what will be passed to them from the “parent”.

Using grid variables in scripts

Python script

Grid variables are visible to Python scripts created in the Python Script component, and in those scripts they will act as any other Python variable. In the Python script component, you can use Python’s context object to read and write grid variable values. Because grid variables are arrays, you can use Python arrays to hold the grid variable data. To get the values from a grid variable, use:
context.getGridVariable('<GridName>')
To write values into a grid variable, use:
context.updateGridVariable('<GridName>', <values>)
The following Python script example takes data from one grid variable, “people”, puts the data into a Python array called “p_array”, then copies it to different a grid variable, “names”:
p_array = context.getGridVariable('people')
context.updateGridVariable('names', p_array)
New values written to a grid variable by Python will be used in all subsequent components that use that grid variable as the pipeline continues to run. This gives you a powerful tool for creating complex yet flexible pipelines, as some very simple Python scripts can dynamically change the behavior of the pipeline at runtime by manipulating variable values. Note that you can manipulate the variable array in any desired way within the Python script, but the value of the variable within the pipeline itself won’t change unless you pass it back with the context object.
Python will allow you to change the type of a variable, for example by assigning a string to a variable that previously held an integer. When you intend to pass the variable back to the pipeline using the context object, you must take care that you do not change the variable type in your Python script. Doing so will have consequences in your pipeline and will likely cause it to fail.
A full discussion of Python scripting is beyond the scope of this article. Refer to any good Python reference for further information.

Python Pushdown

Grid variables are visible to Python scripts created in the Python Pushdown component, and in those scripts they will act as any other Python variable. In Python Pushdown, you can use Python’s print function to read grid variable values. Python Pushdown does not require the context object to read grid variables. You can just refer to it directly as shown in the examples provided. To assign the values from a grid variable to a Python variable, use:
rows = GridName
To print all the values from a grid variable, use:
print(GridName)
To print the values of a single element from a grid variable, specify the row and column of the element. For example, to print the value in row 1, column 2, use:
print(GridName[1][2])
To extract the values of a single element from a grid variable, specify the row and column of the element. For example, to extract the value in row 1, column 2, use:
x = GridName[1][2]
To update values into a grid variable, use:
context.updateGridVariable('GridName',[['list','of'],['lists','!']])

Key-based grid variable lookup

You can use Python to perform a key-based lookup of information in a grid variable, allowing you to search for a particular value in one column of the grid and return the corresponding value from another column in the same row. For example, if your grid variable has two columns, “Key” and “Value”, you can define the following Python function to perform a lookup based on the “Key” column and return the corresponding value from the “Value” column:
def get_value_by_key(key):
    for row in rows:
        if row[0] == key:    # column 0 = Key, column 1 = Value
            return row[1]
    return None
Given a grid variable called LoadOptions with the following values:
KeyValue
File FormatCSV
CompressionGZIP
The following Python code would allow you to look up the value of any of the keys in the “Key” column:
rows = LoadOptions  # Returns a 2D Python list

def get_value_by_key(key):
    for row in rows:
        if row[0] == key:
            return row[1]
    return None

# Look up values by key and assign them to Python variables
file_format = get_value_by_key('File Format')           # will return 'CSV'
compression = get_value_by_key('Compression')           # will return 'GZIP'
This example script assumes Python Pushdown is being used, but the same principle applies in a Python Script component, except that you would need to use the context object in the first statement to get the grid variable rows, as described in the preceding sections:
rows = context.getGridVariable('LoadOptions')  # Returns a 2D Python list