- If using Matillion Full SaaS: The component will use the cloud credentials associated with your environment to access resources.
- If using Hybrid SaaS: By default the component will inherit the agent’s execution role (service account role). However, if there are cloud credentials associated to your environment, these will overwrite the role.
Properties
A human-readable name for the component.
Select the remote file system to search. Available data types are:
- Azure Blob Storage
- Google Cloud Storage
- FTP
- SFTP
- Amazon S3
- Windows Fileshare.
The URL of the source files, including the full path to the folder you wish to iterate over. You can further refine the filenames to be iterated over using the Filter regex property.Clicking this property will open the Input data URL dialog. This displays a list of all existing storage accounts. Select a storage account, then a container, and then a subfolder if required. This constructs a URL with the following format:You can also type the URL directly into the Storage Accounts path field, instead of selecting listed elements. This is particularly useful when using project and pipeline variables in the URL, for example:Special characters used in this field must be URL-safe.
Enter your connection domain.
Provide a valid username for the connection.
The secret definition denoting the password for the connection. Your password should be saved as a secret definition before using this component.
The secret definition denoting your SFTP key for the connection. Your SFTP key should be saved as a secret definition before using this component.This parameter is optional and will only be used if the data source requests it.This must be the complete private key, beginning with “-----BEGIN RSA PRIVATE KEY-----” and conforming to the same structure as an RSA private key.The following private key formats are currently supported:The flags used in this command are:
Read this page to know more about supported flags.
- DSA
- RSA
- ECDSA
- Ed25519
| Flag | Description |
|---|---|
-p | Changes the passphrase of a private key file. Use this to trigger a format conversion without setting a new passphrase. |
-f | Specifies the filename of the private key file to convert. Replace YOUR_PRIVATE_KEY with the path to your key file. |
-m | Specifies the output format. The pem value converts the key to PEM format, which is compatible with AWS Secrets Manager and Azure Key Vault. |
- No: The URL path is from the server root.
- Yes: The URL path is relative to the user’s home directory. Default setting is Yes.
- No: Only search for files within the folder identified by the Input data URL.
- Yes: Consider files in subdirectories when searching for files.
Set the maximum recursion depth into subdirectories. This property is only available when Recursive is set to Yes.
- No: Include hidden files.
- Yes: Ignore hidden files, even if they otherwise match the filter. This is the default setting.
The maximum number of times the attached component runs. The maximum value cannot exceed 5000.The value you set interacts with the number of matching files as follows:
- If Max iterations is lower than the number of matching files, the component iterates over only the first N files, where N is the value you set. For example, setting this to 25 means only the first 25 matching files are iterated over, even if more files exist in the remote file system.
- If Max iterations is higher than the number of matching files, the component cycles through the available files repeatedly until it reaches the set number of iterations. For example, setting this to 1000 when only 10 files match means each file is iterated 100 times.
The java-standard regular expression used to test against each candidate file’s full path. If you want ALL files, specify
.*Filter regex starts with a variable that represents the folder name with /.* as the suffix. The forward slash defines to look within the folder. The .* is the wildcard to return all files in that folder.Example: ${jv_folder}/.*If Filter regex has a folder structure ${jv_folder}/.*, you do need to have a Recursive value as YES to find the folder beyond Input data URL path DataType://${jv_blobStorageAccount}/${jv_containerName}/.- Concurrent: Iterations are run concurrently.
- Sequential: Iterations are done in sequence, waiting for each to complete before starting the next. This is the default setting.
Select project variables that will hold the values of file attributes. This will allow you to use the matching file’s metadata (such as its filename) in the component attached to the File Iterator. The project variables must have been defined before using them in this component. Read Project and pipeline variables for more information.Use + to add a variable, and specify the following:
- Variable: Select an existing project variable to hold a given file attribute.
- File attribute: For each matched file, the project variable will be populated with the attribute selected here. The attributes which can be used are:
- Base Folder.
- Subfolder. Useful when recursing.
- Filename.
- Last modified. A date formatted as ISO8601, with a UTC indicator. For example:
2021-01-04T10:45:15.123Z.
If a failure occurs during any iteration, the failure link is followed. This parameter controls whether it’s followed immediately or after all iterations have been attempted.
- No: Attempt to run the attached component for each iteration, regardless of success or failure. This is the default setting.
- Yes: If the attached component doesn’t run successfully, fail immediately.
Select Yes to stop the iteration based on a condition specified in the Condition property. The default setting is No.For this property to be available, set Concurrency to Sequential.
Select the method for creating the stop condition.
- Simple: A no-code condition editor opens, where you specify an Input Variable, Qualifier, Comparator, and Value. This is the default setting.
- Advanced: A code editor opens, where you write the condition manually using SQL.
Click the gear icon to open the Condition dialog. Use + and - to add or remove conditions. Each condition has the following columns:Input variable: An input variable to form a condition around.Qualifier: Select whether the condition should be applied (Is, the default) or reversed (Not). Selecting Not reverses the comparator, so Equal to becomes “not equal to”, Less than becomes “greater than or equal to”, and so on.Comparator: Select from:
- Less than: Value of the input variable must be less than the specified value.
- Less than or equal to: Value of the input variable must be less than or equal to the specified value.
- Equal to: Value of the input variable must be equal to the specified value.
- Greater than or equal to: Value of the input variable must be greater than or equal to the specified value.
- Greater than: Value of the input variable must be greater than the specified value.
- Blank: Checks whether the input variable is empty.
Enter the condition manually in the code editor using SQL.This property is only available when Stop on condition mode is set to Advanced.
When multiple conditions are present, they can be separated by And or Or.
- And: All the conditions must be true.
- Or: Any of the conditions must be true.
Use this to set a maximum limit on the number of concurrent iterations that will be attempted. This is important to ensure that the workload is orchestrated to accommodate any source and target constraints.If you leave this property blank, no upper limit will be placed on the number of concurrent tasks that can be attempted.If you stack iterators, concurrency limits will multiply exponentially. For example, if you stack an iterator with Maximum concurrent iterations set to 10 on top of another iterator with Maximum concurrent iterations set to 10, the component could attempt 100 concurrent iterations.This property is only available when Concurrency is set to Concurrent.
If this is set to Yes, the Task history tab and Observability dashboard will show each iterator variable as a name/value pair in the component result message for each iteration. The default is Yes.
If this is set to Yes, the Task history tab and Observability dashboard will show each iterator variable as a name/value pair in the component result message for each iteration. The default is Yes.
Counting the number of iterations
The File Iterator determines the number of files to iterate at runtime, based on the matching files found. You can count the number of iterations using System variables.Checking whether a file exists
does not provide a dedicated method, component, or function specifically for checking whether a file exists in a remote location. However, you can use the File Iterator component as a practical alternative. If the File Iterator matches the file, it runs at least one iteration; if no files match, it runs zero iterations. A common pattern is to initialize a Text project variable (for example,file_found) to false before the File Iterator, then set it to true inside the iterator. If the variable remains false after the iterator completes, no matching file was found.
This approach works for any remote file system supported by the File Iterator, including Azure Blob Storage, Google Cloud Storage, FTP, SFTP, Amazon S3, and Windows Fileshare.