Production use of this feature is available for specific editions only. Contact our sales team for more information.
Currently, this component only supports provisioned OpenSearch services, not serverless.
Prerequisites
Before you use the Amazon OpenSearch Upsert component, you’ll need to add AWS cloud credentials to .Permissions
You’ll need to ensure you have permissions for Amazon OpenSearch Service. If you’re using Amazon Bedrock for your embeddings, you’ll also need permission to invoke the model.Amazon OpenSearch Service
Amazon OpenSearch Service
To use the Amazon OpenSearch Upsert component, ensure that your IAM role or user has the necessary permissions to interact with Amazon OpenSearch. Below is an example of an IAM policy that grants the required permissions:If you are using fine-grained access control in Amazon OpenSearch Service, you’ll also need OpenSearch user/role permissions (for example, a role mapped to the appropriate OpenSearch index with write access).
Amazon Bedrock
Amazon Bedrock
If you are using Amazon Bedrock as your embedding provider, ensure that your IAM role or user has the necessary permissions to invoke the model. Below is an example of an IAM policy that grants the required permissions:
Properties
Reference material is provided below for the Source, Configure, and Destination properties.A human-readable name for the component.
Source
- Snowflake
- Databricks
- Amazon Redshift
The Snowflake database. The special value
[Environment Default] uses the database defined in the environment. Read Databases, Tables and Views - Overview to learn more.The Snowflake source schema. The special value
[Environment Default] uses the schema defined in the environment. Read Database, Schema, and Share DDL to learn more.Select the table that contains the data you want to upsert into Amazon OpenSearch.
This column is used to uniquely identify each row in the table. It is used to ensure that the data is not duplicated when it is loaded into the destination. An example use case would be a column of product IDs that you want to use to identify each product in the table.
This column is used to generate vectors for the text data in the table, which are then upserted as embeddings to Amazon OpenSearch. An example use case of this column would be a column of product reviews that you want to convert into vectors for semantic search or to perform sentiment analysis on.
Set the
Limit to control the maximum number of records (rows) to load from the table. The default is 1000.Configure
The embedding provider is the API service used to convert the search term into a vector. Choose either OpenAI or Amazon Bedrock. The embedding provider receives a search term (e.g. “How do I log in?”) and returns a vector.
- OpenAI
- Amazon Bedrock
Use the drop-down menu to select the corresponding secret definition that denotes the value of your OpenAI API key.Read Secrets and secret definitions to learn how to create a new secret definition.To create a new OpenAI API key:
- Log in to OpenAI.
- Click your avatar in the top-right of the UI.
- Click View API keys.
- Click + Create new secret key.
- Give a name for your new secret key and click Create secret key.
- Copy your new secret key and save it. Then click Done.
Select an OpenAI embedding model.Currently supports:
- text-embedding-ada-002
- text-embedding-3-small
- text-embedding-3-large
Set the size of array of data per API call. The default size is 10. When set to 10, 1000 rows would therefore require 100 API calls.You may wish to reduce this number if a row contains a high volume of data, and conversely, increase this number for rows with low data volume.
Destination
The URL of the Amazon OpenSearch domain endpoint to upsert your vector embeddings to. To find your endpoint URL:
- Log in to the Amazon OpenSearch Service console.
- Navigate to the Domains page.
- Click on the domain you want to use.
- Copy the Domain Endpoint URL from the domain details page.
The name of an existing Amazon OpenSearch index where the vector embeddings will be upserted. An index in Amazon OpenSearch is similar to a table in a relational database.Below is an example code snippet you could use to create an index. You can run the following command in the OpenSearch Dev Tools console or a REST API client like curl or Postman:
The
dimension value must match the output dimension of the embedding model you have chosen:text-embedding-ada-002andtext-embedding-3-smalloutput vectors of dimension “1536”.text-embedding-3-largeoutput vectors of dimension “3072”.
text-embedding-3-large model, update "dimension": 1536 to “dimension”: 3072 in the mapping above.Select the AWS region of your Amazon OpenSearch Service domain.

