> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Snowflake Vector Upsert

export const ComponentMetadata = ({warehouses, unsupportedWarehouses = [], componentType, connectionInputs, connectionOutputs}) => {
  const allWarehouses = [...warehouses.map(w => ({
    name: w,
    supported: true
  })), ...unsupportedWarehouses.map(w => ({
    name: w,
    supported: false
  }))];
  return <div style={{
    background: 'var(--colors-background-light, #f9fafb)',
    border: '1px solid var(--colors-border-default, #e5e7eb)',
    borderRadius: '12px',
    padding: '20px 28px',
    marginBottom: '28px',
    boxShadow: '0 1px 4px rgba(0,0,0,0.10)'
  }}>
      <table style={{
    width: '100%',
    borderCollapse: 'collapse'
  }}>
        <tbody>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle',
    width: '180px'
  }}>Project Availability</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>
              <div style={{
    display: 'flex',
    flexWrap: 'wrap',
    gap: '8px'
  }}>
                {allWarehouses.map((w, i) => <span key={i} style={{
    background: w.supported ? '#dcfce7' : '#fee2e2',
    color: w.supported ? '#15803d' : '#b91c1c',
    border: `1px solid ${w.supported ? '#bbf7d0' : '#fca5a5'}`,
    borderRadius: '9999px',
    padding: '3px 12px',
    fontSize: '0.85rem',
    fontWeight: '500',
    whiteSpace: 'nowrap'
  }}>
                    {w.name} {w.supported ? '✅' : '❌'}
                  </span>)}
              </div>
            </td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Component Type</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{componentType}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    paddingBottom: '14px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Inputs</td>
            <td style={{
    paddingBottom: '14px',
    verticalAlign: 'middle'
  }}>{connectionInputs}</td>
          </tr>
          <tr>
            <td style={{
    fontWeight: '600',
    paddingRight: '32px',
    whiteSpace: 'nowrap',
    verticalAlign: 'middle'
  }}>Connection Outputs</td>
            <td style={{
    verticalAlign: 'middle'
  }}>{connectionOutputs}</td>
          </tr>
        </tbody>
      </table>
    </div>;
};

<ComponentMetadata warehouses={["Snowflake"]} unsupportedWarehouses={["Databricks", "Amazon Redshift"]} componentType="Orchestration" connectionInputs="One" connectionOutputs="Unlimited" />

<Info>
  Production use of this feature is available for specific editions only. [Contact our sales team](https://www.matillion.com/contact) for more information.
</Info>

The Snowflake Vector Upsert component lets you convert data stored in a Snowflake cloud data warehouse into [vector embeddings](https://docs.snowflake.com/en/user-guide/snowflake-cortex/vector-embeddings), and then store these embeddings in a new Snowflake table. This will allow you to use alternative embedding models (for example, OpenAI or Amazon Bedrock) instead of Snowflake's Cortex embedding.

The destination table must already exist; this component won't create it. The destination table must have a column to hold a copy of the source text and a column to hold the vector embeddings.

<Note>
  SQL queries will need to be used rather than the [create table component](/docs/components/create-table) to create a destination table. Take care to make sure the vector dimensions match the vector dimensions of the model you intend to use.

  Example SQL query to create a table with a vector column:

  `CREATE TABLE "destination-table" ("id" NUMBER, "text" TEXT, "embedding_result" VECTOR(float, 768));`

  The vector dimension is set to a fixed value for each embedding model. To find the value, see the **Model** property, below.
</Note>

## Properties

<ResponseField name="Name" type="string" required>
  A human-readable name for the component.
</ResponseField>

{/* <!-- param-start:[source.snowflake.database] | warehouses: [snowflake] --> */}

<ResponseField name="Database" type="drop-down" required>
  The Snowflake database. The special value `[Environment Default]` uses the database defined in the environment. Read [Databases, Tables and Views - Overview](https://docs.snowflake.com/en/guides-overview-db) to learn more.
</ResponseField>

{/* <!-- param-start:[source.snowflake.schema] | warehouses: [snowflake] --> */}

<ResponseField name="Schema" type="drop-down" required>
  The Snowflake schema. The special value `[Environment Default]` uses the schema defined in the environment. Read [Database, Schema, and Share DDL](https://docs.snowflake.com/en/sql-reference/ddl-database.html) to learn more.
</ResponseField>

{/* <!-- param-start:[source.snowflake.table] | warehouses: [snowflake] --> */}

<ResponseField name="Table" type="string" required>
  The Snowflake table that holds your source data.
</ResponseField>

{/* <!-- param-start:[source.snowflake.keyColumn] | warehouses: [snowflake] --> */}

<ResponseField name="Key Column" type="drop-down" required>
  This column is used to uniquely identify each row in the table. It is used to ensure that the data is not duplicated when it is loaded into the destination. An example use case would be a column of product IDs that you want to use to identify each product in the table.
</ResponseField>

{/* <!-- param-start:[source.snowflake.textColumn] | warehouses: [snowflake] --> */}

<ResponseField name="Text Column" type="drop-down" required>
  This column is used to generate vectors for the text data in the table, which are then upserted as embeddings to your Snowflake vector database. An example use case of this column would be a column of product reviews that you want to convert into vectors for semantic search or to perform sentiment analysis on.
</ResponseField>

{/* <!-- param-start:[source.snowflake.limit] | warehouses: [snowflake] --> */}

<ResponseField name="Limit" type="integer">
  Set a limit for the maximum number of rows to load from the table. The default is 1000.
</ResponseField>

{/* <!-- param-start:[embeddingGenerator.embeddingProviderType] | warehouses: [snowflake] --> */}

<ResponseField name="Embedding Provider" type="drop-down" required>
  The embedding provider is the API service used to convert the search term into a vector. Choose either [OpenAI](https://platform.openai.com/docs/guides/embeddings) or [Amazon Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/titan-embedding-models.html). The embedding provider receives a search term (e.g. "How do I log in?") and returns a vector.

  Choose your provider:
</ResponseField>

<Tabs>
  <Tab title="OpenAI">
    {/* <!-- param-start:[embeddingGenerator.openAI.apiKey] | warehouses: [snowflake] --> */}

    <ResponseField name="API Key" type="drop-down" required>
      Select the secret definition that represents your OpenAI API key.

      Choose the secret definition that represents your credentials for this connector.

      If you have not already saved your credentials for this connector as a secret definition, click **Add secret** to create a secret definition representing these credentials. Read [Secrets and secret definitions](/docs/guides/secrets-and-secret-definitions) for details about creating a secret definition.

      To create a new OpenAI API key:

      1. Log in to [OpenAI](https://platform.openai.com/).
      2. Click your avatar in the top-right of the UI.
      3. Click **View API keys**.
      4. Click **+ Create new secret key**.
      5. Give a name for your new secret key and click **Create secret key**.
      6. Copy your new secret key and save it. Then click **Done**.
    </ResponseField>

    {/* <!-- param-end:[embeddingGenerator.openAI.apiKey] --> */}

    {/* <!-- param-start:[embeddingGenerator.openAI.model] | warehouses: [snowflake] --> */}

    <ResponseField name="Model" type="drop-down" required>
      Select an [embedding model](https://platform.openai.com/docs/guides/embeddings).

      Currently supports:

      | Model                  | Dimension |
      | ---------------------- | --------- |
      | text-embedding-ada-002 | 1536      |
      | text-embedding-3-small | 1536      |
      | text-embedding-3-large | 3072      |
    </ResponseField>

    {/* <!-- param-end:[embeddingGenerator.openAI.model] --> */}

    {/* <!-- param-start:[embeddingGenerator.embeddingBatchSize] | warehouses: [snowflake] --> */}

    <ResponseField name="API Batch Size" type="integer" required>
      Set the [size of array of data per API call](https://platform.openai.com/docs/api-reference/embeddings/create#embeddings-create-input). The default size is 10. When set to 10, 1000 rows would therefore require 100 API calls.

      You may wish to reduce this number if a row contains a high volume of data, and conversely, increase this number for rows with low data volume.
    </ResponseField>

    {/* <!-- param-end:[embeddingGenerator.embeddingBatchSize] --> */}
  </Tab>

  <Tab title="Amazon Bedrock">
    {/* <!-- param-start:[embeddingGenerator.aws.region] | warehouses: [snowflake] --> */}

    <ResponseField name="Region" type="drop-down" required>
      Select the [AWS region](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Regions).
    </ResponseField>

    {/* <!-- param-end:[embeddingGenerator.aws.region] --> */}

    {/* <!-- param-start:[embeddingGenerator.aws.model] | warehouses: [snowflake] --> */}

    <ResponseField name="Model" type="drop-down" required>
      Select an embedding model.

      Currently supports:

      | Model                                                                                                                     | Dimension |
      | ------------------------------------------------------------------------------------------------------------------------- | --------- |
      | [Titan Embeddings G1 - Text](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-titan-embed-text.html) | 1536      |
    </ResponseField>

    {/* <!-- param-end:[embeddingGenerator.aws.model] --> */}
  </Tab>
</Tabs>

<ResponseField name="Database" type="drop-down" required>
  The Snowflake *destination* database. The special value `[Environment Default]` uses the database defined in the environment. Read [Databases, Tables and Views - Overview](https://docs.snowflake.com/en/guides-overview-db) to learn more.
</ResponseField>

{/* <!-- param-start:[destination.schema] | warehouses: [snowflake] --> */}

<ResponseField name="Schema" type="drop-down" required>
  The Snowflake *destination* schema. The special value `[Environment Default]` uses the schema defined in the environment. Read [Database, Schema, and Share DDL](https://docs.snowflake.com/en/sql-reference/ddl-database.html) to learn more.
</ResponseField>

{/* <!-- param-start:[destination.table] | warehouses: [snowflake] --> */}

<ResponseField name="Table" type="drop-down" required>
  Select the destination table.
</ResponseField>

{/* <!-- param-start:[destination.keyColumn] | warehouses: [snowflake] --> */}

<ResponseField name="Key Column" type="drop-down" required>
  The column in the destination table to use as the key column.
</ResponseField>

{/* <!-- param-start:[destination.textColumn] | warehouses: [snowflake] --> */}

<ResponseField name="Text Column" type="drop-down" required>
  The column in the destination table that will hold the copied source data.
</ResponseField>

{/* <!-- param-start:[destination.embeddingColumn] | warehouses: [snowflake] --> */}

<ResponseField name="Embedding Column" type="drop-down" required>
  The column in the destination table that will hold the vector embeddings.
</ResponseField>
