> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Configure your Microsoft SQL Server database

export const s_runner = "Streaming runner";

export const maia = "Maia";

For a {s_runner} to capture change events from Microsoft SQL Server tables, a Microsoft SQL Server administrator with the necessary privileges must first run a query to enable change data capture on the database. The administrator must then enable change data capture for each table that you want the {s_runner} to capture. These steps are based on advice in the [Microsoft documentation](https://learn.microsoft.com/en-us/sql/relational-databases/track-changes/enable-and-disable-change-data-capture-sql-server?view=sql-server-ver16).

Streaming captures all `INSERT`, `UPDATE`, and `DELETE` operations committed to the tables for which change data capture is enabled, once it's deployed. The connector can then record and broadcast these events.

***

## Video example

Expand this box to watch our video about configuring Microsoft SQL Server for a {s_runner}.

<iframe width="560" height="315" src="https://www.youtube.com/embed/sCOF7jOCbZo?si=5NX_Ccug_MgcGHUK&enablejsapi=1" title="YouTube video player" frameBorder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share; fullscreen" referrerPolicy="strict-origin-when-cross-origin" allowFullScreen />

***

## Prerequisites

Before the {s_runner} can be used to load data from Microsoft SQL Server, the database must be configured. To perform these steps, the following is required:

* A `sysadmin` user must enable change data capture for the Microsoft SQL Server database (or Azure SQL Managed Instance).
  * In an Azure SQL Database, the `db_owner` role is required to enable change data capture.
* The Microsoft SQL Server agent must be running.

If the agent isn't running, start the Microsoft SQL Server agent. For more information, read [Start, Stop, or Pause the SQL Server Agent Service](https://learn.microsoft.com/en-us/sql/ssms/agent/start-stop-or-pause-the-sql-server-agent-service?view=sql-server-ver16). If it's not running, you will see an error similar to the following in the server logs:

```
2017-01-08 15:40:24,596 @ -ERROR cached5
com.webaction.source.tm.MSSqlTransactionManager.getStartPosition
(MSSqlTransactionManager.java:389) 2522 :
Could not position at EOF, its equivalent LSN is NULL
```

***

## Enabling change data capture on the database

As noted in [Prerequisites](#prerequisites), a privileged user must first enable change data capture on the database before a capture instance for specific tables can be established.

1. To determine if a database already has change data capture enabled, run the following command:

   ```sql theme={null}
   SELECT name, is_cdc_enabled
   FROM sys.databases
   WHERE name = '<DATABASE_NAME>'
   ```

   If the value for `is_cdc_enabled` is `1`, then change data capture is enabled for the database. Otherwise, it's not enabled.

2. To enable change data capture on your database, run the following stored procedure:

   ```sql theme={null}
   USE <DATABASE_NAME>;
   EXEC sys.sp_cdc_enable_db
   ```

   Enabling the database for change data capture will trigger the creation of several database-specific resources: a schema with the name `cdc`, a change data capture user, metadata tables, and other system objects.

***

## Creating a change data capture user role

1. Create a Microsoft SQL Server user to be used by the Microsoft SQL Server Reader. The user should have `sysadmin`, a fixed server role, and must use the Microsoft SQL Server authentication mode, which must be enabled in Microsoft SQL Server.

2. Grant the **Microsoft SQL Server Reader** user the `db_owner` role. This must be done for each database that you want to read. Use the following commands, substituting your own details:

   ```sql theme={null}
   USE <database name>;
   EXEC sp_addrolemember @rolename=db_owner, @membername=<user_name>
   ```

3. For example, to enable change data capture on the database `mydb`, create a user `streaming_user`, and give that user the `db_owner` role on `mydb`:

   ```sql theme={null}
   USE mydb
   EXEC sys.sp_cdc_enable_db
   CREATE LOGIN streaming_user WITH PASSWORD = 'passwd'
   CREATE USER streaming_user FOR LOGIN cdc
   EXEC sp_addrolemember @rolename=db_owner, @membername=streaming_user
   ```

***

## Enabling change data capture on a table

Once your database has been enabled for change data capture, you must then enable it for all tables that you wish to capture data from. A capture instance for each source table must be created. To do so, the creating member must have the `db_owner` fixed database role. The stored procedure `sys.sp_cdc_enable_table` can then be used as described below.

1. Use the following command to retrieve all the source tables that have been enabled for change data capture:

   ```sql theme={null}
   SELECT name, is_tracked_by_cdc, SCHEMA_NAME(schema_id) AS 'schema_name'
   FROM sys.tables
   WHERE is_tracked_by_cdc = 1
   ```

2. Use the following command to enable change data capture on source tables you want to capture:

   ```sql theme={null}
   USE MyDB
   GO
   EXEC sys.sp_cdc_enable_table
   @source_schema = N'dbo',
   @source_name = N'MyTable',
   @role_name = N'MyRole',
   @filegroup_name = N'MyDB_CT',
   @supports_net_changes = 0
   GO
   ```

The following fields are used in the above script:

| Field name              | Description                                                                                                                                                                                                                                                                                                                                                     |
| ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| @source\_name           | Specifies the name of the table that you want to capture.                                                                                                                                                                                                                                                                                                       |
| @role\_name             | Specifies a role you can add users to, who you want to grant `SELECT` permission on the captured columns of the source table. `sysadmin` or `db_owner` roles also have access to the specified change tables. Set the value of `@role_name` to `NULL`, to allow only members in the `sysadmin` or `db_owner` roles to have full access to captured information. |
| @filegroup\_name        | Specifies the `filegroup` where SQL Server places the change table for the captured table. The named `filegroup` must already exist. It's best not to locate change tables in the same `filegroup` that you use for source tables.                                                                                                                              |
| @supports\_net\_changes | For more information, read [Microsoft @supports\_net\_changes](https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sys-sp-cdc-enable-table-transact-sql?view=sql-server-ver16#----supports_net_changes).                                                                                                                        |

<Note>
  When `@supports_net_changes` is set to `1`, an additional non-clustered index is created on the change table, and the net changes query function is created. Because this index needs to be maintained, enabling net changes can have a negative effect on streaming performance.
</Note>

***

## Verify user access to the streaming table

To retrieve change data capture configuration information from a database or table, a Microsoft SQL Server administrator can use a system stored procedure. This type of administrator with the necessary privileges must do the following:

* Make sure all the captured columns in the change data capture instance have `SELECT` permissions. Members of the `db_owner` database role have access to all the defined capture instances' information.
* Have membership in any gating roles that are defined for the table information that the query includes.
* Run the `sys.sp_cdc_help_change_data_capture` stored procedure to query the table.

Queries shouldn't return empty results.

The following example runs the stored procedure `sys.sp_cdc_help_change_data_capture` on the database `MyDB` to query a table for change data capture configuration information:

```sql theme={null}
USE MyDB
EXEC sys.sp_cdc_help_change_data_capture
```

The query returns configuration information for each streaming-enabled table in the database that includes change data that the caller is authorized to see. If the result is empty, make sure the user has access to the capture instance as well as the streaming tables.

***

## Microsoft SQL Server always on

The Microsoft SQL Server connector can capture changes from an **Always On** read-only replica.

### Prerequisites

* Change data capture is configured and enabled on the primary node. Microsoft SQL Server doesn't support streaming directly on replicas.
* The read-only replica must be configured with **Readable Secondary** as either `Read-intent only` or `Yes`. For more information, read [Microsoft's documentation](https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/configure-read-only-access-on-an-availability-replica-sql-server?view=sql-server-ver16), including the detailed steps in the article.
* Ensure the **Read Replica** toggle is on when creating your pipeline.

***

## Schema drift

Schema drift occurs when your data source changes its metadata, by adding, removing, or changing fields, columns, and types. When not handled correctly, schema drift will cause a difference between the data source, and your streaming pipeline configuration, resulting in the pipeline failing or processing incorrect information.

Due to Microsoft SQL Server limitations, {maia} can't automatically handle schema drift on source tables in Microsoft SQL Server databases. A Microsoft SQL Server connector with change data capture enabled does not automatically propagate any schema changes, such as existing column data type or size changes, renamed columns, new tables, or new columns, meaning you may have to reconfigure your pipeline in response to changes in the source metadata.

To accommodate offline or online schema updates, follow the steps outlined in the following Debezium links to ensure that the changes will be reflected in your streaming pipeline:

* [Offline schema updates](https://debezium.io/documentation/reference/stable/connectors/sqlserver.html#offline-schema-updates)
* [Online schema updates](https://debezium.io/documentation/reference/stable/connectors/sqlserver.html#online-schema-updates)

***

## Limitations

* To build a change data capture instance in Microsoft SQL Server, the base object must be a table. As a result, Microsoft SQL Server doesn't enable collecting changes from indexed views (also known as materialized views).
* If the Microsoft SQL Server agent service isn't running, Microsoft SQL Server change data capture jobs won't execute.
* Streaming doesn't support the values for computed columns even if the computed column is defined as persisted. Computed columns that are included in a capture instance always have a value of NULL.

For additional limitations, read the [Microsoft documentation](https://learn.microsoft.com/en-us/sql/relational-databases/track-changes/about-change-data-capture-sql-server?view=sql-server-ver16#known-limitations-and-issues).

### Large object (LOB) data types

When streaming tables that include large object (LOB) or unstructured data types, pipelines can fail if the data exceeds server configuration limits.

Failures may include errors indicating incomplete responses from the database, for example:

```
SQL Server returned an incomplete response
```

Microsoft SQL Server change data capture uses a replication-based mechanism. If a LOB column exceeds the configured `max text repl size` limit, the result set can be truncated, which may cause incomplete data to be returned. This is a common cause of failures when LOB columns are present.

To check the current setting, run:

```sql theme={null}
EXEC sp_configure 'max text repl size';
```

To increase the limit (a value of -1 sets it to unlimited), run:

```sql theme={null}
EXEC sp_configure 'max text repl size', -1;
RECONFIGURE;
```

For more information, see [Server Configuration: max text repl size - SQL Server](https://learn.microsoft.com/en-us/sql/database-engine/configure-windows/configure-the-max-text-repl-size-server-configuration-option).

***

## Administration and monitoring of change capture tables

Each user setup will have nuances, so it's critical that cleanup and retention is agreed with your Microsoft SQL Server administrator or database administrator, and is appropriate for your use case. Appropriate monitoring should also be configured accordingly. For more information, read [Administer and monitor change data capture - SQL Server](https://learn.microsoft.com/en-us/sql/relational-databases/track-changes/administer-and-monitor-change-data-capture-sql-server?view=sql-server-ver16#Cleanup).

The Debezium documentation, [Debezium connector for SQL Server](https://debezium.io/documentation/reference/stable/connectors/sqlserver.html#_effect_of_sql_server_capture_job_agent_configuration_on_server_load_and_latency), documents how change data capture can impact your server, and what actions you can take to help mitigate this issue.
