Installation steps

Setup of Gluesync CDC connector for MS SQL Server


To have Gluesync MS SQL Server CDC connector working on your database instance you will need to have:

  • valid user credentials with permissions to read, and write to the source tables and respective database, Gluesync requires that role to activate CDC for needed tables and create its table structure under the GLUESYNC schema (automatically created by Gluesync);

  • valid user with administrative privileges to access the host where MS SQL Server is running, this will be then used to turn on SQL Server Agent;

To create a valid user for Gluesync on your MS SQL Server database you can run the following query:
    WITH PASSWORD = 'YouDecide!';

--- db_owner role is required for executing Microsoft's SQL Server CDC functions
Given user gsuser present in the example above is not a mandatory user name for Gluesync, you can define whatever user name you’d like.

MS SQL user-permissions

Enabling SQL Agent

Gluesync requires SQL Agent to be enabled and running on the source MS SQL Server database. This task is intended to be carried on by a DBA. To execute this task DBA should access MS SQL Server with sysadmin role permissions.

If you’d like instead to have Gluesync running this task on your behalf you can assign to Gluesync the following permission level:

--- where "gsuser" is the given user name belonging to the Gluesync user

Enabling CDC on the source database

Gluesync requires CDC to be enabled at the database level. To enable CDC at a specific database level Microsoft SQL Server comes with a handy function called sys.sp_cdc_enable_db that by passing to it the db name will enable CDC over it.

If Gluesync’s user is given sysadmin role permissions it can automatically turn that on, otherwise, this task needs to be carried on by the DBA with the same user privileges.

To learn more about that stored procedure we suggest you take a look at Microsoft’s documentation through the following link.

Enabling and executing CDC tasks on tables

Gluesync requires privileges to execute a set of Microsoft-provided functions that are part of the MS SQL Server database package. A complete list of functions involved in the CDC process is listed under the following section in Microsoft MS SQL Server documentation.

Every function requires at least a db_owner role to be given to the Gluesync user to perform its duties.

Enabling CDC agent (SQL Server Agent)

CDC on MS SQL Server is served through a service called SQL Server Agent, you can read more about it by following this link pointing directly to the Microsoft SQL Server website.

Enabling the agent is a mandatory step that needs to be carried on by a DBA. Here we have collected steps for enabling the agent both for MS SQL Server Windows-based and Linux deployments.

If you’re considering running MS SQL Server under a container environment (like docker for instance) consider following the Linux section.

Turn on CDC agent under Linux

With a user with administrative privileges on the machine where MS SQL Server is running, you need to issue the following commands:

/opt/mssql/bin/mssql-conf set sqlagent.enabled true

After having successfully performed this step you’ll be requested to restart your MS SQL Server database instance.

If you need to enter inside the container to run the given bash command you can issue that from your host: docker exec -it --user root CONTAINER_NAME bash.

Turn on CDC agent under Windows

Microsoft has a detailed guide on how to enable SQL Server Agent when running MS SQL Server under Windows-based environments.

The detailed guide can be found here at this link.

Checking if SQL Server Agent is running properly

To check if the SQL Server Agent has been properly set and is currently running you can issue the following commands:

# this will enable the sp_configure to display advanced options
EXEC sp_configure 'show advanced options', 1;

# this will print out the full status of the SQL Server Agent

The resulting output will look like the one in the following picture.

Query output showing the two columns

Basic configuration example

The SQL to NoSQL module can be customized by using a configuration file, in JSON format. The file name to use must be specified as a parameter when launching the app, with the -f or --file tokens. The file should be composed of a union of common configuration files (see here Installation steps) and source/destination specific configuration:

  "mssql": {
      "dbo.SUPPLIERS": ["NAME", "SURNAME", "EMAIL"]

Microsoft SQL Server-specific configurations are listed under the mssql property:

  • tableKeys: this object is mandatory if you’re considering sourcing data from table(s) where there are NO primary key columns defined. You can learn more about the tableKeys object and how to leverage it by reading its deep dive at this link.

Looking for data modeling features or other options?

For more detailed configuration options, including the ability to perform data modeling, please have a look at the dedicated Data modeling section when sourcing from an RDBMS.