Azure Synapse Setup
Go ahead and gather the basic details about Azure Synapse:
- Database host/servername
- User information
Azure Synapse Pre-Requisites:
- Update the firewalls on your Azure Synapse workspace to ensure to whitelist our DataLakeHouse.io IP addresses for the required network access.
- Create a specific Azure Synapse role and user for DLH.io integration. A new or specific user is required, and it should not be the default service administrator because that administrator user is fixed at a small memory resource class which cannot be changed, thus limiting your performance.
- Create your dedicated SQL Pool / SQL Warehouse or capture an existing SQL Pool's details. In a dedicated SQL Pool, the database name is usually the same name as the dedicated SQL Pool itself.
- Capture the Server Name of the SQL Pool you just created, and be sure you get the fully qualified domain name (FQDN) of the SQL Pool server. It is also known as the Workspace SQL Endpoint.
- This is usually something like, mysynapseaccount.sql.azuresynapse.net
- Update the Azure firewall setting to allow the DLH.io IP addresses for incoming connections to your Azure Synapse instance, if needed to restrict access for security reasons, otherwise leave it open to all public trafic
- Capture your Azure Synapse Analytics connection details by clicking on the SQL pool link from the left side after navigating to Manage > Analytics Pool (for example "Built-in" or the Server Name from the SQL Pool you created in the previous step)
- Create the DLH.io user with your default Synapse Analytics Administrator role or similar, using your favorite IDE or Synapse Studio. You can also use a basic SQL script from Azure Synapse Analytics by clicking Develop from the left menu then clicking the "+" icon to add a "SQL Script". For example run this code against your Synapse workspace SQL Pool using the SQL Script. For the username, we suggest, dlh_user_svc or similar and use the same for the LOGIN username:
-
- Assign the new Synapse workspace user to a resource class role that allows greater memory for any workload service level. Run the following statement for our basic recommendation of memory to service level alignment, replacing <resource_class_name> with the value staticrc20, and <username> with the username created in the previous step:
-
- (Optional) Create a new Azure Blob Storage source connector in DLH.io, which will allow any transient data from the synchronization to run through your cloud storage instead of the default DLH.io storage.
Scroll down to ensure you review all steps, as needed...
Remember DLH.io connects to your instance with credentials supplied by you. We may store your credentials securely with our bank-grade protocols to maintain speed and consistency of your data pipelines.
- Get all the information from the pre-requisite steps.
- Enter your Credentials and Other Information in the Fields
- Enter in the Name/Alias field, the name you'll use within DLH.io to differentiate this connection from others.
- Enter in the Target Schema Prefix field, the prefix of schema(s) that gets created on your destination target connection database for each of the schemas or connectors you load into this as a target connection. So, if your source connection to be synchronized is a database and it has a schema named 'dbo' this target connection database when the data is synced will have a schema created in the target database named the value of this field + '_dbo'.
- Alphanumeric characters only. Prefix must start and end with a letter but can contain an underscore(_).
- Enter in the Server/Host field, the name of the workspace server name.
- Enter in the Port field, where this database is accessible and the firewall restrictions are open.
- Enter in the Database field, the name of the database to connect.
- Enter in the Username/Alias field, the username of user you created in the steps above to give access to DataLakeHouse.io.
- Leave the Auth Type field alone. It is set to password because DLH.io is using TLS and requires username and password credentials to access the database.
- Enter in the Password field, the password for the user you created in the steps above.
- Click Save & Test
- This attempts to connect to your Synapse database with the credentials provided.
- A message of success or failure will be shown.
- If successful, you'll be prompted with the schema objects of the database and will need to complete the final steps for configuration shown below.
- If the test connection fails, the connection is still saved, but you will need to correct the failure based on the information provided within the error message.
This section of steps ensures you have coverage of other important steps required on your database side and in DataLakeHouse.io once you have completed the above test connection successfully.
Remember DataLakeHouse.io connects to your database instance with credentials supplied by you. We store your credentials securely with our bank-grade protocols.
DLH.io will connect through SSH Tunnel TCP/IP or via standard internet traffic. You must have enabled access to your Azure Synapse workspace(s) and ensure that you have whitelisted/grantlisted our DataLakeHouse.io IP addresses that will access your SQL Server database(s).