Connectors
Databases
Redshift
3min
Amazon Redshift is a steadfast Cloud Data Warehouse platform. It is capable of handling analytical workloads typically used for data warehouse models and other data structures for reporting and beyond.
- Be sure to Our IP Grantlist / Whitelist our IP Addresses for cross network database server network access
- Identify the Redshift cluster and capture it for later use
- Consult with you Redshift administrator as needed on the setup
- We suggest creating a new read-only user and new role to delineate this user service account for DLH.io from any other user access to your database, however, using an existing user and/or role is acceptable.
- Create a Master or Limited user. Since a Master user has the necessary permissions that can be used. Or create a DLH.io specific Limited User having all the privileges.
- Create a Read-Only DLH.io user in your instance
- We suggest creating a new user specific to DLH.io for read only access to your database(s), but you may already have an existing user in which case you can ignore this step:
- CREATE USER datalakehouse_user PASSWORD tmp!Password;
- GRANT CREATE, TEMPORARY ON DATABASE public TO datalakehouse_user; GRANT CREATE, TEMPORARY ON DATABASE datalakehouse_raw TO datalakehouse_user;
- Identify the Redshift user name that will be used and save it for use in the new connector form
Scroll down to ensure you review all steps, as needed...
- Create a DataLakeHouse.io user on AWS IAM / Redshift
- Log in with an account having administrator role or similar access
- Connect to your Redshift instance and find the cluster information and instance endpoint details
- Determine the database name or we recommend to create a new database called, DATALAKEHOUSE_RAW
- Update the VPC Security Group for the Inbound list with the Safelist Our IP Grantlist / Whitelist addresses of DataLakeHouse.io for all IP addresses
- In Configuration > Workload Management we suggest that you turn on Automatic WLM in the Workload Management window click the Parameter Group then click on Switch to WLM Mode. Choose Automatic WLM then click Save
- You could use the manual method by updating the Workload Queues > Concurrency on Main (query_concurrency, in the json) to a value greater than 5 but we recommend the Automatic WLM if not an advanced use case for your data
- On the Connection Form :: Enter your Credentials and Other Information
- Enter in the Name/Alias field, the name you'll use within datalakehouse.io to differentiate this connection from others
- Enter in the Server/Host field, the name of the endpoint server name:
- Use the full endpoint, for example,
- Enter in the Port field, where this database is accessible and the firewall restrictions are open. For Redshift we always assume port 5439, which is standard but we have it here for future-proofing.
- Enter in the Database field, the name of the database to connect
- In most cases this is the DATALAKEHOUSE_RAW database
- Enter in the Username/Alias field, the username of user you created in the steps above to give access to DataLakeHouse.io
- In most cases this is the DATALAKEHOUSE_USER
- Leave Auth Type field alone. It is set to password because DataLakeHouse is currently only using SSL/TLS and requires username and password credentials to access the database
- Enter in the Password field, the password for the user you created in the steps above
- Click on Save & Test to save the connection and test that we can connect.
- If updating the form Click Save & Test or just Test
- Clicking on Save & Test will again save any changes such as the password change, etc. You will not be able to change the prefix of the schema that will be the target in the destination. Any test of the connection will attempt to connect to your database with the credentials and info provided.
- A message of success or failure will be shown:
- If success you'll be prompted with the schema objects objects of the database and will need to complete the final steps for configuration shown below.
- If failure happens with the test connection, the connection is still saved but you will need to correct the failure based on the failure reason information provided in the message