SQL Transformations

5min

Transformations allow logic to be executed against a target connection based on a scheduled frequency or triggered event of new data on tables updated via DataLakeHouse.io (DLH.io):

  • Add a Transformation
  • Pre-SQL Transformations
  • SQL Transformations

Add a Transformation

  1. Click on Transformation from the header menu
  2. Click on New Transformation or Add a Transformation to show the Transformation creation form.
  3. Update the form and determine what options to select.

Overview

Adding a new transformation, be sure to complete all the required details

Add a New Transformation
Add a New Transformation


Pre-SQL Transformations

Pre-SQL transformations are unique to DLH.io where provide the ability to reference the in stream data (data streaming per each synchronization frequency) against the existing raw data landing zone.

This is a perfect use case for conceptual ideas such as refunds, chargebacks, employee punches, etc.

This option is available by selecting the Transformation Type in the transformation area.

See also Employee Pay Summary for example of using Pre-SQL logic.

SQL Transformations

SQL transformations are unique to DLH.io where provide the ability to reference the in stream data (data streaming per each synchronization frequency) against the existing raw data landing zone. This type of Transformations runs as the step in the Sync Bridge execution.

While a tool such as dbt Cloud, Dataform or Coalesce.io can do these light transformations, DataLakeHouse.io can as well.

In the example below, a Sync Bridge is syncing a Google Sheet into Snowflake. In order to control the column data types in Snowflake, a Transformation is created to drop the CONTACTS_NEW table at the end of each Sync Bridge run and then a CTAS is executed to create the CONTACTS_NEW table with the desired column data types.

Document image


The text of the Script in the above image is below for easy reference:

---------------------------------------------------------------- ---------- Drop the table ----------------------- ---------------------------------------------------------------- DROP TABLE DLH_DEMO_SANDBOX.PEOPLEEXTRACT_GOOGLE_SHEETS.CONTACTS_NEW; ---------------------------------------------------------------- ------ Create the table with the desired data types ---- ---------------------------------------------------------------- CREATE TABLE DLH_DEMO_SANDBOX.PEOPLEEXTRACT_GOOGLE_SHEETS.CONTACTS_NEW AS SELECT DOB::DATE AS DOB, FIRST_NAME::VARCHAR(25) AS FIRST_NAME, MIDDLE_NAME::VARCHAR(15) AS MIDDLE_NAME, LAST_NAME::VARCHAR(50) AS LAST_NAME, PREFIX::VARCHAR(10) AS PREFIX, SUFFIX::VARCHAR(10) AS SUFFIX, TITLE::VARCHAR(75) AS TITLE, DETAILS::VARCHAR(16777216) AS DETAILS, COMPANY::VARCHAR(50) AS COMPANY, COMPANY_ID::NUMBER AS COMPANY_ID, OWNED_BY::VARCHAR(50) AS OWNED_BY, STREET::VARCHAR(100) AS STREET, CITY::VARCHAR(30) AS CITY, STATE::VARCHAR(2) AS STATE, POSTAL_CODE::VARCHAR(10) AS POSTAL_CODE, COUNTRY::VARCHAR(50) AS COUNTRY, __ROW_MD5::VARCHAR(50) AS __ROW_MD5, __DLH_IS_DELETED::BOOLEAN AS __DLH_IS_DELETED, __DLH_SYNC_TS::TIMESTAMP_TZ(9) AS __DLH_SYNC_TS, __DLH_START_TS::TIMESTAMP_TZ(9) AS __DLH_START_TS, __DLH_FINISH_TS::TIMESTAMP_TZ(9) AS __DLH_FINISH_TS, __DLH_IS_ACTIVE::BOOLEAN AS __DLH_IS_ACTIVE FROM DLH_DEMO_SANDBOX.PEOPLEEXTRACT_GOOGLE_SHEETS.CONTACTS;