Connectors
Files & Object Storage

Google Drive

3min

Google Drive is the robust files storage and productivity solution from Google and is part of the Google Business and Google Workspace offerings used by millions of users. All different types of files can be stored in Google Drive and synchronized with your desktop and shared equally as easily for enhanced productivity.

DLH.io provides this connector as a direct way to work with data and files mainly as a source to synchronize or load into your cloud data warehouse destination of choice.

Pre-Requisities:

  • Access to files in your Google Drive

Setup Instructions

DataLakeHouse.io securely connects to your Google Drive account that you have access to using the security of Google authentication. Using the form in the DataLakeHouse.io portal please complete the following basic steps.

  1. Enter a Name or Alias for this connection, in the 'Name/Alias' field, that is unique from other connectors
  2. Enter a 'Target Schema Prefix', which will be the prefix for the schema at the target you will sync your data files into
  3. Enter a Folder Path,is a path on the root bucket from where desired files will be retrieved
    • File Pattern, is a regular expression (RegEx) used to isolated only certain files to be retrieved
    • File Type, allows for a pre-determined type of file extension to be retreived
  4. Enter your Service Account Key, which should be a JSON string
    • Paste the entire Service Account Key (JSON). If using this bucket as process storage for BigQuery, please Add this Service Account email, dlh-global-bq-data-sync-svc@stg-datalakehouse.iam.gserviceaccount.com, as a principal user on your GCP project and assign it the Storage Admin role.
  5. Click the Save & Test button. Once your credentials are accepted you should be able to see a successful connection.

Control Each Column Data Type

SQL Transformations allow logic to be executed against a target connection based on a scheduled frequency or triggered event of new data on tables updated via DataLakeHouse.io (DLH.io). This especially helps when you want to control the data type set in your Target Connection since all columns are set as VARCHAR(16777216).