website logo
⌘K
Getting Started 🚀
What is DataLakeHouse.io?
Our Business-Value Focus
Learn the Basic Concepts
Connectors
Operations Applications
ConnectWise
Google Sheets
Aloha POS
BILL
Bloom Growth
Ceridian Dayforce
Food Delivery Service Connector
Facebook Ads
Google Analytics 4
Harvest
Hubspot
Jira
McLeod Transportation
NetSuite (Oracle NetSuite)
Optimum HRIS
QuickBooks Online
Salesforce
Shopify
Square
Square Marketplace
Stripe
Workday HCM
Xero
Databases
SQL Transformations
Terraform: Reverse Terraforming
DBT Cloud Transformations
Sync Bridge (Data Pipelines)
Create a Sync Bridge
Manually Run a Sync Bridge
Deleting a Sync Bridge
Analytics
Access Analytics
Snowflake Usage Analytics
FAQ (about syncing data)
How are new columns are added to the target Data Warehouse?
....
Data Catalog
Create the Catalog
Populate the Catalog
Access the Catalog
Data Warehousing
Snowflake
Open Source DW Models
Alerts & Notifications
Integrations (Slack, etc.)
Logs & Monitoring
Security
Release Notes
April 2022
July 2022
Community Overview
Contributor Agreements
Code Contribution Guide
About
Our
License
Viewpoint
Docs powered by archbee 
2min

What is a Data Vault?

DataLakeHouse.io believes a Data LakeHouse is a framework and a way of thinking in terms of structure and scale. We support the initiatives and guidance of the Data Vault 2.0 framework as a part of the DataLakeHouse concept and continue to implement Data Vault concepts as well as supporting features in DataLakeHouse.io on an ongoing basis as part of our roadmap.

Data Vault 2.0 is a hybrid data modeling methodology, architecture, and framework that allows for working with data of all types (structured, semi-structured, and unstructured) and is designed to be resilient to environmental changes. At the core it is a modern, agile way of designing and building efficient, effective Data Warehouses. Dan Linstedt, the Data Vault methodology creator once described the Data Vault as "A detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3NF and Star Schemas. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise."

The newest iteration of the Data Vault methodology Data Vault 2.0 improves massively to align with many leaps technology advancement and compute such as removing the need for lookup surrogate keys, now using a Hash Key concept which we at DataLakeHouse.io use in all of our data modeling designs.

The Data Vault Hub, Link, and Satellite concept allow for integration of multiple data sources into single or multiple structures that help define and democratize data across the enterprise. In so doing the Data Warehouse becomes a near idempotent system of data in a Raw Vault with the ability to layer on more business centric transformed data in a Business or Information Vault. Ultimately, the ability to building a traditional dimensional model Data Warehouse or Data Mart is available with the consistency of having a reliable and massively scalable DV 2.0 Raw Vault at its core.



Updated 03 Mar 2023
Did this page help you?
Yes
No
UP NEXT
What is Usage-Based Pricing?
Docs powered by archbee