Data Warehousing

Data Warehousing is a means for an organization to store and retrieve curated business decision making information, quickly, accurately, and with high levels of confidence, usually for reporting or visualization. It’s pre-defined structure allows for cross-departmental analysis of business data, that is usually pre-aggregated with already applied business logic and rules, with security provisioned profiles to enable a more data governed approach for scaling and enabling self-service analytics across the organization. Lastly, it makes any analysis a team already does in an Excel Spreadsheet or elsewhere exponentially faster and more consistent. A Data Warehouse is commonly referred to as the “single source of truth” for the organization. We recommend that it is always part of an organizations data-value-chain regardless of an organization’s understanding of a ‘Data Lake’. An “Enterprise Data Warehouse” (EDW) is a common term, and highest level of achievement, for most Data Warehouse implementations as it implies a holistic representation of the organization’s information.

  • Snowflake Data Cloud : Snowflake
  • Our Open Source Data Warehouse Models : Open Source DW Models

Why have a Data Warehouse?

In our opinion, and as a fundamental approach to building a DataLakeHouse, every organization should have a Data Warehouse as part of the data-value-chain, regardless of the idea or understanding of a ‘Data Lake’. A (Enterprise) Data Warehouse enables:

  • A single-source-of-the-truth for operational and managerial information and data, with curated perspectives on the data that represent the organization, including at a departmental owner perspective
  • Well-defined Key Performance Indicators (KPIs) that drive, and provide the pulse of the business, to track how the organization/business is doing and enables alerting, thresholds, and other measurement abilities
  • Corporate and business hierarchy representation of information – this can can be a direct abstract from operational systems (i.e.: ERP, CRM, POS) to codify and provide taxonomy that represents the organization not the default vendor representation of the transactional data. A common example for this is org structure, alternative chart of accounts or P&L structure, alternative business calendars, etc.
  • Historical data storage and trend analytics (thus how the organization is performing over time?)
  • Master Data Management (MDM) integration and validation, even potentially a “poor-man’s” MDM solution (depending on the organization’s business case requirements)
  • Consolidation of multiple data source systems into a single data repository for analytics (for example, multiple GL systems, and/or HR systems) to gain a single source holistic view of the organization (perfect for acquisitions or mergers where decisions need to be made regarding the new formed organization as a whole)

Updated 17 Feb 2022
Did this page help you?