Security
22min
datalakehouse io is committed to security and focused on keeping you and your data safe datalakehouse io adheres to industry leading standards while connecting, replicating, and loading data from all of your data sources contact security\@datalakehouse io if you have any questions or comments web portal connectivity all connections to datalakehouse io's web portal are encrypted by default using industry standard cryptographic protocols (tls 1 2+) any attempt to connect over an unencrypted channel (http) is redirected to an encrypted channel (https) to take advantage of https, your browser must support encryption protection (all versions of google chrome, firefox, and safari) communication & encryption all connections to datalakehouse io are encrypted by default, in both directions using modern ciphers and cryptographic systems we encrypt in transit utilizing tls 1 2 any attempt to connect over http is redirected to https we use hsts to ensure browsers interact with datalakehouse io only over https we utilize aes 256 for all data encrypted at rest penetration testing datalakehouse io undergoes an annual penetration testing from an outside provider, and regularly installs the latest, secure versions of all underlying software compliance soc 2 a soc 2 examination, performed by an independent, certified public accounting (cpa) firm, is an assessment of a service provider’s security control environment against the trust services principles and criteria set forth by the american institute of certified public accountants (aicpa) the result of the examination is a report which contains the service auditor’s opinion, a description of the system that was examined, management’s assertion regarding the description, and the testing procedures performed by the auditor datalakehouse io received its soc 2 accrediation on december 15, 2023 our soc 2 proof of active examination is available for review under mnda upon request for existing customers and special requests gdpr datalakehouse io is fully gdpr compliant datalakehouse io's terms of service includes a data processing addendum that enacts standard contractual clauses set forth by the european commission to establish a legal basis for cross border data transfers from the eu pci before granting datalakehouse io access to data subject to pci requirements, please contact support at support\@datalakehouse io hipaa the health insurance portability and accountability act of 1996 (hipaa) is a federal law that required the creation of national standards to protect sensitive patient health information from being disclosed without the patient’s consent or knowledge datalakehouse io has being assessed against relevant hipaa security criteria as part of our soc 2 report and is available for review under mnda upon request connectors connections to customers' database sources and destinations are ssl encrypted by default datalakehouse io can support multiple connectivity channels connections to customers' software as a service (saas) tool sources are encrypted through https permissions databases and api cloud applications datalakehouse io only requires read permissions for data sources that by default grant permissions beyond read only, datalakehouse io will never make use of those permissions destinations datalakehouse io requires the create permission this permission allows datalakehouse io to create a schema within your destination, create tables within that schema, and write to those tables datalakehouse io is then able to read only the data it has written retention of customer data all customer data, besides what is listed in the exceptions https //docs datalakehouse io/security#odmoa section, is purged from datalakehouse io's system as soon as it is successfully written to the destination for normal syncs, this means data exists in our system for no more than eight hours there are some cases where the retention period may be longer, as described below in the following two cases, customer data is automatically purged after 30 days using object lifecycle management destination outage datalakehouse io maintains data that has been read from your source if the destination is down, so we can resume the sync without losing progress once the issue is resolved retrieving schema information for column blocking or hashing purposes for newly created connectors, if you choose to review your connector schema before syncing in order to use column blocking or hashing, we queue your data while we read the full schema and only write it to the destination once you approve exceptions datalakehouse io retains subsets of a customer's data that are required to provide and maintain datalakehouse io's solution this only includes the following data customer access keys datalakehouse io retains customer database credentials and saas oauth tokens in order to securely and continuously extract data and troubleshoot customer issues these credentials are securely stored in a key management system the key management system is backed by a hardware security module that is managed by our cloud provider customer metadata datalakehouse io retains configuration details and data points (such as table and column names) for each connector so that this information can be shown to your organization in your datalakehouse io dashboard temporary data some data integration or replication processes may use ephemeral data specific to a data source this stream of data is essential to the integration process and is deleted as soon as possible, though it may briefly exceed 24 hours in rare instances examples of this temporary data include binary logs for mysql or sql server solution infrastructure access to datalakehouse io production infrastructure is only allowed via hardened bastion hosts, which require an active account protected by mfa (multi factor authentication) to authenticate further access to the environment and enforcement of least privilege is controlled by iam (identity and access management) policies privileged actions taken from bastion host are captured in audit logs for review and anomalous behavior detection physical and environmental safeguards physical and environmental security is handled entirely by our cloud service providers each of our cloud service providers provides an extensive list of compliance and regulatory assurances, including soc 1/2 3, pci dss, and iso27001 google see the google cloud platform compliance https //cloud google com/security/compliance/ , security https //cloud google com/security/overview/ , and data center security https //cloud google com/security/overview/whitepaper#state of the art data centers documentation for more detailed information amazon see the amazon web services compliance https //aws amazon com/compliance/ , security https //aws amazon com/security/ , and data center security https //aws amazon com/compliance/data center/controls/ documentation for more detailed information your organization permissions users can use single sign on (sso) with saml 2 0 officially supported identity providers azure adfs google workspace https //workspace google com/ okta https //www okta com/ only users of your organization registered within datalakehouse io and datalakehouse io operations staff have access to your organization's datalakehouse io dashboard your organization's datalakehouse io dashboard provides visibility into the status of each integration, the aforementioned metadata for each integration, and the ability to pause or delete the integration connection not organization data organization administrators can request that datalakehouse io revoke an organization member's access at any point; these requests will be honored within 24 hours or less company policies datalakehouse io requires that all employees comply with security policies designed to keep any and all customer information safe, and address multiple security compliance standards, rules and regulations two factor authentication and strong password controls are required for administrative access to systems security policies and procedures are documented and reviewed on a regular basis current and future development follows industry standard secure coding guidelines, such as those recommended by owasp networks are strictly segregated according to security level modern, restrictive firewalls protect all connections between networks hipaa under the hipaa security rule, datalakehouse io does comply with hipaa requirements for protected health information (phi) and will sign a business associate agreement (baa) with customers who are subject to hipaa mandates (typically, hipaa covered entities) datalakehouse io is not a covered entity under hipaa rules, and therefore cannot be "hipaa compliant", since hipaa itself applies to covered entities (that is, those entities that are subject to regulation by the hhs) datalakehouse io serves as a data pipeline, which means that phi traversing the datalakehouse io environment is never permanently stored all transmissions are encrypted using industry best practices (at present, tls 1 2+) temporary storage may occur when the amount of data transmitted exceeds the capacity for real time processing, and as a result, requires short term caching such temporary storage is encrypted all customer data, including phi, is purged from datalakehouse io's system as soon as it is successfully written to the destination in the event of a data breach to date, datalakehouse io has not experienced a breach in security of any kind in the event of such an occurrence, datalakehouse io protocol is such that customers would be made aware as soon as the compromise is confirmed responsible disclosure policy at datalakehouse io, we are committed to keeping our systems, data and product(s) secure despite the measures we take, security vulnerabilities will always be possible if you believe you’ve found a security vulnerability, please send it to us by emailing security\@datalakehouse io please include the following details with your report description of the location and potential impact of the vulnerability a detailed description of the steps required to reproduce the vulnerability (poc scripts, screenshots, and compressed screen captures are all helpful to us) please make a good faith effort to avoid privacy violations as well as destruction, interruption or segregation of services and/or data we will respond to your report within 5 business days of receipt if you have followed the above instructions, we will not take any legal action against you regarding the report diagnostic data access important datalakehouse io cannot access your data without your approval when working on a support ticket, we may need to access your data to troubleshoot or fix your broken connector or destination in that case, we will ask you to grant datalakehouse io access to your data for a limited time you can allow or deny data access if you grant or downgrade access to support docid\ st2he1ypbvcf2n4v5z 5u , you can revoke it at any moment