Sync Bridge (Data Pipelines)

Historical Re-sync

3min

During each run, a Sync Bridge will return the new or updated records since the last run. Well, what about after the first run; what if I didn't get all the records replicated from the Source Connector? When historical data needs to be loaded each source connector will have certain capabilities relaitve to how the vendor has built their API.

Option 1 - Re-sync All Historical Records

When you need to sync historical data, DataLakeHouse.io aligns with the vendors API capabilities and provides the ability to resync this historical data. For the Source Connectors that allow you to set the data on the historical sync, under the Options tab the Historical Load Start Date field is available.

Before setting a date, as the source system administrator how far back data is available and if there are any API limits for your source system. For example, your Salesforce contract will specify the amount of data that you are allowed to Pull via the API on a monthly basis.

Once you know the historical date, set the Historical Load Start Date field accorindgly in the Source Connection.

Document image


Next, go to your Sync Bridge and click Actions...Re-Sync All History and the bridge will execute a historical sync based upon the Historical Load Start Date. Please not that processing time of the Sync Bridge may take significantly longer to run depending on how much data is being replicated during the historical sync.

Document image


Option 2 - Individual Entity Historical Sync

In certian situations, you may only want to perform a historical sync on certain entities.

Once you know the historical date, set the Historical Load Start Date field accorindgly in the Source Connection.

Document image


Next, while in the Source Connection, click on the Source Schema tab. A list of all the entities available to replicate will be displayed. If the source system vendor has provided the appropriate API, a Re-Sync icon will appear to the right of the Table/Entity as show as #1 in the image below. Clicking this icon will tell the Sync Bridge to do a historical sync on that entity during it's next scheduled run.

If the Table/Entity is not available to run a historical sync then the sync icon will be grayed as in #2 in the image below. This may be due to the fact that the vendor controls the API in order to ensure data integrity. This data integrity is enforce by only allowing a historical sync on the higher level Table/Entity. In the image below, the Employee entity is the main entity and when a historical sync is activited on the Employee entity, all the other Entities with the name EMPLOYEE_ will have a historical sync.

Document image