"
User Story
You have deployed your new MDM solution into production and it has been running smoothly. You have already started new development sprints in your Dev environment. A month later, you have new enrichers, tweaked existing match rules, configured the survivorship rules. You are ready to deploy the new model changes to production.
Once you close your model in Dev, test it in QA, and deploy the new model to Production, you’re ready for the changes to the model to be reflected in all of the records in Production.
Solution
The key is that you need to reload the production data to trigger the reprocessing of your records so new rules (enrichers, validation rules) work against existing data or replay matching rules.
The quick answer is: you must reload your data in production so it goes through certification again. To do this: take the MI (Master Integration) records and reload them into the SD (Source Data) table in Production.
Note: For quick guidance on how to replay matching, here is our Integration documentation on how to replay the matching process. If you need to replay matching and don’t need to reset user decisions, you can skip directly to the section called “Re-load the data from the latest master values.” This similar script can be used to reprocess records for new enrichers and other non-matching rules. |
Here is the skeleton of the script you need to reload data. This only covers the insert statement. You will need the additional steps of getting a new load ID and submitting the load (or a continuous load if you choose to delegate the overhead to xDM).
INSERT into SEMARCHY_DEMO_MDM.SD_CUSTOMER (
B_LOADID /* Semarchy specific data fields */
,B_CLASSNAME
,B_PUBID
,B_SOURCEID /* Customer system fields */
,CUSTOMER_NAME /* Data fields */
,TOTAL_REVENUE
,INPADDRESS
,INPPOSTAL_CODE
,INPCITY
,INPCOUNTRY
,F_ACCOUNT_MANAGER
)
SELECT
v_load_id /* B_LOADID */
,'Customer' /* B_CLASSNAME - Entity name. Found in the Application Builder. */
,B_PUBID /* B_PUBID - Publisher ID mapped to B_PUBID in SD table to track source systems. */
,B_SOURCEID /* B_SOURCEID - mapped to B_SOURCEID in SD table. */
,CUSTOMER_NAME /* Data fields */
,TOTAL_REVENUE
,INPADDRESS
,INPPOSTAL_CODE
,INPCITY
,INPCOUNTRY
,F_ACCOUNT_MANAGER
FROM SEMARCHY_DEMO_MDM.MI_CUSTOMER /* Add a clause to filter records */
-- WHERE MI_CUSTOMER.CUSTOMER_ID = ??
Wrapping this insert into a stored procedure could look something like the following, using Oracle:
create or replace procedure SEMARCHY_DEMO_MDM.RELOAD_CUSTOMER_LOAD
AUTHID CURRENT_USER AS
v_load_id int;
v_batchid int;
v_submit_load_ok boolean := TRUE;
BEGIN
v_load_id := SEMARCHY_DEMO_REPOSITORY.INTEGRATION_LOAD.GET_NEW_LOADID (
'CustomerMDM' /* Data Location Name. Case sensitive. Found in the Application Builder*/
,'MANUAL_ETL' /* Informational. Describes the process or program doing the load. */
,'Test to reload master records' /* Informational. Describes the load. */
,'MANUAL_ETL_USER' /* User initializing the load.*/
);
BEGIN
INSERT into SEMARCHY_DEMO_MDM.SD_CUSTOMER (
B_LOADID /* Semarchy specific data fields */
,B_CLASSNAME
,B_PUBID
,B_SOURCEID /* Customer system fields */
,CUSTOMER_NAME /* Data fields */
,TOTAL_REVENUE
,INPADDRESS
,INPPOSTAL_CODE
,INPCITY
,INPCOUNTRY
,F_ACCOUNT_MANAGER
)
SELECT
v_load_id /* B_LOADID - v_load_id is a variable declared earlier */
,'Customer' /* B_CLASSNAME - Entity name. Found in the Application Builder */
,B_PUBID /* B_PUBID - Publisher ID mapped to B_PUBID in SD table to track source systems. */
,B_SOURCEID /* B_SOURCEID - mapped to B_SOURCEID in SD table. */
,CUSTOMER_NAME /* Data fields */
,TOTAL_REVENUE
,INPADDRESS
,INPPOSTAL_CODE
,INPCITY
,INPCOUNTRY
,F_ACCOUNT_MANAGER
FROM SEMARCHY_DEMO_MDM.MI_CUSTOMER
COMMIT;
EXCEPTION
WHEN DUP_VAL_ON_INDEX THEN
dbms_output.put_line('Encountered error ORA-00001. You attempted to insert non-unique records into the SD table.');
v_submit_load_ok := FALSE;
WHEN OTHERS THEN
dbms_output.put_line('Encountered an error while inserting into the SD table');
v_submit_load_ok := FALSE;
END;
IF v_submit_load_ok THEN
v_batchid := SEMARCHY_DEMO_REPOSITORY.INTEGRATION_LOAD.SUBMIT_LOAD (
v_load_id
,'INTEGRATE_DATA' /* Must exactly match a defined integration job. Found in the Application Builder. */
,'MANUAL_ETL_USER' /* Must exactly match the value of the User used in GET_NEW_LOADID above. */
);
ELSE
SEMARCHY_DEMO_REPOSITORY.INTEGRATION_LOAD.CANCEL_LOAD( v_load_id, 'MANUAL_ETL_USER' );
END IF;
END;
/* Run the procedure like this (it takes no parameters) */
exec SEMARCHY_DEMO_MDM.CUSTOMER_LOAD;
Performance concerns
It can make sense to add a where clause on your selection statement to strategically reload only the records you want to reprocess. It's OK to reload all the records as well. But you may want better performance by dividing your records into smaller batches and loading them in a piecemeal fashion.
Matching concerns
xDM does not automatically replay match because it’s possible you don’t want all of your records to undergo matching again. Consider which records to reload can be important. Test in QA to ensure that it is safe to reload all the records you intend to reprocess and ensure you do not get unexpected bad match results.
"