This article addresses performance issues encountered in Semarchy xDM during the 'Find and Match Duplicate' step within integration jobs, where matching rules cause extended delays in processing. This guide provides solutions to optimize match rules and improve processing times. 


Issue

Customers may experience delays in their integration jobs, particularly during the 'Find and Match Duplicate' step, where even small batches of records (1-5) may take over an hour to process. This issue may arise from inefficient use of functions within match conditions and the absence of optimized indexing. 

Solution

Improving match rules performance involves optimizing binning and matching conditions by removing functions, creating enriched calculated attributes, and adding indexes to the database. This allows for more efficient matching and indexing strategies.

Detailed Steps for Resolution

  1. Optimize Binning Conditions

    • Avoid Functions in Binning Conditions: Functions within binning conditions can slow down performance significantly.
    • Solution:
      • Create a calculated attribute that captures your current binning condition expression without functions.
      • Use this new calculated attribute as the binning field in the match rule instead of the original calculation.
  2. Optimize Matching Conditions

    • Avoid Functions Inside Match Rule Conditions: Functions like trim or upper should not be used directly in match rule conditions, as they can degrade performance.
    • Solution:
      • Create or update attributes with the necessary transformations (e.g., trimmed and/or uppercase versions) and store them as calculated attributes.
      • Use these newly created attributes in your match rule conditions instead of function calls within the rules.
  3. Update Existing Records with New Enrichers

    • Objective: To avoid re-importing all records, perform a manual update on the existing records directly in the MI table.
    • Solution:
      • Extract the SQL code for the enrichers from your task definitions.
      • Execute this SQL on the relevant MI_ table to update existing records so they match the newly created attributes.
    • Note: This approach ensures that the next load will enrich incoming records and compare them against updated attributes in the MI table.
  4. Add Indexes for Matching Attributes

    • Adding indexes can enhance the efficiency of matching tasks significantly, especially on attributes frequently used in match rules.
    • Suggested Indexes:
    • CREATE INDEX usr_1 ON mi_m_health_prfssnl_hprf (hprf_full_gbl_nm_trim, b_classname, b_pubid, b_sourceid);