Repeatability
High
The matching logic is fully specified: domain matching, fuzzy string similarity, and phone normalization are deterministic algorithms. This same pipeline could be re-run on any future data export with no structural changes.
Ambiguity Tolerance
High
Success is clearly defined: a CSV of candidate duplicate pairs with match scores. The human team handles final merge decisions, so the agent doesn't need to resolve edge cases — just surface them with confidence scores.
Data & Tool Availability
Medium
The user needs to export and provide all three data sources (Pipedrive, HubSpot, SQL export) in a usable format. Once provided, standard Python libraries (pandas, rapidfuzz, recordlinkage) cover everything needed — no special API access required.
Error Cost
Low
The output is a review-ready CSV, not an automated merge. No customer data is modified. A false positive just means a human reviewer skips a pair; a false negative means a duplicate goes unflagged, which is recoverable.
Human Judgment Required
Low
The agent's job is purely algorithmic matching and scoring, not deciding which records to merge. Edge cases like 'Acme Inc' vs 'Acme LLC' being different legal entities are explicitly deferred to the human review step.