Why Duplicates Happen: Process Failures, Not Fraud
The instinct when AP discovers a duplicate payment is to look for a bad actor. In practice, 85–90% of duplicate payments trace back to three categories of process and data quality failure:
- Vendor master duplication: the same legal entity appearing under multiple vendor IDs, often from different invoice routing channels, acquisitions, or entity-code migrations. Payments to the same supplier route through different IDs, making automated duplicate detection in the ERP blind to the collision.
- Invoice reference inconsistencies: the same invoice received through different channels (email, EDI, supplier portal, manual entry) with slightly different reference formats. "INV-2025-0441" and "2025-0441" may be the same invoice in different systems.
- Credit/rebill timing gaps: a vendor credits an invoice and reissues a corrected version. If the credit posts before the original payment clears, or the rebill arrives before the reversal is visible in the AP queue, both get paid. The mechanics are process timing failures, not intent.
Understanding the root cause matters because it determines the right control: fraud requires investigation workflows and escalation. Process failures require detection patterns, pre-payment scoring, and operator corrections.
The 6 High-Signal Detection Patterns
These patterns should be applied against the AP transaction history and run against the payment queue before each payment run. Listed in order of detection confidence and typical dollar exposure.
The highest-confidence pattern. Same normalized invoice reference number, amount within a configurable tolerance (typically 0.5%), paid to the same golden vendor ID within a lookback window (typically 90 days). Catches simple re-entries and system integration double-feeds.
Match: InvRef_Norm = InvRef_Norm AND ABS(Amt - Amt) / Amt < 0.005 AND GoldenVendorID = GoldenVendorID AND PayDate within 90 days
Invoice numbers that are not exact matches but are highly similar (Levenshtein distance ≤ 2, or same numeric core with different prefix/suffix formatting). Same unified vendor, amounts within 2%, payments within 30 days. Catches the "INV-441" vs "2025-441" vs "INV2025-441" problem.
Match: FuzzyMatch(InvRef, InvRef) >= 0.88 AND GoldenVendorID = GoldenVendorID AND PayDate within 30 days AND ABS(Amt - Amt) / Amt < 0.02
Payments to the same golden vendor from different legal entities or company codes within the ERP, with the same invoice reference or highly similar amounts in the same period. Requires a golden vendor ID layer that unifies the vendor across company codes, which standard ERP duplicate checks cannot see.
Match: GoldenVendorID = GoldenVendorID AND CompanyCode != CompanyCode AND InvRef_Norm match OR (Amount match AND InvoiceDate within 15 days)
A credit note and a corrected invoice arrive from the same vendor in close sequence. The credit posts correctly but the original invoice and the rebill both enter the payment queue before the net position is visible in AP reporting. This pattern detects cases where the sum of payments to a vendor in a period exceeds the net invoice total by more than the tolerance threshold.
Flag: SUM(PaymentsToVendor, 30 days) > SUM(NetInvoiceTotal, 30 days) + Tolerance AND VendorHasCreditNote IN period
Two or more payments to the same golden vendor for the same or highly similar amounts within a 7-day window. Catches manual re-runs, ERP batch processing errors, and system integration retry loops. Lower precision than pattern 1; requires controller review rather than automatic hold.
Flag: COUNT(Payments WHERE GoldenVendorID = X AND ABS(Amt - Amt) / Amt < 0.01 AND PayDate within 7 days) >= 2
A vendor has payments routing to more than one bank account number within the same entity, and the bank account records do not have a documented multi-account justification. This is the pattern most often associated with fraud (bank account hijacking), but also catches legitimate address-of-record changes that bypassed the update control workflow.
Flag: GoldenVendorID has payments to BankAcct_A AND BankAcct_B in same 90-day period AND no approved multi-account flag in vendor record
Control Design: Three Layers
Effective duplicate payment control is not a single tool: it is three detection layers that operate at different points in the AP cycle:
Layer 1: Historical Recovery Scan (Retrospective)
Run the six patterns against 24 months of historical payment transactions. This is the one-time "find what got through" scan. It produces a ranked recovery candidate list: duplicate payments that have already been made. The output is a controller action queue with the evidence record for each candidate: both payment documents, the vendor collision details, and a recommended recovery action (credit request, stop-payment, dispute).
Layer 2: Pre-Payment Risk Scoring (Before Each Payment Run)
Before each payment run, run patterns 1–4 against the items queued for payment and flag high-risk matches. Items above the risk score threshold are held for controller review before the payment processes. This is the most valuable layer; it prevents the duplicate rather than recovering it. Requires a lookup against the payment history database that must be updated with each completed run.
Layer 3: Tolerance Bands and Vendor-Type Calibration
Not all vendors have the same risk profile. High-frequency, low-value vendors (utilities, subscriptions) generate many legitimate same-amount, same-vendor payments. Calibrate pattern thresholds by vendor category and spend band. A $42 utility payment matching pattern 5 is noise. A $42,000 professional services payment matching pattern 5 requires review. Miscalibrated tolerances produce alert fatigue; calibrated tolerances produce actionable queues.
Design principle: The goal of control design is a controller action queue that is small enough to be reviewed before every payment run. If the queue is too large to review, the thresholds are miscalibrated. Start conservative (high-confidence patterns only, high-spend vendors only) and expand the net as the team builds confidence in the system.
Implementation Sequence
This sequence is designed for AP teams working without ERP admin access; all detection runs on exports.
1
Week 1–2: Build the Golden Vendor ID Layer
Before any detection patterns can run cross-entity, a unified golden vendor ID must exist. Export the vendor master, run deduplication, produce a golden vendor mapping table. This is the prerequisite that makes patterns 3 and 6 possible.
2
Week 2–3: Run Historical Recovery Scan (Patterns 1–6)
Apply all six patterns to 24 months of payment history against the golden vendor layer. Produce a ranked recovery queue. Conduct a brief controller review session to validate findings and remove false positives before escalation.
3
Week 3–4: Calibrate Thresholds and Build the Review Dashboard
Review false positive rates from the historical scan. Adjust amount tolerances and lookback windows by vendor category. Build the controller dashboard that surfaces the pre-payment queue, shows the evidence for each flagged item, and records resolution actions.
4
Week 4–6: Deploy Pre-Payment Scoring (Layer 2)
Connect the detection engine to the AP payment queue. This requires a data pull from the ERP before each payment run, typically a scheduled export that triggers a Fabric refresh and surfaces the flagged items to the controller before the batch processes. Items above the threshold are held; items below proceed. Document the SLA for controller review to avoid payment delays.
5
Ongoing: Monthly Calibration and Pattern Expansion
Review false positive rates and missed detections monthly. Adjust thresholds as vendor patterns stabilize. Add patterns 5 and 6 to the automated queue once the team is confident in the pattern 1–4 results. Track the five AP quality KPIs from the supplier master guide to measure control effectiveness over time.
How DataQubi Embeds This in the AP Workflow
DataQubi deploys this detection framework as a Microsoft Fabric-based control layer (no ERP modifications required). The architecture:
- Bronze layer: AP transaction exports and vendor master snapshots landed on a scheduled basis, with full historical retention and immutable lineage.
- Silver layer: golden vendor ID mapping applied, vendor names normalized, invoice references standardized, payment records linked to unified vendor identities.
- Detection layer: the six patterns run as SQL logic against the silver tables, producing a scored flagging table refreshed before each payment run cycle.
- Controller dashboard (Power BI): the pre-payment queue displayed by risk score and dollar value. Each item shows: both payment candidates, the specific pattern that flagged them, the vendor identity record, and a resolution action button. Completed actions are logged to the audit table.
The full stack from first data extract to live controller dashboard is typically deployed in 3–5 weeks for a company with a single ERP source. Multi-entity and multi-ERP deployments add 2–4 weeks for the golden vendor ID layer.