Fuzzy Similarity-Based Identity Matching Framework for Duplicate Detection in Financial Databases

Naga Charan Nandigama

PDF

Published: Mar 2, 2026

Keywords:

Fuzzy Logic, String Similarity, Identity Matching, Duplicate Detection, Financial Databases, Customer Identity Resolution, Jaro–Winkler, Levenshtein Distance, Data Quality, Banking Systems

Naga Charan Nandigama

Independent Researcher, Tampa, Florida, USA

Abstract

Duplicate and inconsistent customer records pose a significant challenge in modern financial databases, impacting Know Your Customer (KYC) compliance, risk assessment, fraud detection, and operational efficiency. Traditional deterministic matching techniques often fail due to spelling variations, typographical errors, incomplete data, and format inconsistencies in customer information. To address these limitations, this research proposes a Fuzzy Similarity-Based Identity Matching Framework designed to intelligently detect and reconcile duplicate customer profiles in banking and financial systems. The framework integrates fuzzy logic with advanced string similarity algorithms such as Jaro–Winkler, Levenshtein distance, cosine similarity, and phonetic encoding measures to generate weighted similarity scores. A multi-layer fuzzy inference model evaluates these scores to classify record pairs into match, probable match, or non-match categories. Experimental validation demonstrates significant improvements in accuracy, recall, and false-positive reduction when compared to conventional rule-based and deterministic methods. The proposed approach enhances data quality, strengthens compliance processes, and supports secure, real-time identity resolution across heterogeneous financial databases, making it a scalable and robust solution for large banking environments.

Issue

Vol. 4 No. 1 (2026)

Section

Articles

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

Article Sidebar

Main Article Content

Abstract

Article Details