Linking the Comparison and Graphical Approaches to Bipartite Matching

Redivo, Edoardo

doi:10.1111/insr.70038

Bipartite record linkage has the goal of identifying observations referring to the same individual, called coreferent observations, across two distinct non-duplicated datasets. The two main approaches to solve this task are the Fellegi–Sunter model, which relies on pairwise comparisons of observations, and the graphical record linkage model, which directly models the data and groups together coreferent observations. In this paper, we aim to investigate the similarities between these two methods. We show that both models can be expressed in terms of a latent binary matrix indicating coreferent record pairs, that they can be framed as particular latent class analysis models and that they admit a direct relationship between their parameters under a common data model. Moreover, we propose a unified estimation framework based on a classification expectation–maximization algorithm. The proposed estimation method properly incorporates the problem constraints, while still allowing for a computationally efficient implementation. Moreover, it allows for an interchangeable use of the same distributional assumptions on the linkage distribution between the two models. Empirical results using the proposed estimation method demonstrate satisfactory and mostly equivalent performance for two models both on simulations and on a real dataset commonly used as a benchmark for record linkage.

Redivo, E. (2026). Linking the Comparison and Graphical Approaches to Bipartite Matching. INTERNATIONAL STATISTICAL REVIEW, NA, 1-26 [10.1111/insr.70038].