What is “Referential Matching?” A Primer.

Thought Leadership

What is patient matching? 

Patient matching is what enables healthcare organizations to link all of a patient’s data together—or to put it another way, to link all of their health data to the correct patients. While patient matching can happen across organizations (such as finding and sending the correct patient’s clinical data during health information exchange), it is typically talked about in the context of a single organization linking each patient’s data across its many facilities and technology systems—from its billing system, to clinical data in its various EHRs, to its labs and imaging and pharmacy management systems. More recently, consumer data, including CRM data, marketing data, and third-party data, has become an increasingly important piece of the puzzle. 

Why is patient matching important? 

Inaccurate patient matching has far reaching consequences throughout a healthcare organization. It impacts: 

  • Quality of care—On average, 18% of a health system’s medical records are duplicates1 (duplicate medical records are records that belong to the same patient but have not been linked), which means that nearly one in five patient’s health histories are incomplete at the point of care. 
  • Patient safety—86% of nurses, physicians, and IT practitioners say they have witnessed or know of a medical error that was the result of patient misidentification2
  • Revenue cycle—One third of denied claims are due to inaccurate patient matching, costing the average hospital $1.5M annually1
  • Reputation—88% of consumers directly blame the provider for their dissatisfaction with the lack of portability of their health records1
  • Strategy—Inaccurate matching diminishes quality performance and inhibits value-based care and patient engagement initiatives. 
  • Business and labor efficiency—to overcome inaccurate patient matching, organizations often invest in large teams that manually review and merge patient data that has not been accurately linked. 
  • Patient outreach and engagement—reaching new populations and engaging existing patients requires knowledge not only about their health but also their contact preferences, home situation, economic factors, and existing provider relationships. 

How does patient matching actually happen? 

Patient matching is achieved by technologies, people, and processes. For example, it is essential that registration staff look for an existing medical record for a patient in order to prevent a duplicate record from being created. But if a duplicate record is created anyway (for example, if the registration staff can’t find an existing medical record for a patient because the patient has changed her last name and address after a recent marriage), then it is essential that the provider has technologies and processes in place to discover that a duplicate has been created and to correct it. 

The foundational technologies that match and link all of a patient’s data together are called master patient index (MPI) technologies. Just about every EHR has one built in, and its job is to automatically link all of a patient’s data across the EHR and to flag any duplicate records that may have been created. Some organizations will also invest in enterprise master patient index (EMPI) technologies that help link each patient’s data not just within an EHR, but also across multiple EHRs, facilities, and other technology systems. 

What is provider matching? 

Provider data includes both healthcare practitioner and healthcare organization data. Similar to patient matching, it is important to have all available data about one practitioner linked together in one place. Is the Dr. Jane Smith providing care at a family care location the same person as Dr. J. Smith who has privileges at a hospital in the same network? Knowing this information is vital to managing providers and offering comprehensive care to patients. In addition to names, addresses, and contact information, data used to identify and match practitioners can include National Provider Identifier (NPI) numbers, state license IDs, Drug Enforcement Administration (DEA) IDs, and other Medicare and Medicaid IDs. Data used to manage organizations can include legal names of the entities, addresses, NPI numbers, and Tax ID numbers (TIN). 

How do MPI and EMPI technologies work? 

All MPI and EMPI technologies use fundamentally the same approach to patient matching. This is true whether we’re talking about the built-in MPI module in an EHR from Epic®, Cerner®, or another vendor, or whether we’re talking about an EMPI product from a vendor like IBM®. 

Basically, all conventional patient matching technologies use algorithms to compare the demographic data (name, address, birthdate, etc.) from two patient records to determine if those records belong to the same person—in other words, if they match. If the demographic data is the same or very close, the technology determines that the records match. 

The most basic of these algorithms are called “deterministic” algorithms, and they look for a perfect match between each data element in order to determine that two records belong to the same person. 

The most sophisticated of these algorithms are called “probabilistic” algorithms, and they use statistics, weights, thresholds, rules, and complicated math to calculate the probability that two records belong to the same person. Because of this, probabilistic algorithms can overcome minor data errors like misspelled names and mistyped birthdates, and they can understand that two records with the last name “Rumpelstiltskin” are more likely to belong to the same person than two with the last name “Smith.” 

What is wrong with existing MPI and EMPI technologies? 

Even the most sophisticated probabilistic algorithms found in today’s best-in-class MPI and EMPI technologies have been around in one form or another since the 1970s, and they have seen little innovation since then. Because of this, they are struggling in today’s increasingly complex healthcare ecosystem—where there is much more patient data, which must be linked across increasingly large provider organizations, who must use this data to provide much more valuable and cost-effective care. 

Consider these matches that no MPI or EMPI technology could ever make: 

  • Out-of-date data—An existing medical record contains a patient’s old address and maiden name, and a newly created duplicate contains that same patient’s current address and married name. 
  • Sparse data—A lab record that contains very sparse demographic data for a patient, such as just the patient’s name and birthdate. 
  • Different data attributes—One medical record from a hospital contains a patient’s name, address, and SSN, while a record from a clinic owned by the hospital contains that patient’s name, phone number, and birthdate. 

How is Referential Matching different? 

Referential Matching is a completely different approach to patient matching technology. Rather than directly comparing the demographic data from two patient records to see if they match, Referential Matching technologies instead match the demographic data from each record to a comprehensive, continuously-updated, and highly-curated reference database of identities. Verato Carbon, Verato’s referential database contains identities spanning the entire U.S. population, and each identity contains a complete profile of demographic data spanning a 30-year history. Verato Carbon also contains 9 million identities of healthcare practitioners, including licensing information. Such databases are essentially pre-built answer keys for all patient and provider data. 

By matching records to a reference database instead of to each other, Referential Matching technologies can make matches that deterministic and probabilistic algorithms could never make—even patient records containing demographic data that is out-of-date, incomplete, incorrect, or different. 

Importantly, Referential Matching isn’t simply a better algorithm. It is a completely new approach that is a quantum leap more accurate than existing matching technologies. It represents the next generation of patient matching technology. 

Just as importantly, because of the nature of Referential Matching technologies, they are only available as cloud-based software-as-a-service (SaaS) solutions—but this means they are typically cost-effective, and often are highly scalable and secured in HIPAA-compliant and HITRUST-certified cloud infrastructures. This also means they can be accessed using modern APIs. 

Referential Matching is such a powerful new technology that The Pew Charitable Trusts, in its landmark patient matching report “Enhanced Patient Matching Is Critical to Achieving Full Promise of Digital Health Records,” highlighted Referential Matching technology as one of four opportunities to improve patient matching nationwide, saying, “healthcare organizations—including health information networks—should consider incorporating referential matching into their processes given that this approach has generated among the highest match rates currently published.” 

How can Referential Matching be used? 

Referential Matching technologies can typically be used in five ways: 

  1. X-Ray—Organizations can gain an “X-Ray” of their EHR or EMPI technology using a cloud-based Referential Matching service. This X-Ray will give the organization insight into how many duplicate records are in the EHR or EMPI, and how many matches the EHR or EMPI has missed. It can also give insight into what kinds of data issues are causing the duplicate records and missed matches to occur. 
  1. Automated improvement—Organizations can “plug in” a cloud-based Referential Matching service into their EHR or EMPI to improve that technology’s patient matching accuracy and reduce its duplicate records. For example, a Referential Matching service could automatically find and resolve an EHR’s or EMPI’s duplicate records and missed matches—even the toughest matches that the EHR or EMPI has flagged as “potential duplicate records” requiring manual review. Such plug-ins work with any technology system from any vendor, like Epic®, Cerner®, eClinicalWorks®, Allscripts®, IBM®, Mirth®, and others. 
  1. Referential Matching EMPI—Organizations can deploy a cloud-based EMPI that uses Referential Matching to achieve the highest levels of patient matching accuracy across the organization’s many facilities, EHRs, and other systems. 
  1. Statewide or Nationwide MPI—Referential Matching technology is the only patient matching technology that can scale to the statewide or nationwide scale without making great sacrifices in accuracy. Because of this, it is the only technology that can be the backbone of a statewide or nationwide master patient index. 
  1. Inter-organizational interoperability—Referential Matching technology is the only patient matching technology that allows organizations to automatically discover common patients they share with each other, because each organization’s patients are matched to the same set of universal and unique reference identities. Because of this, it is also the only patient matching technology that can be used as a “cross-reference” for each organization’s unique patient identifiers—allowing providers to discover common patients even if one provider uses a fingerprint biometric to identify its patients, another uses an iris scan, and a third uses driver’s license numbers. (This is all assuming, of course, that each organization already has the proper privacy, exchange, and patient consent agreements in place.) 

Does anyone currently use Referential Matching technology? 

Some of the largest providers, payers, and health information exchanges (HIEs) in the country are already using Referential Matching technology to manage, match, and link their patient data with unprecedented ease and accuracy—either by gaining an X-Ray of their existing EHR’s or EMPI’s matching accuracy, by improving their EHR’s or EMPI’s matching and reducing its duplicate records, by leveraging a Referential Matching EMPI to link their patient data across their enterprise with the highest accuracy rates, or by deploying a statewide Referential Matching MPI. 


[1] Black Book Market Research, Mid-Year EHR Consumer Satisfaction Survey, 2018 

[2] Ponemon Institute, National Patient Misidentification Report, 2016