Sep 23, 2019 | Jim Shalaby
The idea of combining clinical data with financial/claims data to improve analytics is not new. The biggest barrier has typically been access to the clinical data due to policy/security as well as technical challenges – stringent regulatory and institutional policy barriers, proprietary EMR formats, lag in adoption of standards, etc. However, over the past 6 years or so, we’ve seen a steady increase in clinical data usage due to the adoption of interoperability standards (e.g., CDA, FHIR) as well as maturing data access policies and regulations. Having access to a richer set of data also comes with some new challenges, of which, reconciliation and the added value/usefulness of clinical data are at the top of the list.
How is the Addition of Clinical Data to Claims Data Useful?
At a high level, it would seem obvious that clinical data is more complete and is the raw source for all claims data. Having access to clinical data, in theory, would obviate the need for claims data. However, some of the most difficult challenges with the use of clinical data alone today are inconsistency and loss of context.
Data inconsistency is partially attributable to the large range of variability we see when looking at sources such as C-CDA/CCD or FHIR (whether the data is accessed directly or through aggregation in HIEs, data warehouses or registries). The reasons range from lack of proper coding and structure (improper use of the standard) to the wide spectrum of semantic variability, even when conforming to standards (many valid ways of saying the same thing). One example that comes to mind when speaking of proper use of structure is medication entry in a CDA document. Meds can be found in the narrative section and in the coded entry section. This makes it very difficult to trust that a medication list is complete without having to parse or process the free text section of a CDA. As far as semantic variability, the ability to represent blood pressure is a good example. Believe it or not, there are many variables to consider when normalizing around a standard set of blood pressure measurements – the blood pressure measurements – systolic and diastolic; the position – sitting, standing, lying down; the exertional state – at rest, not at rest; and the body site – upper left arm, lower left arm, upper right arm, lower arm. All of these variables contribute to semantic variability.
Loss of context occurs when information about where the data came from is not captured. It can be very challenging since it can sometimes be related to where in a clinical workflow the data was captured or may require complex rules to recreate the proper context. Here are some examples of different context losses:
- A simple context is understanding if a procedure was actually performed vs planned. In claims data, we can infer that a procedure was performed if it was billed (most of the time). However, in clinical data, if the source was a claims record of a performed procedure a scheduling system or clinical care plan (both found in typical EMRs) it would be very difficult to differentiate if that context was not captured.
- A more complex context is trying to compute medication adherence for complex regimens such as multi-drug chemotherapy or HIV regimens. Understanding which regimen a patient is on as well as which step in the regimen (e.g., which cycle in chemotherapy) can be very challenging if that context is not explicitly captured. In fact, that is a huge challenge with just using claims data without clinical data.
- Contextual loss can occur in workflow as well. An example is the reporting of a medication drug level. The reporting lab will state it as a peak, trough or random level, but sometimes that information is not accurate or present. Applying rules to identify the time at which the blood was drawn vs. the time a medication was administered is the most reliable way to determine context.
Value Of Combining Clinical and Claims Data
Even with these challenges, there is clear value in combining clinical data with claims data. We can think of clinical data as providing context to claims data and claims data as providing confirmation to clinical data. This is especially true when thinking of claims data as confirming what has actually been done or performed and clinical data representing what was planned. Some good examples are in performed procedures, medication adherence and problems addressed in an encounter:
- With performed procedures such as an appendectomy, clinical data reveals the plan for an appendectomy through scheduling data. However, to know if an appendectomy was actually performed requires a much deeper analysis of the health record. By reconciling clinical plans with claims data we get a better picture of what was actually performed (the scheduled procedure may have been for an appendectomy but the surgeon may have had to do a partial colon resection for example).
- With medication adherence, the clinical data shows what the physician had ordered/prescribed but the claims data shows what the pharmacy actually dispensed. Conversely, if a patient is on a complex chemotherapy regimen, claims data alone is typically not enough to know with certainty which regimen is being used. The clinical data can confirm the regimen which in turn would make it possible to calculate adherence of the self-administered drug components of the regimen.
- With problems addressed in an encounter, A patient may have 10 or more coded/structured problems on their problem list, but in any one encounter, some subset is likely addressed. Without doing NLP on narrative encounter notes, or depending upon coded data capture from clinicians, the billing diagnoses from claims data can indicate what problems were actually addressed, and when, in the clinical record.
Enabling Reconciliation
Combining clinical and claims data requires a more complex set of data reconciliation rules. The medication adherence and performed procedures discussion above are examples. The value that reconciliation yields is a much richer picture both at the patient and population levels. If we think of claims data as being echoes or shadows of detailed clinical encounters with the benefit of also clarifying what was performed on a patient, the value becomes apparent.
In summary, there’s a great deal of value in combining clinical and claims data especially given today’s state of clinical data. With richer data comes the need for “helper” services ranging from semantic “normalizers” to reconciliators (supported by the former). As the inconsistency and context issues get addressed with clinical data, the need for rules to determine context or reconciliation will not go away, they’ll just become more sophisticated and less error-prone. We’re now at a point where claims data alone is not sufficient to gain accurate clinical insights because the barriers to accessing clinical data are quickly disappearing.