Randomized controlled trials (RCTs) are considered the gold-standard study design for comparative effectiveness research, which involves directly comparing the effectiveness of one treatment to another. Despite their many benefits, RCTs have important limitations that can reduce their utility for certain types of comparative effectiveness research and limit the external validity of their findings. For this reason, real-world evidence—data about outcomes in actual patients who are receiving a treatment in a usual care setting—is gaining traction as a key source of evidence for comparative effectiveness research.
In this post, we review how evidence generated by RCTs compares to real-world evidence, discuss when and why these two study types may yield different outcomes for comparative effectiveness analyses, and examine why real-world evidence is particularly useful as a complement to RCT data.
Efficacy Versus Effectiveness
The US health care system’s shift from volume- to value-based payment models has heightened stakeholders’ focus on measuring care value accurately and reliably. Care value is, fundamentally, a function of quality and costs of care for real-world patients, not patients in clinical trials. Thus, policy makers, payers, and health system leaders have grown increasingly interested in finding ways to estimate reliably how given treatments will affect care quality and spending in real-world patient populations.
Both RCTs and real-world evidence can help define a treatment’s absolute value and its value relative to alternative interventions, but they do so in different ways. RCTs are the gold-standard study design for evaluating treatment efficacy—how a drug performs in a controlled clinical environment. Stakeholders have traditionally used treatment efficacy data to support new drug development and approval, as well as policy and clinical decision making for new drugs and devices.
Treatment effectiveness can be thought of as a treatment’s effect on outcomes in real-world clinical practice and is typically assessed using data from retrospective or prospective cohort studies of patients who are receiving a treatment. The difference between a treatment’s efficacy and its effectiveness, called the “efficacy-effectiveness gap,” reflects the difference between a treatment’s health effects in RCTs and real-world clinical practice.
What Creates The Efficacy-Effectiveness Gap?
In some situations, the efficacy-effectiveness gap may be quite small; in others, however, it is large. To illuminate the circumstances in which real-world evidence may be most helpful for assessing a treatment’s comparative effectiveness, it is important to understand the factors that generate and influence the size of the efficacy-effectiveness gap—when RCT data and real-world evidence do not align.
Characteristics Of The Treated Disease
The efficacy-effectiveness gap may be largest for treatments that produce benefits that manifest over many years. For these treatments, RCTs—which rarely last longer than five to seven years—may underestimate true treatment efficacy. Many long-term maintenance therapies for chronic diseases, such as diabetes, fall into this category because they continue to produce benefits a decade or more after treatment is initiated. In contrast, the efficacy-effectiveness gap is likely to be smaller for acute illnesses such as hepatitis C, where treatment duration is shorter and the clinical benefits of adequate therapy manifest rapidly.
Patients’ Clinical Characteristics
Patients included in RCTs are usually healthier, younger, and more homogenous than individuals who ultimately receive a treatment in the real world. For this reason, the results of RCTs may not generalize to a broader group of patients, and the efficacy-effectiveness gap may be particularly large for older, and more severely ill, patient populations. There is ample evidence to support this assertion. For example, rates of hypoglycemia in RCTs of new treatments for type 2 diabetes mellitus are lower than they are in patients eligible for, and receiving, the treatments evaluated in these trials. One potential explanation for this difference in rates of hypoglycemia is that certain comorbidities, including chronic kidney disease and congestive heart failure, are both more common in real-world versus RCT populations and are associated with a higher risk of severe hypoglycemia. In a parallel example, asthmatic smokers, who have a higher risk of severe asthma exacerbations than asthmatic non-smokers, are typically excluded from RCTs of asthma treatments.
One critical consequence of enrolling healthier patients (or patients with fewer comorbidities) in RCTs is that adverse clinical outcomes often occur less frequently in RCTs than they do in the real world. The less frequently these outcomes occur, the harder it is to prove that a treatment can prevent them. RCTs may therefore underestimate a treatment’s ability to prevent adverse outcomes in real-world patient populations.
Patients’ Sociodemographic Characteristics
Because socioeconomically disadvantaged patients, racial and ethnic minorities, the elderly, and patients with limited health literacy are frequently underrepresented in RCTs, estimates of treatment efficacy derived from RCTs also may not generalize to these patient populations. An analysis of pivotal RCTs of novel therapeutics approved by the Food and Drug Administration (FDA) between 2011 and 2013 found that racial and ethnic minorities were consistently underrepresented in these trials. Underrepresentation of these patient cohorts in RCTs introduces substantial uncertainty into estimates of the efficacy-effectiveness gap for them.
Methodological Differences And Treatment Adherence
RCTs prioritize internal validity, or the ability to characterize accurately and precisely how an intervention affects one or more outcomes among study participants. As a result, they often lack generalizability, or external validity. Real-world studies may aim for greater generalizability but may have less internal validity than RCTs. For example, selection bias is a critical issue in real-world studies because patients are not randomized to treatment. This lack of randomization can produce situations in which treatment effectiveness is either under- or overestimated. Indeed, if a treatment is commonly used in extremely sick patients whose disease has progressed beyond the point in which treatment can improve outcomes, then a study of outcomes among patients who receive this treatment as part of routine clinical care would likely understate its impact. Alternatively, effectiveness estimates derived from a study of a population of patients who stand to benefit greatly from treatment and have high rates of treatment adherence might not generalize to populations of high-risk patients with lower rates of adherence.
Importantly, RCTs of drug treatments often include several supporting services designed to improve health and facilitate and measure treatment adherence. These services may deliver benefits that are independent of, or synergistic with, the treatment being studied. Many RCTs include patient education interventions that, while not formally part of the intervention being studied, are more robust and expansive than education programs normally available to patients outside of the study. Studies indicate that patients enrolled in RCTs evaluating treatments for asthma, hypertension, and diabetes have benefitted independently from the guidance offered by providers delivering care in these trials. Efforts to replicate these support services in routine clinical practice have frequently failed. Thus, the inclusion of these support services can further limit the external generalizability of an RCT’s findings. And because these support services are usually focused on facilitating high rates of treatment adherence, adherence rates in RCTs are often much higher than they are in routine clinical practice and can influence outcomes.
A recent study by Ginger S. Carls and colleagues underscores the outcomes gap, with poor medication adherence accounting for approximately three-fourths of the difference observed between RCT and real-world evidence in type 2 diabetes studies. Moreover, the challenge of lower real-world adherence rates is compounded by the limitations of current adherence measures. While “good adherence” is often defined as taking a medication 80 percent of the time, this threshold may be inappropriate for some medications. For example, for statins, clinical benefits are thought to accrue above 80 percent adherence, but additional adherence beyond this level may not generate large incremental clinical benefits. In contrast, patients with type 1 diabetes who are adherent with insulin only 80 percent of the time remain at high risk for life-threatening complications of their disease, including diabetic ketoacidosis. More generally, the adequacy of a given level of treatment adherence depends entirely on the disease being treated, how the treatment works, and the consequences of less-than-perfect adherence.
Implications
The existence and determinants of the efficacy-effectiveness gap have several implications for optimizing the quality of research on comparative effectiveness and value. First, the source of data used to carry out comparative effectiveness research can profoundly impact results. An understanding of the efficacy-effectiveness gap, and circumstances in which it is likely to be smaller or larger, can help guide policy makers, payers, and researchers in choosing the data sources that most accurately reflect outcomes in routine clinical practice. Recognizing that both real-world studies and RCTs provide evidence that is valuable for making policy and regulatory decisions, the FDA recently published guidance about how to incorporate real-world evidence into regulatory submissions.
Second, users of comparative effectiveness research should extrapolate results of RCTs to real-world populations conservatively and with the knowledge that RCT results may not mirror outcomes among broad populations of patients treated in routine practice. In particular, the results of RCTs may lack external validity among the elderly, more severely ill, socioeconomically disadvantaged, and racial and ethnic minorities—populations that are consistently and significantly underrepresented in clinical trials. Thus, both the absolute and relative risk-benefit profile of a treatment, as portrayed in a trial setting, may be quite different in real life. Researchers must make concerted efforts to include these underrepresented populations in RCTs to mitigate this critical limitation of clinical trials and narrow the efficacy-effectiveness gap for these groups.
Third, the development and adoption of standardized, trusted methods for gathering real-world evidence across different patient populations, geographic regions, and health systems may help increase confidence in the accuracy of this data type and comparative effectiveness assessments that use it.
Fourth, reimbursement models that rely on RCT-based estimates of comparative and absolute treatment value should account for uncertainty related to the efficacy-effectiveness gap. For example, value-based drug pricing agreements should incorporate mechanisms that allow for additional performance-based rewards (and penalties) based on real-world treatment outcomes.
Looking Ahead: Closing The Gap
Researchers, clinicians, and policy makers can take a number of steps to close the efficacy-effectiveness gap. For one, we need RCTs with greater external validity; in practical terms, this means that we cannot shy away from enrolling sicker patients in clinical trials. Furthermore, we must make concerted efforts to recruit greater numbers of racial and ethnic minority patients, the elderly, and socioeconomically disadvantaged patients to improve the representation of these important populations in RCTs. In addition, leaders of RCTs of chronic disease treatments should incorporate plans for long-term patient follow-up; even if many participants are lost to follow up, these studies can still be quite useful for understanding the long-term benefits and risks of a treatment in the real world.
We also need to take a more holistic and comprehensive approach to addressing the root causes of nonadherence in real-world settings—a key driver of the efficacy-effectiveness gap. To date, most efforts to increase medication adherence have focused on trying to improve patient behavior through education, medication reminders, and financial incentives; while some of these interventions have succeeded, on the whole they have not yielded sustainable, long-term benefits across large populations. To achieve long-term improvements in adherence, we need comprehensive strategies that simultaneously address multiple drivers of nonadherence, including social determinants of health, poor health literacy, inconsistent access to care, and lack of trust in the health care system. Additionally, innovative treatment delivery systems, including implantable devices and combined medication reminder-dispenser technologies, may help reduce the burden on patients to organize and remember to take their medications. We must continue to search for and embrace these and other outside-the-box approaches to closing the efficacy-effectiveness gap if we hope to capture the full value of our collective investments in health care innovation.
Authors’ Note
Financial support provided by Intarcia Therapeutics, Inc., to Precision Health Economics. Drs. Yu-Isenberg and Yee are employees of Intarcia Therapeutics. In addition to their academic and clinical positions, Drs. Blumenthal and Jena are consultants to Precision Health Economics, a life sciences consulting firm.