Ever since the U.S. Food and Drug Administration (FDA) enacted the 21st Century Cures Act in 2016 and encouraged sponsors to utilize real-world evidence (RWE) with the intent to increase innovation and accelerate product development, medical technology innovators have been eager to comply. The burgeoning use of RWE, clinical evidence resulting from real-world data analysis (i.e., routinely collected data relating to patient health status and/or delivery of healthcare), should lead to improved safety data monitoring that better captures everyday healthcare practices and the diverse nature of patient populations.
However, before product developers can more confidently design RWE studies and venture away from the traditional path of lengthy prospective studies, industry must overcome several challenges. This article provides insight on these barriers and perspective on how industry might fully realize the promise of RWE in the near-term to advance health care.
RWE is still in the early stages of adoption. Thus, FDA and industry are continuing to learn together how best to extract real-world data (RWD) and design robust and reliable RWE studies. As part of that process, FDA expectations on how industry can appropriately harness high-quality real-world data to derive RWE requires greater clarity.
The U.S. Congress enacted the Cures Act to accelerate patient access to innovative and necessary therapies and medical devices. At the time, the act marked a clear expansion, which directed FDA to consider RWE in regulatory decision-making. Prior to this, RWE had typically been used as part of pharmacovigilance post-market surveillance to assess of treatment patterns and monitor for safety and adverse events.
Since then, FDA has continued to provide updated guidance on the use of RWE, which, while helpful, continues to leave room for ambiguity. This includes the 2021 report, “Examples of Real-World Evidence (RWE) Used in Medical Device Regulatory Decisions,” which was a much-anticipated follow-on to the 2018 release of “The Framework for FDA’s Real-world Evidence Program.” The report encourages the continued use of RWE, asks sponsors to engage early with FDA regarding any future RWE plans and provides RWE examples that led to pre- and post-market regulatory decisions spanning 2012 to 2019. Of these 90 successful examples, 23 studies used RWE as the primary source of clinical evidence. However, the high-level discussion of each individual example does not provide sponsors with the specific criteria the FDA used to evaluate how each RWE study was suitable to support a regulatory decision.
While part of that ambiguity stems from the novelty of using RWE in regulatory-decision making, part of it also stems from a lack of information on unsuccessful submissions. Without clearly understanding what factors or design elements contribute to FDA ruling an RWE submission cannot be approved/cleared, it remains challenging for sponsors to confidently design regulatory-grade RWE studies.
The COVID-19 pandemic hit not long after the initial Cures Act was enacted, and quickly shifted FDA and other regulatory bodies’ priorities toward vaccine, diagnostic test review and prevention efforts. Still, fostering the use of RWE in these cases remained a priority, as FDA hosted several workshops right before and during the pandemic to enable faster transition of Emergency Use Authorization (EUA) products to full marketing authorization. However, scientific hurdles remained in order to objectively use RWD.
The good news is, scientific progress is being made. FDA is once again accepting all pre-submissions. That means industry and FDA can resume working together to clarify the design and implementation of RWE studies in product development and approval.
RWD can be collected from a myriad of sources, including registries, electronic health records (EHRs), medical claims and billing data, patient reported outcomes (PROs), and mobile and wearable devices. Sources of RWD may be comprised of structured, semi-structured and unstructured data:
The amalgamation of structured, semi-structured and unstructured data may be required to obtain regulatory grade data, i.e., complete and high-quality data obtained through a transparent and robust study design, including provenance, veracity and traceability of the data. Because the way healthcare providers record data can vary from site to site, structured data may not be consistently capturing all the data points, which could potentially bias a study (i.e., data interoperability, which implies both syntactic and semantic interoperability). For example, to initiate a study, predetermined inclusion and exclusion criteria must be set. Using structured data alone can handicap the ability to appropriately apply these criteria if, for example, characteristic disease features were not consistently captured. Structured data may also lack information on the type of procedure performed, surgical history, medical image examinations or prior treatments. If this information is inconsistent in the structured data across a cohort, it can lead to patients being inappropriately included or excluded from a study, and could introduce bias.
Unstructured data can help fill in missing details. However, due to the difficult-to-analyze nature of unstructured data, it can be challenging for sponsors to effectively use this information to support regulatory submissions. Key patient information may not always be organized in a consistent manner across sites, providers or doctors, and this information can be scattered across multiple sources, such as clinician notes and pathology reports. The process of abstracting unstructured data and consolidating it into structured variables historically has required significant manual work, which is time-consuming, costly and involves extensive quality control.
The increasing use of artificial intelligence (AI) methods, e.g., natural language processing (NLP) and machine learning (ML), allows sponsors to more readily utilize these data-rich sources of information by making the abstraction process substantially more efficient. Despite the clear advantage AI provides when dealing with unstructured data, the initial manual confirmation, verification and validation involved in the setup of NLP and/or ML for a study remains a cumbersome process that takes significant time and expertise. Additionally, care must be taken to develop the AI algorithm in a manner that does not lead to bias through non-random patient selection.
Although overlap often exists in the available structured and unstructured data, the inclusion of unstructured data in a regulatory grade RWE study remains important for data completeness and rigorous standards and approaches are required to ensure the data are reliable and of the highest quality. Fortunately, continued developments in NLP and ML are reducing the burden of unstructured data abstraction.
When working with clinical sites to obtain RWD from EHRs it is not uncommon to see a fragmented approach to patient information protection. When using the same protocol and contracts with clinical sites, what is considered acceptable under those protocols and contracts may vary from site to site. For example, some sites possess a strong understanding of the de-identification process and cybersecurity requirements, whereas other sites may be new to this process, which can raise concerns regarding privacy if the process is not fully understood.
The variability in approach to patient information protection can lead to scenarios where some clinical sites are comfortable securely transferring de-identified data, including clinical notes, to sponsors for use in RWE studies, while others may not be comfortable transferring such data or may want to do so with specific provisions. When these types of information are provided in different forms, the potential exists for the introduction of bias and site-specific variability linked to data quality and completeness. In such cases, statistical analysis to determine whether all sites can appropriately be pooled is prudent.
This is an area where standards should be created to develop a common framework to assess patient and data protections, while also fostering utilization of RWD in clinical research. As noted above, these barriers are largely due to the novelty of using RWD and RWE in or complimentary to clinical studies. Medical product developers should continue to work with investigators, clinical sites and ethics committees to standardize appropriate methods for patient data protection and to use the opportunity to educate stakeholders on the intentional scientific applications of RWE. By developing a standardized framework, all involved can better ensure sufficient patient protections are consistently met, while expanding the use of RWE to support innovation.
With RWE still in the infancy stage, all involved are concurrently learning how to best utilize it for regulatory decision making. The steep learning curve associated with sponsors leveraging RWE for the first time can lead to wildly unpredictable timelines, but also great benefits. As the use of RWE grows and AI continues to develop to more efficiently extract high-quality and reliable information from unstructured data, regulatory approvals and product claims will increase and expand.