Single Arm Studies and their role in providing pivotal evidence for Regulatory Approval

tranScrip Oncology Commentary: June 2023

In April 2023, the European Medicines Agency (EMA) published a draft reflection paper for public consultation (EMA/CHMP/564424/2021) on ‘Establishing efficacy based on single arm trials (SATs) submitted as pivotal evidence in a marketing authorisation (MA)’ (EMA, 2023) . This concept is different from the Food and Drug Administration(FDA) draft recommendation on ‘Clinical trials to support accelerated approval of oncology therapeutics’ published in March 2023 (FDA, 2023).

The EMA paper outlines important considerations in the design of SATs that might allow the treatment effect to be estimated and efficacy to be established for any indication without the need for further confirmatory studies. It does not aim to define conditions under which SATs may be considered acceptable as pivotal evidence for an MA. The applicant should justify that the SAT can provide clear pivotal evidence of efficacy by addressing the adequacy of the SAT, as well as the limitations and remaining uncertainties.

The FDA paper clarifies the current FDA thinking on the accelerated approval pathway that is commonly used for approval of oncology drugs in areas of high unmet medical need based on surrogate clinical endpoints. The initial study may be a SAT or a randomised controlled trial (RCT), typically followed by a post-marketing confirmatory trial to verify and quantify the anticipated clinical benefit.

Some of the concepts in the EMA paper may be relevant for any SAT, although they need to be stringently applied in the case of a pivotal study. Therefore, these important considerations regarding the SAT design are presented first below:

The principle recurring theme in the EMA paper is bias due to lack of double-blind randomisation and a concurrent control arm. External knowledge must be used to estimate the average outcome of the trial population if they had not received the experimental treatment and to allow a causal interpretation of the effect of the treatment.
The primary efficacy endpoint should be able to isolate treatment effects, in that the desired outcome would occur rarely in the absence of an active treatment. Issues include selection bias and possible over representation of patients with a higher likelihood of remission in the absence of treatment, an episodic disease that waxes and wanes and measurement error or misclassification without a comparator that is equally affected.
Time to event endpoints (e.g. progression free survival [PFS]) can occur in the absence of active treatment and are usually not suitable for use in SATs. Furthermore, the starting point of being at risk for the endpoint (‘time 0’) is usually different from the start of the trial. Prognostic factors can impact the disease course and prognostic cannot be separated from predictive factors based on the results of a SAT. This is also relevant for selection of the trial population.
Continuous endpoints allow for a precise measurement of changes experienced by patients during the trial, but this change cannot be attributed to treatment when there is within-patient variability, a fluctuating natural disease course or measurement error. ‘Regression to the mean’ is a common phenomenon, whereby patients with a low value for a specific endpoint will tend to improve at a later timepoint, regardless of whether an effective treatment is administered.
Binary endpoints may be considered to isolate the treatment effect in certain circumstances, typically where a ‘cure’ is achieved that would not be possible without active treatment (e.g. hepatitis C infection).
Similarly, a treatment effect could be concluded with time to event or continuous endpoints if values are measured that could not be achieved without treatment. In effect these endpoints are dichotomised by setting a threshold in advance that no patient could cross without treatment, even after accounting for potential sources of bias.
The trial population should reflect the intended target population (external validity). It also determines the plausibility of assumptions (known and unknown patient characteristics) about the disease course of a hypothetical control group or comparability with external data. High patient or disease heterogeneity makes interpretation of the results more challenging. Details of patients who were not selected at screening are needed to provide reassurance that the magnitude of effect is not due to selection of a favourable population. Interpretation of SAT results in a biomarker defined population is challenging as historical data are often limited and the biomarker may be prognostic for the natural course of the disease as well as predictive for the treatment effect. Subgroup heterogeneity caused by prognostic or predictive factors may be impossible to differentiate based on SAT data.
External information (general knowledge about the disease course or external clinical data) is critical for interpretation of the results and should be pre -specified in the study protocol. External information may be used to establish a clinically valid threshold for efficacy in advance that can support isolating a treatment effect. In exceptional cases, assessment of efficacy may be informed by a direct comparison against external clinical data. This adds complexity to pre-specification, relies on additional assumptions and the approach should be carefully evaluated on a case-by-case basis.
Statistically, predefinition of all aspects is essential as any changes (e.g. to endpoints, sample size, dosing regimen, eligibility criteria or unplanned interim analyses) are considered potentially data driven. The analysis model for estimation of the treatment effect should be pre-specified. A critical unplanned change is the post hoc designation of an exploratory Phase II trial as pivotal confirmatory evidence once data are available.

In summary, causal inference of a treatment effect from a SAT involves careful trial planning and understanding of the outcomes in the population without treatment. There are multiple sources of bias (15 are tabulated in the EMA paper) which can potentially be mitigated through the SAT design, conduct and analysis.

The FDA paper highlights that SATs with a primary endpoint of response rate (an accepted marker of drug activity) have commonly been used in oncology. Limitations of SATs are detailed. These include, as well as those mentioned in the EMA paper, the small safety database and the fact that clinical benefit for products with low response rates (e.g. immunotherapy) may not be predicted.

An RCT is the preferred approach to support accelerated approval (AA) by the FDA. Direct comparison to a concurrent control arm allows a more robust efficacy and safety assessment. This is particularly relevant for biomarker selected populations that lack historical trial data. It allows for assessment of potential regional differences in multinational trials and an RCT may allow evaluation in an earlier treatment setting where more patients may benefit, Sponsors can conduct two RCTs, one with an early endpoint for AA and the second powered for a longer-term clinical endpoint (PFS or overall survival [OS]). The confirmatory study should be well underway, if not fully enrolled, by the time of AA. To facilitate patient accrual, the confirmatory study can be in a different line of therapy (earlier disease setting).

Alternatively, follow-up in the same trial adequately powered for the longer -term clinical endpoint could fulfil the post marketing requirement to verify clinical benefit in a timely fashion; the ‘one-trial’ approach.

A ‘one-trial’ approach requires careful selection of the endpoint that is appropriate to evaluate earlier for AA. Depending on the disease course, endpoints other than response rate may be selected if supported by a strong rationale and discussed in advance with the FDA, with subsequent evaluation of clinical benefit endpoints. The effect on the surrogate/intermediate endpoint must be reasonably likely to predict clinical benefit and provide a meaningful advantage over available therapy.

Trial integrity is critical and the risk that regulatory action on the AA application introduces bias should be considered. Blinding of data for the clinical benefit endpoint should be maintained until the protocol specified analysis time point is reached to ensure a robust assessment. The FDA safety assessment may include evaluating the potential for harm from the investigational treatment (e.g. detrimental effect on OS). Sponsors should specify a plan to maintain study blind if a summary analysis of survival data is requested.

Type I error should be strongly controlled, and the sample size should ensure adequate power for both early and late endpoints. The trial design can incorporate adaptive design elements (e.g. sample size re-estimation). For a response-based endpoint, analysis to support AA could be based on a pre-specified number of initially randomised patients while, for a time-to-event endpoint it is appropriate to pre-specify the number of events. Efficacy analyses to support AA should be avoided until the trial is close to full enrolment to mitigate challenges in accrual if AA is granted.

For SATs, the adequacy of the primary response rate endpoint to support AA should be based on the magnitude and duration of response. Statistical inferential procedures are not needed. Stable disease or clinical benefit rate should not be used as these largely reflect the natural history of the disease rather than a direct therapeutic effect. Usually, a minimum of 6 months post response is needed to characterise durability, but the FDA may request additional data during review.

The sample size and analysis population (expected to be the entire trial population of patients who have received at least one dose of study drug) should be pre-specified. Multiple sample size increases with repeated looks at the data in the absence of a pre-specified plan should be avoided. Further, to reduce the potential for bias and to mitigate variance, blinded independent central review (BICR) of the response assessment should be performed in line with a BICR charter.

The FDA thinking on AA will be relevant for EMA conditional marketing authorisations, which also require comprehensive data post authorisation. Both papers highlight the agencies willingness to be flexible where trials are appropriately designed given the disease setting and intended patient population. EMA scientific advice and discussion with the FDA are strongly recommended to agree in advance the key aspects of the trial design.

References

EMA. (2023). Reflection paper on establishing efficacy based on singlearm trials submitted as pivotal evidence in a marketing authorisation https://www.ema.europa.eu/en/documents/scientific-guideline/reflection-paper-establishing-efficacy-based-single-arm-trials-submitted-pivotal-evidence-marketing_en.pdf

FDA. (2023). Clinical trial considerations to support accelerated approval of oncology therapeutics guidance for industry https://www.fda.gov/regulatory-information/search-fda-guidance-documents/clinical-trial-considerations-support-accelerated-approval-oncology-therapeutics

Single Arm Studies and their role in providing pivotal evidence for Regulatory Approval

tranScrip Oncology Commentary: June 2023

References

Latest insights

Article

Case study

Contact our team of experts

Single Arm Studies and their role in providing pivotal evidence for Regulatory Approval

tranScrip Oncology Commentary: June 2023

References

Latest insights

Article

Rare Disease Day 2024

The importance of SME status and how it can help unlock success in EU medicine development

Case study

Preparation of EU PIP application

Contact our team of experts