Login
In Cooperation with:

American Society for Quality Statistics Division

American Statistical Association

Bernoulli Society for Mathematical Statistics and Probability

Institute of Mathematical Statistics

International Biometric Society

International Chinese Statistical Association

International Society for Bayesian Analysis

International Statistical Institute

Royal Statistical Society

Statistical Society of Canada / Société statistique du Canada
Statistics in Biopharmaceutical Research
STATISTICS IN BIOPHARMACEUTICAL RESEARCHChristy Chuang-Stein Introduction
"Statistics in Biopharmaceutical Research" is the title of the on-line journal launched by the American Statistical Association in 2009. There are at least two other international peer-reviewed journals completely dedicated to the use of statistics in biopharmaceutical research. They are Journal of Biopharmaceutical Statistics (Taylor & Francis Group) and Pharmaceutical Statistics (John Wiley & Sons). There are many books devoted to this area of statistical applications also, e.g. Senn (2008), Dmitrienko, Chuang-Stein and D'Agostino (2007) and Dmitrienko et al (2005). In the United States (US), pharmaceutical, biotech and device industries together employ thousands of statisticians, either directly or indirectly. Statisticians support the discovery, development and commercialization of valuable medicines and devices, which have made substantial contributions to a longer life expectancy and a better quality of life in the past 50 years. In this article, we will briefly discuss these contributions starting with preclinical research to health technology assessment. The latter has become increasingly important as it directly impacts patients' access to pharmaceutical products in many parts of the world. Preclinical Research of Drugs and Biologics
The development of a new medicine is a long and high risk proposition. It takes an average of 15 years for a new compound to be discovered and eventually submitted to regulators for approval. As of 2008, the cost of developing a new drug is estimated to be between $800 million and $2 billion US dollars (Masia, 2008). Biopharmaceutical research generally begins in the laboratory where medicinal chemists synthesize compounds and biologists screen the compounds for activities. Because of the large number of compounds, it is often necessary to develop an efficient algorithm-based process to conduct high-throughput screening for promising compounds. Once a compound is judged to meet the level of required potency, it needs to go through formulation development so that the active ingredient could be delivered to the target site of actions in intended subjects. The first test subjects are laboratory animals to evaluate the effect of the compound on cardiovascular function, reproductive function, tumour development and the general wellbeing of offspring of animals exposed to the compound. Most of the animal experiments are conducted according to the International Conference on Harmonisation (ICH) guidance M3(R2) (2009) on non-clinical safety studies. The need to use the smallest number of animals for preclinical testing has led to the use of efficient experimental designs with repeated measures on each animal. In addition, data mining techniques are used widely to search for chemical and physical properties that tend to associate with chemical compounds that turn out to be successful (Johnson and Rayens, 2007). Clinical Development of Drugs and Biologics
The majority of statistical support in biopharmaceutical research takes place during the clinical testing in humans. In the US, this support has grown substantially since the 1962 Kefauver-Harris (K-H) Amendment (Krantz, 1966) that required drug sponsors to prove a product's safety and efficacy in controlled clinical trials before receiving marketing authorization. It is usually thought that the first properly randomised and documented control trial in the 20th century involved the use of streptomycin for the treatment of pulmonary tuberculosis (Medical Research Council, 1948). Since that time, the number of clinical trials (both randomized and non-randomized) has skyrocketed as evidenced by the number of trials registered at www.clinicaltrials.gov in the United States. Since the K-H Amendments, the number of statisticians working in the pharmaceutical industry has greatly increased. This increase took another jump when the quality of the manufactured products came under close scrutiny. Moving into the 21st century, the lure and the promise of genomics and proteomics in product development, especially in search of personalized medicine, will further intensify scientists' reliance on statistics to estimate, predict and confirm the role of these biomarkers in determining the optimal treatment for individual patients (Chuang-Stein and D'Agostino, 2007). The latter has also led to significant increase in statistical support to nonclinical research over the past 10 years. Clinical development of a new drug or biologic is often divided into 3 phases. All clinical trials need to follow ICH E6 guidance on good clinical practice (1996) and the 1964 Declaration of Helsinki. In Phase 1 trials, healthy volunteers are randomized to receive a single dose or multiple doses of an investigational product or a placebo to study the tolerance and pharmacokinetics of the new product. The objective is to decide an acceptable dose range. For cytotoxic agents, Phase 1 trials are conducted in cancer patients with the objective to estimate the maximum tolerated dose. Because of the small number of subjects (e.g. 40-100) at this stage, safety evaluation focuses on identifying common side effects of the new product. If the tolerance and pharmacokinetic profiles based on the limited data are judged to be acceptable, testing will proceed to Phase 2. In Phase 2, the new investigational product will be compared against a concurrent comparator (placebo, an approved product or the standard of care) in patients with the target disease (or condition). The primary objective is to gather safety and efficacy data in patients. Different dose strengths are typically used in these trials to help estimate the dose-response relationship. The latter may involve fitting an Emax model or a logistic model. For oncology trials, Phase 2 may consist of single arm studies using the maximum tolerated dose estimated from Phase 1. Some researchers further divide Phase 2 into Phase 2a proof-of-concept and Phase 2b dose-ranging studies. The former often includes a single dose strength to verify the hypothesized mechanism while the latter uses different dose strengths and clinically relevant endpoints. Phase 1 and Phase 2 trials are designed to help a sponsor learn about the new treatment (Sheiner, 1997). They are exploratory in nature and the analysis will focus on estimation instead of hypothesis testing. This is a critical phase of product development. It is during this stage that important information on dose(s), dosing schedule(s), endpoints and the target population will be evaluated and decided upon. Statistics is heavily used to design trials, to analyse the results and to make Go or No-Go decisions. If data from the Phase 2 development offer good reasons to believe that the new investigational product has a positive benefit to risk balance and can bring value to patients, development will move into the confirmatory stage, or Phase 3. In general, Phase 3 trials are double-blind randomized trials if blinding is at all possible. These trials are typically large (hundreds to thousands of patients) with a longer duration. The primary objective is to confirm the presence of a treatment effect and to collect additional safety information in a more diverse population. In terms of efficacy assessment, it could be either superiority over a comparator or non-inferiority to an active control (ICH E10, 2000). For life-threatening conditions, interim analyses are often conducted to stop a trial early for efficacy (positive outcome) or futility (negative outcome) for ethical reasons. Interim analyses are typically conducted by individuals independent of the study and reviewed by an Independent Data Monitoring Committee (Food and Drug Administration DMC guidance, 2006). Except for those pre-specified in the protocol as possible mid-trial adaptations, changes are strongly discouraged at this stage. When multiple comparisons (multiple doses, multiple endpoints, multiple subgroups, interim efficacy analysis etc) are conducted for inferential purposes, the significance levels for comparisons need to be properly adjusted so that the family-wide Type I error rate is strongly controlled. The adjustment method needs to be pre-specified and can't be changed once the trial results become known. In short, the design and analysis of Phase 3 trials need to be carefully planned and rigorously executed. Statistical principles, as articulated in ICH E9 (1998), should be followed with very few deviations. Deviations, when occurring, need to be justified and sensitivity analyses should be conducted to evaluate the impact of the deviations on conclusions. Statistics is the basis for inferential conclusions in these trials. Increasing Attention for Preventive Vaccines
Vaccination was considered one of the 10 greatest public health achievements of the 20th century. Even in the 21st century, emphasis on more effective disease management will likely lead to the development of more vaccines, most of which will likely be used for prevention purpose. Vaccines are typically administered as a single series with a potential booster shot. Vaccines use antigen or attenuated live virus to trigger immune responses for disease prevention. The level of antibody can be measured by several different serological assays. Some assays are more expensive and laborious, but provide greater sensitivity. Assay validation (e.g. precision, reproducibility, intro-assay variability, inter-assay variability, range of antibody where the assay is the most robust) is an important part of vaccines development (Chan et al, 2003). Clinical development of a vaccine is similar to that of a drug and a biologic. Phase 1 is conducted in healthy subjects while Phase 2 explores different doses for safety and efficacy in the target population. Phase 3 aims to confirm the safety and efficacy of an investigational vaccine. There are at least 3 unique features of a Phase 3 vaccine trial. First, vaccine efficacy VE in a placebo-control trial, measured as (1 - the ratio of incidence of the endpoint in the vaccinated group to that in the control group), needs to exceed a threshold. Here, the endpoint of interest is the condition (e.g. disease) the vaccine is designed to prevent. The threshold requirement becomes one on the lower bound of a one-sided 97.5% confidence interval for VE. Since the condition is often rare, a requirement, say 25%, on the lower bound will lead to a large sample size. Second, the manufacturing of a vaccine (a biologic) is subject to more variability than a chemical compound. To prove consistency in the manufacturing process, a Phase 3 trial is often set up to not only confirm the safety and efficacy of an investigational vaccine, but also to study lot consistency by randomizing subjects to vaccines produced from 3 different manufacturing lots. Consistency in manufacturing needs to be shown via consistency in subject's immunogenic response to the vaccines from different lots (Lachenbruch et al, 2004). Third, the threshold for vaccine safety is typically high because a vaccine could be administered to millions of healthy subjects. These three factors combined mean that the sample size for a Phase 3 vaccine trial is characteristically large. Even with large Phase 3 trials, there are often large-scale post licensure studies to collect additional safety data. There are many statistical issues uniquely associated with vaccine development. These include the need to establish a range of potency for manufacturing and product shelf-life, the use of multiple co-primary measurements to assess vaccine efficacy, handing of missing booster shot data, identification of immune markers that correlate with efficacy, identification of the protective level etc. The entire 4th issue of Journal of Biopharmaceutical Statistics in 2006 is dedicated to preventive vaccines, starting with a guest editorial by Horn (2006) and a statistical primer by Mehrotra (2006). All articles in the special issue will be of interest to readers who want to learn more about vaccines development. Another interesting article on the use of a group sequential trial to evaluate the safety of a rotavirus vaccine is given by Heyse et al (2008). Development of Medical Devices
A medical device is an item for treating or diagnosing a health condition that is not achieved primarily by chemical or biological action within the body (Yue, 2008). In essence, a medical device is any medical item that is not a drug or biological product. A medical device could be as simple as a tongue depressor or as sophisticated as a drug-eluting coronary stent. There are many thousands of medical device firms in the US. About 30-35% of medical devices are diagnostic tests (Campbell, 2008). They range from tests for blood type, the presence of HIV infection, the existence of a genetic variant to in vitro multivariate index assays such as the GeneSearch Breast Lymph Node (BLN) assay approved by the US FDA in July 2007 for detecting breast cancer metastasis. Establishing limit of assay detection, technical validation and quality control of the assay, sensitivity/specificity of the test and agreement of a new test with an existing reference test are among major considerations in developing new diagnostic tests. In the following, we will focus on non-diagnostic medical devices. The development of a medical device follows a different pathway from that of a drug and a biologic (Wittes, 2001; Campbell, 2008). While some medical devices such as implants require considerable bench and animal testing for reliability and biocompatibility, there is usually no analogue for Phase 1 and animal studies for toxicity for devices. Pilot and feasibility device studies can serve as first-in-man studies. Besides, not all devices need to undergo controlled clinical trials to gain regulatory approval. Even for devices that require a confirmatory study to support a premarket approval (PMA) application, the confirmatory study may rely on historical controls instead of a randomized concurrent control since the basis for granting a PMA is "valid scientific evidence that there is reasonable assurance that the device is safe and effective". In the latter case, a single confirmatory trial is often sufficient to support a PMA application. When a randomized control is not used in the confirmatory trial, causal inferential methods such as propensity scores have been used to compare the new device against a historical control. Unique features of device development as well as the associated statistical issues are discussed in great detail in the first issue of Journal of Biopharmaceutical Statistics in 2008. In general, devices go through continuous improvement in short intervals. It is not uncommon for a clinical trial to start with one device and end with an improved version of the device. Because of the knowledge accumulated over the years on some device (e.g. pacemaker), it is possible to establish an objective performance criterion that is then imposed on a new device for the same purpose. In addition, the accumulated experience on the control has led many device companies to propose hierarchical models when designing and analyzing a device trial. The latter has led to the FDA guidance on the use of Bayesian statistics in medical device clinical trials (2010). Perhaps, the most distinct feature of a randomized device trial is the fact the effect of a device may depend on the experience of a center (clinic, investigator, or surgeon), leading to the potential of a treatment by center interaction, especially for devices such as implants. Thus, if a treatment by center interaction is observed in a device trial, there needs to be a careful investigation into the source of the interaction. If necessary, extensive training may be required of the healthcare professionals before their using the device in patient care. Post-marketing Safety and Benefit/Risk Assessment
Following several highly visible product withdrawals in late 1990's in the US, the safety of pharmaceutical products has attracted much public and congressional attention. Consequently, the US Food and Drug Administration (FDA) convened a Drug Safety and Risk Management Advisory Committee to advise the FDA Commissioner on risk management, risk communication, and quantitative evaluation of spontaneous reports for drugs for human use and for any other product for which the FDA has regulatory responsibility. In collaboration with this committee, FDA has been holding public meetings on product safety since 2002. Data from clinical trials, spontaneous reports of adverse reactions collected in pharmacovigilance databases and longitudinal patient information from healthcare or claims databases have been used to explore possible product-induced injuries. Statistical techniques, based on the concept of proportionality (Almenoff et al, 2007), have been developed and applied extensively to look for possible safety signals. Reconciling potentially conflicting findings between meta analysis from randomized clinical trials and observational studies can be a challenge for both regulators and pharmaceutical sponsors as articulated by Michele et al (2010) in the case of tiotropium for chronic obstructive pulmonary disease. The story surrounding regulatory actions on Rosiglitazone further testified to this difficulty (Woodcock et al, 2010). This increasing scrutiny on product safety has also reinforced the notion that the safety of a product needs to be evaluated with respect to the potential benefit from the product, the target disorder and available treatment options for the disorder. Benefit/risk assessment has long been the subject of interest to many researchers and regulators (e.g. Mussen, Salek, Walker, 2007a, 2007b; Temple, 2007; Chuang-Stein, Entsuah, Pritchett, 2008; O'Neill, 2008; Pritchett, Tamura, 2008). Benefit/risk assessment supplements separate safety and efficacy evaluations and should be routinely considered when a new product delivers greater efficacy at possibly the expense of a higher incidence of clinically meaningful adverse product reactions. We have started to see the applications of quantitative benefit/risk assessment by the FDA (e.g. Cardiovascular and Renal Advisory Committee, Feb 3 2009, http://www.fda.gov/AdvisoryCommittees/CommitteesMeetingMaterials/ Benefit/risk assessment consists of summarizing all relevant information, articulating the relative importance of various factors, identifying a sensible way to quantitatively combine factors while taking into account their relative importance, settling on a decision rule and identifying situations where a decision could be clear and unequivocal. If the assessment reaches a decision, the decision should be communicated to individuals who have an interest and a need to know the outcome. This process is highly quantitative involving data extraction, data summarization, choice of metrics and quantitative decision rules. Statisticians, in partnership with others involved in the assessment, have much to contribute to this endeavour. Comparative Effectiveness Research and Health Technology Assessment
After a product receives marketing authorization, testing often continues for additional uses of the product. A new product could also be included in a head-to-head comparison against another marketed product for differentiation or comparative effectiveness research (CER). In a report from the US Congressional Budget Office titled "Research on the Comparative Effectiveness of Medical Treatments" (2007), comparative effectiveness is defined as a rigorous evaluation of the impact of different options that are available for treating a given medical condition for a particular set of patients. More recently, the Federal Coordinating Council for Comparative Effectiveness Research (2009) in the US defines CER as research comparing interventions in real-world settings. Various stakeholders have opinionated on how CER should be implemented and how findings from such research should be used to determine treatment options. The entire October 2010 issue of the journal PharmacoEconomics is dedicated to CER. As pointed out by Birnbaum and Greenberg (2010) in an editorial in that issue, it remains to be seen whether CER will be a catalyst for conflict or a driver of positive change in the US healthcare system. Despite a call to generate evidence for CER using more pragmatic randomized clinical trials (Mullin et al, 2010), evidence will likely come from observational studies, at least in the short term. Thus, regardless of the role CER will ultimately play, there are many methodological issues underpinning CER that need to be resolved for the research findings to be credible (Brookhart et al, 2010; Tunis et al, 2010). Additional discussions on CER could be found in the October 2010 issue of the journal Health Affairs which is completely dedicated to CER. In regions where a government board decides if a new product is eligible for reimbursement and sets the price of the product under a national healthcare system, an approved product needs to undergo cost-effectiveness evaluation. The latter is often referred to as health technology assessment (HTA) in these regions. The assessment involves pooling data from multiple studies and can rely on endpoints different from those used to make marketing authorization decision. The work requires statisticians to collaborate closely with health economists, health care providers, third party payers and patients. Systematic review (including meta analysis) is often the basis for such efforts. Bayesian approach has been proposed to make indirect and mix-treatment comparisons between treatments (Lu and Ades, 2004; Sutton et al, 2008; Vanness, 2010). Looking for subgroups of patients for whom a new intervention is the most cost-effective is an integral part of the assessment. Several guidelines, prepared by groups associated with policy-making bodies are available (e.g. National Institute for Health and Clinical Excellence, Methods Guide, 2008; Institute for Quality and Efficiency in Health Care, General Methods Guide, 2008). Understanding the advantages and potential pitfalls of the proposed methods is an important area where pharmaceutical statisticians can make substantial contributions to assist with treatment decisions. A Special Interest Group of the professional statistical association "Statisticians in Pharmaceutical Industry" (PSI) in United Kingdom has put together a PSI Health Technology Assessment Handbook (Fletcher et al, 2010). The Handbook articulates additional statistical challenges in HTA such as the use of surrogate endpoints to demonstrate long-term clinical benefit in the absence of mortality and/or morbidity endpoints, extrapolating data obtained from clinical trials to estimate life-time benefits and costs and handling uncertainty in economic modelling. While many statisticians have started to pay attention to the basis for drawing inferences in HTA, the field could benefit from some of the statistical rigor and maturity enjoyed by randomized clinical trials. Trend toward Academia, Industry and Government Collaborations
The decrease in the overall productivity measured by the number of approved new molecular entities each year has led industry and regulators to look for better ways to conduct biopharmaceutical research. Examples include FDA's Critical Path Initiative (2004, http://www.fda.gov/Science Research/SpecialTopics/CriticalPathInitiative/) and European Union's Innovative Medicines Initiative (2007) (http://www.imi.europa.eu/index_en.html). One outcome from this emphasis is the extensive statistical research on adaptive trial designs over the last 5 years (e.g. Journal of Biopharmaceutical Statistics 15(4), 2005; Gallo et al, 2006; Biometrical Journal 48(4), 2006; Drug Information Journal 40(4), 2006; Pharmaceutical Statistics 5(2), 2006; Mehta and Patel, 2006; Bornkamp et al., 2007; Wang et al, 2007; Bauer, 2008; Gaydos et al. 2009; Bretz et al, 2009a; Bretz et al, 2009b; Brannath et al, 2009). Research on adaptive design includes designs for dose-ranging studies and confirmatory studies. The interest by pharmaceutical sponsors in using adaptive designs has prompted regulators to issue two regulatory guidances on this subject (CHMP reflection paper, 2007; FDA draft guidance, 2010). Some recent research has been directed towards adaptations that include strategies for allocating resources to exploratory and confirmatory stages for better program-level decisions (Julious and Swank, 2005). Central to the concept of adaptive designs is a more efficient use of data and a more agile response to accumulated evidence on the effect of a new treatment. Statistical research on adaptive designs has focused heavily on Type I error control and bias in estimation. Experience from conducting adaptive trials has made it clear that such trials require intense collaboration and rigorous upfront planning among multiple disciplines supporting the trials. The need to control the dissemination of interim results and to watch out for potential operational bias should be always on the mind of trialists conducting adaptive trials. As experience accumulates from conducting adaptive trials, sharing what worked and what did not across academia, industry and government will benefit the entire clinical trial community. Another area of ongoing academia-industry-government collaboration is the use of graphics to display and assist with better safety data review. Unlike efficacy evaluations which focus on central tendency, safety assessment needs to look at the entire response distribution, especially the extreme responses. Effective graphics that display all safety data can help increase the likelihood of detecting potential safety signals. In 2009, an FDA/Industry/Academia working group was formed with an objective to develop a core set of graphics for visualizing clinical trial safety data (Soukup, 2010). The plan is to make this core set publicly available including codes and sample data for illustration. In addition, the working group plans to engage in outreach activities to educate and engage stakeholders. It is hoped that repeated use of the shared graphics will result in public familiarity of the graphic type and better appreciation of safety data, which in turn will encourage more effective safety assessment. A Dynamic Environment Calling for Statistical Leadership and Innovations
In biopharmaceutical research, statisticians are at the heart of evidence collection, synthesis and communication. Statisticians have enormous opportunities and face probably an equal number of challenges. We can expect both opportunities and challenges to increase in the 21st century. Statisticians need to be in tune with the dynamic environment, to help meet the needs of multiple customers, to cash in on the opportunities, create opportunities for academia-industry-government collaboration and rise to the challenges (Chuang-Stein et al, 2010)! STATISTICS IN BIOPHARMACEUTICAL RESEARCHChristy Chuang-Stein References
Based on an article from Lovric, Miodrag (2011), International Encyclopedia of Statistical Science. Heidelberg: Springer Science +Business Media, LLC |


