When a single genetic discovery forced oncologists to rethink how they tested cancer drugs, it revealed a fundamental challenge in medical research that affects us all.
Testing in broad populations
Focusing on specific subgroups
Imagine two patients arrive at a hospital with what appears to be the same cancer. They receive the same treatment, but one improves dramatically while the other shows no benefit. A decade ago, this mystery plagued oncologists treating colorectal cancer—until they discovered that what looked like one disease was actually multiple diseases with different genetic drivers. This realization created a fundamental dilemma for clinical researchers: should they test treatments in broad, mixed populations or focus on specific subgroups? This question lies at the heart of "lumping and splitting" in clinical trials—a debate that shapes which treatments reach which patients.
In clinical research, "lumping" refers to combining diverse patient populations into a single study, while "splitting" means separating them into distinct subgroups based on specific characteristics like genetic markers 1 .
The evolution of colorectal cancer treatment perfectly illustrates the splitting approach's transformative potential. In the early 2000s, drugs called Cetuximab and Panitumumab were approved to treat metastatic colorectal cancer by targeting the epidermal growth factor receptor (EGFR) 1 .
Initially, these drugs were tested in "lumped" populations of all colorectal cancer patients. The results were modest at best.
Researchers discovered through retrospective analysis: these EGFR-targeting drugs only worked for patients with a specific genetic profile—those with a KRAS wild-type genotype 1 .
Newer trials focused exclusively on KRAS wild-type patients, where the drugs demonstrated substantially greater effectiveness. Meanwhile, researchers could redirect their efforts to find effective treatments for KRAS-mutant patients 1 .
| Time Period | Trial Population | Subgroup Analysis | Result |
|---|---|---|---|
| Pre-2009 | Mixed KRAS status patients | None | Modest treatment effects overall |
| 2009 | Retrospective analysis of previous trials | KRAS wild-type vs. mutant | Dramatic benefit in wild-type only |
| Post-2009 | KRAS wild-type only | Not applicable | Strong treatment effects demonstrated |
The mixing of different study designs in medical research creates a significant challenge for evidence synthesis. How can researchers combine results from studies that used different approaches to lumping and splitting?
This is particularly problematic for meta-analysis, a statistical method that combines results from multiple studies to arrive at more reliable conclusions 1 . Traditional meta-analysis assumes that the populations across studies are comparable—an assumption that breaks down when some trials enroll mixed populations while others focus on specific subgroups.
Combining summary results from published trials using sophisticated modeling that accounts for different population compositions 1
Using raw data from each participant in all studies—considered the gold standard but more resource-intensive 1
Examining how treatment effects vary based on study-level characteristics like the percentage of biomarker-positive patients in a trial 1
| Method | Data Required | Advantages | Limitations |
|---|---|---|---|
| Aggregate Data (AD) Methods | Published summary statistics | Easier to implement, less resource-intensive | Limited ability to adjust for differences between studies |
| Individual Participant Data (IPD) Meta-Analysis | Raw data for each participant | Can standardize analyses across studies, examine subgroup effects | Requires significant resources and collaboration |
| Hybrid Approaches | Both aggregate and individual data | Balances practical considerations with statistical rigor | Complex methodology, requires access to some IPD |
The lumping and splitting dilemma extends beyond patient populations to what researchers measure—the endpoints. How we categorize and combine endpoints significantly influences trial conclusions 6 .
These directly capture how a patient feels, functions, or survives 6 .
Laboratory measures or other indicators that substitute for clinical endpoints but don't directly measure patient benefit 6 .
| Endpoint Category | Description | Examples | Considerations |
|---|---|---|---|
| Clinician-Reported (ClinRO) | Based on clinical judgment or interpretation | Cancer remission, ulcer healing | May involve subjectivity despite clinical expertise |
| Patient-Reported (PRO) | Directly reported by patients | Pain scores, quality of life measures | Captures the patient experience directly |
| Performance-Based (PerfO) | Standardized task assessment | 6-minute walk test, cognitive assessments | Objective but may not reflect daily functioning |
| Surrogate Endpoints | Laboratory or biomarker measures | Blood pressure, cholesterol levels | Often faster to measure but may not predict clinical benefit |
The choice between these endpoints involves similar lumping/splitting considerations. Composite endpoints (lumping multiple outcomes together) can increase statistical efficiency but may obscure effects on individual components 6 .
Regulatory agencies like the FDA face the challenge of setting standards that ensure drug safety and efficacy while adapting to the complexities of precision medicine. The traditional "two-trial paradigm"—requiring two significant pivotal trials for drug approval—is being reexamined in light of these challenges 2 .
Requiring two significant pivotal trials for drug approval.
As medicine continues to evolve toward greater personalization, the tension between lumping and splitting will likely intensify. The ideal approach isn't uniformly choosing one over the other but rather strategically applying each based on the specific research context and current understanding of the disease.
Designs that can respond to accumulating evidence during a trial, potentially starting with broader populations and then focusing on responsive subgroups 7 .
Better biomarker discovery and validation will help researchers split populations along biologically meaningful lines rather than arbitrary distinctions.
Generating evidence that is both statistically reliable and relevant to the patients who will ultimately receive treatments.
What remains constant is the fundamental challenge: generating evidence that is both statistically reliable and relevant to the patients who will ultimately receive treatments. As the KRAS story demonstrates, getting the nosology right—both for populations and endpoints—isn't just academic; it directly translates into more effective, more targeted therapies for patients who need them.