This article provides a comprehensive resource for researchers, scientists, and drug development professionals on the analytical validation of PD-L1 assays for clinical use.
This article provides a comprehensive resource for researchers, scientists, and drug development professionals on the analytical validation of PD-L1 assays for clinical use. It covers the foundational biology of the PD-1/PD-L1 axis and its critical role as a predictive biomarker in immuno-oncology. The content details the current methodological landscape, including FDA-approved companion diagnostics, laboratory-developed tests, and emerging liquid biopsy approaches. It addresses key challenges in pre-analytical variables, assay standardization, and tumor heterogeneity, while providing evidence-based strategies for troubleshooting and optimization. Furthermore, the article systematically compares assay performance and validation frameworks, including interchangeability studies and regulatory requirements, offering a complete guide for implementing robust PD-L1 testing in clinical and research settings.
The programmed death protein 1 (PD-1) and its ligand PD-L1 represent a critical immune checkpoint pathway that tumors exploit to evade host immune surveillance [1] [2]. Under normal physiological conditions, the PD-1/PD-L1 axis maintains immune homeostasis by preventing excessive immune responses and autoimmunity [2] [3]. However, cancer cells subvert this pathway for immune escapeâPD-L1 expressed on tumor cells binds to PD-1 on activated T cells, transmitting an inhibitory signal that suppresses T cell effector functions, promotes T cell exhaustion, and creates an immunosuppressive tumor microenvironment (TME) [2] [4] [5]. This mechanism represents one of the most significant breakthroughs in cancer immunotherapy, with inhibitors targeting the PD-1/PD-L1 axis achieving remarkable success across various cancers [1] [3]. Understanding the molecular intricacies of this signaling pathway and the analytical methods for detecting PD-L1 expression is fundamental for optimizing patient selection and therapeutic outcomes.
PD-1 is a transmembrane protein belonging to the CD28/CTLA-4 superfamily, expressed on activated T cells, B cells, natural killer (NK) cells, and monocytes [2] [5]. Structurally, PD-1 consists of an extracellular Immunoglobulin variable (IgV)-like domain, a transmembrane domain, and a cytoplasmic tail containing both an immunoreceptor tyrosine-based inhibition motif (ITIM) and an immunoreceptor tyrosine-based switch motif (ITSM) [2] [3]. Its primary ligand, PD-L1 (B7-H1; CD274), is broadly expressed on antigen-presenting cells (APCs), non-hematopoietic cells, and various tumor cells [2] [3].
The binding of PD-L1 to PD-1 initiates a cascade of intracellular events that ultimately inhibit T cell activation. Upon engagement, the ITSM motif in PD-1's cytoplasmic tail becomes phosphorylated and recruits the tyrosine phosphatases SHP-1 and SHP-2 [2] [3]. Activated SHP-2 then dephosphorylates key signaling molecules downstream of the T cell receptor (TCR), including CD3ζ, ZAP70, and PKCθ, effectively attenuating TCR signaling [2] [5]. This phosphatase activity also targets the co-stimulatory receptor CD28, further dampening T cell activation [2]. The resulting inhibition disrupts critical activation pathways such as PI3K/Akt, leading to reduced T cell proliferation, cytokine production (e.g., IL-2, IFN-γ), and cytotoxic activity [2] [3] [5].
Figure 1: PD-1/PD-L1 Signaling Pathway in T Cell Inhibition. The binding of PD-L1 to PD-1 recruits SHP-2, which dephosphorylates key TCR signaling molecules (ZAP70, PKCθ) and the co-stimulatory receptor CD28, ultimately suppressing T cell effector functions.
Cancer cells dynamically regulate PD-L1 expression through multiple mechanisms in response to TME pressures. Key regulatory pathways include:
Additionally, post-translational modifications, particularly ubiquitination, critically control PD-L1 stability. Several E3 ubiquitin ligases target PD-L1 for proteasomal degradation, while deubiquitinating enzymes can enhance PD-L1 stability, representing a promising therapeutic avenue to modulate PD-L1 levels [1] [2].
The immunohistochemical (IHC) detection of PD-L1 expression has emerged as a critical companion diagnostic for immune checkpoint inhibitor therapies. However, the existence of multiple validated assays using different antibody clones and platforms presents significant challenges for clinical implementation and interpretation [6] [7].
Table 1: Comparison of FDA-Approved PD-L1 IHC Assays and Their Performance Characteristics
| Assay (Clone) | Platform | Primary Target | Scoring Algorithm | Key Cancer Indications | Concordance with 22C3 (CPS) |
|---|---|---|---|---|---|
| 22C3 pharmDx | Dako Link 48 | PD-L1 | CPS (â¥10) | UC, NSCLC, Gastric, HNSCC | Reference [7] |
| SP263 | Ventana Benchmark | PD-L1 | CPS (â¥10) / TC (â¥25%) | NSCLC, UC | OPA: 89.6% [7] |
| SP142 | Ventana Benchmark | PD-L1 | IC (â¥5%) / TC (â¥50%) | UC, TNBC | Low PPA (CPS) [7] |
| 28-8 | Dako Link 48 | PD-L1 | TC (â¥1%) | NSCLC, RCC | Not directly compared |
| SP263 (Lab Validation) | Ventana Platform | PD-L1 | TC (â¥1%) | NSCLC | Concordance: 76% [6] |
Abbreviations: CPS: Combined Positive Score; TC: Tumor Proportion Score; IC: Immune Cell Score; OPA: Overall Percent Agreement; PPA: Positive Percent Agreement; UC: Urothelial Carcinoma; NSCLC: Non-Small Cell Lung Cancer; HNSCC: Head and Neck Squamous Cell Carcinoma; TNBC: Triple-Negative Breast Cancer; RCC: Renal Cell Carcinoma.
Multiple studies have demonstrated that while some assays show strong analytical concordance, others yield substantially different results. In urothelial carcinoma, the SP263 and 22C3 assays demonstrate high overall percent agreement (OPA: 89.6%) when using the combined positive score (CPS) algorithm, suggesting potential interchangeability in clinical practice [7]. In contrast, the SP142 assay consistently shows lower positivity rates and poor positive percent agreement (PPA) compared to both 22C3 and SP263, regardless of scoring method [7]. This discrepancy was confirmed in non-small cell lung cancer (NSCLC), where a laboratory-developed test using SP142 clone showed only moderate concordance (76%) with the validated SP263 assay for tumor cell staining, and even lower concordance (61%) for immune cell staining [6].
The complexity of PD-L1 assessment is compounded by different scoring algorithms validated in clinical trials:
These scoring methods are not directly comparable, and their predictive value varies across cancer types and specific immune checkpoint inhibitors [1] [7].
Table 2: Comparison of PD-L1 Scoring Algorithms and Clinical Utility
| Scoring Algorithm | Calculation Method | Clinical Cutoffs | Associated Therapies | Advantages | Limitations |
|---|---|---|---|---|---|
| Tumor Proportion Score (TPS) | % of positive tumor cells | â¥1%, â¥50% | Pembrolizumab (NSCLC) | Simple, reproducible | Ignores immune cell staining |
| Combined Positive Score (CPS) | (PD-L1+ cells / viable tumor cells) à 100 | â¥1, â¥10 | Pembrolizumab (UC, Gastric) | Captures immune landscape | Complex counting required |
| Immune Cell (IC) Score | % area of immune cells | IC0/1/2/3 (0-10%) | Atezolizumab (UC) | Focus on immune contexture | Challenging in low-infiltrate tumors |
The following protocol details the validated methodology for PD-L1 IHC using the Ventana SP263 assay, as employed in clinical trials and comparative studies [6] [7]:
Tissue Preparation:
Staining Procedure (Ventana Benchmark Platform):
Quality Control:
For CPS scoring in urothelial carcinoma [7]:
Figure 2: PD-L1 IHC Experimental Workflow. Standardized protocol for PD-L1 immunohistochemical staining and analysis using the Ventana SP263 assay.
Table 3: Key Research Reagent Solutions for PD-1/PD-L1 Investigation
| Reagent/Platform | Specific Function | Application Context | Key Characteristics |
|---|---|---|---|
| Anti-PD-L1 Clone SP263 | Rabbit monoclonal antibody targeting intracellular PD-L1 domain | IHC on Ventana platforms; companion diagnostic for durvalumab | Detects epitope corresponding to amino acids 284-290 [6] |
| Anti-PD-L1 Clone 22C3 | Mouse monoclonal antibody against PD-L1 | IHC on Dako platforms; companion diagnostic for pembrolizumab | Validated for CPS scoring in multiple cancers [7] |
| Anti-PD-L1 Clone SP142 | Rabbit monoclonal antibody against PD-L1 intracellular domain | IHC on Ventana platforms; complementary diagnostic for atezolizumab | Lower sensitivity for tumor cells, higher for immune cells [6] [7] |
| Ventana Benchmark Series | Automated IHC/ISH staining platforms | Standardized PD-L1 staining for clinical trials | Ensures reproducibility across laboratories [6] [7] |
| Dako Autostainer Link 48 | Automated IHC staining system | PD-L1 staining with 22C3 and 28-8 assays | Platform-specific optimization required [7] |
| OptiView DAB Detection Kit | Amplification system for IHC signal | Enhanced detection sensitivity for low-abundance targets | Redbackground staining with proper optimization [6] |
| TMA Construction Systems | High-throughput tissue microarray technology | Parallel analysis of multiple tumor samples | Enables comparative studies across cancer types [7] |
| Velnacrine | Velnacrine | Velnacrine, a potent acetylcholinesterase (AChE) inhibitor. For Research Use Only. Not for human or veterinary diagnostic or therapeutic use. | Bench Chemicals |
| 4-Methyl-1,2,3,4-tetrahydroisoquinoline | 4-Methyl-1,2,3,4-tetrahydroisoquinoline|High-Purity Research Compound | High-quality 4-Methyl-1,2,3,4-tetrahydroisoquinoline for pharmaceutical research. For Research Use Only. Not for human or veterinary use. | Bench Chemicals |
The PD-1/PD-L1 signaling axis represents a sophisticated immune evasion mechanism that cancers exploit through multiple molecular strategies. While immune checkpoint inhibitors blocking this pathway have revolutionized oncology, their optimal use depends on reliable PD-L1 detection assays. Current evidence indicates that while some assays (particularly SP263 and 22C3) show strong analytical concordance and may be potentially interchangeable, others (notably SP142) demonstrate significant divergence in staining patterns and positivity rates [6] [7]. These differences have direct clinical implications for patient selection, particularly as regulatory requirements evolve. Future directions should focus on greater harmonization of scoring systems, validation of laboratory-developed tests against reference standards, and integration of complementary biomarkers such as tumor mutational burden and microbiome signatures to improve patient stratification [1] [8]. The analytical validation of PD-L1 assays remains a critical component in the broader framework of precision immuno-oncology, ensuring that transformative immunotherapies reach the patients most likely to benefit.
The programmed death-ligand 1 (PD-L1) serves as a critical mechanism for tumor immune evasion. The binding of PD-L1, expressed on tumor or immune cells, to its receptor PD-1 on activated T cells, inhibits T-cell effector function, enabling tumors to escape host immune surveillance [9] [10]. This biological axis became a prime target for immune checkpoint inhibitors (ICIs), revolutionizing cancer treatment. Consequently, PD-L1 protein expression emerged as the first major predictive biomarker for patient selection in anti-PD-1/PD-L1 therapy.
However, the journey of PD-L1 from a biological concept to a clinically validated biomarker is complex. An analysis of the initial 45 FDA approvals for immune checkpoint inhibitors revealed that PD-L1 expression was predictive of response in only 28.9% of cases, was not predictive in 53.3%, and was not tested in 17.8% of the approvals [11]. This indicates that while PD-L1 is a crucial component of the immunotherapy landscape, its utility as a standalone biomarker is limited and nuanced, varying significantly across tumor types, assay platforms, and scoring methodologies.
The predictive power of PD-L1 is not universal but is context-dependent. A 2024 meta-analysis of biliary tract cancer (BTC) demonstrated that while PD-L1 positivity did not significantly correlate with objective response rate (ORR) or disease control rate (DCR), it was associated with significantly improved progression-free survival (PFS) and overall survival (OS). The pooled hazard ratios were 0.54 for PFS and 0.58 for OS for PD-L1-positive patients compared to PD-L1-negative patients [12]. This survival benefit underscores PD-L1's prognostic value in specific cancers, even in the absence of a strong correlation with immediate tumor response rates.
Conversely, in clear cell renal cell carcinoma (ccRCC), PD-L1 expression assessed by various FDA-approved assays (22C3, 28-8, SP142, SP263) showed remarkably low positivity in tumor cells across all assays. Positivity in immune cells was approximately 15% for most assays, except for SP142, which showed only 2.1% positivity [13]. This highlights not only tumor-type specificity but also the impact of the assay itself on biomarker prevalence.
Table 1: Predictive Value of PD-L1 Across Different Cancers
| Cancer Type | Predictive Value for ORR/DCR | Predictive Value for Survival | Key Findings |
|---|---|---|---|
| Biliary Tract Cancer | Not significant (ORR OR: 1.56) [12] | Significant (OS HR: 0.58; PFS HR: 0.54) [12] | PD-L1 positive associated with longer PFS and OS. |
| Non-Small Cell Lung Cancer | Variable by assay and cutoff [14] [11] | Significant for PFS at â¥50% cutoff (HR: 0.67) [14] | Combined with TILs enhances predictive power. |
| Clear Cell RCC | Low tumor cell positivity limits utility [13] | Shorter cancer-specific survival with PD-L1+ in ICs [13] | Prognostic rather than predictive value. |
Given the limitations of PD-L1 as a standalone biomarker, research has shifted towards combination biomarkers. A 2025 systematic review in NSCLC found that while PD-L1 expression (at a â¥50% cutoff) was associated with longer PFS (HR: 0.67), and tumor-infiltrating lymphocytes (TILs) alone were not significantly predictive, the combination of PD-L1 and CD8+ TILs provided the strongest predictive value. The pooled hazard ratio for PFS was 0.39 and for OS was 0.42 for patients positive for both biomarkers [14]. This synergistic effect underscores that the functional immune response is multi-faceted and cannot be fully captured by a single metric.
A primary challenge in standardizing PD-L1 testing is the existence of multiple FDA-approved companion diagnostic assays, each developed alongside specific therapeutic agents. These assays employ different antibody clones, staining platforms, and scoring criteria, leading to potential discordance.
A 2025 comparative study in ccRCC evaluated four FDA-approved assays (22C3, 28-8, SP142, and SP263) on tissue microarrays. The results revealed significant differences in PD-L1 detection rates and concordance [13]. While the 28-8 assay showed the highest pairwise concordance with others (kappa statistics: 0.52 with 22C3, 0.46 with SP263), the SP142 assay consistently demonstrated markedly lower PD-L1 positivity in both tumor and immune cells, making it an outlier [13]. This lack of perfect interchangeability necessitates strict adherence to the specific companion diagnostic assay linked to the intended therapy.
Table 2: Comparison of FDA-Approved PD-L1 Immunohistochemistry Assays
| Assay (Antibody Clone) | Staining Platform | Example Companion Drug | Typical Scoring Method(s) | Key Considerations |
|---|---|---|---|---|
| 22C3 pharmDx | Dako/Agilent | Pembrolizumab | TPS, CPS | Common cutoff: TPS â¥1% or â¥50% [11] |
| 28-8 pharmDx | Dako/Agilent | Nivolumab | TPS | Demonstrated high concordance with other assays in RCC [13] |
| SP263 | Ventana/Roche | Durvalumab | TPS, TC/IC | Comparable performance to 22C3 and 28-8 in some studies [13] |
| SP142 | Ventana/Roche | Atezolizumab | TC/IC (IC key) | Noted for significantly lower positivity rates, especially in ICs [13] |
The development of new assays continues with a focus on harmonization and improved performance. A 2025 feasibility study introduced the novel PD-L1 CAL10 assay (Leica Biosystems) and compared it to the established SP263 assay on NSCLC samples. The study met its pre-specified target, with the lower bound of the 95% confidence interval for overall percent agreement (OPA) being 86.2% at TPS â¥50% and 94.0% at TPS â¥1% [9]. This demonstrates that new assays can achieve high concordance with existing standards, potentially offering more options for pathology laboratories.
Diagram 1: PD-L1 assay validation workflow. OPA: Overall Percent Agreement, PPA: Positive Percent Agreement, NPA: Negative Percent Agreement.
While PD-L1 IHC is the most widely used biomarker, its limitations have driven the search for alternatives and complementary biomarkers. A network meta-analysis comparing predictive assays for anti-PD-1/PD-L1 monotherapy found that multiplex immunohistochemistry/immunofluorescence (mIHC/IF) exhibited the highest sensitivity (0.76), while microsatellite instability (MSI) had the highest specificity (0.90) and diagnostic odds ratio (6.79) [15]. This suggests that different biomarkers may be optimal for different clinical contexts.
Furthermore, the combination of biomarkers is a promising frontier. The same analysis revealed that when PD-L1 IHC was combined with tumor mutational burden (TMB), the sensitivity for predicting response improved significantly to 0.89 [15]. This aligns with the understanding that a comprehensive view of the tumor-immune microenvironment, incorporating genomic and proteomic data, is likely more informative than any single parameter.
Diagram 2: Multi-modal biomarker integration for improved prediction.
Table 3: Key Research Reagent Solutions for PD-L1 Biomarker Investigation
| Reagent/Resource | Function/Application | Example Specifics |
|---|---|---|
| FDA-Approved IHC Assays | Validated companion diagnostics for therapeutic selection. | 22C3, 28-8, SP263, SP142 clones on specified staining platforms [13] [11]. |
| Novel Antibody Clones | Development and validation of new diagnostic assays. | CAL10 clone for use on BOND-III staining systems [9]. |
| Tissue Microarrays (TMAs) | High-throughput validation of IHC assays across multiple tumor samples under standardized conditions. | Used for concordance studies across hundreds of patient samples [13]. |
| Whole Slide Scanners | Enables digital pathology, archiving, and computational analysis of stained samples. | Aperio GT 450 scanner for creating whole slide images [9]. |
| Automated Staining Systems | Ensure reproducible and standardized IHC staining protocols. | BOND-III (Leica), Benchmark Ultra (Ventana) [9] [13]. |
PD-L1 expression remains a cornerstone predictive biomarker in immuno-oncology, with proven utility in guiding therapy for specific cancers like NSCLC and biliary tract cancer. However, the evidence clearly demonstrates that it is an imperfect biomarker, characterized by tumor-type heterogeneity, technical variability between assays, and a lack of universal predictive power.
The future of predictive biomarkers lies in integrated approaches. Combining PD-L1 IHC with assessments of the tumor immune contexture, such as CD8+ TIL density, or with genomic markers like TMB, creates a more robust predictive model [15] [14]. Furthermore, the ongoing harmonization of existing assays and the development of novel, highly concordant tests are critical for standardizing PD-L1 testing across clinical laboratories. As precision medicine advances, moving beyond a one-dimensional view of PD-L1 towards a multi-analyte diagnostic strategy will be essential to accurately identify patients most likely to benefit from costly and potentially toxic immunotherapies.
The programmed cell death ligand 1 (PD-L1) serves as a critical immunoinhibitory molecule within the tumor microenvironment, where its interaction with the PD-1 receptor on T cells leads to T cell exhaustion and facilitates immune escape of cancer cells [16]. The assessment of PD-L1 expression has evolved into a cornerstone of cancer immunotherapy, not only as a predictive biomarker for response to immune checkpoint inhibitors but also as a significant prognostic factor across various malignancies [17]. However, the prognostic value of PD-L1 expression demonstrates considerable variability among different cancer types, with associations ranging from poor to favorable clinical outcomes depending on the specific cancer and tumor microenvironment context [16]. This comprehensive review examines the multifaceted prognostic significance of PD-L1 expression across diverse cancer types, explores the technical challenges in its detection, and discusses emerging technologies and methodologies that are shaping the future of PD-L1 as a clinical biomarker.
PD-L1 expression carries distinct prognostic implications across different cancer types, reflecting the complex interplay between tumors and the host immune system. The following table summarizes the association between PD-L1 expression and clinical outcomes in various malignancies:
Table 1: Prognostic Value of PD-L1 Expression Across Different Cancer Types
| Cancer Type | Prognostic Association | Key Supporting Evidence |
|---|---|---|
| Gastric Cancer | Poor clinical outcome [16] | Overexpression suppresses T-cell activation, promoting tumor progression [16] |
| Hepatocellular Carcinoma | Poor clinical outcome [16] [18] | Associated with immune evasion mechanisms in the liver microenvironment [16] [18] |
| Renal Cell Carcinoma | Poor clinical outcome [16] | Creates immunosuppressive microenvironment [16] |
| Esophageal Cancer | Poor clinical outcome [16] [19] | Negative predictor of overall survival in advanced ESCC treated with chemotherapy [19] |
| Pancreatic Cancer | Poor clinical outcome [16] | Correlates with worse outcome independent of MMR status and TILs [16] |
| Ovarian Cancer | Poor clinical outcome [16] | Generates immunosuppressive tumor microenvironment [16] |
| Bladder Cancer | Poor clinical outcome [16] | Overexpression linked to tumor progression [16] |
| Breast Cancer | Better clinical outcome [16] [17] | Significant association with better overall survival (56.6% of cases) [17] |
| Merkel Cell Carcinoma | Better clinical outcome [16] | Inverse correlation with poor prognosis [16] |
| Non-Small Cell Lung Cancer | Controversial [16] [17] | Predictive value when combined with other indicators like CD8+/Foxp3+ T cell ratio [16] |
| Colorectal Cancer | Controversial [16] [17] | Varies by study; stromal vs. tumor cell expression impacts interpretation [17] |
| Melanoma | Controversial [16] | Inconsistent prognostic value across different studies [16] |
The differential prognostic significance of PD-L1 across cancer types highlights the biological complexity of the PD-1/PD-L1 axis. In cancers where PD-L1 expression correlates with poor outcomes, it primarily functions as a mechanism of immune evasion, where tumor cells upregulate PD-L1 to suppress T-cell mediated antitumor immunity [16]. Conversely, in cancers like breast cancer and Merkel cell carcinoma, PD-L1 expression may represent a marker of robust immune infiltration, where the presence of tumor-infiltrating lymphocytes drives compensatory PD-L1 upregulation as part of an active immune response [16] [17]. This "reactive" PD-L1 expression pattern is associated with better clinical outcomes and potentially enhanced response to immunotherapy.
The controversial prognostic role of PD-L1 in lung cancer, colorectal cancer, and melanoma underscores additional layers of complexity. In these malignancies, the prognostic value may depend on specific histological subtypes, compartmental expression patterns (tumor cells versus immune cells), and the interplay with other biomarkers in the tumor microenvironment [16] [17]. For instance, in NSCLC, PD-L1 expression alone may not be prognostic but gains significant predictive value when combined with other indicators such as CD8+/Foxp3+ T-cell ratio [16].
Table 2: Factors Contributing to Controversial Prognostic Value of PD-L1
| Factor | Impact on Prognostic Interpretation |
|---|---|
| Tumor Heterogeneity | Spatial and temporal variations in PD-L1 expression within tumors [16] |
| Detection Timing | Differences between primary diagnosis and metastatic progression [17] |
| Compartmental Expression | Distinct implications of tumor cell vs. immune cell PD-L1 expression [17] |
| Technical Variability | Different antibodies, platforms, and scoring systems [20] |
| Tumor Microenvironment | Interaction with other immune cells and checkpoint molecules [16] |
The primary method for PD-L1 detection in clinical practice is immunohistochemistry (IHC), with several validated assays utilized across different cancer types. The following table compares the performance characteristics of major PD-L1 IHC assays:
Table 3: Comparison of PD-L1 IHC Assays and Their Clinical Implementation
| Assay/Clone | Staining Platform | Diagnostic Status | Key Characteristics | Tumor Positivity Rate* |
|---|---|---|---|---|
| 22C3 | Dako | Companion diagnostic for pembrolizumab [20] | Highest tumor proportion score (TPS) with strong membranous/cytoplasmic staining [20] | 35% (at â¥1% cutoff) [20] |
| SP263 | Ventana | Complementary diagnostic | Similar staining intensity to E1L3N [20] | 34% (at â¥1% cutoff) [20] |
| SP142 | Ventana | Complementary diagnostic for atezolizumab [20] | Lowest TPS with punctate and discontinuous membranous staining [20] | 16% (at â¥1% cutoff) [20] |
| E1L3N | Multiple platforms | Laboratory-developed test [20] [21] | Cost-effective alternative with high concordance to 22C3 [21] | 24% (at â¥1% cutoff) [20] |
Data based on study of 97 NSCLC cases [20]
The substantial variability in PD-L1 positivity rates among different assays, particularly the consistently lower rates observed with the SP142 assay, highlights critical challenges in assay standardization and interpretation [20]. Despite this variability, when assay-specific clinical cut-offs are applied, the concordance between assaysâparticularly between 22C3 and SP263âcan be remarkably high, with reported κ values of >0.7 for cut-offs of 1-25% [20].
Standardized experimental protocols are essential for reliable PD-L1 assessment in clinical and research settings. The following section outlines key methodologies cited in the literature:
IHC Protocol for PD-L1 Detection (E1L3N Clone)
Circulating Tumor Cell (CTC) Analysis for PD-L1 Detection
The intrinsic limitations of tissue biopsies, including spatial and temporal heterogeneity, have motivated the development of liquid biopsy approaches for PD-L1 assessment. Quantitative microscopic evaluation of PD-L1 expression on circulating tumor cells (CTCs) from patients with non-small cell lung cancer represents a promising technological advancement [22]. This methodology enables:
The analytical validation of this approach has demonstrated high precision and accuracy using control materials, confirming its readiness for clinical laboratory implementation [22]. Notably, preliminary testing in NSCLC patients has revealed substantial heterogeneity in PD-L1 and HLA I expression on CTCs, with promising clinical value in predicting progression-free survival in response to PD-L1 targeted therapies [22].
Recent advances in understanding the regulatory mechanisms of PD-L1 expression have opened new avenues for therapeutic interventions. PD-L1 expression is regulated at multiple levels, including transcription, post-transcription (mRNA processing), and post-translation (protein modifications) [23]. This understanding has enabled the development of novel combination strategies, such as the repurposing of FK228 (romidepsin), an FDA-approved histone deacetylase inhibitor, as a PD-L1 pathway sensitizer [24].
FK228 demonstrates multifaceted effects on the tumor immune microenvironment:
The combined use of FK228 and a PD-L1 inhibitor has shown significant tumor growth delay and extended survival in tumor-bearing mice, providing preclinical rationale for this combination approach in solid tumors [24].
The following table outlines key reagents and methodologies essential for PD-L1 research in clinical and laboratory settings:
Table 4: Essential Research Reagents and Methodologies for PD-L1 Investigation
| Reagent/Methodology | Specific Application | Research Utility |
|---|---|---|
| Anti-PD-L1 Antibody Clones (22C3, SP263, SP142, E1L3N) | IHC-based PD-L1 detection [20] [21] | Standardized detection of PD-L1 expression in FFPE tissue sections; companion diagnostics for immunotherapy [20] |
| Exclusion-Based Sample Preparation (ESP) | Circulating tumor cell isolation [22] | High-yield retention of rare CTCs for downstream PD-L1 and HLA I expression analysis [22] |
| Quantitative Microscopy | Protein expression quantification on rare cells [22] | Objective quantification of PD-L1 and HLA I expression on CTCs; enables longitudinal monitoring [22] |
| Recombinant PD-L1 and HLA I Proteins | Assay validation and standardization [22] | Generation of calibration curves and quality control materials for quantitative assays [22] |
| Single-Cell RNA Sequencing | Tumor immune microenvironment characterization [24] | Comprehensive analysis of immune cell populations and PD-L1 expression patterns in response to therapeutic modulators [24] |
The PD-1/PD-L1 axis represents a critical immunosuppressive pathway in the tumor microenvironment. The following diagram illustrates key components and regulatory relationships in this pathway:
Diagram Title: PD-1/PD-L1-Mediated T Cell Inhibition
The binding of PD-L1 to PD-1 leads to the formation of a PD-1/TCR inhibitory microcluster that recruits SHP1/2 molecules, resulting in the dephosphorylation of multiple members of the TCR signaling pathway [16]. This ultimately shuts off T cell activation through induction of apoptosis, reduction of proliferation, and inhibition of cytokine secretion [16]. Beyond its role in immune checkpoint regulation, PD-L1 can also serve as a receptor transmitting antiapoptotic signals to tumor cells and may possess intrinsic oncogenic functions during colon cancer carcinogenesis [16].
Resistance to PD-1/PD-L1 immunotherapy involves multiple mechanisms, including tumor antigen deletion, T cell dysfunction, increased immunosuppressive cells, and alterations in PD-L1 expression within tumor cells [25]. Additional factors such as altered metabolism, microbiota influences, and DNA methylation also contribute to resistance patterns [25]. Understanding these resistance mechanisms is critical for developing effective combination strategies and overcoming treatment limitations.
The prognostic significance of PD-L1 expression varies substantially across different cancer types, reflecting the biological complexity of tumor-immune interactions. While PD-L1 overexpression consistently correlates with poor clinical outcomes in cancers such as hepatocellular carcinoma, pancreatic cancer, and renal cell carcinoma, it associates with better prognosis in breast cancer and Merkel cell carcinoma [16] [17]. Technical challenges in PD-L1 assessment, including assay variability, tumor heterogeneity, and sampling limitations, continue to pose significant obstacles to standardized clinical implementation [20]. Emerging technologies such as circulating tumor cell analysis and novel therapeutic combinations targeting PD-L1 regulatory mechanisms hold promise for advancing the field [22] [24]. Future research directions should focus on multi-parametric biomarker approaches that integrate PD-L1 expression with other immune parameters, such as HLA I expression and tumor-infiltrating lymphocyte profiles, to develop more comprehensive predictive models for immunotherapy response and patient prognosis [22].
Companion and complementary diagnostics represent pivotal tools in precision medicine, enabling the stratification of patients for targeted therapies. These in vitro diagnostic (IVD) devices provide critical information for optimizing therapeutic decisions, particularly in oncology. The fundamental distinction lies in their regulatory status and clinical application: while a companion diagnostic is essential for the safe and effective use of a corresponding drug, a complementary diagnostic aids in benefit-risk decision-making without being strictly required for drug access [26] [27]. The first companion diagnostic, the HercepTest for HER2 detection, was approved simultaneously with trastuzumab (Herceptin) in 1998, establishing a new paradigm for drug-diagnostic co-development [26] [28] [27]. In contrast, the first complementary diagnostic, the PD-L1 IHC 28-8 assay for nivolumab, gained FDA approval in 2015 [26] [27]. This guide objectively compares the regulatory and analytical frameworks governing these diagnostic classes, with a specific focus on PD-L1 assays for immune checkpoint inhibitors, providing researchers and drug development professionals with experimental data and validation methodologies critical for clinical implementation.
A companion diagnostic is a medical device, often an in vitro device (IVD), that provides information deemed essential for the safe and effective use of a corresponding drug or biological product [26] [29] [30]. The U.S. Food and Drug Administration (FDA) mandates that these tests must be used if the corresponding drug is to be administered, as they identify a specific patient population that qualifies for treatment based on biomarker status [27]. For example, the Dako 22C3 PharmDx assay is a companion diagnostic required to identify non-small cell lung cancer (NSCLC) patients with PD-L1 expression (TPS â¥1% or â¥50%) for treatment with pembrolizumab [29] [31]. The drug's efficacy is intrinsically linked to the diagnostic result, and its use is stipulated in the therapeutic product labeling [29].
A complementary diagnostic is a test that aids in benefit-risk decision-making about the use of a therapeutic product, where the difference in benefit-risk is clinically meaningful but does not restrict drug access based on test results [26] [27]. The FDA includes complementary IVD information in the therapeutic product labeling, but unlike companion diagnostics, these tests are not mandatory before treatment [26]. For instance, the PD-L1 IHC 28-8 PharmDx assay is a complementary diagnostic for nivolumab (OPDIVO) in NSCLC and melanoma; the drug can be used even if PD-L1 detection is negative, though the test provides valuable prognostic information [27]. This distinction creates different clinical and regulatory pathways for these diagnostic classes.
Table 1: Key Differences Between Companion and Complementary Diagnostics
| Feature | Companion Diagnostic (CDx) | Complementary Diagnostic (CoDx) |
|---|---|---|
| Definition | Biomarker-specific test essential for safe/effective drug use | Biomarker-specific test that aids benefit-risk assessment |
| Regulatory Requirement | Required for drug administration | Not required for drug access |
| Patient Population | Restricts treatment to test-positive patients | All patients may be eligible regardless of test result |
| Drug Access | Conditional on test result | Not conditional on test result |
| Clinical Utility | Identifies patients who will benefit | Informs on degree of benefit |
| Example | 22C3 for pembrolizumab in NSCLC | 28-8 for nivolumab in NSCLC |
Robust analytical validation is fundamental for both companion and complementary diagnostics to ensure reliability across laboratories. For PD-L1 immunohistochemistry (IHC) assays, validation requires demonstrating analytical precision, accuracy, specificity, and sensitivity using standardized control materials [32] [21]. Recent approaches utilize Index Tissue Microarrays (TMAs) containing isogenic cell lines expressing predetermined PD-L1 levels to objectively compare assay performance across institutions [32]. One validated protocol involves constructing a TMA with 10 isogenic cell lines in triplicate, with formalin-fixed, paraffin-embedded (FFPE) cell pellets prepared in independent batches to assess batch-to-batch concordance [32].
Quantitative assessment employs both chromogenic IHC and quantitative immunofluorescence (QIF). For chromogenic IHC, slides are scanned using platforms like Aperio ScanScope XT, with PD-L1 expression quantified using open-source software such as QuPath, which provides optical density (OD) measurements and percentage of PD-L1+ cells [32]. For QIF, slides are stained with PD-L1 antibodies (e.g., E1L3N, SP142, SP263), incubated with EnVision reagent, amplified with Cy5-Tyramide, and counterstained with DAPI. The Automated Quantitative Analysis (AQUA) method then generates scores by dividing target pixel intensities by the area of molecularly designated compartment, normalized for operational variables [32].
Multiple studies have evaluated the concordance between different PD-L1 assays, particularly comparing companion and complementary diagnostics. A 2022 retrospective study compared the E1L3N antibody (potential laboratory-developed test) with the FDA-approved 22C3 companion diagnostic in 46 NSCLC patients receiving pembrolizumab [21]. Using tumor proportion score (TPS) cutoffs of â¥1% and â¥50%, the assays demonstrated high concordance with a correlation coefficient of 0.925 (p<0.0001) [21]. The study also found that patients with E1L3N TPS â¥50% had significantly higher objective response rates than those with TPS<1% (p=0.047), mirroring the predictive performance of 22C3 [21].
A multi-institutional study analyzing five PD-L1 IHC assays (FDA-approved and LDTs) across 12 sites demonstrated that assays for 22C3-FDA, 28-8-FDA, SP263-FDA, and E1L3N-LDT were highly similar, while the SP142-FDA assay failed to detect low PD-L1 levels distinguished by other assays [32]. This comprehensive evaluation employed statistical measures including linear regression coefficients (R²) and Bland-Altman plots to assess correlation and concordance, with Levey-Jennings plots evaluating measurement consistency over time [32].
Table 2: Performance Characteristics of Major PD-L1 Assays
| Assay (Clone) | Regulatory Status | Therapeutic Partner | Key Tumor Indications | Concordance with 22C3 | Notable Characteristics |
|---|---|---|---|---|---|
| 22C3 PharmDx (Dako) | Companion Diagnostic | Pembrolizumab | NSCLC, HNSCC, Gastric, Esophageal | Reference | Gold standard for multiple indications |
| 28-8 PharmDx (Dako) | Complementary Diagnostic | Nivolumab | NSCLC, HNSCC, Gastric | High (R²>0.90) | Broadly applicable across tumor types |
| SP263 (Ventana) | Companion Diagnostic | Durvalumab, Atezolizumab | NSCLC, Bladder | High (R²>0.90) | Interchangeable with 22C3 in multiple studies |
| SP142 (Ventana) | Complementary/Companion* | Atezolizumab | NSCLC, TNBC, Bladder | Lower sensitivity | Detects immune cell staining; different scoring algorithm |
| E1L3N (LDT) | Laboratory Developed Test | Investigational | NSCLC (evaluated) | High (R²=0.925) | Cost-effective alternative with similar performance |
*SP142 is a complementary diagnostic for atezolizumab in NSCLC but a companion diagnostic in urothelial cancer [32] [31].
The PD-1/PD-L1 axis represents a critical immune checkpoint pathway exploited by cancers to evade host immunity. The following diagram illustrates the molecular interactions and therapeutic intervention points:
Diagram 1: PD-1/PD-L1 Signaling and Therapeutic Blockade
This pathway illustrates how tumor cell-expressed PD-L1 engages with PD-1 receptors on T-cells, transmitting an inhibitory signal that suppresses T-cell activation and effector functions, enabling immune evasion [22] [21]. Monoclonal antibodies targeting either PD-1 or PD-L1 disrupt this interaction, restoring antitumor immunity [32] [21]. PD-L1 immunohistochemistry assays detect the presence of the PD-L1 ligand in tumor tissues, serving as predictive biomarkers for response to these inhibitors [32] [21].
The analytical validation of PD-L1 assays follows a structured workflow encompassing sample preparation, staining, quantification, and analysis:
Diagram 2: PD-L1 IHC Assay Workflow
This workflow highlights the standardized procedures for PD-L1 IHC testing, with variations in antibody clones, detection systems, and scoring algorithms contributing to differences between companion and complementary diagnostic assays [32] [21]. The tumor proportion score (TPS) calculates the percentage of viable tumor cells showing partial or complete membrane staining, while the combined positive score (CPS) considers both tumor and immune cells relative to all tumor cells [31].
Implementing robust PD-L1 testing requires specific reagents and platforms validated for clinical or research use. The following table details essential materials and their functions:
Table 3: Essential Research Reagents for PD-L1 Assay Validation
| Reagent/Platform | Function | Example Products | Application Notes |
|---|---|---|---|
| PD-L1 Antibody Clones | Specific detection of PD-L1 epitopes | 22C3, 28-8, SP263, SP142, E1L3N | Clones show varying sensitivity for tumor vs. immune cell staining [32] [21] [31] |
| Automated IHC Stainers | Standardized staining protocols | Dako Autostainer Link 48, Ventana Benchmark Ultra, Leica BOND-MAX | Platform choice affects staining intensity and background [32] [21] |
| Index TMAs | Analytical standardization | Custom TMA with isogenic cell lines (Horizon Dx) | Enables inter-laboratory and inter-assay comparison [32] |
| Image Analysis Software | Quantitative assessment of staining | QuPath, Aperio ImageScope, AQUA | Automated analysis reduces subjectivity in TPS/CPS scoring [32] |
| Detection Systems | Signal amplification and visualization | EnVision (Dako), OptiView (Ventana), Cy5-Tyramide | Impact assay sensitivity and dynamic range [32] |
| Cell Line Controls | Assay performance monitoring | FFPE pellets with known PD-L1 expression | Essential for daily quality control and validation [32] |
The current regulatory landscape for companion and complementary diagnostics presents several challenges for implementation and innovation. The proliferation of multiple PD-L1 assays with different scoring algorithms and cut-offs for various drugs creates complexity for clinical laboratories [31]. For example, a single NSCLC patient may require PD-L1 testing using different cut-offs (TPS â¥1%, â¥50%, or IC â¥10%) depending on the therapeutic context [31]. This multiplicity strains laboratory resources and creates confusion for pathologists and clinicians [31].
Additionally, the "one-drug/one-test" model can create barriers to diagnostic innovation once a drug is approved with a specific companion diagnostic. Current regulations make it challenging to incorporate emerging data about new assay formats or biomarkers without conducting new prospective clinical trials [31]. This is particularly problematic as evidence accumulates that laboratory-developed tests (LDTs) like E1L3N can perform equivalently to FDA-approved companion diagnostics at lower cost [21] [31].
Future directions likely include increased regulatory flexibility, with potential for assay harmonization and recognition of interchangeability between analytically validated tests [32] [21] [31]. The advent of comprehensive genomic profiling tests like FoundationOne CDx, which consolidate multiple companion diagnostic indications into a single platform, represents another evolution in this landscape [30] [31]. Such approaches could address current challenges while maintaining the rigorous analytical and clinical validation standards necessary for patient safety.
Immunohistochemistry (IHC) remains the gold standard and primary detection method for assessing Programmed Death-Ligand 1 (PD-L1) expression in tumor tissues to guide immunotherapy selection. This comprehensive analysis examines the analytical validation of PD-L1 IHC assays, comparing FDA-cleared companion diagnostics and laboratory-developed tests. We evaluate performance characteristics including analytical sensitivity, specificity, and reproducibility across different platforms, antibodies, and scoring systems. Quantitative data from recent multicenter comparisons reveal significant inter-assay variability, with concordance rates between major assays ranging from 51-78%. Emerging methodologies including quantitative microscopy and circulating tumor cell analysis demonstrate potential to address current limitations in tissue-based PD-L1 assessment. Standardization through metrological traceability to NIST Standard Reference Material 1934 represents a crucial advancement toward improving assay harmonization and clinical reliability in the era of precision immuno-oncology.
Immunohistochemistry has established itself as the cornerstone method for PD-L1 detection in clinical practice and research settings. The clinical utility of PD-L1 as a predictive biomarker for immune checkpoint inhibitor therapy has necessitated the development of robust, analytically validated IHC assays [33]. Companion diagnostic IHC tests are developed and performed without incorporating the tools and principles of laboratory metrology, leaving basic analytic assay parameters such as lower limit of detection (LOD) and dynamic range unknown to both assay developers and end users [34]. This review examines the current landscape of PD-L1 IHC testing, focusing on analytical validation, comparative performance of different assays, and emerging methodologies that aim to address existing limitations.
The PD-1/PD-L1 axis plays a critical role in cancer immune evasion. PD-L1, a transmembrane protein expressed on tumor cells and immune cells, interacts with PD-1 receptor on T-cells, inhibiting T-cell activation and effector functions [33]. Blockade of this interaction using monoclonal antibodies has revolutionized cancer treatment, with five anti-PD-1/PD-L1 agents currently approved by the FDA [35]. However, only approximately 30% of patients benefit from these therapies, highlighting the critical need for reliable predictive biomarkers [33].
Four FDA-cleared companion diagnostic IHC assays are currently utilized for PD-L1 detection, each developed in conjunction with specific immune checkpoint inhibitors [34] [35]. These assays employ different primary antibodies, detection systems, and scoring algorithms, creating a complex diagnostic landscape:
The clinical cutoffs for positivity vary between assays and cancer types, creating additional complexity in test interpretation and application [35].
PD-L1 expression is evaluated using different scoring systems depending on the specific assay and clinical context:
Tumor Proportion Score (TPS): Calculated as the number of PD-L1 positive tumor cells divided by the total number of viable tumor cells à 100 [35]. Only membranous staining on tumor cells is considered, with any partial or complete membranous staining counted as positive regardless of intensity.
Combined Positive Score (CPS): Calculated as the number of PD-L1 positive cells (tumor cells, lymphocytes, macrophages) divided by the total number of viable tumor cells à 100 [35]. This score can exceed 100% due to the inclusion of immune cells.
Immune Cell Score (% IC): The proportion of tumor area occupied by PD-L1 expressing tumor-infiltrating immune cells [35].
Proper interpretation requires evaluation of at least 100 viable tumor cells and correlation with H&E staining to distinguish tumor cells from macrophages and other immune cells that may express PD-L1 [35].
Recent advances in IHC standardization have enabled direct quantitative comparison of PD-L1 assays using calibrators with units of measure traceable to NIST Standard Reference Material 1934 [34]. A survey of 41 laboratories across North America and Europe quantified previously unknown analytical parameters:
Table 1: Analytical Performance Characteristics of PD-L1 IHC Assays
| Assay | Lower Limit of Detection (LOD) | Dynamic Range | Analytic Sensitivity | Key Characteristics |
|---|---|---|---|---|
| 22C3 | Intermediate | Broad | Intermediate | Balanced performance for tumor cell staining |
| 28-8 | Higher | Moderate | Lower | Requires higher PD-L1 expression for detection |
| SP142 | Lower | Broad | Higher | Enhanced detection of immune cell staining |
| SP263 | Intermediate | Broad | Intermediate | Similar to 22C3 with minor variations |
The data revealed that the four FDA-cleared PD-L1 assays represent three distinct levels of analytic sensitivity, explaining why some patients' tissue samples test positive by one assay and negative by another [34]. These differences in LOD and dynamic range also clarify why previous attempts to harmonize certain PD-L1 assays were unsuccessful, as their dynamic ranges were too disparate and did not overlap sufficiently.
Multiple studies have evaluated the concordance between different PD-L1 IHC assays to determine their interchangeability in clinical practice. A direct comparison of the 22C3 and SP142 assays in 135 NSCLC samples revealed significant disparities:
Table 2: Concordance Between 22C3 and SP142 IHC Assays in NSCLC
| Concordance Metric | 22C3 vs. SP142 | SP142 vs. 22C3 |
|---|---|---|
| Overall Concordance | 77.78% (105/135 samples) | 51.11% (69/135 samples) |
| Kappa Value | 0.481 (p < 0.001) | 0.324 (p < 0.001) |
| Staining Pattern | Stronger tumor cell membrane staining | Weaker tumor cell staining, fewer positive tumor cells |
| Immune Cell Detection | Moderate | Enhanced immune cell detection |
The SP142 assay typically resulted in underestimation of PD-L1 expression in tumor cells compared to the 22C3 assay, while showing more robust detection in immune cells [36]. This fundamental difference in staining patterns and scoring emphasis contributes to the relatively poor concordance between these assays and highlights why they cannot be used interchangeably without proper validation.
The following detailed methodology represents the standard approach for PD-L1 IHC testing in clinical and research settings:
Tissue Preparation and Sectioning
Deparaffinization and Antigen Retrieval
Immunostaining Procedure
Controls and Validation
Emerging methodologies enable PD-L1 detection on circulating tumor cells (CTCs) using exclusion-based sample preparation and quantitative microscopy:
CTC Enrichment and Staining
Image Acquisition and Analysis
This methodology demonstrates high precision and accuracy, with coefficient of variation <10% for intra-assay imprecision measurements, enabling reliable detection of PD-L1 expression heterogeneity [22].
Essential reagents and materials for PD-L1 IHC research and clinical testing:
Table 3: Key Research Reagents for PD-L1 Detection
| Reagent/Material | Function | Examples/Specifications |
|---|---|---|
| Primary Antibodies | Bind specifically to PD-L1 epitopes | 22C3, 28-8, SP142, SP263 clones; specific to intracellular or extracellular domains |
| Detection Systems | Amplify and visualize antibody binding | Dako EnVision FLEX, Ventana OptiView; enzyme-based chromogenic detection |
| Antigen Retrieval Buffers | Unmask epitopes altered by fixation | Citrate buffer (pH 6.0), EDTA/TRIS (pH 8.0/9.0) |
| Reference Standards | Calibrate assays and ensure reproducibility | NIST SRM 1934; Boston Cell Standards calibrators with traceable units [34] |
| Control Materials | Monitor assay performance | Tonsil, placenta tissue; cell lines with defined PD-L1 expression; polymer beads with recombinant protein [22] [35] |
| Automated Platforms | Standardize staining conditions | Dako Autostainer Link 48, Ventana Benchmark series; ensure consistent timing and temperatures |
| Image Analysis Software | Quantify staining objectively | Automated algorithms for tumor cell identification and membrane staining quantification |
Diagram 1: PD-1/PD-L1 Signaling Pathway and Therapeutic Intervention. This diagram illustrates the interaction between PD-L1/PD-L2 on tumor cells and PD-1 on T-cells, leading to T-cell inhibition and immune evasion. Immune checkpoint blockers (anti-PD-1/PD-L1 antibodies) disrupt this interaction, restoring T-cell function and anti-tumor immunity [33] [35].
Diagram 2: IHC Workflow for PD-L1 Detection. This comprehensive workflow details the pre-analytical, analytical, and post-analytical phases of PD-L1 IHC testing, highlighting critical quality control measures including positive and negative controls and reference calibrators to ensure assay validity [34] [35] [36].
Beyond conventional IHC, several innovative approaches are emerging for PD-L1 detection:
PD-L1 Binding Peptides
Liquid Biopsy Approaches
Multiplexed Immunofluorescence
Significant efforts are underway to standardize PD-L1 testing across platforms and institutions:
Metrological Traceability
Analytical Validation Frameworks
Harmonization Studies
Immunohistochemistry maintains its position as the gold standard method for PD-L1 detection in clinical practice and research. The analytical validation of PD-L1 IHC assays has revealed significant differences in performance characteristics between the four FDA-cleared companion diagnostics, explaining observed discordances in patient classification. Quantitative approaches using NIST-traceable calibrators represent a crucial advancement toward standardization, enabling precise measurement of previously undefined analytical parameters including limit of detection and dynamic range.
Emerging methodologies including quantitative microscopy of circulating tumor cells and novel detection reagents like PD-L1 binding peptides show promise in addressing current limitations related to tumor heterogeneity and tissue availability. The continued evolution of PD-L1 detection technologies, coupled with rigorous analytical validation and standardization efforts, will enhance the reliability of this critical predictive biomarker and optimize patient selection for immune checkpoint inhibitor therapies. As the field advances, integration of PD-L1 assessment with complementary biomarkers such as tumor mutational burden and HLA expression will likely provide more comprehensive predictive models for immunotherapy response.
The advent of immune checkpoint inhibitors (ICIs) targeting the programmed death-1 (PD-1)/programmed death-ligand 1 (PD-L1) axis has fundamentally transformed the therapeutic landscape for non-small cell lung cancer (NSCLC) and other malignancies [38] [39]. PD-L1 immunohistochemistry (IHC) has emerged as a critical, yet imperfect, companion diagnostic tool for identifying patients most likely to benefit from these therapies. The current landscape is characterized by a "one-drug, one-assay" paradigm, wherein specific therapeutics are paired with dedicated diagnostic assays [40]. This framework has led to the widespread clinical use of four primary PD-L1 IHC assays: Dako 22C3 (pembrolizumab), VENTANA SP263 (durvalumab), VENTANA SP142 (atezolizumab), and Dako 28-8 (nivolumab) [40].
Each assay employs a unique antibody clone, detection platform, and scoring algorithm, raising legitimate questions about their interchangeability and creating practical challenges for pathology laboratories, which may not have access to all platforms [41]. This guide provides a detailed, evidence-based comparison of these assays, focusing on their analytical performance, clinical predictive value, and technical characteristics to inform their use in clinical research and drug development.
The foundational differences between the assays lie in their respective components and scoring systems.
Table 1: Key Characteristics of FDA-Approved PD-L1 Assays
| Assay Clone | Associated Therapeutic(s) | Platform | Scoring Method | Cell Types Scored |
|---|---|---|---|---|
| 22C3 | Pembrolizumab [40] | Dako Autostainer [41] | Tumor Proportion Score (TPS) [38] | Tumor Cells [38] |
| SP263 | Durvalumab [40], Atezolizumab (early-stage NSCLC) [38] | VENTANA BenchMark [41] | Tumor Cell (TC) Percentage [38] | Tumor Cells [38] |
| SP142 | Atezolizumab [38] [42] | VENTANA BenchMark [38] | TC and IC (Immune Cell) Score [38] [42] | Tumor Cells & Immune Cells [38] [42] |
| 28-8 | Nivolumab [40] | Dako Autostainer [40] | Tumor Proportion Score (TPS) [40] | Tumor Cells [40] |
A critical distinction is the scoring algorithm. The 22C3, SP263, and 28-8 assays primarily employ a Tumor Proportion Score (TPS), defined as the percentage of viable tumor cells exhibiting partial or complete membranous staining of any intensity [38]. In contrast, the SP142 assay utilizes a composite score that incorporates both the percentage of tumor cells (TC) and the percentage of tumor-infiltrating immune cells (IC) that stain positive for PD-L1 [38] [42]. This fundamental difference in scoring contributes to the unique patient populations identified by each test.
Numerous studies have investigated the analytical concordance between these assays to determine their potential interchangeability. The evidence indicates that while the 22C3, SP263, and 28-8 assays show high agreement, the SP142 assay often appears as an outlier.
Multiple studies demonstrate a high degree of analytical correlation between the 22C3 and SP263 clones.
The SP142 assay consistently identifies fewer PD-L1-positive tumor cells compared to other assays, though it retains clinical predictive power.
Assay performance can be significantly affected by preanalytical variables. A 2022 study highlighted that the concordance between 22C3 and SP263 is influenced by the age of FFPE blocks and slide storage conditions.
Diagram 1: The impact of preanalytical conditions on PD-L1 assay performance, showing that 22C3 is more susceptible to degradation under suboptimal storage conditions than SP263 [40].
For researchers designing studies to compare PD-L1 assays, the following methodological details, drawn from the cited literature, provide a framework for robust experimental design.
Table 2: Key Clinical Concordance Findings from Major Studies
| Comparison | Clinical Context | Concordance at â¥1% | Concordance at â¥50% | Source |
|---|---|---|---|---|
| SP263 vs 22C3 | Early-stage NSCLC (IMpower010) | 83% | 92% | [38] |
| SP263 vs 22C3 | Lung Adenocarcinoma (Multicenter) | 80% | 99% | [41] |
| SP142 vs 22C3 | Metastatic NSCLC (OAK Trial) | Overlapping but distinct populations identified | [42] |
Table 3: Key Reagents and Materials for PD-L1 Assay Comparison Studies
| Item | Function/Description | Example |
|---|---|---|
| FFPE Tissue Microarrays (TMAs) | Contain multiple patient samples in a single block, enabling high-throughput, simultaneous staining of many specimens under identical conditions. | [41] |
| Anti-PD-L1 Antibody Clones | Primary antibodies that specifically bind the PD-L1 protein. Different clones (22C3, SP263, etc.) may recognize different epitopes. | 22C3, SP263, SP142, 28-8 [41] [21] |
| Automated IHC Staining Platforms | Ensure standardized, reproducible staining runs by automating dewaxing, retrieval, antibody incubation, and detection steps. | Dako Autostainer Link 48, VENTANA BenchMark Ultra [40] [41] |
| Validated Positive Control Tissues | Tissues with known PD-L1 expression levels used to validate each staining run. | Tonsil (for 22C3), Placenta (for SP263) [40] |
| IHC Detection Kits | Visualization systems that generate a chromogenic signal at the site of antibody binding. | Manufacturer-specific kits (e.g., OptiView DAB on Ventana) [41] |
| Arginomycin | Arginomycin, CAS:106133-33-9, MF:C18H28N8O5, MW:436.5 g/mol | Chemical Reagent |
| Picrasidine I | Picrasidine I, MF:C14H12N2O2, MW:240.26 g/mol | Chemical Reagent |
The comparative analysis of FDA-approved PD-L1 assays reveals a complex picture. The 22C3, SP263, and 28-8 assays demonstrate a high degree of analytical and clinical concordance, particularly at the â¥50% cutoff, suggesting potential interchangeability in well-controlled settings [38] [41]. In contrast, the SP142 assay remains distinct in its scoring algorithm and sensitivity, identifying a different patient population, yet still effectively predicting response to its corresponding therapy, atezolizumab [42].
For researchers and drug developers, these findings are critically important. While the harmonization of assays is a desirable goal to simplify clinical testing, the unique characteristics of each assay must be respected. Future efforts should focus on rigorous standardization, especially of preanalytical variables, and the development of sophisticated, multi-feature predictive models that integrate PD-L1 with other biomarkers like Ki-67 [43] or genomic signatures to better stratify patients for optimal immunotherapy outcomes [39].
Laboratory-developed tests (LDTs) are in vitro diagnostic tests that are developed, validated, and performed within a single laboratory [44]. Unlike commercially manufactured in vitro diagnostic (IVD) tests, which undergo rigorous premarket review by regulatory bodies like the FDA, LDTs have traditionally been subject to less centralized oversight. However, this regulatory landscape is undergoing transformative change. In May 2024, the FDA issued a final rule establishing comprehensive oversight of LDTs, phasing out the discretionary enforcement that has been in place for decades [45]. This shift represents the most significant change in LDT regulation in history and creates new complexities for pathologists, researchers, and drug development professionals, particularly in fast-moving fields like immuno-oncology where tests for biomarkers like PD-L1 are critical for patient selection [46] [47].
The clinical necessity for LDTs often arises when commercially available IVDs cannot address specialized testing needs [45]. This is particularly relevant for PD-L1 biomarker testing, a cornerstone of immunotherapy selection for cancers like non-small cell lung cancer (NSCLC). While several FDA-approved companion diagnostic (CDx) PD-L1 assays exist (e.g., Agilent's 22C3 and Roche's SP263), their concordance is not perfect, and clinical laboratories may need to develop LDTs due to equipment limitations, cost considerations, or the need for protocol modifications [48] [13] [47]. This guide objectively compares the validation requirements and performance of LDTs against regulated IVDs within the critical context of analytical validation for PD-L1 assays.
The decision to implement an LDT or an IVD has direct implications for test performance and clinical utility. A 2022 study provides a direct quantitative comparison of PD-L1 testing for NSCLC using IVDs versus LDTs, modeling outcomes within the German healthcare system [44].
Table 1: Performance and Outcomes of IVD vs. LDT for PD-L1 Testing in NSCLC
| Parameter | In Vitro Diagnostic (IVD) | Laboratory-Developed Test (LDT) |
|---|---|---|
| Diagnostic Accuracy | 93% | 73% |
| Risk of Misdiagnosis | 7% | 27% (20% greater relative chance) |
| Impact on Treatment | Lower risk of incorrect therapy | ~1 in 4 patients could receive incorrect treatment |
| Cost vs. Benefit | +0.4% cost difference, +19% chance of improved patient outcomes | Lower diagnostic cost, but significantly worse patient outcomes |
The data indicates that while LDTs may offer lower upfront costs, IVDs are 19% more effective in achieving a successful diagnosis and aligning PD-L1 positive NSCLC patients with effective immunotherapy [44]. The superior accuracy of IVDs (93% vs. 73%) translates to a substantial reduction in overall healthcare costs associated with disease progression, management of adverse events, and end-of-life care, demonstrating that the minimal additional diagnostic cost of IVDs is offset by improved therapeutic outcomes.
For a predictive biomarker test like a PD-L1 assay to be considered "fit-for-purpose," it must undergo rigorous validation across multiple spheres.
The workflow illustrates that the validation pathway diverges after a successful clinical trial. A laboratory using an FDA-approved CDx assay must only perform verification to demonstrate proper use. However, any modification to a CDxâbe it a technical change (e.g., altering antibody incubation time in an IHC assay) or a change in intended useâautomatically transforms it into an LDT, triggering the requirement for indirect clinical validation (ICV) [47].
The methodology for ICV depends on the biomarker's biological and clinical characteristics, categorized into three groups [47]:
For a PD-L1 LDT, a robust ICV protocol is essential. The following methodology, derived from recent guidelines and bridging studies, provides a framework [48] [47]:
The new FDA regulatory framework for LDTs is being implemented through a phased, five-stage process. Laboratories must adhere to strict deadlines to maintain compliance and continue offering LDTs [46].
Table 2: FDA LDT Rule Implementation Timeline and Key Requirements
| Phase | Deadline | Key Compliance Requirements |
|---|---|---|
| 1 | May 6, 2025 | Implement Medical Device Reporting (MDR) systems, complaint file management, and procedures for corrections and removals. |
| 2 | May 6, 2026 | Complete laboratory registration and device listing with the FDA. Implement labeling requirements. |
| 3 | May 6, 2027 | Implement comprehensive Quality System requirements, adhering to good manufacturing practices. |
| 4 | November 6, 2027 | Complete premarket review requirements (e.g., 510(k), PMA) for all high-risk LDTs. |
| 5 | May 6, 2028 | Complete premarket review requirements for all moderate and low-risk LDTs. |
The timeline underscores the urgency for laboratories to act. The first deadline in May 2025 requires establishing systems for MDR and complaint files, which many laboratories may need to modify from existing policies to meet the specific FDA requirements for LDTs as medical devices [45].
The successful development and validation of a PD-L1 LDT rely on a suite of critical reagents and materials. The selection of these components directly impacts the test's analytical performance and must be carefully controlled.
Table 3: Key Research Reagents for PD-L1 LDT Development
| Reagent / Material | Function in PD-L1 LDT | Examples & Considerations |
|---|---|---|
| Primary Antibody Clones | Binds specifically to the PD-L1 epitope on tumor and/or immune cells. | Key clones: 22C3, 28-8, SP263, SP142. Note: Different clones have varying binding affinities and specificities, contributing to assay discordance [13]. |
| IHC Detection System | Visualizes the antibody-antigen binding through a chromogenic reaction. | Includes detection kits, visualization substrates, and counterstains (e.g., hematoxylin). Must be optimized for the specific platform and antibody. |
| Cell Line and Tissue Controls | Serves as reference materials for assay validation, daily runs, and proficiency testing. | Cell lines with known PD-L1 expression levels or well-characterized FFPE tissue controls are essential for maintaining consistency and monitoring performance [47]. |
| Platinum-Doublet Chemotherapy | Used in clinical trial bridging studies to establish clinical validity and efficacy endpoints. | Not a laboratory reagent, but critical for validating the LDT against clinical outcomes like Overall Survival (OS) and Progression-Free Survival (PFS) [48]. |
The implementation of PD-L1 LDTs presents a complex balance of scientific rigor and evolving regulatory compliance. While LDTs offer flexibility and can address unmet needs in precision oncology, evidence shows that validated IVD tests currently demonstrate superior diagnostic accuracy and patient outcomes for PD-L1 testing in indications like NSCLC [44]. The decision to develop and implement an LDT must be justified by a clear need and supported by a robust indirect clinical validation protocol that demonstrates diagnostic equivalence to the relevant CDx gold standard [47]. With the FDA's new final rule, laboratories must now navigate a structured, multi-year compliance timeline, making it imperative to integrate regulatory planning seamlessly with analytical validation processes [46] [45]. For researchers and drug developers, this new era of LDT oversight demands a proactive, evidence-based approach to ensure that laboratory-developed tests meet the highest standards of safety and effectiveness, ultimately supporting their critical role in advancing personalized cancer care.
The analytical validation of PD-L1 assays is a critical step in optimizing immunotherapy for cancer patients. While traditional immunohistochemistry (IHC) on tissue biopsies remains the gold standard for assessing PD-L1 expression, this approach faces significant challenges including tumor heterogeneity, the invasive nature of tissue sampling, and inability to perform dynamic monitoring [49]. These limitations have spurred the development of novel diagnostic approaches that can provide complementary information for clinical decision-making.
This guide objectively compares three emerging analytical approaches for PD-L1 assessment: liquid biopsy-based circulating tumor cell (CTC) analysis, quantification of soluble PD-L1 (sPD-L1) in blood, and cerebrospinal fluid (CSF) testing for leptomeningeal metastases. For researchers and drug development professionals, understanding the technical specifications, performance characteristics, and appropriate contexts for implementing these technologies is essential for advancing personalized immunotherapy strategies.
Table 1: Analytical and Clinical Performance Characteristics of Novel PD-L1 Detection Methods
| Assay Characteristic | Liquid Biopsy (CTC-based) | Soluble PD-L1 (sPD-L1) | CSF-Based Testing |
|---|---|---|---|
| Sample Type | Peripheral blood | Blood plasma/serum | Cerebrospinal fluid |
| Analytical Target | Cell surface PD-L1 on captured CTCs | Soluble PD-L1 protein | Cell surface PD-L1 on CSF tumor cells |
| Primary Technology | Aptamer-modified carbon quantum dots with magnetic electrochemical detection [50] | Enzyme-linked immunosorbent assay (ELISA) [51] [52] | ThinPrep liquid-based cytology with immunocytochemistry [53] |
| Sensitivity/LOD | PD-L1 detection limit: 2 ng/mL [50] | Varies by cancer type; Cutoff for prognosis: 11.0 pg/μL in advanced cancer [52] | Requires â¥20 tumor cells on slide; optimized for low cellularity [53] |
| Key Clinical Correlations | Elevated CTC counts & reduced PD-L1 levels associated with disease progression in NSCLC [50] | High levels correlate with progressive disease, worse PFS and OS in multiple cancers [51] [52] | PD-L1 positivity associated with higher response to intrathecal immunotherapy (61.9% vs 33.3%) [53] |
| Tissue Concordance | Captures heterogeneity through dual EpCAM/Vimentin aptamers [50] | Poor correlation with tissue IHC in some studies [52] | Poor agreement with paired extracranial lesions (κ=0.175-0.179) [53] |
| Dynamic Monitoring | Enables continuous monitoring during immunotherapy [50] | Levels can increase post-ICI treatment; patterns vary by cancer type [52] | Suitable for monitoring CNS-specific disease progression |
Table 2: Advantages and Limitations in Research and Clinical Applications
| Application Context | Liquid Biopsy (CTC-based) | Soluble PD-L1 (sPD-L1) | CSF-Based Testing |
|---|---|---|---|
| Early Therapy Screening | Limited due to low CTC counts in early disease | Moderate potential; levels elevated in advanced disease [52] | Not applicable for early disease |
| Therapy Response Monitoring | Excellent for longitudinal tracking of evolving PD-L1 expression [50] [54] | Good for systemic response monitoring; levels change with therapy [52] | Excellent for CNS-specific response assessment [53] |
| Prognostic Stratification | High CTC counts predict worse prognosis [50] | High sPD-L1 independently predicts worse PFS and OS [51] [52] | Emerging prognostic value for CNS metastases |
| Technical Complexity | High (nanomaterial synthesis, electrochemical detection) | Low (standard ELISA protocols) | Moderate (cell enrichment, ICC optimization) |
| Sample Requirements | Standard blood draw | Standard blood draw | Lumbar puncture or Ommaya reservoir |
| Implementation Barriers | Specialized equipment and expertise | Commercially available kits; requires validation | Specialized cytology expertise; low cellularity challenges |
Protocol Overview: This methodology enables highly efficient capture of circulating tumor cells followed by reagent-less electrochemical detection of PD-L1 expression [50].
Key Workflow Steps:
Validation Parameters: This assay was validated in 41 NSCLC patients, demonstrating capability to measure PD-L1 concentrations as low as 2 ng/mL with excellent specificity and sensitivity [50].
Protocol Overview: Standardized procedure for measuring circulating sPD-L1 levels in blood plasma or serum using commercial ELISA kits [51] [52].
Key Workflow Steps:
Technical Notes: Multiple ELISA kits are commercially available with sensitivity ranging from 0.60 pg/mL to 1.14 pg/mL [51]. Studies typically use matched healthy controls to establish baseline levels.
Protocol Overview: Robust methodology for detecting PD-L1 expression in cerebrospinal fluid for patients with leptomeningeal metastases [53].
Key Workflow Steps:
Validation Parameters: This method shows high concordance between 22C3 and SP263 clones (κ=0.815-0.881) and requires at least 20 tumor cells for reliable assessment [53].
Figure 1. Comparative Workflows for Novel PD-L1 Detection Approaches. This diagram illustrates the parallel sample processing pathways for liquid biopsy (sPD-L1 and CTCs) and cerebrospinal fluid testing methodologies.
Figure 2. PD-L1 Biology and Detection Pathways. This diagram outlines the cellular origins, molecular forms, and detection methodologies for PD-L1, highlighting the relationship between membrane-bound and soluble forms.
Table 3: Key Research Reagent Solutions for Novel PD-L1 Assays
| Reagent Category | Specific Examples | Research Function | Application Context |
|---|---|---|---|
| Capture Agents | EpCAM/Vimentin dual-aptamers [50] | Simultaneous capture of epithelial and mesenchymal CTC subpopulations | CTC enrichment from whole blood |
| Detection Probes | PD-L1-aptamer conjugated gold nanoparticles [50] | Reagent-less electrochemical detection of surface PD-L1 | Portable CTC PD-L1 quantification |
| Antibody Clones | 22C3, SP263, SP142, 28-8 [53] [55] | IHC/ICC detection of PD-L1 with varying specificities | Tissue, cell block, and cytology applications |
| ELISA Kits | Human PD-L1 ELISA (BMS2327), Human PD-1 ELISA (BMS2214) [51] | Quantitative measurement of soluble checkpoint proteins | Serum/plasma sPD-L1 quantification |
| Nucleic Acid Assays | PD-L1 TaqMan assays for ddPCR [56] | Absolute quantification of PD-L1 mRNA expression | Liquid biopsy RNA analysis |
| Sample Preservation | PreservCyt solution [53] | Cellular preservation for liquid-based cytology | CSF sample stabilization |
| Reference Genes | GUSB, RPLP0, TBP [56] | Normalization of quantitative RNA assays | Gene expression standardization |
| Nitrovin | Nitrovin | Antibacterial Growth Promoter for Research | Nitrovin is a historical antibacterial growth promoter for animal science research. For Research Use Only. Not for human or veterinary use. | Bench Chemicals |
| p-Iodoclonidine hydrochloride | p-Iodoclonidine Hydrochloride|High-Affinity α2-Adrenergic Agonist | p-Iodoclonidine hydrochloride is a high-affinity partial agonist of α2-adrenergic receptors for neuroscience research. For Research Use Only. Not for human or veterinary use. | Bench Chemicals |
The novel approaches to PD-L1 testing reviewed in this guide offer distinct advantages that complement traditional tissue-based IHC. Liquid biopsy for CTC analysis provides a dynamic window into tumor heterogeneity and enables serial monitoring of PD-L1 expression during therapy [50] [54]. Soluble PD-L1 quantification offers a technically accessible method for prognostic stratification and treatment response monitoring across multiple cancer types [51] [52]. CSF-based testing addresses the critical need for biomarker assessment in leptomeningeal disease, where tissue biopsy is not feasible [53].
For researchers and drug development professionals, the strategic implementation of these technologies should be guided by specific research questions and clinical contexts. Each method contributes unique insights into the dynamic interplay between tumors and the immune system, potentially enabling more precise patient selection and response monitoring in immunotherapy trials. As validation efforts continue, these novel approaches are poised to expand our analytical capabilities beyond the limitations of traditional tissue-based PD-L1 assessment.
The advent of immune checkpoint inhibitors (ICIs) targeting the programmed death protein 1 (PD-1) and its ligand (PD-L1) has fundamentally reshaped cancer treatment, offering new hope for patients with advanced malignancies [1]. The interaction between PD-1 on T cells and PD-L1 on tumor cells suppresses T-cell activation, enabling tumors to evade immune surveillance [1]. Blocking this interaction with ICIs restores anti-tumor immunity, making the PD-1/PD-L1 axis a critical therapeutic target. However, response to these therapies is not universal, highlighting the urgent need for reliable predictive biomarkers to identify patients most likely to benefit [57].
PD-L1 expression detected by immunohistochemistry (IHC) has emerged as a primary biomarker for predicting ICI response. Two principal scoring algorithms have been developed to quantify PD-L1 expression: the Tumor Proportion Score (TPS) and the Combined Positive Score (CPS) [58]. The analytical validation of these assays according to established guidelines ensures accuracy and reduces variation in laboratory practices, forming a crucial foundation for their clinical application [59]. This guide provides a comprehensive comparison of TPS and CPS systems, examining their methodologies, clinical performance data, and implications for drug development and personalized cancer therapy.
TPS is a scoring method that evaluates PD-L1 expression exclusively on viable tumor cells. It is defined as the percentage of viable tumor cells exhibiting partial or complete membrane staining of any intensity [58]. The calculation formula is:
TPS (%) = (Number of PD-L1-positive tumor cells / Total number of viable tumor cells) Ã 100 [58] [60]
Key characteristics of TPS scoring include:
TPS is primarily used in non-small cell lung cancer (NSCLC) to determine eligibility for PD-1/PD-L1 inhibitors such as pembrolizumab, with cut points of â¥1% and â¥50% guiding treatment decisions [58].
CPS provides a more comprehensive assessment by evaluating PD-L1 expression on both tumor cells and surrounding immune cells within the tumor microenvironment. It is defined as the number of PD-L1-staining cells (tumor cells, lymphocytes, macrophages) relative to the total number of viable tumor cells [58]. The calculation formula is:
CPS = (Number of PD-L1-positive cells [tumor cells, lymphocytes, macrophages] / Total number of viable tumor cells) Ã 100 [58]
Key characteristics of CPS scoring include:
CPS is utilized for multiple cancer types including head and neck squamous cell carcinoma (HNSCC), gastric or gastroesophageal junction (GEJ) adenocarcinoma, esophageal carcinoma, cervical cancer, and triple-negative breast cancer (TNBC), with various clinically relevant cut points (e.g., â¥1, â¥10) depending on the specific indication [58].
Table 1: Fundamental Comparison Between TPS and CPS Scoring Systems
| Parameter | Tumor Proportion Score (TPS) | Combined Positive Score (CPS) |
|---|---|---|
| Cells Assessed | Viable tumor cells only | Tumor cells, lymphocytes, and macrophages |
| Scoring Formula | (PD-L1+ tumor cells / Total viable tumor cells) Ã 100 | (PD-L1+ cells [tumor, lymphocytes, macrophages] / Total viable tumor cells) Ã 100 |
| Score Range | 0% to 100% | 0 to 100 (maximum defined) |
| Primary Cancer Applications | Non-small cell lung cancer (NSCLC) | Head and neck squamous cell carcinoma (HNSCC), gastric/GEJ adenocarcinoma, esophageal carcinoma, cervical cancer, triple-negative breast cancer |
| Key Clinical Cut Points | â¥1%, â¥50% | â¥1, â¥10 (varies by cancer type) |
| Inclusion of Immune Microenvironment | No | Yes |
While TPS remains the standard biomarker in NSCLC, emerging evidence suggests CPS may offer improved predictive value. A 2023 retrospective real-world study directly compared the predictive value of CPS and TPS in 187 patients with advanced NSCLC treated with ICI monotherapy [57] [61].
Table 2: Clinical Performance of TPS vs. CPS in Advanced NSCLC (n=187)
| Biomarker Category | PD-L1 Positivity Rate | Overall Survival (OS) Comparison | Statistical Significance |
|---|---|---|---|
| TPS+ (â¥1%) | 112 patients (59.9%) | No significant difference vs. TPS- | p = 0.20 |
| CPS+ (â¥1) | 135 patients (72.2%) | Significantly longer vs. CPS- | p = 0.006 |
| TPS-/CPS+ | Subgroup of CPS+ population | Superior OS vs. TPS-/CPS- | p = 0.018 |
| TPS+/CPS+ | Subgroup of CPS+ population | Superior OS vs. TPS-/CPS- | p = 0.015 |
This study revealed that CPS differentiated overall survival better than TPS, with the remarkable finding that the TPS-/CPS+ subgroup drove this superior performance [57] [61]. These patients, who would have been classified as PD-L1 negative by traditional TPS scoring but positive by CPS, experienced significantly longer survival with ICI treatment, indicating that CPS captures a biologically relevant immune response that TPS misses.
Figure 1: Comparative Clinical Validation Workflow of TPS vs. CPS in NSCLC. This diagram illustrates the study design and key findings from the comparative analysis of TPS and CPS in 187 patients with advanced NSCLC [57] [61].
Robust analytical validation of IHC assays is fundamental to reliable PD-L1 scoring. The College of American Pathologists (CAP) recently updated guidelines to ensure accuracy and reduce variation, with specific recommendations for predictive markers with distinct scoring systems like PD-L1 [59].
Key Methodological Considerations:
Sample Preparation: Baseline tumor biopsies are stained with hematoxylin and eosin (H&E) and PD-L1 using validated laboratory-developed tests or FDA-approved kits (e.g., PD-L1 clone 22C3 on Dako Autostainer) [57]. For bone metastases, ethylenediamine tetra-acetic acid (EDTA)-based decalcification is used without affecting PD-L1 IHC results [57].
Scoring Validation: Laboratories must separately validate/verify each assay-scoring system combination, especially when the same antibody is used with different scoring algorithms for different cancer types [59].
Tissue Requirements: Specimens must contain at least 100 viable tumor cells in the PD-L1-stained slide to be considered adequate for evaluation [58].
Scoring Methodology: TPS and CPS are typically assessed by pathologists who first review a test cohort together to establish consensus, then score independently with discussion of discrepant cases [57]. Cohen's kappa coefficients are used to evaluate interobserver agreement [57].
Automated Scoring Systems: To address challenges with manual scoring subjectivity and time consumption, automated systems using deep learning are being developed. One study created an Automated Tumor Proportion Scoring System (ATPSS) that combines image processing with deep learning to segment tumor areas, detect positive membranes, and count nuclei [60]. This system achieved a Mean Absolute Error of 8.65 and Pearson Correlation Coefficient of 0.9436 compared to subspecialty pathologists, potentially improving consistency in PD-L1 assessment [60].
Liquid Biopsy Approaches: Circulating tumor cells (CTCs) offer a minimally invasive alternative to tissue biopsies for serial monitoring of PD-L1 expression. Exclusion-based sample preparation (ESP) technology enables high-yield capture of CTCs with gentle magnetic movement of antibody-labeled cells through virtual barriers of surface tension [22]. When combined with quantitative microscopy for PD-L1 and HLA I expression, this approach shows promise for predicting progression-free survival in NSCLC patients receiving PD-L1 targeted therapies [22].
Table 3: Key Research Reagents for PD-L1 IHC Assay Development and Validation
| Reagent / Assay Component | Function | Examples / Specifications |
|---|---|---|
| Primary Antibodies | Specific detection of PD-L1 protein | Clones 22C3, 28-8, SP263; FDA-approved companion diagnostics |
| Detection System | Visualization of antibody binding | Dako Autostainer Link 48; laboratory-developed tests validated against pharmDx kits |
| Tissue Processing | Sample preservation and preparation | Formalin-fixed, paraffin-embedded (FFPE) samples; EDTA decalcification for bone metastases |
| Control Materials | Assay validation and quality control | Cell line calibrators; polymer beads coated with recombinant PD-L1 protein |
| Image Analysis | Quantitative assessment of staining | Automated whole-slide imaging scanners; deep learning algorithms for tumor segmentation |
| Cell Capture Technology | Isolation of circulating tumor cells | Exclusion-based sample preparation (ESP); antibody-coated paramagnetic particles |
| Moexipril | Moexipril | High Purity ACE Inhibitor | For Research | Moexipril, a potent ACE inhibitor for cardiovascular research. For Research Use Only. Not for human or veterinary diagnostic or therapeutic use. |
| Amtolmetin guacil | Amtolmetin Guacil |Research Compound | Amtolmetin guacil is a non-steroidal anti-inflammatory drug (NSAID) prodrug for research. This product is for Research Use Only and is not intended for diagnostic or therapeutic applications. |
The field of PD-L1 biomarker testing continues to evolve with several promising developments. Artificial intelligence and machine learning are demonstrating transformative potential in cancer diagnosis, prognosis, and treatment [1]. AI-driven analytics can improve precision medicine by revealing essential biomarker characteristics from diverse datasets, potentially enhancing the discovery of more effective immune checkpoint inhibitors and combinatorial drug strategies [1].
Novel scoring approaches are also emerging, such as the Tumor Area Positivity (TAP) score, which measures PD-L1 expression across defined tumor areas. Recent studies in gastric/esophageal cancers have shown significant agreement between TAP and CPS at various cutoffs (Cohen's κ: 0.64-0.85), with similar overall survival outcomes between TAP score- and CPS-defined PD-L1-positive subgroups [62].
In conclusion, both TPS and CPS provide valuable, complementary approaches to PD-L1 assessment with distinct advantages for different clinical contexts. TPS offers a focused evaluation of tumor-specific PD-L1 expression and remains the standard in NSCLC, while CPS provides a more comprehensive assessment of the tumor immune microenvironment and demonstrates superior predictive value in certain settings [57]. The choice between scoring systems depends on cancer type, therapeutic context, and clinical validation for specific indications. As biomarker science advances, integration of novel technologies including automated scoring, liquid biopsy approaches, and AI-driven analytics will further refine patient selection for immunotherapy, ultimately enhancing treatment outcomes through precision medicine.
The analytical validation of PD-L1 immunohistochemistry (IHC) assays represents a critical prerequisite for accurate patient selection in cancer immunotherapy. While significant attention has focused on analytical factors such as assay interpretation and scoring, pre-analytical variables introduce substantial variability that can compromise test reliability and clinical utility. This guide systematically evaluates the impact of tissue fixation, processing, and antigen preservation on PD-L1 assay performance, providing objective comparisons of different methodologies and their effects on biomarker integrity. Evidence demonstrates that pre-analytical factors significantly influence PD-L1 immunoreactivity, with implications for both clinical trial outcomes and routine diagnostic accuracy [63] [64] [65]. Understanding and standardizing these variables is therefore essential for optimizing PD-L1 as a predictive biomarker for immune checkpoint inhibitor therapy.
Prolonged storage of FFPE tissue blocks significantly impacts PD-L1 antigen preservation, potentially leading to false-negative results and affecting patient eligibility for immunotherapy.
Table 1: Impact of FFPE Block Storage Duration on PD-L1 Immunoreactivity
| Storage Duration | Percentage of Cases with Decreased Staining | Statistical Significance | Clinical Implications |
|---|---|---|---|
| <1 year | 0% | Reference | Optimal preservation |
| 1-2 years | 11% | p = 0.015 | Mild antigen degradation |
| 2-3 years | 13% | p = 0.015 | Moderate antigen degradation |
| â¥3 years | 50% | p = 0.015 | Severe antigen degradation; high risk of false negatives |
A 2025 study on triple-negative breast cancer analyzed 63 cases with PD-L1 testing repeated after varying storage durations. PD-L1 positivity was defined as a Combined Positive Score (CPS) â¥10 using the 22C3 pharmDx assay. The research found a statistically significant decline in PD-L1 immunoreactivity, particularly in blocks stored for three or more years, where half of the previously positive cases showed decreased staining. This highlights the risk of using archived tissues for PD-L1 testing and underscores the necessity of using recent tissue specimens to ensure accurate diagnosis and optimal immune checkpoint inhibitor treatment selection [63].
The storage time and temperature of unstained paraffin sections critically affect PD-L1 antigen stability, with room temperature storage leading to rapid degradation.
Table 2: PD-L1 (SP142) Positivity Rate in Unstained Sections Over Time at Different Temperatures
| Storage Time | Room Temperature | 4°C | -20°C | -80°C |
|---|---|---|---|---|
| 1 week | 97.18% | 97.18% | 98.59% | 98.59% |
| 2 weeks | 83.10% | 80.28% | 92.96% | 85.92% |
| 4 weeks | 71.83% | 76.06% | 83.10% | 76.06% |
| 8 weeks | 61.97% | 64.79% | 61.97% | 63.38% |
| 12 weeks | 54.93% | - | - | - |
| 24 weeks | 32.93% | - | - | - |
A 2023 study on invasive breast cancer demonstrated that PD-L1 antigenicity diminishes as storage time increases. While all storage temperatures showed declining positivity rates over 24 weeks, refrigeration at 4°C or -20°C significantly slowed this degradation compared to room temperature storage. The study recommended that unstained sections should not be stored for more than 4 weeks, even under refrigerated conditions, to maintain reliable PD-L1 (SP142) expression results. For room temperature storage, the reliability window is even shorter, with significant antigen loss observed after just 2 weeks [65].
Standardized fixation in 10% neutral buffered formalin (NBF) is crucial for reliable PD-L1 testing. Current guidelines recommend fixation durations between 6-72 hours, but real-world practices vary considerably. In non-small cell lung cancer (NSCLC), specimens from core needle biopsies or surgical resections should be fixed in 10% NBF for 10â72 hours prior to paraffin embedding to ensure optimal antigen preservation [63]. A nationwide study on cytology specimens revealed substantial variation in fixatives used across pathology laboratories, with alcohol-based fixatives demonstrating negative effects on PD-L1 immunoreactivity compared to formalin-based fixatives. Correcting for differences in fixative and cell block method reduced the number of laboratories with significantly divergent PD-L1 positivity rates from 42.1% to 26.3%, indicating these pre-analytical factors substantially contribute to interlaboratory variation [64].
The choice between biopsy and surgical specimens introduces variability due to tumor heterogeneity. A 2025 study comparing PD-L1 expression between preoperative biopsy and surgical specimens in NSCLC found only 57.6% concordance across three expression categories (negative: <1%, low: 1-49%, high: â¥50%). This discrepancy underscores how PD-L1 expression evaluated using small biopsy specimens may be significantly influenced by sampling chance due to intra-tumoral heterogeneity. This has direct implications for perioperative immunotherapy decisions, where biomarker assessment on small samples must guide treatment planning [66].
Diagram Title: Pre-analytical Factors Affecting PD-L1 Testing Reliability
The following experimental approach was used to evaluate FFPE block storage impact on PD-L1 immunoreactivity:
Study Population and Sample Preparation: Researchers retrospectively analyzed 63 TNBC cases with PD-L1 testing using the 22C3 pharmDx assay at diagnosis. The same FFPE blocks stored at room temperature were re-evaluated after varying storage durations (<1, 1-2, 2-3, â¥3 years). All specimens had been fixed in 10% neutral buffered formalin for 10-72 hours prior to paraffin embedding [63].
Immunohistochemistry Protocol: PD-L1 staining was performed using the PD-L1 IHC 22C3 pharmDx kit on the Dako Autostainer Link 48 platform. Four-micrometer-thick FFPE tissue sections were deparaffinized, and antigen retrieval was performed using EnVision FLEX Target Retrieval Solution (Low pH). After quenching endogenous peroxidase activity, sections were incubated with mouse monoclonal anti-PD-L1 antibody (clone 22C3). Visualization was achieved using the EnVision FLEX visualization system with hematoxylin counterstaining [63].
Assessment and Statistical Analysis: PD-L1 expression was quantified using the Combined Positive Score. PD-L1 positivity was defined as CPS â¥10. Associations with clinicopathologic features were evaluated using appropriate statistical tests, with p-values <0.05 considered significant [63].
This protocol assessed the effect of storage time and temperature on unstained sections:
Sample Preparation and Storage Conditions: The study included 71 PD-L1 (SP142)-positive invasive breast cancer cases. Unstained paraffin sections were stored at room temperature (20-25°C), 4°C, -20°C, and -80°C. PD-L1 staining was performed at 1, 2, 3, 4, 8, 12, and 24 weeks of storage [65].
Immunohistochemistry and Scoring: All sections were stained with PD-L1 (clone SP142) using the OptiView DAB IHC detection kit on a Benchmark XT automatic IHC platform. PD-L1 was scored using the immune cell (IC) positivity score, defined as the percentage of PD-L1-stained immune cells within the tumor area. Expression was considered positive if tumor stromal infiltrating immune cells were â¥1% [65].
Statistical Analysis: A two-way mixed consistency intraclass correlation coefficient (ICC) evaluated the consistency of PD-L1 expression in paraffin sections stored under different conditions compared with fresh sections. ICC values were interpreted as poor (0-0.5), moderate (0.5-0.75), good (0.75-0.9), and excellent (0.9-1.0) [65].
Artificial intelligence platforms show promise in mitigating pre-analytical variability in PD-L1 assessment. A 2025 study evaluated an automated pan-organ CPS AI algorithm across multiple tumor types and staining protocols. AI assistance improved interobserver agreement among pathologists, increasing the intraclass correlation coefficient from 62% to 74%. The improvement was particularly pronounced in challenging cases with CPS <20, where ICC improved from 19% to 62%, demonstrating AI's value in reducing variability near critical clinical decision thresholds [67].
Compared to routine manual scoring, AI-based scoring demonstrated superior accuracy (88% versus 75%) and sensitivity (96% versus 78%) while maintaining comparable positive predictive value (88% versus 87%). This enhanced detection capability suggests AI could partially compensate for suboptimal antigen preservation by providing more consistent scoring, particularly in borderline cases [67].
Table 3: Key Research Reagents for PD-L1 Pre-analytical Studies
| Reagent/Resource | Specific Example | Research Application | Function |
|---|---|---|---|
| Anti-PD-L1 Antibodies | 22C3 pharmDx (Agilent) | Companion diagnostic for pembrolizumab | PD-L1 detection in IHC |
| SP263 (Ventana) | Companion diagnostic for durvalumab | PD-L1 detection in IHC | |
| SP142 (Ventana) | Companion diagnostic for atezolizumab | PD-L1 detection in IHC, primarily on IC | |
| IHC Platforms | Dako Autostainer Link 48 | Automated IHC staining | Standardized assay execution |
| Benchmark XT/ULTRA (Ventana) | Automated IHC staining | Standardized assay execution | |
| Digital Pathology Tools | Whole Slide Scanners | Digital image acquisition | Enables AI analysis and remote review |
| AI Scoring Software | DiaKwant PD-L1 algorithm | Automated CPS quantification | Reduces interobserver variability |
| Cefetamet pivoxil hydrochloride | Cefetamet pivoxil hydrochloride, CAS:105629-49-0, MF:C20H26ClN5O7S2, MW:548 g/mol | Chemical Reagent | Bench Chemicals |
| Sabeluzole | Sabeluzole | Sabeluzole for research applications. Explore its neuroprotective properties and mechanisms. This product is For Research Use Only (RUO). Not for human or veterinary use. | Bench Chemicals |
Pre-analytical factors including specimen storage duration, storage conditions, fixation methods, processing techniques, and tissue heterogeneity significantly impact PD-L1 antigen preservation and assay performance. The evidence demonstrates that prolonged storage of FFPE blocks beyond three years and unstained sections beyond 2-4 weeks substantially reduces PD-L1 immunoreactivity, potentially leading to false-negative results and affecting patient eligibility for immunotherapy. Standardization of pre-analytical protocols is essential for reliable PD-L1 testing, particularly in the context of clinical trials and companion diagnostic development. Emerging technologies such as AI-assisted scoring show promise in mitigating some variability, but cannot replace proper specimen handling and storage practices. For translational researchers and drug developers, rigorous attention to these pre-analytical variables is fundamental to ensuring accurate biomarker assessment and optimizing patient selection for immunotherapy.
Programmed Death-Ligand 1 (PD-L1) expression serves as a critical biomarker for predicting responses to immune checkpoint inhibitors across multiple cancer types. However, its assessment is significantly complicated by substantial spatial and temporal heterogeneity within tumors. Spatial heterogeneity refers to the variations in PD-L1 expression across different geographical regions of the same tumor, while temporal heterogeneity encompasses changes in expression patterns over time and in response to therapeutic interventions. This heterogeneity presents substantial challenges for biomarker-driven patient selection, as biopsy samples may not accurately represent the overall PD-L1 status of the entire tumor mass. Understanding these variations is therefore essential for developing accurate diagnostic approaches and optimizing immunotherapy outcomes.
The tumor microenvironment (TME) plays a pivotal role in shaping PD-L1 expression patterns through dynamic interactions between tumor cells, immune cells, and stromal components. Immune checkpoint receptors and ligands are expressed on diverse cell types within the TME, including tumor cells, macrophages, T cells, and endothelial cells, creating a complex network of immunosuppressive signals [68]. This review systematically examines the spatial and temporal dimensions of PD-L1 heterogeneity, compares currently available PD-L1 assays, details experimental methodologies for comprehensive assessment, and discusses emerging strategies to address heterogeneity challenges in clinical practice and research.
Spatial heterogeneity in PD-L1 expression manifests as varying distribution patterns across different regions of the same tumor, between primary and metastatic sites, and even within individual tumor cells. Research in esophageal squamous cell carcinoma (ESCC) has demonstrated significant intratumor spatial heterogeneity in PD-L1 expression when sampling multiple distinct tumor regions using endoscopic biopsy forceps [69]. This variability can lead to substantial sampling bias when relying on limited biopsy specimens, potentially misclassifying patients who might benefit from immunotherapy.
The clinical implications of spatial heterogeneity are profound. In ESCC, studies have found that spatial heterogeneity was reduced when the tumor's combined positive score (CPS) was sufficiently high, suggesting that tumors with robust PD-L1 expression may be more uniformly positive [69]. Multi-region sampling assessment revealed that the maximum CPS derived from three distinct regions provided a more accurate approximation of the bulk tumor's PD-L1 status than single-region biopsies [69]. This finding highlights the importance of comprehensive sampling strategies to overcome spatial heterogeneity challenges in clinical practice.
Spatial heterogeneity in PD-L1 expression is closely linked to the composition and distribution of immune cells within the tumor microenvironment. In ESCC, PD-L1 expression positively correlated with the density of infiltrating T cells, particularly CD8+ and CD4+ T cells [69]. This relationship suggests that PD-L1 expression is often induced by local immune pressure, creating geographically distinct immunologically "hot" and "cold" regions within the same tumor.
Pan-cancer analyses have revealed that immune checkpoint receptors and ligands exhibit cell-specific expression patterns within the TME [70] [68]. For instance, PD-L1 is highly expressed on macrophages and tumor cells, while immune checkpoint receptors such as LAG3 and TIGIT are predominantly found on CD8+ T cells [68]. This cellular compartmentalization of immune checkpoint molecules adds another layer of complexity to spatial heterogeneity, as the functional significance of PD-L1 expression may depend on which cell type is expressing it and its spatial relationship with complementary receptors on immune cells.
Table 1: Factors Contributing to Spatial Heterogeneity of PD-L1 Expression
| Factor | Impact on PD-L1 Heterogeneity | Clinical Implications |
|---|---|---|
| Regional Immune Infiltration | Varying densities of T cells across tumor regions create mosaic expression patterns | Sampling limited to immune-cell poor areas may underestimate PD-L1 status |
| Tumor Microenvironment Architecture | Distinct expression patterns in invasive margin vs. tumor center | Biopsy location significantly influences PD-L1 assessment |
| Cellular Source | Differential expression on tumor cells vs. immune cells | Scoring algorithms must account for cellular compartmentalization |
| Hypoxic Gradients | Perinecrotic and hypoxic regions often show elevated PD-L1 expression | Geographic sampling bias may over- or under-estimate overall expression |
Temporal heterogeneity in PD-L1 expression refers to the changes that occur over time, both naturally during disease progression and in response to therapeutic interventions. The dynamic nature of the tumor immune microenvironment means that PD-L1 expression is not static but can evolve under selective pressures, including prior treatments. Although the search results do not contain specific longitudinal studies tracking PD-L1 changes over time, this aspect represents a critical dimension of heterogeneity with significant clinical implications.
Therapies themselves can profoundly influence PD-L1 expression patterns. Radiation, chemotherapy, and targeted therapies have been shown to modulate the tumor immune microenvironment, potentially altering PD-L1 expression on both tumor and immune cells. These treatment-induced changes may explain discrepancies in PD-L1 status between initial diagnostic specimens and samples taken after disease progression or between primary and recurrent tumors. Understanding these temporal dynamics is essential for determining the optimal timing for biomarker assessment and for interpreting PD-L1 status in the context of prior therapies.
Temporal heterogeneity poses significant challenges for biomarker-driven treatment decisions, particularly when therapeutic selection relies on historical specimens that may not reflect current tumor biology. This is especially relevant in the advanced disease setting, where biopsies are often obtained at initial diagnosis but treatment decisions for later-line therapies must account for potential changes in the immune microenvironment during disease progression.
The emergence of novel immune checkpoint receptors and ligands beyond PD-1/PD-L1 adds further complexity to temporal dynamics. Pan-cancer analyses have identified numerous co-inhibitory receptors (LAG3, TIGIT, TIM-3) and their corresponding ligands that exhibit distinct expression patterns across different cell types in the TME [68]. The relative expression of these alternative immune checkpoints may change over time and in response to selective pressures, potentially driving resistance to PD-1/PD-L1 blockade. Comprehensive temporal mapping of the broader immune checkpoint landscape will be essential for developing effective combination strategies and sequencing approaches.
Multiple PD-L1 immunohistochemical (IHC) assays have been developed as companion diagnostics for immune checkpoint inhibitors, each utilizing different antibody clones, staining platforms, and scoring algorithms. A recent comprehensive evaluation of four FDA-approved PD-L1 assays (22C3, 28-8, SP142, and SP263) in clear cell renal cell carcinoma (ccRCC) revealed significant differences in detection rates and concordance [13]. These disparities highlight the impact of technical factors on PD-L1 assessment and underscore the challenges posed by tumor heterogeneity in achieving consistent results across different assay platforms.
The study demonstrated substantial variability in PD-L1 detection rates depending on the cellular compartment assessed. For tumor cells, PD-L1 positivity was extremely low across all four assays. In contrast, PD-L1 positivity in tumor-infiltrating immune cells was approximately 15% for 22C3, 28-8, and SP263 assays, but only 2.1% for the SP142 assay [13]. This finding indicates that the SP142 assay has fundamentally different detection characteristics, particularly for immune cell PD-L1 expression, which could significantly impact patient classification for immunotherapy.
Table 2: Comparison of FDA-Approved PD-L1 Assays in Clear Cell Renal Cell Carcinoma
| Assay | Tumor Cell Positivity | Immune Cell Positivity | Pairwise Concordance with 28-8 (κ statistics) | Prognostic Significance |
|---|---|---|---|---|
| 22C3 | Very low | 14.7% | 0.52 | Worse cancer-specific survival with IC positivity |
| 28-8 | 2.1% | 16.1% | Reference | Worse cancer-specific survival with IC positivity |
| SP142 | 2.1% | 2.1% | 0.16 | Limited prognostic value |
| SP263 | 15.0% | 15.0% | 0.46 | Worse cancer-specific survival with combined TC/IC scoring |
In response to the challenges posed by assay variability and tumor heterogeneity, professional organizations have developed guidelines to standardize PD-L1 testing approaches. The College of American Pathologists (CAP), in collaboration with several professional societies, has published evidence-based recommendations for PD-L1 testing in patients with non-small cell lung cancer (NSCLC) [71]. These guidelines emphasize the use of validated PD-L1 IHC assays, appropriate technical validation for different specimen types, and standardized reporting using percent expression scores.
The CAP guideline recommends that pathologists use clinically validated PD-L1 IHC assays as intended by their regulatory approvals whenever feasible [71]. However, recognizing practical constraints related to cost and access, the guideline also endorses the use of laboratory-developed tests (LDTs) provided they undergo proper technical validation against one or more approved companion diagnostic assays. This balanced approach seeks to maintain testing quality while ensuring broad patient access to essential biomarker assessment.
Robust assessment of PD-L1 heterogeneity requires specialized experimental approaches designed to capture spatial and temporal variations. Multi-region sampling represents a key strategy for addressing spatial heterogeneity, as demonstrated in ESCC research where four distinct tumor regions were sampled using endoscopic biopsy forceps [69]. This approach enables comprehensive mapping of PD-L1 distribution patterns and provides insights into the relationship between PD-L1 expression and local immune contexture.
The experimental workflow for multi-region PD-L1 assessment typically involves several key steps: (1) identification of geographically separate tumor regions for sampling, (2) collection of multiple specimens using biopsy forceps or core needles, (3) individual processing and embedding of each sample, (4) PD-L1 immunohistochemical staining using validated assays, and (5) standardized scoring by qualified pathologists. In research settings, additional analyses such as immune cell density quantification, genomic characterization, and transcriptomic profiling may be performed on each region to correlate PD-L1 expression with other features of the TME.
Spatio-Temporal Assessment Workflow
Robust analytical validation is essential for ensuring accurate PD-L1 assessment in the context of tumor heterogeneity. The PD-L1 IHC 22C3 pharmDx assay protocol exemplifies a standardized approach for PD-L1 evaluation [72]. This protocol specifies detailed methodologies for sample preparation, staining conditions, and interpretation criteria, with membranous PD-L1 expression on tumor cells quantified using tumor proportion scores (TPS) with established cutoffs (â¥50% = strong positive; 1-49% = weak positive; <1% = negative) [72].
For comprehensive heterogeneity assessment, validation protocols should address several key elements: (1) pre-analytical factors including sample collection, fixation, and processing; (2) analytical consistency across multiple tumor regions; (3) scoring reproducibility between observers; and (4) integration with other biomarker data. Tissue microarrays (TMAs) constructed from multiple tumor regions represent a valuable tool for standardized evaluation of PD-L1 expression across different assays under controlled conditions [13]. This approach facilitates direct comparison of assay performance and enhances our understanding of how different platforms detect heterogeneous PD-L1 expression.
Table 3: Essential Research Reagents for PD-L1 Heterogeneity Studies
| Reagent Category | Specific Examples | Research Application | Technical Considerations |
|---|---|---|---|
| FDA-Approved PD-L1 IHC Assays | 22C3 pharmDx, 28-8 pharmDx, SP142, SP263 | Companion diagnostic validation; assay comparison studies | Different staining intensities and cellular localization patterns |
| Laboratory-Develop Test Reagents | Optimized antibody clones on automated platforms | Development of validated LDTs when approved assays unavailable | Require extensive validation against clinical outcome data |
| Immune Cell Markers | CD8, CD4, CD68, FoxP3 | Correlation of PD-L1 expression with immune contexture | Multiplex IHC enables spatial relationship analysis |
| Digital Pathology Tools | Image analysis algorithms for quantitative scoring | Objective assessment of PD-L1 expression heterogeneity | Reduce inter-observer variability in complex staining patterns |
| Spatial Biology Platforms | Multiplex immunofluorescence, CODEX, GeoMx | Comprehensive mapping of immune checkpoint topography | Enable correlation of PD-L1 with multiple TME parameters simultaneously |
The spatial and temporal heterogeneity of PD-L1 expression represents a fundamental challenge in immuno-oncology, with significant implications for patient selection, response prediction, and therapeutic outcomes. Current evidence indicates that multi-region sampling approaches and maximum CPS scoring from three regions can provide more accurate assessment of tumor PD-L1 status compared to single biopsies [69]. Furthermore, the limited concordance among different FDA-approved PD-L1 assays highlights the need for continued standardization efforts and assay-specific validation [13].
Emerging technologies offer promising avenues for addressing heterogeneity challenges. Digital pathology and artificial intelligence-based image analysis algorithms are being increasingly employed to provide more consistent and quantitative assessment of PD-L1 expression patterns [73]. These tools can help identify complex heterogeneity patterns that may be difficult to discern through conventional manual scoring. Additionally, multiplex immunohistochemistry and spatial transcriptomics enable comprehensive profiling of the immune microenvironment, allowing researchers to correlate PD-L1 heterogeneity with other features of the TME.
The growing understanding of immune checkpoint biology beyond PD-1/PD-L1 suggests that future biomarker strategies will need to account for the complex interplay between multiple inhibitory and stimulatory pathways [68]. Pan-cancer analyses have revealed that various immune checkpoint receptors and ligands exhibit distinct expression patterns across different cell types in the TME, creating a complex regulatory network [70] [68]. Comprehensive mapping of this network, with its inherent spatial and temporal heterogeneity, will be essential for developing next-generation biomarkers and combination therapies that can overcome resistance mechanisms and benefit more patients.
Research Landscape and Clinical Implications
The analytical validation of PD-L1 assays is a critical step in ensuring accurate patient selection for immunotherapy. The pre-analytical phase, particularly sample type selection, introduces significant variability that can impact assay performance and subsequent treatment decisions. This guide objectively compares the performance of biopsy specimens, surgical resection specimens, and cytology specimens for PD-L1 immunohistochemistry (IHC) testing in non-small cell lung cancer (NSCLC), providing researchers and drug development professionals with consolidated experimental data and methodologies.
Surgical resections provide large tissue volumes for analysis but are often unavailable for patients with advanced disease. Small biopsies remain the primary diagnostic material, though they are subject to tumor heterogeneity influences.
Table 1: Concordance Between Biopsy and Surgical Resection Specimens for PD-L1 Expression in NSCLC
| PD-L1 Cutoff | Relative Risk (RR) | 95% Confidence Interval | P-value | Conclusion | Study Details |
|---|---|---|---|---|---|
| 1% | 0.89 | 0.70â1.12 | P=0.33 | No significant difference in detection rate | Meta-analysis of 12 studies (n=877 patients) [74] |
| 50% | 0.69 | 0.58â0.83 | P<0.01 | Significantly lower detection rate in biopsies | Meta-analysis of 12 studies (n=877 patients) [74] |
| Three Categories (<1%, 1-49%, â¥50%) | - | - | - | 57.6% concordance rate | Recent study (2025) of 33 patients [66] |
A recent 2025 study underscored this challenge, reporting only 57.6% concordance in three-category PD-L1 classification (negative: <1%, low: 1-49%, high: â¥50%) between preoperative biopsies and subsequent surgical specimens [66]. The study concluded that PD-L1 expression evaluated using small biopsy specimens may be largely influenced by chance due to intra-tumoral heterogeneity [66].
Cytology specimens, including cell blocks from fine-needle aspiration (FNA) and endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA), are often the only available material from advanced NSCLC patients. Evidence supports their validity for PD-L1 testing.
Table 2: Diagnostic Accuracy of Cytologic vs. Paired Histologic Specimens for PD-L1 Testing
| PD-L1 Cutoff | Sensitivity (Pooled) | Specificity (Pooled) | Study Details |
|---|---|---|---|
| â¥1% | 0.84 (95% CI: 0.77-0.89) | 0.88 (95% CI: 0.82-0.93) | Meta-analysis of 26 articles (1,064 specimen pairs) [75] |
| â¥50% | 0.78 (95% CI: 0.69-0.86) | 0.94 (95% CI: 0.91-0.96) | Meta-analysis of 26 articles (1,064 specimen pairs) [75] |
This meta-analysis confirms that cytologic specimens provide an accurate assessment of PD-L1 expression at standard clinical cutoffs [75]. The International Association for the Study of Lung Cancer (IASLC) states that all cytologic preparations, including cell blocks, ethanol-fixed, and air-dried slides, can be used for immunocytochemistry (ICC) [76].
PD-L1 expression demonstrates spatial heterogeneity not only within a single tumor but also between primary and metastatic sites.
Table 3: PD-L1 Expression in Primary Lung vs. Extrathoracic Metastatic Sites
| Sample Site | PD-L1 Positive Rate (TPS â¥1%) | Average TPS | Statistical Significance vs. Primary NSCLC |
|---|---|---|---|
| Primary NSCLC (Reference) | 53.50% | 17.87% | - |
| All Extrathoracic Metastases | 61.83% | 26.24% | P=0.03 [77] |
| Liver Metastases | 85.71% | Not specified | P<0.05 [77] |
| Adrenal Metastases | 77.78% | Not specified | P<0.05 [77] |
| Lymph Node Metastases | 60.00% | Not specified | Not significant [77] |
| Brain, Bone, Soft Tissue, Pleural Metastases | 40.00%-66.67% | Not specified | Not significant [77] |
This study demonstrated that PD-L1 expression is frequently higher in metastatic lesions, with significant variation across different organ sites [77]. This has profound implications for biomarker development, as the sampling site may influence the PD-L1 score obtained.
The following diagram outlines a standardized workflow for studies comparing PD-L1 expression across different specimen types:
Specimen Processing and IHC Protocol (Based on [66]):
Quality Control Measures (Based on [78]):
Table 4: Essential Reagents and Materials for PD-L1 Assay Validation Studies
| Item | Function/Application | Examples/Specifications |
|---|---|---|
| PD-L1 IHC Assays | Companion diagnostics for specific immune checkpoint inhibitors | 22C3 pharmDx (Dako, for pembrolizumab), 28-8 (Dako, for nivolumab), SP263 (Ventana, for durvalumab), SP142 (Ventana, for atezolizumab) [79] [66] |
| Automated IHC Platforms | Standardized staining conditions to minimize inter-laboratory variability | Dako Autostainer Link48, Ventana Benchmark series [79] |
| Cell Block Preparation Kits | Processing cytologic samples into FFPE-like blocks for IHC | Various commercial kits; formalin fixation recommended for optimal results [76] [80] |
| Positive Control Tissues | Ensuring staining protocol performance in each run | Placental tissue (strong positive), Tonsil tissue (variable positive patterns) [78] |
| Digital Image Analysis Software | Objective quantification of PD-L1 expression, reducing inter-observer variability | Platforms like Leica Aperio Imagescope can be used for research purposes [81] |
| 3-Ethyl-4-heptanol | 3-Ethyl-4-heptanol (CAS 19780-42-8) - C9H20O | 3-Ethyl-4-heptanol (CAS 19780-42-8) is a chemical compound for research use only (RUO). It is strictly for laboratory applications and not for personal use. |
For researchers designing clinical trials for novel immunotherapies, these findings highlight critical considerations:
The consistent message across studies is that while cytology and small biopsy specimens are generally adequate for PD-L1 testing, understanding their limitations is crucial for appropriate analytical validation and clinical interpretation.
The advent of immune checkpoint inhibition therapy has established programmed death ligand 1 (PD-L1) immunohistochemistry (IHC) as a critical predictive biomarker in oncology. However, the analytical landscape for PD-L1 detection is characterized by significant complexity, with multiple assays utilizing different antibody clones, staining platforms, and scoring methodologies. This variability presents substantial challenges for clinical implementation and data interpretation across different research studies and diagnostic laboratories. The harmonization of pre-treatment assessments is essential, as the growing use of checkpoint inhibitors demands greater standardization to ensure appropriate patient selection for therapy [82].
Multiple approved PD-1/PD-L1 inhibitor drugs are accompanied by different diagnostic methods, each with distinct characteristics. Within these methods, staining platforms may vary, but the primary antibody differs in every case, creating a fragmented diagnostic landscape [82]. Furthermore, diagnostic methods may assess PD-L1 levels either throughout the tumor tissue or within infiltrating immune cells, and PD-L1 positive thresholds differ across studies and clinical trials, complicating cross-comparison of research findings and clinical outcomes [82]. Understanding the sources and magnitude of this inter-assay variability is therefore fundamental to both clinical research and diagnostic practice.
The primary antibody clone represents one of the most significant sources of variability in PD-L1 IHC assays. Different clones demonstrate marked differences in analytical sensitivity and staining characteristics, which can directly impact patient classification as PD-L1 positive or negative.
Table 1: Comparison of Common Anti-PD-L1 Antibody Clones
| Antibody Clone | Associated Drug | Staining Platform | Relative Sensitivity | Key Characteristics |
|---|---|---|---|---|
| 22C3 | Pembrolizumab | Dako Autostainer | Moderate | FDA-approved companion diagnostic; harmonizes well with 28-8 and SP263 |
| 28-8 | Nivolumab | Dako Autostainer | Moderate | FDA-approved complementary diagnostic; shows high concordance with 22C3 and SP263 |
| SP263 | Durvalumab | Ventana Benchmark | Moderate | FDA-approved companion diagnostic; staining pattern similar to 22C3 and 28-8 |
| SP142 | Atezolizumab | Ventana Benchmark | Lower | FDA-approved companion diagnostic; typically stains fewer tumor cells |
| 73-10 | Investigational | Dako Autostainer | Higher | Not FDA-approved; demonstrates higher tumor cell staining intensity |
| E1L3N | Research Use | Various | Variable | Commonly used in research settings (11.5% of studies) |
The most commonly used anti-PD-L1 antibody clones in research and clinical practice include 22C3 (30.8%), SP142 (19.2%), SP263 (15.4%), and E1L3N (11.5%) [83]. International comparison studies such as the Blueprint Programmed Death Ligand 1 Immunohistochemistry Comparability Project have revealed that while three of the major assay systems (22C3, 28-8, and SP263) generate largely consistent staining results, others show significant deviations [82]. Specifically, the SP142 antibody consistently demonstrates lower sensitivity, staining fewer tumor cells than other assays, which would result in fewer patients being designated as PD-L1 positive [82]. Conversely, the 73-10 antibody shows higher sensitivity, staining more tumor cells and potentially leading to more positive designations if approved for clinical use [82].
The underlying mechanisms for these sensitivity differences are multifactorial, with epitope binding variation being a significant contributor [82]. Research comparing antibody clones for PD-L1 IHC detection has noted extensive differences in epitope recognition, with some antibodies (73-10 and SP142) binding to intracellular epitopes and others (28-8) to extracellular epitopes of PD-L1 [82]. However, the relationship between epitope location and staining sensitivity is not straightforward, as SP142 (intracellular binding) provides the least sensitivity in PD-L1 detection while 73-10 (also intracellular binding) provides the most [82]. Other aspects of antigen binding that likely affect assay performance include antibody affinity, on-off kinetics, and the interaction between primary and secondary antibodies [82].
Beyond antibody clones, the technical platform and scoring approach contribute substantially to inter-assay variability. The most common IHC platforms for PD-L1 detection include the Dako Autostainer and Ventana Benchmark systems, each with proprietary detection chemistry and amplification systems that can influence staining intensity and background [82].
Scoring methodologies represent another critical source of variability, with significant differences in how PD-L1 expression is quantified:
The inter-observer consistency in scoring also varies significantly depending on the cell type being assessed. Studies have shown that pathologists demonstrate remarkable consistency in scoring stained tumor cells (interclass correlation coefficient of 0.88-0.93), which is considered good to excellent agreement [82]. However, there is substantially more variation between pathologists in scoring the staining of immune cells, with consistency scores reflecting a low level of agreement between pathologists (Fleiss kappa statistic 0.11-0.28) [82]. This finding underscores that scoring PD-L1 staining of infiltrating immune cells remains particularly challenging and contributes significantly to overall assay variability.
The fundamental question for clinical laboratories is whether different PD-L1 assays can be used interchangeably for specific clinical purposes. A comprehensive meta-analysis addressing this question evaluated the diagnostic accuracy of various PD-L1 assays at specific clinical cut-points defined for specific immunotherapies [84]. This analysis incorporated 376 assay comparisons from 22 published studies, providing substantial evidence for evaluating inter-assay performance.
Table 2: Diagnostic Accuracy of PD-L1 Assays for Different Clinical Purposes
| Assay Type | Clinical Purpose | Sensitivity Range | Specificity Range | Interchangeability Recommendation |
|---|---|---|---|---|
| FDA-approved Companion Diagnostic | Pembrolizumab (TPS â¥1%) | 85-98% | 92-97% | Reference standard for intended use |
| FDA-approved Companion Diagnostic | Nivolumab (TPS â¥5%) | 82-95% | 88-96% | Reference standard for intended use |
| Laboratory Developed Tests | Various purposes | 45-99% | 50-98% | Highly variable; requires rigorous validation |
| Alternate FDA-approved Assays | Cross-purpose application | 65-92% | 70-94% | Not recommended without validation |
The meta-analysis revealed that laboratory-developed tests (LDTs) show wide variability in diagnostic accuracy, with sensitivity ranging from 45-99% and specificity from 50-98% depending on the specific validation protocols used [84]. This variability stems from differences in IHC protocol conditions across laboratories, even when using the same primary antibody on the same automated instrument with the same detection system. These differences can include variations in antigen retrieval methods, primary antibody dilution, incubation time, and amplification steps [84].
When applying clinically acceptable diagnostic accuracy thresholds (both sensitivity and specificity â¥90%), the evidence suggests that replacing an FDA-approved companion diagnostic developed for a specific purpose with another FDA-approved companion diagnostic developed for a different purpose generally does not maintain sufficient diagnostic accuracy [84]. This finding highlights the importance of the "fit-for-purpose" approach to test development and validation, which establishes explicit links between Disease, Drug, and Diagnostic assay (the "3D" concept) [84].
Beyond conventional IHC, emerging methodologies offer alternative approaches for assessing the PD-1/PD-L1 axis. Multiplex immunohistochemistry/immunofluorescence (mIHC/IF) has demonstrated superior performance in predicting response to anti-PD-1/PD-L1 therapy, exhibiting the highest sensitivity (0.76) and second-highest diagnostic odds ratio (5.09) among various biomarker testing modalities [15]. This enhanced performance likely stems from the ability to simultaneously evaluate multiple cell types and spatial relationships within the tumor microenvironment.
Other biomarker approaches include:
For soluble PD-L1 detection, enzyme-linked immunosorbent assay (ELISA) and electrochemiluminescent immunoassay methodologies have been developed and popularized in recent years (2019-2021), offering advantages of easy accessibility, non-invasiveness (using blood samples), quantitative outputs, and relatively rapid turnaround times [83].
The Blueprint Programmed Death Ligand 1 Immunohistochemistry Comparability Project represents a systematic international effort to assess the concordance of different PD-L1 IHC assays. The second phase of this project (BP2) involved 24 pathologists examining 81 different lung cancer cases representing various cancer subtypes, all collected during routine clinical practice to enhance real-world applicability [82].
The experimental methodology included:
This rigorous experimental design provided comprehensive data on both inter-observer reproducibility and inter-assay variability, establishing a benchmark for PD-L1 assay comparison studies.
The meta-analysis of PD-L1 assay diagnostic accuracy followed a structured methodology to ensure comprehensive evidence synthesis [84]:
This methodology enabled quantitative comparison of assay performance across multiple studies while accounting for between-study variability and potential biases.
The following table details key research reagents and their applications in PD-L1 assay development and validation:
Table 3: Essential Research Reagents for PD-L1 Assay Development
| Reagent Category | Specific Examples | Research Application | Validation Considerations |
|---|---|---|---|
| Primary Antibodies | 22C3, 28-8, SP142, SP263, 73-10, E1L3N | PD-L1 detection in IHC | Epitope specificity, sensitivity, cross-reactivity |
| Detection Systems | Dako EnVision FLEX, Ventana OptiView | Signal amplification and detection | Amplification efficiency, background staining |
| Staining Platforms | Dako Autostainer, Ventana Benchmark | Automated IHC staining | Protocol standardization, reproducibility |
| Control Materials | Cell lines with known PD-L1 expression, tissue microarrays | Assay validation controls | Expression level verification, stability |
| Validation Reagents | CRISPR knockout cells, recombinant protein | Specificity confirmation | Target verification, off-target effects |
Antibody validation should follow comprehensive approaches such as the Hallmarks of Antibody Validation, which includes six complementary strategies: genetic validation (knockout/CRISPR), orthogonal comparison with non-antibody methods, independent antibody correlation, expression of tagged proteins, immunoprecipitation followed by mass spectrometry, and biological validation across diverse cell lines and tissues [85]. Critically, no single assay is sufficient to validate an antibody, including knockout validation alone, as antibody performance can vary significantly across different applications and experimental conditions [85].
The inter-assay variability in PD-L1 detection stems from multiple technical factors including antibody clones, detection platforms, and scoring methodologies. Evidence from systematic comparisons indicates that while some assays (22C3, 28-8, and SP263) show reasonable concordance, others (particularly SP142 and 73-10) demonstrate significant differences in analytical sensitivity [82]. This variability has direct implications for patient classification and therapeutic decision-making.
For clinical laboratories, the choice between FDA-approved companion diagnostics and laboratory-developed tests requires careful consideration of diagnostic accuracy for specific clinical purposes [84]. The meta-analysis evidence suggests that proper validation is essential, and that replacing an FDA-approved companion diagnostic with another assay developed for a different purpose may not maintain sufficient diagnostic accuracy. Rather, developing a properly validated laboratory-developed test for the same purpose as the original FDA-approved companion diagnostic represents a more reliable approach [84].
As the field of immunotherapy continues to evolve, emerging technologies such as multiplex IHC/IF and combined biomarker approaches offer promising avenues for improved predictive accuracy [15]. However, these advances must be balanced against the practical need for standardization and harmonization across laboratories to ensure consistent patient care and reliable research outcomes.
The analytical validation of PD-L1 assays is a critical component in the paradigm of precision immuno-oncology. For researchers and drug development professionals, ensuring that these assays yield reliable, reproducible, and clinically actionable data is paramount. Quality assurance (QA) encompasses a broad spectrum of activities, from the initial analytical validation of a test to its ongoing performance monitoring through external proficiency testing (EPT). The fundamental goal is to minimize pre-analytical, analytical, and post-analytical variables that can confound the accurate measurement of PD-L1 expression, a key predictive biomarker for response to immune checkpoint inhibitors [83] [32]. The challenges in this field are significant, driven by the diversity of available assays, including different antibody clones, scoring algorithms, and diagnostic platforms. This guide objectively compares the performance of various PD-L1 testing methodologies and QA approaches, providing a foundational resource for robust biomarker development.
A comprehensive understanding of the relative strengths and weaknesses of different biomarker testing modalities is essential for selecting the right analytical tool for clinical research and drug development.
A recent network meta-analysis (NMA) compared the diagnostic accuracy of seven major biomarker testing assays for predicting response to anti-PD-1/PD-L1 monotherapy. The analysis incorporated 144 diagnostic index tests from 49 studies, encompassing data from 5,322 patients [15].
Table 1: Diagnostic Performance of Biomarker Assays for Predicting Immunotherapy Response [15]
| Assay | Sensitivity (95% CI) | Specificity (95% CI) | Diagnostic Odds Ratio (95% CI) | Superiority Index |
|---|---|---|---|---|
| Multiplex IHC/IF (mIHC/IF) | 0.76 (0.57 - 0.89) | - | 5.09 (1.35 - 13.90) | 2.86 |
| Microsatellite Instability (MSI) | - | 0.90 (0.85 - 0.94) | 6.79 (3.48 - 11.91) | - |
| PD-L1 IHC + TMB (Combined) | 0.89 (0.82 - 0.94) | - | - | - |
The data reveal that mIHC/IF exhibited the highest sensitivity, making it a powerful tool for identifying potential responders, while MSI testing demonstrated the highest specificity, effectively ruling out non-responders. Notably, combining PD-L1 IHC with TMB significantly improved sensitivity, underscoring the value of multi-analyte approaches in overcoming the limitations of single-analyte tests [15]. The performance of these assays also varied by tumor type. For instance, mIHC/IF and other IHC & H&E-based methods showed high predictive efficacy in non-small cell lung cancer (NSCLC), whereas PD-L1 IHC and MSI were particularly effective in gastrointestinal tumors [15].
The heart of PD-L1 QA lies in understanding the performance characteristics of various immunohistochemistry (IHC) assays. Multiple FDA-approved and laboratory-developed tests (LDTs) are in use, each with its own profile.
Table 2: Comparative Analysis of PD-L1 IHC Assays [15] [83] [32]
| Assay / Clone | Regulatory Status | Key Characteristics | Performance Notes |
|---|---|---|---|
| 22C3 (Dako/Agilent) | FDA-approved CDx | Companion diagnostic for pembrolizumab in NSCLC. | High similarity and potential interchangeability with 28-8 and SP263 assays demonstrated in multi-institutional studies [86] [32]. |
| SP263 (Ventana) | FDA-approved CDx | Complementary diagnostic. | Shows high concordance with 22C3 and 28-8; used for emerging scores like Tumor Area Positivity (TAP) [32] [62]. |
| 28-8 (Dako/Agilent) | FDA-approved CDx | Complementary diagnostic. | Highly similar to 22C3 and SP263 in quantitative comparisons [32]. |
| SP142 (Ventana) | FDA-approved CDx | Complementary diagnostic; lower sensitivity. | Consistently fails to detect low PD-L1 levels distinguished by other assays; shows lower sensitivity in multi-institutional settings [32]. |
| E1L3N (LDT) | Laboratory Developed Test | Used in various LDTs. | High consistency with 22C3, 28-8, and SP263 FDA assays when properly validated [32]. |
A critical finding from multi-institutional studies is that the assays for 22C3, 28-8, SP263, and the E1L3N LDT are highly similar, whereas the SP142 assay consistently demonstrates lower detection sensitivity for PD-L1 expression [32]. Furthermore, emerging scoring methods like the Tumor Area Positivity (TAP) score show significant agreement with established scores such as the Combined Positive Score (CPS), with Cohenâs κ ranging from 0.64 to 0.85 across different cutoffs in clinical trials for gastric and esophageal cancers [62].
External Proficiency Testing (EPT) is an indispensable tool for objectively measuring a laboratory's testing accuracy, turnaround time, and reporting clarity against peer institutions.
A novel, comprehensive EQA program conducted by the Canadian Pathology Quality Assurance (CPQA) provided stark insights into real-world laboratory performance. In this exercise, 13 laboratories processed three challenging NSCLC cases with the goal of delivering a complete biomarker report, with performance measured on accuracy, report clarity, and turnaround time [87].
This study highlights that accuracy alone is insufficient; timely and clear reporting are equally critical for enabling precision oncology in clinical practice.
The use of standardized, quantitative control materials is a advanced strategy for normalizing PD-L1 measurement across platforms and sites. Research has demonstrated the utility of a standardized PD-L1 Index Tissue Microarray (TMA) constructed from a panel of 10 isogenic cell lines with varying levels of PD-L1 expression [32].
Experimental Protocol: Quantitative PD-L1 Assay Comparison Using Index TMA [32]
This methodology allowed for an objective, quantitative comparison that isolated the analytical performance of the assay from the subjective interpretation of the pathologist, confirming the lower sensitivity of the SP142 assay and the high concordance of the others in a multi-institutional setting [32].
The following table details essential materials and their functions as derived from the experimental protocols cited in this guide.
Table 3: Essential Research Reagents and Materials for PD-L1 Assay Validation [22] [32]
| Item | Function in QA/Validation | Specific Examples / Clones |
|---|---|---|
| Isogenic Cell Line FFPE Blocks | Serves as a reproducible, standardized control material with a defined dynamic range of PD-L1 expression for cross-assay and cross-laboratory comparison. | Horizon Discovery PD-L1 isogenic cell line panel [32]. |
| Index Tissue Microarray (TMA) | High-throughput platform for analyzing multiple standardized cell lines or tissues simultaneously on a single slide, reducing staining variability and resource consumption. | Custom TMA with 10 cell lines in triplicate [32]. |
| Anti-PD-L1 Antibody Clones | Key reagents for IHC detection; different clones have distinct binding affinities and epitopes, influencing staining performance. | 22C3, 28-8, SP263, SP142, E1L3N [83] [32]. |
| Quantitative Image Analysis Software | Enables objective, reproducible quantification of biomarker expression, moving beyond subjective pathologist scoring. | QuPath (for chromogenic IHC), AQUA/NavigateBP (for QIF) [32]. |
| Recombinant Protein-Coated Beads | Synthetic controls used to validate antibody specificity and for constructing standard curves for quantitative microscopy. | ELISA beads coated with recombinant PD-L1 or HLA I [22]. |
| Circulating Tumor Cell (CTC) Enrichment Kits | Facilitates liquid biopsy approaches for serial monitoring of PD-L1 and other biomarkers (e.g., HLA I) from patient blood. | Exclusion-based sample preparation (ESP) technology, e.g., ExtractMax system [22]. |
The advent of immune checkpoint inhibitors targeting the programmed cell death protein 1 (PD-1) and its ligand (PD-L1) has transformed cancer treatment paradigms for various solid tumors, including non-small cell lung cancer (NSCLC) and head and neck squamous cell carcinoma (HNSCC) [88] [89]. PD-L1 immunohistochemistry (IHC) has emerged as a critical predictive biomarker to identify patients most likely to benefit from these therapies [84]. Consequently, multiple commercially available PD-L1 IHC assays have been developed and approved as companion or complementary diagnostics for specific immune checkpoint inhibitors [88] [90].
The clinical utility of PD-L1 testing depends fundamentally on the analytical validation of these assays, with sensitivity, specificity, and reproducibility representing essential performance metrics [84] [91]. However, the landscape of PD-L1 testing is complicated by the existence of multiple standardized assays (22C3, 28-8, SP142, and SP263) developed on different staining platforms with distinct scoring algorithms [92]. This complexity is further compounded by the widespread use of laboratory-developed tests (LDTs) and the challenges inherent in pathologist interpretation [88] [93].
This review systematically compares the analytical performance of PD-L1 IHC assays, focusing on their sensitivity, specificity, and reproducibility profiles. By synthesizing evidence from method comparison studies, meta-analyses, and reproducibility assessments, we aim to provide researchers and drug development professionals with a comprehensive evaluation of PD-L1 assay performance characteristics essential for robust biomarker implementation in both clinical trials and practice.
Substantial evidence demonstrates that not all PD-L1 assays exhibit equivalent diagnostic performance. A comprehensive network meta-analysis comparing predictive biomarker testing assays for PD-1/PD-L1 inhibitors found that multiplex immunohistochemistry/immunofluorescence (mIHC/IF) displayed the highest sensitivity (0.76, 95% CI: 0.57-0.89) among various testing modalities, while microsatellite instability (MSI) showed the highest specificity (0.90, 95% CI: 0.85-0.94) and diagnostic odds ratio (6.79, 95% CI: 3.48-11.91) [15]. When focusing specifically on PD-L1 IHC assays, this analysis revealed that performance varied significantly by tumor type, with PD-L1 IHC demonstrating particularly high predictive efficacy in gastrointestinal tumors [15].
A critical meta-analysis addressing PD-L1 assay interchangeability based on diagnostic accuracy examined 376 assay comparisons from 22 studies [84]. This analysis established that for clinical application, PD-L1 IHC assays should demonstrate both sensitivity and specificity â¥90% relative to their reference standards. The findings indicated that properly validated LDTs could achieve this performance threshold, whereas attempts to use an FDA-approved companion diagnostic for a purpose other than its intended clinical application frequently resulted in suboptimal diagnostic accuracy [84].
Table 1: Diagnostic Accuracy of PD-L1 IHC Assays Across Tumor Types
| Assay | Sensitivity (Range) | Specificity (Range) | Optimal Tumor Types | Interchangeability Recommendations |
|---|---|---|---|---|
| 22C3 | High (â¥90% when properly validated) | High (â¥90% when properly validated) | NSCLC, HNSCC | Interchangeable with 28-8 and SP263 for NSCLC |
| 28-8 | High (â¥90% when properly validated) | High (â¥90% when properly validated) | NSCLC, Melanoma | Interchangeable with 22C3 and SP263 for NSCLC |
| SP263 | High (â¥90% when properly validated) | High (â¥90% when properly properly validated) | NSCLC, Urothelial Carcinoma | Interchangeable with 22C3 and 28-8 for NSCLC |
| SP142 | Lower than other assays | Variable | NSCLC (especially immune cell scoring) | Not interchangeable with other assays |
The concept of sensitivity in PD-L1 testing encompasses both clinical diagnostic sensitivity and analytical sensitivity. A groundbreaking survey of 41 laboratories across North America and Europe utilizing NIST-traceable PD-L1 calibrators revealed that the four FDA-cleared PD-L1 assays actually represent three distinct levels of analytical sensitivity [91]. These differences in lower limit of detection (LOD) explain why some patient tissue samples test positive by one assay but negative by another, highlighting a critical challenge in assay harmonization.
This calibrated approach demonstrated that previous attempts to harmonize certain PD-L1 assays were unsuccessful because their dynamic ranges were too disparate and non-overlapping [91]. Furthermore, the calibration clarified the exact performance characteristics of LDTs relative to FDA-cleared commercial assays, with some LDTs showing nearly indistinguishable analytic response curves from their predicate FDA-cleared assays when properly optimized and validated [91].
The reproducibility of PD-L1 assessment represents a significant challenge in clinical practice, with studies demonstrating variable concordance among pathologists. A comprehensive study evaluating ten surgical pathologists assessing 108 NSCLC samples reported overall percent agreement (OPA) for intra-observer reproducibility of 89.7% at the 1% cut-off and 91.3% at the 50% cut-off [90]. This indicates that approximately 9.5% of intra-observer assessments were irreproducible, potentially leading to different treatment decisions for nearly 1 in 10 patients.
Inter-observer reproducibility presents even greater challenges, with OPA of 84.2% at the 1% cut-off and 81.9% at the 50% cut-off [90]. Notably, pathologist variability was highest for samples with PD-L1 tumor proportion scores (TPS) between 30% and 80%, particularly concerning given that the 50% cut-off determines first-line treatment eligibility for pembrolizumab in metastatic NSCLC [90]. Training interventions demonstrated limited impact, with only slight improvements in concordance after a 1-hour training session [90].
Table 2: Reproducibility Metrics for PD-L1 Assessment in NSCLC
| Reproducibility Metric | 1% Cut-off | 50% Cut-off | Key Findings |
|---|---|---|---|
| Intra-observer Agreement (OPA) | 89.7% | 91.3% | Mean of 9.5% irreproducible assessments |
| Inter-observer Agreement (OPA) | 84.2% | 81.9% | Mean of 17% irreproducible assessments between observers |
| Impact of Training | Minimal improvement | Slight improvement (78.3% to 81.7%) | Brief training sessions insufficient to substantially improve concordance |
| Most Problematic Range | - | 30-80% TPS | Highest variability around clinical decision point |
Recent studies have investigated technological solutions to enhance PD-L1 scoring reproducibility. A sophisticated approach comparing single PD-L1 IHC (S-IHC) with double IHC (D-IHC) combining PD-L1 staining with tumor nuclear markers demonstrated excellent to good inter- and intra-pathologist agreements for both TPS and combined positive score (CPS) [93]. The D-IHC method, which facilitates distinction between tumor cells and immune cells, yielded slightly higher intraclass correlation coefficients (ICC > 0.9 for TPS and > 0.75 for CPS) than conventional S-IHC [93].
Automated image analysis represents another promising approach to reduce variability. A study developing a computer-aided automated image analysis with customized PD-L1 scoring algorithm demonstrated high concordance with pathologist scores (F1 scores ranging from 0.8 to 0.9 across varying PD-L1 cut-offs) [92]. This quantitative comparison confirmed previous findings indicating high concordance between the Ventana SP263 and Dako 22C3 and 28-8 PD-L1 IHC assays across a broad range of cut-offs, while the Ventana SP142 assay showed distinct characteristics [92].
The analytical validation of PD-L1 assays requires strict adherence to standardized staining and scoring protocols. In comparative studies, consecutive sections from tumor samples are typically stained with multiple PD-L1 assays using their respective automated platforms [94]. For example, the PD-L1 IHC 22C3 pharmDx assay is performed on the Dako platform, while Ventana SP142 and SP263 assays run on the Ventana Benchmark series [94]. This platform-specific requirement necessitates careful protocol design in comparability studies.
Scoring methodologies must align with the specific requirements of each assay. The tumor proportion score (TPS) quantifies the percentage of viable tumor cells showing partial or complete membrane staining, while the combined positive score (CPS) includes both tumor cells and immune cells in its calculation [89]. Studies consistently show that pathologists demonstrate higher reliability in scoring TPS compared to CPS, particularly when using the SP142 assay [94]. Up to 18% of samples may be misclassified by individual pathologists compared to consensus scores at the CPS â¥1 cut-off [94].
The integration of digital pathology and automated image analysis represents a significant advancement in PD-L1 assay validation methodologies. The typical workflow involves:
Whole Slide Digitization: PD-L1-stained slides are scanned using high-resolution slide scanners (e.g., Aperio Scanscope at 20x magnification) to create whole slide images [92].
Image Co-registration: Consecutive sections stained with different assays are digitally aligned to ensure analysis is restricted to comparable tissue areas [92].
Automated Image Analysis: Customized algorithms segment and classify tumor cells, immune cells, and PD-L1-positive cells [92] [89].
Quantitative Scoring: The algorithm calculates TPS and CPS based on predefined thresholds [92].
This automated approach facilitates more quantitative comparisons between assays and reduces inter-observer variability, providing a more objective assessment of PD-L1 expression [92]. Studies utilizing open-source bioimage analysis platforms like QuPath have demonstrated the ability to systematically evaluate PD-L1 expression across different specimen types, including preoperative biopsies, surgical resections, and metastatic lymph nodes [89].
Diagram 1: PD-L1 Assay Validation Workflow. This diagram illustrates the integrated experimental, digital, and analytical phases of comprehensive PD-L1 assay validation, highlighting critical steps from tissue processing to quantitative performance metrics.
Table 3: Essential Research Reagents and Platforms for PD-L1 Assay Validation
| Category | Specific Products/Platforms | Research Application |
|---|---|---|
| PD-L1 IHC Assays | 22C3 pharmDx (Agilent), 28-8 (Agilent), SP142 (Ventana), SP263 (Ventana) | Companion diagnostics for specific immune checkpoint inhibitors; comparison studies for assay harmonization |
| Automated Staining Platforms | Dako Autostainer Link 48, Ventana Benchmark Series | Platform-specific assay performance; essential for standardized staining conditions |
| Digital Pathology Systems | Aperio Scanscope (Leica), Philips Intellisite, 3DHistech Pannoramic | Whole slide imaging for quantitative analysis; enables pathologist consensus review and automated image analysis |
| Image Analysis Software | QuPath, HALO, Aperio Image Analysis Toolbox | Automated cell segmentation and classification; quantitative assessment of TPS and CPS with reduced inter-observer variability |
| Reference Materials | NIST-traceable PD-L1 calibrators, cell line controls, tissue microarrays | Assay standardization and harmonization; enables comparison of analytical sensitivity across platforms |
| Tumor Tissue Models | Commercial NSCLC tissue sections, cell line xenografts, tissue microarrays | Analytical validation studies; assessment of inter-assay concordance and reproducibility |
The analytical validation of PD-L1 IHC assays remains challenging due to the complex interplay of multiple factors including assay sensitivity, scoring methodologies, pathologist expertise, and tissue heterogeneity. Evidence from multiple studies indicates that while the 22C3, 28-8, and SP263 assays demonstrate high concordance and may be interchangeable for NSCLC testing, the SP142 assay shows distinct characteristics with generally lower sensitivity for tumor cell staining [88] [92] [94]. This supports the approach of using properly validated LDTs that demonstrate comparable performance to FDA-approved companion diagnostics for their intended purposes [84].
Reproducibility challenges, particularly around critical clinical cut-offs (1% and 50% for TPS), highlight the need for improved training methodologies and decision support tools [93] [90]. The development of automated image analysis systems and dual IHC approaches showing enhanced reproducibility offers promising avenues for more consistent PD-L1 scoring [93] [92]. Furthermore, the introduction of NIST-traceable calibrators represents a significant advancement in standardizing PD-L1 measurement across platforms, potentially transforming the landscape of companion diagnostic testing [91].
Future efforts should focus on standardizing pre-analytical factors, validating novel technological approaches across diverse tumor types, and establishing more robust reference standards for PD-L1 quantification. As PD-L1 testing expands to additional cancer types and combination immunotherapy approaches, the principles of rigorous analytical validationâencompassing sensitivity, specificity, and reproducibilityâwill remain fundamental to ensuring accurate patient selection for these transformative therapies.
Diagram 2: Core Analytical Validation Metrics for PD-L1 Assays. This diagram illustrates the relationship between fundamental validation metrics (sensitivity, specificity, reproducibility) and their critical impacts on clinical research and patient care, highlighting their interconnected nature in comprehensive assay evaluation.
The advent of immune checkpoint inhibitors has established PD-L1 immunohistochemistry (IHC) as a critical predictive biomarker for immunotherapy response in multiple cancer types, including non-small cell lung cancer (NSCLC) [95]. However, the development of distinct PD-L1 IHC assays by different pharmaceutical companies, each with unique antibodies, platforms, and scoring criteria, has created significant challenges for diagnostic standardization [84]. This landscape has prompted extensive research into the interchangeability of these assaysâwhether one FDA-approved companion diagnostic can be reliably substituted for another when the intended clinical purpose remains the same [96].
The clinical necessity for interchangeability stems from practical healthcare constraints. In publicly funded health systems, it is often challenging to maintain multiple dedicated testing platforms for a single biomarker [84]. Laboratories seeking to implement PD-L1 testing thus face a critical decision: whether to adopt the specific FDA-approved companion diagnostic for each drug, develop a properly validated laboratory-developed test (LDT), or use an alternative FDA-approved assay that was validated for a different clinical context [84]. This review synthesizes evidence from meta-analyses and clinical studies to evaluate the diagnostic accuracy and interchangeability of various PD-L1 assays, providing evidence-based guidance for clinical laboratories and researchers.
A comprehensive meta-analysis published in Modern Pathology established a rigorous purpose-based framework for evaluating PD-L1 assay interchangeability [84] [96]. This approach contends that interchangeability should be assessed not merely through analytical comparison, but through diagnostic accuracy for specific clinical purposes defined by drug-indication pairs [84]. The analysis employed modified GRADE and QUADAS-2 criteria for evaluating published evidence and designed data abstraction templates for independent extraction by multiple reviewers [96]. PRISMA guidelines directed the systematic review reporting, while STARD 2015 standards guided the diagnostic accuracy assessment [96].
The meta-analysis accumulated data from 22 studies, providing 376 assay comparisons for analysis [84] [96]. Most evaluations focused on NSCLC, resulting in 337 test comparisons, with smaller numbers in urothelial carcinoma (20 comparisons), mesothelioma (9 comparisons), and thymic carcinoma (9 comparisons) [84]. The primary outcome measure was diagnostic accuracy (sensitivity and specificity) of various PD-L1 assays at specific clinical cut-points, with assays considered clinically acceptable only if both sensitivity and specificity reached â¥90% for the stated clinical purpose [84].
Table 1: Diagnostic Accuracy of PD-L1 Assays for Pembrolizumab Selection in NSCLC
| Assay Type | Clinical Purpose | TPS Cut-point | Sensitivity | Specificity | Interchangeability |
|---|---|---|---|---|---|
| FDA-approved CDx (22C3) | Pembrolizumab selection | 1% | Reference | Reference | Reference standard |
| FDA-approved CDx (28-8) | Nivolumab (complementary) | 1% | 93% | 94% | Acceptable |
| FDA-approved CDx (SP263) | Durvalumab (bladder cancer) | 1% | 91% | 95% | Acceptable |
| Laboratory Developed Tests | Various | 1% | Variable | Variable | Highly variable |
| FDA-approved CDx (22C3) | Pembrolizumab selection | 50% | Reference | Reference | Reference standard |
| FDA-approved CDx (28-8) | Nivolumab (complementary) | 50% | 94% | 96% | Acceptable |
| FDA-approved CDx (SP263) | Durvalumab (bladder cancer) | 50% | 92% | 97% | Acceptable |
| Laboratory Developed Tests | Various | 50% | Variable | Variable | Highly variable |
The meta-analysis revealed that for NSCLC, the 22C3, 28-8, and SP263 assays demonstrated sufficient diagnostic accuracy to be considered interchangeable at both the 1% and 50% tumor proportion score (TPS) cut-points [84]. In contrast, the SP142 assay consistently showed lower sensitivity, identifying fewer PD-L1 positive cases compared to other assays, thus limiting its interchangeability for pembrolizumab selection [32] [84]. This finding aligns with earlier analytical studies, including the Blueprint Project, which also noted the lower sensitivity of the SP142 assay [32].
A critical conclusion from the meta-analysis was that when a laboratory cannot implement the specific FDA-approved companion diagnostic for a clinical purpose, developing a properly validated laboratory-developed test (LDT) for that specific purpose represents a better alternative than substituting an FDA-approved companion diagnostic validated for a different purpose [84] [96]. However, the performance of LDTs was highly variable between laboratories, even when using the same primary antibody, underscoring the importance of rigorous validation [84].
Recent clinical evidence has further strengthened the case for interchangeability between specific PD-L1 assays. A 2025 bridging study from the EMPOWER-Lung 1 trial provided compelling data on the interchangeability of the 22C3 and SP263 assays for selecting NSCLC patients with PD-L1 TPS â¥50% for first-line cemiplimab therapy [97]. In this novel analysis, 871 patient samples were retrospectively tested using both the 22C3 and SP263 assays, including 481 enrolled patients and 390 screening failures [97].
The study demonstrated an overall percent agreement of 88% between the two assays in classifying patients as above or below the 50% TPS threshold [97]. More importantly, clinical efficacy outcomes were nearly identical between the populations defined by each assay. In the 22C3+/SP263+ population (n=324), the hazard ratio for overall survival was 0.52 (95% CI: 0.34-0.80) for cemiplimab versus chemotherapy, closely mirroring the efficacy in the original 22C3+ population (n=563) [97]. Sensitivity analyses of the overall SP263+ population showed consistent results with the primary analysis, leading the authors to conclude similar efficacy and demonstrate interchangeability for selecting patients with PD-L1 â¥50% for first-line cemiplimab monotherapy [97].
Table 2: Performance Comparison of Pathologists vs. AI Algorithms in PD-L1 Scoring
| Assessment Method | TPS <1% Agreement (Fleiss' Kappa) | TPS â¥50% Agreement (Fleiss' Kappa) | Intraobserver Consistency (Cohen's Kappa) | Key Limitations |
|---|---|---|---|---|
| Pathologists (Light Microscopy) | 0.558 (Moderate) | 0.873 (Almost Perfect) | 0.726-1.0 (High) | Reference standard |
| Pathologists (Whole Slide Images) | Similar to light microscopy | Similar to light microscopy | Similar to light microscopy | Comparable to conventional methods |
| uPath Software (Roche) | Not reported | 0.354 (Fair) | Not reported | Requires manual tumor area selection |
| Visiopharm Application | Not reported | 0.672 (Substantial) | Not reported | Less consistent than pathologists |
A 2025 study evaluating the comparative effectiveness of pathologists versus artificial intelligence algorithms in scoring PD-L1 expression in NSCLC provides additional context for interchangeability considerations [95]. The study revealed that pathologists demonstrated moderate interobserver agreement (Fleiss' kappa 0.558) for TPS <1% and almost perfect agreement (Fleiss' kappa 0.873) for TPS â¥50% [95]. Intraobserver consistency was high, with Cohen's kappa ranging from 0.726 to 1.0 [95].
When compared to the median pathologist scores, AI algorithms showed less consistent performance, with fair agreement for uPath (Fleiss' kappa 0.354) and substantial agreement for the Visiopharm application (Fleiss' kappa 0.672) at the 50% TPS cutoff [95]. This highlights that while AI tools show promise for augmenting pathologist workflow, they require further refinement to match the reliability of expert human evaluation, particularly in critical clinical decision-making contexts [95].
Significant efforts have been made to develop standardization tools that facilitate objective comparison between PD-L1 assays. One innovative approach involved creating a standardized PD-L1 Index Tissue Microarray (TMA) containing a panel of 10 isogenic cell lines with predetermined PD-L1 expression levels [32]. This TMA was used to objectively compare five PD-L1 chromogenic IHC assays (both FDA-approved and LDTs) across 12 sites in the United States [32].
The study confirmed previous subjective assessments quantitatively, demonstrating that the SP142 assay failed to detect low PD-L1 levels in cell lines distinguished by the other four assays [32]. Conversely, the 22C3, 28-8, SP263, and E1L3N assays showed high similarity across sites, with all laboratories demonstrating consistent performance over time when using the Index TMA [32]. This approach enables commercial use of standardized materials as a mechanism to compare results between institutions and identify abnormalities during routine clinical testing.
Beyond traditional tissue-based IHC, emerging technologies offer alternative approaches to PD-L1 assessment. Circulating tumor cell (CTC) analysis represents a promising liquid biopsy method that captures heterogeneity across multiple metastatic sites and enables serial monitoring [22]. Recent studies have developed exclusion-based sample preparation technology combined with quantitative microscopy to quantify PD-L1 and HLA I expression on CTCs from NSCLC patients [22].
Analytical validation of these methodologies demonstrated high precision and accuracy using diverse control materials, with preliminary clinical testing showing heterogeneity in PD-L1 and HLA I expression and potential value in predicting progression-free survival in response to PD-L1 targeted therapies [22]. Similarly, commercial CTC platforms like the RarePlex system have demonstrated high recovery rates (96%) and accuracy in PD-L1 detection on CTCs, providing robust research tools for biomarker expression analysis [98].
The evidence supporting PD-L1 assay interchangeability derives from several sophisticated experimental approaches:
Meta-Analysis Protocol: The comprehensive meta-analysis followed a rigorous systematic review process, searching MEDLINE via PubMed from January 2015 to August 2018 using "PD-L1" as the primary search term [84] [96]. From 2,515 initially identified abstracts, 57 studies comparing two or more PD-L1 assays were fully reviewed, with 22 publications ultimately selected for meta-analysis [84]. Additional data were requested from authors of 20 studies to enable construction of 2Ã2 contingency tables for diagnostic accuracy calculations [96]. Data were pooled using random-effects models, with Cochran's heterogeneity statistics (Q and I²) used to examine heterogeneity among studies [84].
Clinical Bridging Study Design: The EMPOWER-Lung 1 bridging study retrospectively tested 871 patient samples using the SP263 assay, including both enrolled patients and screening failures [97]. This design enabled calculation of overall percent agreement between assays and, crucially, comparison of clinical efficacy outcomes (overall survival and progression-free survival) between populations defined by each assay, providing direct evidence of therapeutic interchangeability [97].
Multi-Institutional Standardization Testing: The PD-L1 Index TMA study involved twelve 5-µm sections cut from a single TMA block and distributed to 12 institutions for staining weekly during six consecutive weeks [32]. Each site used their clinical PD-L1 assays according to standard protocols, with subsequent digital image analysis performed using QuPath software to objectively quantify PD-L1 expression through cell segmentation and DAB intensity quantification [32].
Diagram 1: Methodological Workflow for PD-L1 Assay Interchangeability Assessment
Table 3: Key Research Reagent Solutions for PD-L1 Assay Development
| Reagent/Platform | Manufacturer | Primary Function | Application in PD-L1 Testing |
|---|---|---|---|
| PD-L1 IHC 22C3 PharmDx | Agilent Technologies | Companion diagnostic | FDA-approved for pembrolizumab in NSCLC |
| VENTANA PD-L1 (SP263) | Roche Diagnostics | Companion diagnostic | FDA-approved for durvalumab in urothelial cancer |
| PD-L1 (SP142) Assay | Ventana Medical Systems | Complementary diagnostic | FDA-approved for atezolizumab in NSCLC/urothelial |
| PD-L1 (28-8) Assay | Dako | Complementary diagnostic | FDA-approved for nivolumab in multiple cancers |
| uPath PD-L1 Software | Roche | Digital image analysis | AI-based TPS scoring (IVDD certified) |
| Visiopharm PD-L1 TME | Visiopharm | Digital image analysis | AI-based tumor microenvironment analysis |
| RarePlex CTC Assays | RareCyte | Circulating tumor cell analysis | PD-L1 expression on CTCs from blood samples |
| PD-L1 Index TMA | Yale University | Assay standardization | Multi-institutional performance monitoring |
| Quantitative Microscopy | Various | Protein quantification | Objective PD-L1 expression measurement |
The accumulated evidence from meta-analyses and clinical studies indicates that specific PD-L1 assays demonstrate sufficient diagnostic accuracy to be considered interchangeable for defined clinical purposes in NSCLC, particularly at the critical 1% and 50% TPS thresholds [84] [97]. The 22C3, 28-8, and SP263 assays show strong concordance, while the SP142 assay demonstrates consistently lower sensitivity, limiting its interchangeability [32] [84].
For clinical laboratories, this evidence supports two validated approaches when the specific FDA-approved companion diagnostic is unavailable: implementing a properly validated laboratory-developed test designed for the same clinical purpose, or substituting with an alternative FDA-approved assay that has demonstrated sufficient diagnostic accuracy for that purpose [84] [96]. The latter approach received strong recent support from the EMPOWER-Lung 1 bridging study, which showed nearly identical clinical efficacy when using either 22C3 or SP263 assays for patient selection [97].
Standardization tools such as the PD-L1 Index TMA and quantitative digital image analysis provide objective methods for comparing assay performance across institutions [32]. Meanwhile, emerging technologies including AI scoring algorithms and CTC-based PD-L1 assessment offer promising avenues for further refinement of PD-L1 as a predictive biomarker, though they require additional validation before routine clinical implementation [95] [22] [98]. As the field evolves, continued emphasis on evidence-based interchangeability assessments will be crucial for ensuring equitable patient access to predictive biomarker testing without compromising therapeutic efficacy.
The analytical validation of PD-L1 assays is a critical prerequisite for their successful application as companion diagnostics in immuno-oncology. A fundamental aspect of this validation is assessing the concordance of staining patterns between tumor cells (TCs) and immune cells (ICs). The spatial distribution and relative abundance of PD-L1 expression across these cellular compartments exhibit significant heterogeneity across cancer types, which directly impacts the selection of appropriate scoring algorithms and the predictability of response to immune checkpoint inhibitors (ICIs) [1] [89]. This guide systematically compares the performance of various PD-L1 immunohistochemistry (IHC) assays and scoring methods, providing researchers and drug development professionals with consolidated experimental data and methodological insights to inform assay selection and validation in clinical research settings.
The evaluation of PD-L1 assay concordance requires rigorously controlled experimental conditions to ensure meaningful comparisons. Representative studies employ formalin-fixed, paraffin-embedded (FFPE) tissue samples sectioned at standardized thicknesses (typically 4μm) and stained using automated platforms with manufacturer-specified reagents and protocols [89] [94]. For instance, the PD-L1 IHC 22C3 pharmDx assay is typically run on the Dako platform, while VENTANA assays (SP263, SP142) utilize the BenchMark ULTRA system [89] [94]. To control for pre-analytical variables, tissue microarrays (TMAs) constructed from well-characterized patient samples enable parallel evaluation of multiple assays under identical conditions [13].
Two principal scoring systems are employed for PD-L1 assessment: the Tumor Proportion Score (TPS), which calculates the percentage of viable tumor cells displaying partial or complete membrane staining, and the Combined Positive Score (CPS), which incorporates both tumor and immune cell staining by dividing the total number of PD-L1-positive cells (tumor cells, lymphocytes, macrophages) by the total number of viable tumor cells, multiplied by 100 [9] [89]. An emerging metric, the Tumor Area Positivity (TAP) score, provides an alternative measurement approach [62]. To minimize inter-observer variability, studies increasingly utilize digital pathology platforms and bioimage analysis software such as QuPath for objective cell classification and enumeration [89]. These tools enable manual annotation of distinct cell populationsâtumor cells, immune cells, PD-L1-expressing tumor cells, and PD-L1-expressing immune cellsâfollowed by automated scoring across entire tissue sections [89].
A feasibility study evaluating the novel PD-L1 CAL10 assay (Leica Biosystems) demonstrated strong concordance with the established VENTANA PD-L1 (SP263) assay in NSCLC samples. The overall percent agreement (OPA) between the assays reached 86.2% at the clinically relevant TPS cutoff of â¥50%, and 94.0% at the TPS cutoff of â¥1% [9]. This high concordance was maintained between manual glass slide reads and whole slide images scanned with the Aperio GT 450 platform, supporting the utility of digital pathology in PD-L1 assessment [9].
Table 1: PD-L1 Assay Concordance in NSCLC (CAL10 vs. SP263)
| TPS Cutoff | Overall Percent Agreement (OPA) | 95% Confidence Interval (CI) | Sample Size (N) |
|---|---|---|---|
| â¥50% | 86.2% | Predefined target of â¥85% | 136 cases |
| â¥1% | 94.0% | Predefined target of â¥85% | 136 cases |
Significant discrepancies in PD-L1 expression patterns occur across different specimen types within HNSCC. A comprehensive study of 68 cases analyzing matched preoperative biopsy, surgical resection, and metastatic lymph node samples revealed substantial heterogeneity in both CPS and TPS [89]. Statistical comparisons using the Kruskal-Wallis test showed significant differences between biopsy and resection specimens (p<0.01), and between resection and metastatic lymph node samples (p<0.01) [89]. This heterogeneity underscores the context-dependent nature of PD-L1 expression and highlights the importance of standardized specimen selection for reliable companion diagnostic results.
Table 2: PD-L1 Expression Heterogeneity Across HNSCC Specimen Types
| Specimen Comparison | CPS Discrepancy | TPS Discrepancy | Statistical Significance |
|---|---|---|---|
| Biopsy vs. Surgical Resection | Significant | Significant | p < 0.01 |
| Resection vs. Metastatic Lymph Node | Significant | Significant | p < 0.01 |
| Biopsy vs. Metastatic Lymph Node | Not Significant | Not Significant | Not Provided |
A detailed evaluation of four FDA-approved PD-L1 assays in ccRCC revealed notably different expression patterns compared to NSCLC and HNSCC. While PD-L1 expression in tumor cells was consistently low across all assays, expression in immune cells varied significantly by assay type [13]. The SP142 assay demonstrated markedly lower sensitivity, detecting PD-L1 expression in only 2.1% of immune cells, compared to approximately 15% for the 22C3, 28-8, and SP263 assays [13]. Pairwise concordance assessed using kappa statistics showed moderate agreement between the 28-8 assay and others (κ=0.52 with 22C3, κ=0.46 with SP263), but poor agreement with SP142 (κ=0.16) [13].
Table 3: PD-L1 Expression Patterns in Clear Cell Renal Cell Carcinoma
| PD-L1 Assay | Positivity in Tumor Cells | Positivity in Immune Cells | Prognostic Impact on CSS |
|---|---|---|---|
| 22C3 | Low | 14.7% | Significantly worse |
| 28-8 | Low | 16.1% | Significantly worse |
| SP142 | Low | 2.1% | Not significant |
| SP263 | Low | 15.0% | Significantly worse |
A comparability study in hepatocellular carcinoma evaluated four standardized PD-L1 assays (22C3, 28-8, SP142, and SP263) with assessment by five pathologists. The 22C3, 28-8, and SP263 assays demonstrated comparable sensitivity in detecting PD-L1 expression, while the SP142 assay was consistently the least sensitive [94]. Inter-assay agreement, measured by intraclass correlation coefficients (ICC), was 0.646 for TPS and 0.780 for CPS [94]. Pathologists showed good to excellent inter-rater agreement (ICC 0.946 for TPS and 0.809 for CPS), though reliability was lower for CPS assessment, particularly with the SP142 assay, where up to 18% of samples were misclassified by individual pathologists compared to consensus scoring at CPS â¥1 cutoff [94].
Table 4: Essential Research Materials for PD-L1 Concordance Studies
| Reagent/Platform | Type/Model | Primary Research Application |
|---|---|---|
| PD-L1 IHC 22C3 pharmDx | Antibody Clone | PD-L1 detection on Dako platforms; companion diagnostic for pembrolizumab [89] [13] |
| PD-L1 IHC 28-8 pharmDx | Antibody Clone | PD-L1 detection on Dako platforms; companion diagnostic for nivolumab [13] [94] |
| VENTANA PD-L1 (SP263) | Antibody Clone | PD-L1 detection on Ventana platforms; used with durvalumab [9] [13] |
| VENTANA PD-L1 (SP142) | Antibody Clone | PD-L1 detection on Ventana platforms; used with atezolizumab [13] [94] |
| BOND-III Staining System | Instrument | Automated IHC staining platform for PD-L1 CAL10 assay development [9] |
| BenchMark ULTRA | Instrument | Automated IHC staining platform for SP263 and SP142 assays [89] [94] |
| Aperio GT 450 | Instrument | Whole slide imaging for digital pathology integration [9] |
| QuPath | Software | Open-source bioimage analysis for objective PD-L1 scoring [89] |
The cumulative data from these comparative studies highlight several critical considerations for researchers developing and validating PD-L1 assays. First, assay concordance is highly tissue-type dependent, with distinct staining patterns observed in NSCLC, HNSCC, ccRCC, and hepatocellular carcinoma [9] [89] [13]. Second, the SP142 assay consistently demonstrates lower sensitivity across multiple tumor types, particularly in detecting PD-L1 expression in immune cells [13] [94]. Third, specimen type significantly influences PD-L1 scoring in HNSCC, with notable differences between biopsy, resection, and metastatic samples [89]. Finally, the emerging TAP score shows promising concordance with CPS in predicting clinical outcomes for gastric/esophageal cancers treated with tislelizumab, suggesting its potential utility as a complementary scoring metric [62].
These findings underscore the necessity of context-specific assay validation that accounts for both tumor histology and intended scoring algorithm. For researchers engaged in analytical validation of PD-L1 assays, these data support the implementation of digital pathology solutions to improve scoring consistency and the establishment of tissue-specific reference standards that reflect the unique distribution of PD-L1 expression across tumor and immune cell compartments in different cancer types.
The analytical validation of PD-L1 immunohistochemistry (IHC) assays is a critical prerequisite for their successful implementation in clinical trials and routine diagnostics. As immune checkpoint inhibitors continue to transform cancer treatment, the need for reliable, reproducible, and accessible companion diagnostics has intensified. This comparison guide objectively evaluates the performance characteristics of FDA-approved assays from Dako (22C3) and Ventana (SP263) alongside laboratory-developed tests (LDTs) using clones such as E1L3N and CAL10. The focus on non-small cell lung cancer (NSCLC) provides a clinically relevant context for assessing analytical performance across different testing platforms, staining methodologies, and interpretation criteria. Understanding the concordance, limitations, and appropriate applications of each platform empowers researchers and drug development professionals to make informed decisions regarding biomarker strategy in clinical trials and translational research.
Table 1: Analytical Concordance of PD-L1 IHC Assays in NSCLC
| Assay Comparison | Clinical Context | Concordance Metric | TPS â¥1% | TPS â¥50% | Reference |
|---|---|---|---|---|---|
| CAL10 (LDT) vs. SP263 (Ventana) | NSCLC tissue samples | Overall Percent Agreement (OPA) | 94.0% | 86.2% | [9] [99] |
| E1L3N (LDT) vs. 22C3 (Dako) | Advanced NSCLC | Correlation Coefficient | 0.925 (p<0.0001) | 0.925 (p<0.0001) | [21] |
| E1L3N (LDT) vs. 22C3 (Dako) | Advanced NSCLC | Positive Rate (TPSâ¥1%) | 67.4% vs. 73.9% | N/A | [21] |
| E1L3N (LDT) vs. 22C3 (Dako) | Advanced NSCLC | Positive Rate (TPSâ¥50%) | 26.1% vs. 30.4% | N/A | [21] |
| 22C3 vs. 28-8 vs. SP263 | Multi-institutional study | Qualitative Assessment | Interchangeable | Interchangeable | [32] |
Table 2: Predictive Performance of Alternative PD-L1 Assays
| Assay | Therapeutic Context | Clinical Endpoint | Performance Outcome | Reference |
|---|---|---|---|---|
| E1L3N (LDT) | First-line pembrolizumab in NSCLC | Objective Response Rate (ORR) | TPS>50% vs <1%: p=0.047 | [21] |
| 22C3 (Dako) | First-line pembrolizumab in NSCLC | Objective Response Rate (ORR) | TPS>50% vs <1%: p=0.051 | [21] |
| E1L3N (LDT) | First-line pembrolizumab in NSCLC | Progression-Free Survival (PFS) | Longer PFS for TPSâ¥50% & 1-49% vs <1% | [21] |
| CAL10 (LDT) | NSCLC tissue samples | Digital vs Manual Reading | Comparable concordance with SP263 | [9] |
The quantitative data demonstrate that carefully validated LDTs can achieve high analytical concordance with FDA-approved assays. The CAL10 assay developed on the BOND-III platform showed meeting predefined performance targets with lower bound 95% CI of OPA exceeding 85% at both â¥1% and â¥50% TPS cutoffs compared to the SP263 assay [9]. Similarly, the E1L3N assay exhibited exceptional correlation (r=0.925) with the 22C3 pharmDx test across the dynamic range of PD-L1 expression [21]. Most importantly, the E1L3N assay demonstrated comparable predictive performance for pembrolizumab response, with statistically significant separation in ORR between TPS categories mirroring the pattern observed with the 22C3 assay [21].
While this guide focuses primarily on PD-L1 assays, the comparative framework for diagnostic assays extends to other biomarkers like HER2. Studies comparing HercepTest (Dako) and PATHWAY anti-HER2 (4B5) from Ventana in breast carcinoma reveal important methodological considerations. In one study, the 4B5 assay significantly reduced equivocal results (74.1% of cases equivocal by HercepTest were negative by 4B5), potentially streamlining testing algorithms [100]. However, the 4B5 assay failed to detect three FISH-positive cases identified by HercepTest, highlighting the risk of false negatives [100]. A more recent study of the next-generation HercepTest mAb pharmDx demonstrated 98.2% concordance with PATHWAY 4B5 for HER2-negative and HER2-positive categorization, though the HercepTest mAb showed higher sensitivity for detecting HER2-low cases [101]. These findings underscore that apparent "performance differences" between platforms must be interpreted within the clinical context and therapeutic implications.
Experimental Protocol: A robust validation methodology employed a standardized PD-L1 Index Tissue Microarray (TMA) containing 10 isogenic cell lines with predetermined PD-L1 expression levels spanning the dynamic range [32]. The protocol involved:
This standardized approach demonstrated that assays for 22C3 (Dako), 28-8 (Dako), SP263 (Ventana), and E1L3N (LDT) were highly similar across sites, with all laboratories showing high consistency over time [32]. The SP142 assay, however, failed to detect low levels of PD-L1 distinguished by other assays, confirming previous subjective assessments now quantified in a multi-institutional setting [32].
Experimental Protocol: The validation of E1L3N as an alternative to 22C3 followed a comprehensive clinical correlation protocol:
This validation framework established not only analytical concordance but also clinical equivalence, demonstrating that E1L3N TPS >50% predicted significantly higher ORR (p=0.047) similar to the 22C3 assay (p=0.051) [21].
Figure 1: PD-L1 Assay Validation Workflow. This diagram illustrates the standardized approach for multi-institutional assay validation using Index Tissue Microarrays (TMAs), digital image analysis, and statistical concordance assessment across different testing platforms.
Figure 2: PD-1/PD-L1 Signaling Pathway and Therapeutic Intervention. This diagram illustrates the immune checkpoint mechanism whereby tumor cell PD-L1 engages T-cell PD-1 to inhibit anti-tumor immunity, and how checkpoint blockade antibodies restore T-cell function.
The PD-1/PD-L1 axis represents a critical immune checkpoint pathway that tumors exploit to evade host immunity. PD-1 is expressed on the surface of T-cells, while its ligand PD-L1 is expressed on antigen-presenting cells and various immune cells [9]. Tumor cells can also express PD-L1, and when PD-L1 binds to PD-1 on T-cells, it induces T-cell death and downregulates the T-cell response, enabling tumor immune escape [9]. Immune checkpoint inhibitors targeting this pathway block the PD-1/PD-L1 interaction, preserving T-cell mediated anti-tumor immunity [9]. PD-L1 IHC assays detect the presence of the PD-L1 ligand on tumor and immune cells, serving as predictive biomarkers for response to these therapies.
Table 3: Essential Research Reagents and Platforms for PD-L1 Assay Development
| Reagent/Platform | Specific Example | Research Application | Performance Characteristics |
|---|---|---|---|
| Anti-PD-L1 Antibody Clones | 22C3 (Dako) | Companion diagnostic for pembrolizumab in NSCLC | FDA-approved; reference standard for multiple clinical trials [21] |
| SP263 (Ventana) | Identifying NSCLC patients for atezolizumab/cemiplimab | Qualitative IHC assay; detects PD-L1 in tumor cells and immune cells [102] | |
| E1L3N (Cell Signaling) | Laboratory-developed test alternative | High concordance with 22C3 (r=0.925); cost-effective option [21] | |
| CAL10 (Leica Biosystems) | Novel assay development under validation | Comparable to SP263 (94.0% OPA at â¥1% TPS) [9] [99] | |
| Automated Staining Platforms | Dako Autostainer Link 48 | 22C3 and 28-8 pharmDx assays | Standardized staining for companion diagnostics [32] |
| Ventana BenchMark ULTRA/XT | SP263 and SP142 assays | Integrated staining with optiView DAB detection [21] [102] | |
| Leica BOND-MAX/BOND-III | E1L3N and CAL10 LDTs | Automated IHC/ISH with flexible protocol options [21] [9] | |
| Digital Analysis Tools | QuPath (Open Source) | Quantitative image analysis | Cell segmentation, DAB OD quantification, TPS/CPS calculation [32] [89] |
| Aperio ScanScope XT | Whole slide imaging | High-resolution digital pathology for archiving/analysis [32] | |
| Standardization Materials | Index TMA (Isogenic Cell Lines) | Inter-assay comparison | Pre-characterized PD-L1 expression range for normalization [32] |
| FFPE Cell Pellet Controls | Run-to-run quality control | Positive (MDA-453 for HER2) and negative controls [101] |
This toolkit highlights the essential components for developing, validating, and implementing PD-L1 IHC assays in research and potential diagnostic applications. The selection of appropriate antibody clones must be guided by the specific clinical or research context, considering factors such as therapeutic companion diagnostic status, platform availability, and cost constraints. The integration of automated staining platforms ensures reproducibility, while digital analysis tools provide objective quantification of biomarker expression. Standardization materials like indexed TMAs and control cell lines are indispensable for multi-institutional studies and quality assurance programs.
The comparative data presented in this guide demonstrate that well-validated LDTs can achieve performance characteristics comparable to FDA-approved assays for PD-L1 detection. The high concordance between E1L3N and 22C3 (r=0.925) and between CAL10 and SP263 (OPA >94% at â¥1% TPS) suggests that standardized laboratory-developed tests represent viable alternatives, particularly in resource-limited settings [21] [9]. However, the choice of platform must consider multiple factors beyond analytical concordance, including regulatory requirements, therapeutic context, and infrastructure considerations.
The emerging field of digital pathology and computational biomarker analysis presents new opportunities for standardizing PD-L1 assessment across platforms. Studies have demonstrated comparable concordance between manual slide reading and whole slide images for the CAL10 assay, supporting the integration of digital pathology into biomarker development workflows [9]. Furthermore, open-source bioimage analysis tools like QuPath enable standardized quantification of PD-L1 expression across different specimen types, potentially reducing inter-observer variability [89].
Future assay development must also address the challenges of tumor heterogeneity and temporal changes in PD-L1 expression. Studies in head and neck squamous cell carcinoma have revealed significant discrepancies in PD-L1 expression between biopsy specimens, surgical resections, and metastatic lymph nodes [89]. Similar heterogeneity has been observed in NSCLC, where neoadjuvant therapy can alter PD-L1 expression patterns in up to 36-57% of patients [22]. Liquid biopsy approaches using circulating tumor cells (CTCs) to quantify PD-L1 and HLA I expression represent promising complementary strategies that capture heterogeneity across metastatic sites and enable serial monitoring [22].
As therapeutic paradigms evolve toward combination immunotherapies and novel antibody-drug conjugates (evidenced by the HER2-low concept in breast cancer [101]), biomarker assays must demonstrate robust performance across the entire dynamic range of expression. The successful validation of LDTs for PD-L1 detection, following rigorous analytical and clinical validation frameworks as outlined in this guide, provides a roadmap for future biomarker development in the era of precision immuno-oncology.
The clinical implementation of programmed death-ligand 1 (PD-L1) immunohistochemistry (IHC) assays requires rigorous analytical validation to ensure accurate identification of patients who may benefit from immune checkpoint inhibitor therapy. PD-L1 expression serves as a critical predictive biomarker for multiple cancer types, including non-small cell lung cancer (NSCLC), with various approved companion and complementary diagnostic assays available [103]. The validation landscape is complicated by the existence of multiple FDA-approved assays, each developed alongside specific therapeutic agents, utilizing different antibody clones, platforms, and scoring criteria [104]. This complexity necessitates standardized validation approaches to ensure reliable clinical performance across laboratory settings while addressing challenges related to cost, accessibility, and technical variability.
Regulatory validation ensures that PD-L1 assays demonstrate consistent performance characteristics including sensitivity, specificity, precision, and reproducibility. The College of American Pathologists (CAP) emphasizes that laboratories should use validated PD-L1 IHC expression assays in conjunction with other biomarker assays where appropriate to optimize patient selection for immune checkpoint inhibitors [71]. Furthermore, pathologists must ensure appropriate validation has been performed on all specimen types and fixatives, with laboratory-developed tests (LDTs) requiring validation according to accrediting body requirements when clinically validated assays are not feasible [71].
Regulatory agencies classify PD-L1 assays based on their clinical utility and relationship to specific therapeutics:
Companion Diagnostics: The US Food and Drug Administration (FDA) defines a companion diagnostic as "a medical device, often an in vitro device, which provides information that is essential for the safe and effective use of a specific drug or biological product" within its approved labeling [104]. These assays are mandatory for treatment decisions, with the PD-L1 IHC 22C3 pharmDx assay for pembrolizumab in NSCLC representing the first companion diagnostic in immuno-oncology [104] [103].
Complementary Diagnostics: These assays aid in therapeutic decision-making but are not strictly required when prescribing the associated drug. The PD-L1 IHC 28-8 PharmDx assay became the first complementary diagnostic when the FDA approved nivolumab for second-line treatment of non-squamous NSCLC [104]. Treatment may be considered even in the absence of test results or if results are negative, though testing is highly recommended.
Table 1: FDA-Approved PD-L1 Assays and Their Characteristics
| PD-L1 Antibody | Platform | Detection System | Therapeutic Agent | Scoring Method | Key Cutoffs |
|---|---|---|---|---|---|
| 22C3 | Dako Autostainer Link 48 | EnVision FLEX visualization system | Pembrolizumab | TPS | â¥1%, â¥50% (NSCLC) |
| 28-8 | Dako Autostainer Link 48 | EnVision FLEX visualization system | Nivolumab | TPS | â¥1%, â¥5% (NSCLC) |
| SP263 | Ventana BenchMark Ultra | OptiView DAB IHC Detection Kit | Durvalumab | TC | â¥25% (NSCLC) |
| SP142 | Ventana BenchMark Ultra | OptiView DAB IHC Detection Kit | Atezolizumab | IC/TC | IC â¥5%, TC â¥10% (NSCLC) |
| 73-10 | Dako Autostainer Link 48 | EnVision FLEX visualization system | Avelumab | TPS | â¥1%, 50%, 80% (NSCLC) |
TPS: Tumor Proportion Score; TC: Tumor Cells; IC: Immune Cells [104]
Comprehensive analytical validation of PD-L1 assays must establish multiple performance characteristics to ensure clinical reliability:
Precision: Both intra-assay and inter-assay imprecision must be quantified. For novel ELISA assays targeting soluble PD-1, PD-L1 and PD-L2, intra-assay imprecision measurements with three patient pools demonstrated coefficients of variation (CV) not exceeding 10% for all three assays (PD-1: 6.4-7.8%; PD-L1: 4.2-7.1%; PD-L2: 4.5-10.0%) [105]. Inter-assay imprecision should be assessed through repetitive measurement of sample pools across different plates and days.
Limit of Detection (LOD) and Quantification: The LOD should be based on background signals using multiple blank values from different days, with the standard deviation multiplied according to statistical standards [105]. Assays must demonstrate precise measurements down to the pg/mL range for soluble markers [105].
Dilution Linearity: Experiments should demonstrate good linearity in both buffer and relevant matrices like heparin plasma. Serial dilution rows (e.g., 1:2, 1:4, 1:8, 1:16) must maintain linear response [105].
Selectivity: Analytical selectivity should be demonstrated through cross-reactivity experiments with possibly confounding markers at concentrations up to at least 15 ng/mL [105].
Dynamic Range: Assays should demonstrate a broad dynamic range capable of measuring clinically relevant concentrations across patient populations [105].
For novel assay development, comprehensive protocols must be established:
Assay Development Workflow:
Sample Processing Protocol:
The interchangeability of PD-L1 assays remains a significant challenge in clinical practice. Meta-analyses of diagnostic accuracy have evaluated whether different assays can be used interchangeably for specific clinical purposes [84]. For clinical laboratories not able to use FDA-approved companion diagnostics, properly validated laboratory-developed tests represent a viable alternative [84].
Table 2: Analytical Comparison of PD-L1 Assay Performance
| Assay Comparison | Tumor Type | Concordance Level | Key Metrics | Limitations |
|---|---|---|---|---|
| 22C3 vs. SP263 | NSCLC | High concordance | OPA: >85% at â¥50% cutoff [9] | SP263 may show slightly higher sensitivity |
| 22C3 vs. 28-8 | NSCLC | High concordance | ICC for TPS: 0.646 [94] | Good inter-rater agreement (ICC: 0.946) |
| SP142 vs. others | Multiple | Lower sensitivity | Reduced PD-L1 detection [94] | Poorer performance in low expression ranges |
| CAL10 vs. SP263 | NSCLC | Comparable performance | OPA lower bound 95% CI: 86.2% at â¥50% cutoff [9] | Similar staining pattern and intensity |
| Laboratory-Developed Tests vs. FDA-approved | Multiple | Variable | Sensitivity/specificity â¥90% target [84] | Dependent on validation rigor |
Performance Standards: According to meta-analyses, PD-L1 assays are considered acceptable for clinical applications if both sensitivity and specificity for the stated clinical purpose are â¥90% [84]. The 22C3, 28-8, and SP263 assays demonstrate high concordance in PD-L1 scoring, suggesting potential interchangeability [94].
Emerging technologies are transforming PD-L1 assessment:
Digital Scoring Algorithms: Studies comparing pathologists versus artificial intelligence algorithms in scoring PD-L1 expression in NSCLC reveal moderate interobserver agreement among pathologists (Fleiss' kappa 0.558) for TPS <1% and almost perfect agreement (Fleiss' kappa 0.873) for TPS â¥50% [95]. AI algorithms show fair to substantial agreement with median pathologist scores (Fleiss' kappa 0.354-0.672) [95].
Quantitative Continuous Scoring (QCS): Computer vision systems enable granular cell-level quantification of PD-L1 staining intensity in digitized whole slide images [106]. The PD-L1 QCS-PMSTC (percentage of tumor cells with medium to strong staining intensity) classifier at >0.575% identifies patient populations with comparable hazard ratios to visual scoring (0.62 vs. 0.69) but increased biomarker-positive prevalence (54.3% vs. 29.7%) [106].
The PD-1/PD-L1 axis represents a critical immune checkpoint pathway that cancers exploit to evade host immunity. Understanding this biological context is essential for appropriate assay implementation and interpretation.
Figure 1: PD-1/PD-L1 Signaling Pathway and Therapeutic Intervention. The binding of PD-1 (expressed on T-cells) to PD-L1 (expressed on tumor cells and antigen-presenting cells) initiates an immunosuppressive signaling cascade through SHP2 recruitment, leading to inhibition of T-cell receptor signaling and co-stimulatory pathways. This results in impaired T-cell effector functions and tumor immune evasion. Immune checkpoint inhibitors (therapeutic antibodies) block this interaction, restoring anti-tumor immunity [2].
Table 3: Essential Research Reagents for PD-L1 Assay Development
| Reagent Category | Specific Examples | Function and Application | Technical Considerations |
|---|---|---|---|
| Primary Antibodies | Clone 22C3, 28-8, SP263, SP142, 73-10, CAL10 | Target-specific binding to PD-L1 epitopes | Different clones recognize different epitopes with varying sensitivity [104] [9] |
| Detection Systems | EnVision FLEX, OptiView DAB, SULFO-TAG Streptavidin | Signal amplification and visualization | Platform-specific compatibility requirements [105] [104] |
| Platform Instruments | Dako Autostainer, Ventana BenchMark, BOND-III, MESO QuickPlex | Automated staining and processing | Standardized protocols essential for reproducibility [105] [9] |
| Validation Controls | Recombinant protein standards, patient serum pools, multi-tissue blocks | Assay calibration and quality assurance | Should cover dynamic range; store at -80°C in aliquots [105] |
| Digital Pathology Tools | uPath software, Visiopharm applications, Aperio GT 450 scanner | Automated quantification and analysis | Require validation against manual scoring [95] [9] |
Regulatory validation of PD-L1 assays requires a comprehensive, evidence-based approach that addresses analytical performance, clinical utility, and practical implementation challenges. The validation framework must establish rigorous performance criteria including precision, sensitivity, specificity, and reproducibility across specimen types. While multiple FDA-approved assays exist with demonstrated clinical utility, properly validated laboratory-developed tests represent a necessary alternative for many laboratories facing resource constraints. Emerging technologies including digital pathology and artificial intelligence show promise for enhancing quantification accuracy and reducing inter-observer variability, though further refinement is needed to match the reliability of expert pathologist assessment. As the immuno-oncology landscape continues to evolve, maintaining rigorous validation standards remains paramount for ensuring accurate patient selection and optimal therapeutic outcomes.
The analytical validation of PD-L1 assays represents a critical component in the precision medicine paradigm for cancer immunotherapy. Successful implementation requires thorough understanding of the biological context, methodological rigor in assay selection and optimization, proactive management of pre-analytical and analytical variables, and comprehensive validation against established clinical benchmarks. Future directions must focus on harmonizing scoring systems across platforms, validating novel liquid biopsy approaches for difficult-to-access malignancies, and developing integrated biomarker models that combine PD-L1 with other predictive factors such as tumor mutational burden and microsatellite instability. As immunotherapy continues to evolve, robust analytical validation of PD-L1 assays will remain fundamental to identifying patients most likely to benefit from these transformative treatments and advancing drug development strategies.