Proteotoxic stress response drives T cell exhaustion and immune evasion – Nature

-


Cell lines

The MC38 cell line was purchased from Kerafast (ENH204-FP). The MB49 cell line was purchased from Sigma-Aldrich (SCC148). The HEK293T cell line was purchased from the American Type Culture Collection (CRL-3216). The MB49-gp33 cell line was shared by W. Cui (Northwestern University). The B16-OVA cell line was generated as previously described64 and shared by L. Deng (Memorial Sloan Kettering Cancer Center). HEK293T, MC38 and MB49 cells were cultured in Dulbecco’s modified Eagle medium (DMEM; Gibco, 11965-092) with 10% FBS (Gibco, 10082-147) and 1% penicillin–streptomycin (Gibco, 15140-122) at 37 °C and 5% CO2. B16-OVA cells were cultured in RPMI-1640 (Gibco, 11875-093) with 10% FBS and 1% penicillin–streptomycin. Cell lines were regularly tested for mycoplasma contamination.

Mice

WT C57BL/6J mice (strain 000664) were purchased from The Jackson Laboratory. CD8-specific gp96-deficient mice were generated by crossing E8i-Cre mice (The Jackson Laboratory, strain 008766) and Hsp90b1flox/flox mice, previously generated and described by our group65. The P14 mouse strain was a gift from W. Cui (Northwestern University). OT-1 (strain 003831) and Rag2–/– (strain 033526) mice were purchased from The Jackson Laboratory. These mice were maintained in the animal facility at the Ohio State University under standard conditions (ambient temperature of 20–24 °C, relative humidity of 30–70% and a 12-h dark–light cycle (lights on from 6:00 to 18:00)). Mice aged 6–8 weeks were used for experiments. All procedures were performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health (NIH). The protocol was approved by the Committee on the Ethics of Animal Experiments of the Ohio State University.

T cell isolation, stimulation and drug treatment

Spleens were isolated from C57BL/6J mice and minced into single-cell suspensions. CD8+ T cells were isolated using an immunomagnetic negative selection kit (Stemcell, 19853). Isolated CD8+ T cells were first stimulated with 3 μg ml–1 plate-bound anti-CD3 (BioLegend, 100359) and 1 μg ml–1 anti-CD28 (BioLegend, 102121) antibodies in T cell medium made with RPMI-1640 with 10% FBS, 1% penicillin–streptomycin, 1 mM sodium pyruvate (Gibco, 11360-070), 1× MEM NEAA (Gibco, 11140-050), 10 mM HEPES (Gibco, 15630-080) and 50 μM 2-mercaptoethanol (Gibco, 21985-023) supplemented with 100 U ml–1 recombinant human IL-2 (acquired from the Biological Resources Branch at the NIH) in 12-well plates at a density of 106 cells per well for 48 h at 37 °C and 5% CO2. For chronic stimulation, CD8+ T cells were re-stimulated every 2 days by passaging to new plates with plate-bound anti-CD3 in T cell medium with IL-2. For acute stimulation, CD8+ T cells were passaged every 2 days and maintained in T cell medium with IL-2. In some experiments, cells were treated with MK2206 (Cayman, 11593), LY294002 (Sigma-Aldrich, 440202) or rapamycin (Sigma-Aldrich, 553210) 2 days after initial activation and replenished concurrently with cell passage.

To measure cytokine production, activated cells were collected, plated and re-stimulated with 0.5× cell stimulation cocktail (Thermo Fisher, 00-4970-93) in T cell medium for 3 h at 37 °C and 5% CO2.

Tumour challenge and TIL isolation

For the MC38 tumour model, 1 × 106 cells were subcutaneously injected into the right flank of shaved C57BL/6J mice. Mice were euthanized for tumour collection 16 days after tumour implantation for cell sorting. For the MB49 tumour model, 5 × 105 cells were subcutaneously injected into the right flank of shaved C57BL/6J mice. Tumours were collected 13 days after tumour implantation. To prepare single-cell suspensions, isolated tumours were chopped and washed with PBS before incubation with collagenase I (200 U ml–1, Worthington, LS004196) in serum-free RPMI-1640 for 30 min at 37 °C with gentle agitation. After digestion, 2% BSA in PBS was added to cell suspensions to neutralize collagenase. Cell suspensions were washed with PBS and filtered through a 70 μm nylon filter. Single-cell suspensions were centrifuged and resuspended in PBS for downstream assays. For cell sorting, immune cells were enriched using a mouse TIL CD45 positive selection kit (Stemcell, 100-0350).

Flow cytometry

Cells were washed with PBS twice. Dead cells were stained using Live/Dead fixable blue (Invitrogen, L23105) or Zombie UV (BioLegend, 423108) at 4 °C for 15 min. Cells were washed with FACS buffer twice and a surface molecule staining antibody cocktail was applied for 30 min at 4 °C. After incubation, cells were washed twice with FACS buffer and then fixed and permeabilized using a FOXP3 fixation and permeabilization kit (eBioscience, 00-5523-00) overnight. After overnight fixation, cells were washed twice in permeabilization buffer and an intracellular staining antibody cocktail was added to the cells. After 2 h of incubation at room temperature, cells were washed twice with FACS buffer and analysed using Cytek Aurora. Acquired data were analysed with FlowJo software (v.10.10, BD Life Sciences) or OMIQ (Dotmatics) for high dimensional analysis. The gating strategy for TIL analysis is provided in Supplementary Fig. 2. A list of antibodies used for the multispectral flow cytometry study is provided in Supplementary Table 1.

For protein aggregation staining, cells were washed with HBSS (Sigma-Aldrich, H6648) twice and stained with 100 nM NIAD-4 (Cayman, 18520) or 50 μM CRANAD-2 (Cayman, 19814) in HBSS for 30 min at 37 °C and 5% CO2. Cells were stained using Live/Dead fixable Near IR (Invitrogen, L34975) at 4 °C for 15 min, followed by fixation (BD Biosciences, 554655) for 15 min and DAPI staining for 5 min at room temperature. Cells were then analysed by ImageStream for acquiring fluorescent images or Cytek Aurora for quantification.

For SG analysis, cells were collected and stained using Live/Dead fixable NIR, followed by fixation in BD Cytofix fixation buffer (BD Biosciences, 554655) for 15 min and permeabilization using a FOXP3 fixation and permeabilization kit for 30 min at room temperature. Cells were then stained with anti-G3BP1 antibody (Proteintech, 13057-2-AP) in permeabilization buffer for 1 h at room temperature and then FITC-conjugated anti-rabbit antibody for 30 min. DAPI was added to the cell suspension and incubated for 5 min. Data were collected by ImageStream and analysed using IDEAS (v.6.2). Live cells were gated for SG analysis. Cells with SG loci were determined by gating on the Bight Detail Intensity feature high population on the FITC–G3BP1 channel.

Protein synthesis rate measurement

Nascent proteins were labelled using a Click-iT HPG Alexa Fluor 488 Protein Synthesis Assay kit (Thermo Fisher, C10428). Cells were incubated with 50 μM HPG (Thermo Fisher, C10186) in T cell medium made with methionine-free RPMI (Gibco, A14517-01) for 30 min at 37 °C and 5% CO2. Cycloheximide (Sigma-Aldrich, 239763) was added to the negative control group at 50 μg ml–1 to inhibit translation. In some experiments, 2.5 μM MG132 (Sigma-Aldrich, M7449-1ML) or 10 nM bafilomycin A1 (Sigma-Aldrich, SML1661) was added to cells after HPG incubation. Cells were then labelled following the manufacturer’s protocol and analysed using Cytek Aurora.

For measuring translation in TIL subsets in vivo, 50 mg kg–1 OPP (Vector Laboratories, CCT-1407-25) was administered into tumour-bearing mice by intraperitoneal injection. Mice were killed exactly 1 h after injection. Tumours were isolated and processed into single-cell suspensions. Cells were stained with surface markers and OPP was labelled using a Click-iT reaction kit following the manufacturer’s protocol (Thermo Fisher, C10457).

Cell sorting

Single-cell suspensions were stained using Live/Dead fixable blue (Invitrogen, L23105) at 4 °C for 15 min. Cells were then washed twice with FACS after viability dye staining. Tumour cells were enriched for CD45+ lymphocytes using a mouse TIL positive selection kit (Stemcell, 100-0350) and spleen samples from mice infected with LCMV were enriched for CD8+ T cells with a negative selection kit (Stemcell, 19853) before viability staining. Cells were then incubated with a surface staining antibody cocktail for 30 min at 4 °C. Cells were washed twice with FACS buffer and filtered through a 70 μm nylon filter immediately before loading into a Cytek Aurora CS for sorting. For sorting, a 100 μm nozzle was used for tumour-derived samples and a 70 μm nozzle for spleen-derived samples.

LCMV infection model

For acute LCMV infection, 8–10-week-old male mice were intraperitoneally inoculated with 2 × 105 p.f.u. LCMV Armstrong. For chronic LCMV infection, 8–10-week-old male mice were intravenously inoculated with 2 × 106 p.f.u. LCMV clone 13 in 400 µl RPMI-1640. Mice were euthanized on day 8 and day 30 after infection.

Gene editing in T cells by CRISPR–Cas9

The sgRNAs targeting each candidate were designed and purchased from IDT. The sequences of sgRNAs are provided in Supplementary Table 2. Two days before electroporation, splenic CD8+ T cells were isolated and activated with 3 μg ml–1 plate-bound anti-CD3 and 1 μg ml–1 anti-CD28 antibodies in T cell medium supplemented with 100 U ml–1 IL-2. On the day of electroporation, RNPs were assembled by mixing 1.5 μl sgRNA and 1 μg Cas9 nuclease V3 (IDT, 1081059) and incubated at room temperature for 20 min. Electroporation was prepared using a P4 Primary Cell 4D-Nucleofector kit (Lonza, V4XP-4032). The activated T cells were washed with PBS twice and resuspended with P4 nucleofector solution with supplement provided by the kit. RNPs and 1 μl HDR Enhancer (IDT, 10007921) were added to the cell suspensions. The reaction mix was loaded into a Nucleocuvette after incubation at room temperature for 2 min. 4D-Nucleofector and program CMT137 were used for electroporation. Cells were rested in T cell medium with 50 U ml–1 IL-2 for 2 days and received re-stimulation every 2 days afterwards. At 8 days after electroporation, cells were collected for downstream analyses.

Protein electrophoresis and western blotting

Cells were pelleted and lysed in NP-40 buffer (50 mM Tris 7.4, 150 mM NaCl, 1% NP-40 and 0.1% sodium deoxycholate) supplemented with protease and phosphatase inhibitor cocktail (Thermo Fisher, 78440) and incubated on a roller for 30 min at 4 °C. Samples were centrifuged at 18,000g, 4 °C for 15 min and supernatant was transferred to fresh tubes as the detergent-soluble fraction. The detergent-insoluble fraction was resuspended in NP-40 buffer supplemented with 4% SDS. The protein concentration was quantified using a BCA assay (Pierce, 23227).

Native samples were diluted with native sample buffer (Thermo Fisher, NP) and run on 3–8% Tris-acetate gels (Thermo Fisher, EA0378) with Tris-glycine native running buffer (Thermo Fisher, LC2672). Samples were electrophoresed at 150 V for 3 h at 4 °C. SDS–PAGE samples were boiled in NuPAGE LDS sample buffer (Thermo Fisher, NP0007) and resolved on 4–12% Bis-Tris gels (Thermo Fisher, NP0335) with MOPS SDS running buffer (Thermo Fisher, NP0001). Samples were electrophoresed at 150 V for 1 h at room temperature. A list of antibodies used for western blot analyses is provided in Supplementary Table 1.

Retrovirus packaging and T cell transduction

The retroviral EV plasmid pMIG and pMIG-myrAKT were purchased from Addgene (52107, 65063). The open-reading frame for CFTRΔF508 was synthesized and cloned into the pMIG plasmid for this study. To generate retrovirus for mouse T cell transduction, HEK293T cells were transfected with pMIG and pCL-Eco in Opti-MEM. The cell culture supernatant was collected 48 h after transfection and concentrated overnight with Retro-X Concentrator (Takara, 631456). Concentrated retrovirus was added onto plates coated with RetroNectin (Takara, T100B) and spun at 1,800g at 32 °C for 2 h. Virus supernatant was removed after centrifugation and washed with PBS twice. Polyclonal, P14 cells and OT-1 CD8+ T cells that have been activated for 16–48 h were added to the virus-coated plate and cultured for 24 h. Cells were washed twice and plated into new plates for another 3–6 days for downstream analyses. For the generation of retrovirus for human T cell transduction, a similar approach to that used for mouse cells was used, with the key modification of using the Plat-A cell line for virus packaging. To transduce human CD8+ T cells, CD8+ T cells were magnetically isolated from peripheral blood mononuclear cells (Stemcell, 17953) and activated with Dynabeads (Gibco, 11131D) for 1 day. After activation, the cells were transduced with the indicated virus. In brief, the cells were spinoculated at 1,000g in a RetroNectin-virus-coated plate. After 24 h, the virus was removed, and subsequent analyses were performed after an additional 6–8 days of activation and maintenance.

ACT experiment

P14 cells were isolated from the spleens of P14 mice and activated with 1 μg ml–1 gp33 peptide. Two days after activation, cells were edited by CRISPR–Cas9 as described above and expanded for another 2 days with 100 U ml–1 IL-2. Next, 1 × 106 P14 cells were intravenously transferred per mouse. Then 5 × 105 MB49-gp33 cells were subcutaneously injected into the right flank of shaved WT C57BL/6J mice or Rag2–/– mice. WT mice were lymphodepleted using 5 Gray of total body irradiation on the day before cell transfer and randomized for treatment groups. OT-1 cells were activated and transduced with retroviral vector as described above. Transduced OT-1 cells were purified by cell sorting on the basis of positive GFP expression. In total, 2.5 × 105 OT-1 cells were intravenously transferred to B16-OVA tumour-bearing Rag2–/– mice. For OT-1 ACT experiments, 5 × 105 cells B16-OVA cells were subcutaneously injected into the right flank of Rag2–/– mice 8 days before adoptive transfer and randomized into treatment groups.

Immunofluorescence analysis by confocal microscopy

T cells were collected and spun onto glass coverslips in a 12-well plate. For protein aggregation staining, cells were stained with NIAD-4 and fixed as described above. For CFTR staining, cells were fixed with fixation buffer (BD, 554655) for 15 min, permeabilized with 0.5% Triton X-100 in PBS for 20 min and blocked with 2% BSA for 1 h. Cells were stained with primary anti-CFTR antibody (Proteintech, 20738-1-AP) and then Alexa Fluor 647-conjugated goat anti-rabbit IgG antibody (Thermo Fisher, A-21244). After staining, coverslips were mounted onto glass slides with mountant and DAPI (Thermo Fisher, P36962). Images were taken using an Olympus FV3000 microscope with ×60 magnification and processed with Olympus OlyVIA (v.4.2). For analysis, images were imported into ImageJ as .tiff files and adjusted to RGB stack format for downstream processing. Thresholds for positive detection of aggregates were determined through normalized autodetection and maintained across all images with a lower threshold of 100 and an upper threshold of 255 to generate binary image masks. The area, average size per particle, percentage of area and mean fluorescence intensity were analysed using the Analyze Particles function selected for area, area fraction, fluorescence intensity, particle count and average particle size.

MS sample processing

Cell samples were collected and washed with PBS once. Cell pellets were frozen at −80 °C if not immediately processed. Cells were lysed in lysis buffer made with 5% SDS (Thermo Fisher, AM9820), 50 mM TEAB (Thermo Fisher, 90114) and 2 mM MgCl2 (Thermo Fisher, AM9530G) with HALT protease inhibitor cocktail (Thermo Fisher, 78441). Lysates were homogenized using either a probe sonicator or a Biorupter. DNA was removed by centrifugation at 13,000g for 10 min and the pellet discarded. For in vitro cell samples, the protein concentration was quantified using a BCA assay (Pierce, 23227) and 50 μg protein of each sample was used for subsequent steps. For in vivo samples, total lysates were used assuming accurate FACS cell counts. Cell lysates were then treated with 20 mM DTT (Sigma-Aldrich, 10197777001) at 95 °C for 10 min, followed by the addition of 40 mM iodoacetamide (Pierce, A39271) at room temperature for 30 min in the dark and then quenched with 20 mM DTT for 15 min at room temperature. Phosphoric acid (1.2%; Sigma-Aldrich, 345245) was used to acidify proteins. Binding buffer with 100 mM TEAB in methanol (Thermo Fisher, A4581) was added to samples that were then loaded onto S-traps (ProtiFi, C01-micro-80) and washed with binding buffer 3 times. Proteins were digested with trypsin (Pierce, 90058) at 47 °C for 2 h. Digested peptides were eluted from S-traps with 0.2% formic acid (Thermo Fisher TS-28905) followed by a second elution with 50% acetonitrile (Sigma-Aldrich, T7408) in 0.2% formic acid. Eluates were pooled and lyophilized for storage at −80 °C.

MS acquisition

Peptides were reconstituted with 2% acetonitrile in 0.1% formic acid and separated using either an Easy-nLC 1200 coupled to an Thermo Exploris 480 tandem mass spectrometer (Thermo Fisher) or an UltiMate 3000 UHPLC coupled to a Thermo Fusion tandem mass spectrometer (Thermo Fisher). In both set ups, peptides were first desalted online using an Acclaim PepMap 100 Trap column (75 μm inner diameter, 150 mm length, 3 μm C18 packing) and then separated and ionized using either a 50 cm (Easy-nLC) or 25 cm (Ultimate 3000) Easy-Spray HPLC column (75 μm inner diameter, 2 μm C18 packing) with a 90-min linear gradient.

All data-independent acquisition (DIA) measurements were configured in a staggered window pattern using boundaries optimized to place window boundaries in forbidden zones. The Thermo Fusion was configured to use two DIA injections (covering peptide precursors from 400 to 700 m/z and from 700 to 1,000 m/z) of 38 ×8 m/z-wide windows in a staggered window pattern. These windows were configured to have 17,500 resolution and an automatic gain control (AGC) target of 4 × 105. Precursor spectra were placed every 38 scans (1 per cycle) using 35,000 resolution and an AGC target of 4 × 105. Similarly, the Thermo Exploris 480 was configured to use single-injection DIA measurements (covering peptide precursors from 400 to 1,000 m/z) of 38 × 16 m/z-width windows. These windows were configured to have 30,000 resolution and an AGC target of 1 × 106. Precursor spectra were placed every 38 scans (1 per cycle) using 60,000 resolution and an AGC target of 1 × 106.

For each dataset, a sample pool was made from subaliquots and used for library generation. We used gas-phase fractionation (GPF) DIA following the chromatogram library approach66,67. For this, we injected each peptide pool 6 times using different 100 m/z regions (400–500 m/z, 500–600 m/z, 600–700 m/z, 700–800 m/z, 800–900 m/z and 900–1,000 m/z). Each injection was configured to use 4 m/z staggered DIA windows and appropriate precursor windows. Otherwise, all measurements were performed as for normal DIA above on their respective instrument.

Proteomic data analysis

Raw files were demultiplexed using MSConvert in the Proteowizard package (v.3.0.20169)68 and then searched using EncyclopeDIA (v.2.12.31). EncyclopeDIA was configured with the default settings for Orbitraps: 10 ppm precursor, fragment and library tolerances. EncyclopeDIA was allowed to consider both B and Y ions, and trypsin digestion was assumed. Searches were performed using a two-step procedure. First, the GPF-DIA injections were searched using a Prosit69,70 predicted spectrum library to generate a chromatogram library based on the Mus musculus UniProt FASTA database (downloaded on 22 October 2019, containing 17,025 entries). All z = +2 or z = +3 peptides from 396.4 to 1002.7 m/z (with a maximum of one missed cleavage) were predicted assuming a normalized collision energy of 33. Peptides detected in the six GPF-DIA injections at a 1% peptide-level false discovery rate (FDR) were compiled into the chromatogram library. Quantitative DIA injections were searched against this chromatogram library, again filtered to a 1% peptide-level FDR. A normalized protein expression matrix for all proteomics generated in this study is provided in Supplementary Table 3. Bubble plots of protein expression were generated using the R package tidyverse (v.1.3.1)71 based on z score-normalized protein expression values. Gene set enrichment analysis for protein clusters was performed using Enrichr72,73,74.

Bulk RNA-seq sample preparation and data analysis

Acutely and chronically stimulated T cells were collected on day 8 after initial activation. Cells were washed with PBS twice and pelleted. RNA was first extracted using TRIzol and chloroform and then cleaned up using a RNeasy Micro kit (Qiagen, 74004). Sample library preparation and sequencing were performed by Azenta Life Sciences. Poly(A) selection was used for library preparation. Sequencing was performed using an Illumina NovaSeq platform with a depth of 50 million reads per sample. The raw bulk sequences were checked, trimmed and filtered using Fastp (v.0.23.4)75. The filtered reads were mapped to the mouse reference genome mm10 using HISAT2 (v.2.2.1)76, and samtools (v.1.17)77 was used to convert and sort BAM files. Last, the subread tool (v.2.0.6)78 was used for gene quantification and generating the raw expression matrix. Raw expression data were first log-normalized, and the R package Limma (v.3.56.2)79 was used to fit the model and perform differential expression analysis. To avoid NA values, a pseudo count of 1 was added to the raw count matrix. Genes with an absolute log[fold change] value greater than 1.5 and FDR-adjusted P value smaller than 0.05 are considered as differentially expressed genes.

Statistical comparison of protein expression and gene expression

To accurately compare protein and gene expression levels, we created a hash table (Supplementary Table 4) that included the protein accession number, protein name, gene name and Mouse Genome Informatics (MGI) number. Each protein and RNA matrix needed to match the hash table, and only the overlapped proteins and genes were kept.

We compared the normalized and log-transformed protein expression and gene expression levels in samples of the sample condition (for example, day 8 Tex samples). Only proteins and genes that overlapped in both protein and RNA data were retained for comparison. A Pearson’s correlation test was applied to calculate the correlation coefficient between protein expression and gene expression levels. We also compared the log[fold change] of proteins and genes between different conditions. The log[fold change] of proteins and genes were calculated in the analysis of differentially expressed genes described above.

We generated a functional gene list to further evaluate the expression level of proteins and genes undergoing specific cell functions, including 13 gene ontology terms, one EIF2A-dependent and one EIF2A-independent gene list. Specifically, the EIF2A-dependent and EIF2A-independent genes were determined according to the EIF2A-regulated upstream open reading frames35. As previously described35, EIF2A-regulated upstream open reading frames were defined as the ratio of 5′ untranslated region (UTR) translation in control/5′ UTR translation in Eif2a KO > 4. The remaining mRNAs with a ratio <4 were defined as non-EIF2A regulated (EIF2A-independent). The 5′ UTR translation rate was quantified for mRNAs with an average of more than 16 reads over all replicates. Genes in each of the 26 lists are highlighted on the scatter plot to compare the protein and gene expression/log[fold change].

Gene signature score analysis

For each of the gene lists mentioned above, we also calculated a gene signature score based on the single-sample gene set enrichment analysis (ssGSEA) method. An in-house script was used to perform the ssGSEA analysis. The R package heatmaply (v.1.4.2)80 or Morpheus (https://software.broadinstitute.org/morpheus) was used to draw the heatmap. For gene signature score analysis for scRNA-seq data, the raw expression matrix of LCMV scRNA-seq data was downloaded from GSM3701181 (ref. 31). Cells were divided into three categories on the basis of gene expression levels: progenitor state (Slamf6 > 0 and Cx3cr1 = 0); intermediate state (Cx3cr1 > 0); and terminal state (Slamf6 = 0 and Cx3cr1 = 0). Cells in each category were randomly divided into three equal subgroups. Pseudo bulk gene expression was defined by the average expression of genes in each cell subgroup. Then, the same ssGSEA method was performed on the pseudo bulk expression data to calculate the gene signature scores and to generate the heatmap.

Pan-cancer scRNA-seq data collection

To construct a comprehensive pan-cancer scRNA-seq dataset, we compiled transcriptomic profiles from 346 tumour samples derived from 251 individuals across 20 publicly available scRNA-seq datasets81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100 (Supplementary Table 5). To ensure data consistency and to minimize platform-related biases, only datasets generated using the 10x Genomics droplet-based platform were included for our analyses.

Quality control and preprocessing of the pan-cancer scRNA-seq data

We applied rigorous quality control measures using the package Scanpy (v.1.9.5)101 to filter and preprocess single-cell transcriptomic data. The following inclusion criteria were applied: (1) each cell expressed at least 200 genes; and (2) mitochondrial gene content remained below 20% of total counts. Further filtering steps removed the following data: (1) low-quality barcodes indicative of debris (<400 detected genes, <500 unique molecular identifiers); and (2) potential duplicate cells (>5,500 detected genes or >30,000 unique molecular identifiers). After quality control, raw count matrices and AnnData objects were concatenated, and counts were normalized to transcripts per million using sc.pp.normalize_total, followed by log-transformation with sc.pp.log1p. Non-tumour cells were excluded before normalization, which produced 1,030,968 high-quality single cells and 14,090 genes for downstream analyses.

Batch correction and data integration

To harmonize datasets across studies while preserving biological signals, we used the Python package scVI (scvi-tools v.1.0.4)102 for batch-effect correction and data integration. The scVI model was trained with sample identity as a covariate, mitigating inter-sample technical variability while ensuring robust integration of multiple datasets. The efficiency of batch correction was assessed by quantifying the reduction in batch-specific effects while maintaining key biological variance. After correction, downstream analyses—including clustering, differential gene expression and trajectory inference—were performed on the integrated dataset. UMAP was used for visualization, depicting cellular heterogeneity across batches, datasets, sex, organ origins and cancer types.

Cell-type annotation of pan-cancer scRNA-seq data

To annotate cell populations, we leveraged the scANVI algorithm (scVI-tools v.1.0.4), which provided pre-labelled reference annotations for epithelial, endothelial, fibroblast, lymphoid, myeloid and plasma cells. Initial clustering was performed in the scANVI latent space, followed by Leiden clustering to assign cell identities. The scANVI model was trained with max_epochs=20, and cluster annotations were transferred with n_samples_per_label=100. For detailed characterization of T cell subpopulations, we further integrated corresponding AnnData objects and applied scVI-based batch correction.

Functional signature calculation for scRNA-seq data

We used the scanpy.tl.score_genes function from the Python package Scanpy (v.1.9.5) to compute gene set scores across individual cells, which enabled the quantification of functional signatures in the scRNA-seq dataset.

RNA velocity and trajectory inference

RNA velocity analysis was performed to infer the directionality of cellular state transitions using spliced and unspliced transcript counts. Velocities were computed using the scVelo toolkit (v.0.3.3)103,104, which estimates transcriptional dynamics across single cells. The resulting velocity vectors were projected onto the UMAP embedding to visualize the flow of differentiation. To infer developmental trajectories, the Slingshot algorithm was applied to the UMAP coordinates, incorporating RNA velocity information to identify lineage structures. Slingshot fit smooth curves (principal curves) through the data and assigned pseudotime values along each inferred lineage. Two dominant lineages were identified: one progressing towards a Tex cell phenotype (lineage 1) and the other towards an effector-like phenotype (lineage 2). Signature scores for naive, exhaustion and Tex-PSR gene modules were calculated across pseudotime for each lineage using averaged normalized expression of predefined marker genes.

Validation of the Tex-PSR signature in CD8+ T cells and its prognostic impact

To assess the clinical significance of the Tex-PSR signature in CD8+ T cells, we analysed public processed scRNA-seq data from 116 liver cancer samples obtained from 94 male patients105. Survival analyses were restricted to primary tumours and metastatic samples. After quality filtering, batch correction and cell-type annotation using the established preprocessing pipeline, CD8+ T cells were isolated and Tex-PSR signature scores were computed using the scanpy.tl.score_genes function from the Scanpy package (v.1.9.5).

Tex-PSR signature expression in CD8+ T cells and its impact on patient survival

To evaluate the prognostic significance of Tex-PSR expression levels in CD8+ T cells, we performed survival analyses using Kaplan–Meier curves, with statistical comparisons conducted using the log-rank test and univariate Cox proportional hazards (Cox PH) models, as specified in each figure. Two additional multivariable Cox PH models were fitted to account for potential confounders. The hazard ratio and 95% confidence intervals were reported on the basis of these models. Kaplan–Meier survival curves were generated to compare high versus low Tex-PSR expression in liver cancer scRNA-seq datasets, with P values computed using univariate Cox PH models. To determine the optimal cut-off value for Tex-PSR signature expression in relation to survival outcomes, we used the surv_cutpoint function from the R package survminer. This approach uses maximally selected rank statistics from the R package maxstat106 to stratify patients into low-risk and high-risk groups. Moreover, continuous variables included in the Cox PH107 models were assessed for linearity to ensure model validity.

Tex-PSR expression in immunotherapy-treated patients

We further investigated Tex-PSR expression in responders and non-responders across independent scRNA-seq datasets from patients receiving diverse immunotherapy treatments, including CAR T cell therapy for refractory B cell lymphoma61, anti-PD1 therapy for lung cancer and advanced renal cell RCC62,63, and anti-CTLA-4 with anti-PD1 combination therapy for RCC64,108. For each dataset, we applied the same preprocessing pipeline, including quality filtering, batch correction and cell-type annotation, as described for the pan-cancer scRNA-seq dataset.

Statistical analysis

Statistical analyses were performed using GraphPad Prism (v.10). Two-tailed unpaired Student’s t-test was used for comparison between two groups. One-way ANOVA was used for comparisons among three or more groups. Two-way ANOVA was used to compare curves of time-course studies, including cell and tumour growth curves. P< 0.05 was considered significant.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.



Source link

Latest news

How startups could be affected by a prolonged government shutdown

The U.S. government shutdown could stifle deal flow, freeze visa processing for workers, and cause other problems for...

Celebrating the partners driving Disrupt’s big ideas, connections, and community

Tech Zone Daily Disrupt 2025 wouldn’t be possible without the incredible support of our sponsors, who bring world-class...

Phia’s Phoebe Gates and Sophia Kianni talk consumer AI at Disrupt 2025

Consumer AI is having its breakout moment — and few startups have captured the spotlight this year quite...

China Rolls Out Its First Talent Visa as the US Retreats on H-1Bs

The bottom line is that, unlike the US, China is not a country of immigrants. In 2020, only...

Tech Zone Daily Disrupt 2025 Bundle Sale Ends Tomorrow

Ticktock! The Founder and Investor Bundle sale for Tech Zone Daily Disrupt 2025 ends tomorrow, October 3, at...

Perplexity acquires the team behind Sequioa-backed AI design startup Visual Electric

Sequoia-backed AI design startup Visual Electric said that it is joining search startup Perplexity today. The company noted...

Must read

You might also likeRELATED
Recommended to you