Mapping isoforms and regulatory mechanisms from spatial transcriptomics data with SPLISOSM

Designing isoform-level SV and association tests

Spatial gene variability arises from both cell-type distribution patterns and location-specific variation within cell types. While several methods exist for identifying spatially variably expressed (SVE) genes27,28,29,30,31, they prioritize different spatial patterns, resulting in limited consensus32. Moreover, these approaches cannot be easily adapted for multivariate isoform analysis due to their reliance on specific model assumptions (Supplementary Note). Drawing on nonparametric statistics’ effectiveness in modeling total expression from sparse ST data30, we reformulated spatial variability (SV) detection and differential usage (DU) analysis as multivariate independence testing. Specifically, SPLISOSM transforms isoform compositions, spatial coordinates and other covariates of interest into feature spaces, capturing their complex relationships through the kernel-based Hilbert–Schmidt independence criterion (HSIC)33,34 (Fig. 1c). Significant patterns are identified by comparing against the null hypothesis that a gene’s isoform preference is independent of either spatial locations or a given covariate such as RBP expression (Methods).

Our approach introduces two key technical innovations: first, we mathematically prove that all low-rank approximations of kernel tests—including those in SPARK-X30—inevitably sacrifice statistical power (Theorem 1 in Supplementary Note). This insight led us to develop enhanced spatial kernels35 using spectral graph theory that prioritize signals by their spatial frequency, limiting power losses to high-frequency fluctuations representing noise when processing large-scale data. Second, we established a framework for sparse isoform data with a new compositional kernel that handles undefined ratios (Theorem 2 in Supplementary Note), optimizing efficiency while maintaining test validity through a mean replacement strategy that uses all spatial observations.

Based on these theoretical advances, SPLISOSM implements three complementary SV tests: HSIC-GC for total gene expression, HSIC-IR for isoform usage ratios and HSIC-IC for isoform counts. For genes showing SV in isoform preference, we apply conditional DU tests to identify potential RBP regulators (Methods). Using SV, SPLISOSM enables cell-type-agnostic discovery (Fig. 1d), complementing existing single-cell isoform studies36,37,38 while offering unique advantages in contexts where cell typing is challenging, such as in tumor microenvironments.

SPLISOSM yields well-calibrated P values in simulation

We first validated SPLISOSM through simulations reflecting both gene expression and isoform preference variation (Fig. 2a and Methods). Our SV tests target distinct biological signals with well-calibrated statistics (Fig. 2b). Among them, HSIC-IR correctly identified spatial patterns in isoform usage while ignoring changes in total gene expression (scenario 4 versus 5). The modified linear kernel with mean replacement provided optimal power without introducing artifacts (Supplementary Fig. 1a,b). Complementarily, HSIC-IC detected both isoform usage (scenario 1 versus 3) and gene expression spatial variation (scenario 1 versus 4). We also observed improvements in the gene-level HSIC-GC compared to SPARK-X through the optimized spatial kernel (scenarios 4–6), validating our theoretical predictions.

Fig. 2: SPLISOSM produces well-calibrated and permutation-free P values in simulation.

a, Six simulation scenarios with SV at different layers. Regional variability was generated from two binary and two continuous spatial covariates Iso, isoform. b, QQ plots of SV test P values across 1,000 simulated genes per scenario. Regional gene expression (scenarios 4–6) was included to introduce artifacts in the observed ratios. c, QQ plots of DU test P values aggregated from 2,000 tests (1,000 genes × 2 covariates) per covariate type. Variable isoform usage independent of covariates (scenarios 2 and 5) was sampled from a Gaussian process (GP) to mimic spurious differential associations. For binary covariates, two-sided t-tests were performed on individual isoforms with P values combined at the gene level using Fisher’s method. For GLM and GLMM, null models with zero effect size were fitted and tested using the score statistic. Red text indicates components with spatial variability.

For association analysis, we focus on the challenge of correlated noise, which creates patterns that falsely align with covariates (scenarios 1 versus 2). Conventional approaches, including discrete t-tests, continuous multivariate correlation and generalized linear models (GLMs), all exhibited considerable P value inflation under spatial confounding (Fig. 2c, scenarios 1 versus 2 and 4 versus 5). SPLISOSM overcomes this limitation by introducing spatial conditioning within existing differential analysis frameworks (Methods). Simulations confirmed that our nonparametric conditional HSIC maintains proper null calibration while preserving sensitivity to subtle covariate effects (Fig. 2c and Supplementary Fig. 1c–e), and that our parametric generalized linear mixed model (GLMM) partially controlled false positives compared to standard GLMs.

Spatial splicing in adult mouse brain targets synaptic and membrane trafficking genes

Recent studies17,18 have generated spatial isoform maps of postnatal and adult mouse brains by combining 10X Visium spatial barcoding with Oxford Nanopore (ONT) sequencing. While able to resolve full-length transcripts, these long-read ST technologies are limited by throughput and data sparsity, providing a natural testbed to demonstrate SPLISOSM’s utility in biological discovery. We reanalyzed two SiT18 coronal brain section samples (CBS1, CBS2) from adult mouse (Fig. 3a). SPLISOSM detected 150 spatially variably processed (SVP) genes shared by both replicates (adjusted P < 0.05, Fig. 3b and Supplementary Figs. 2 and 3)—more than double the originally reported 61 genes—without relying on region annotations. Our SV tests demonstrated strong consistency between replicates (Spearman’s ρ of P values = 0.52, 0.39, 0.43 for HSIC-IR, HSIC-GC and HSIC-IC, respectively), while SPARK-X biased toward the sample with higher sequencing coverage (Spearman’s ρ = 0.26). These SVP genes were enriched in membrane trafficking and synaptic signaling (Clta, Cltb, Snap25, Stxbp1, Ap2a1, Atp6v1e1, Napa), with many implicated in neurodegenerative diseases (Fig. 3c). We also observed widespread isoform variability in ribosomal genes (Rps24, Rps6, Rpl13a, Rpl5, Rps9), aligning with previous studies showing cell-type and tissue-specific ribosome composition39. Transcript usage variability alone proved sufficient to define brain regions with comparable spatial resolution to expression-based clusters (Fig. 3d and Methods), and isoform preference could be grouped into region-specific programs by gene-wise clustering (Fig. 3e and Methods).

Fig. 3: Integrative analysis reveals spatial alternative splicing programs in adult mouse brain enriched for synaptic and membrane trafficking functions.
figure 3

a, Workflow overview. CA, cornu ammonis; DG, dentate gyrus. b, Comparison of SV in gene expression (x axis) versus isoform usage (y axis) in the CBS2 sample. Points colored by expression perplexity (effective number of isoforms per gene). Inset: number of events significant in both CBS1 and CBS2 replicates (adjusted P < 0.05), categorized as SVP or spatially variably expressed but not variably processed (SVENP) genes. c, Pathway enrichment analysis comparing SVP versus SVENP genes, with terms arranged by precision differences between the two groups. d, Unsupervised clustering based on gene-level expression of SVE genes (left) or relative isoform ratios of SVP genes (right). Isoform counts were locally smoothed to compensate for data sparsity. e, Selected isoform usage programs identified through hierarchical clustering of SVP genes. f, Distributions of AS event types among SVP and SVE genes. g, Selected exon features of 214 skipping or mutually exclusive exons in SVP and 1,349 exons in SVENP genes, showing per-exon median relative position (left) and median exon number of transcripts containing each exon. Boxplots show median (center line), interquartile range (box) and 1.5× interquartile range (whiskers). P values from two-sided t-test. h, Unsupervised clustering based on 1,028 SVE RBP expression from SR sequencing data. i, Top RBPs ranked by number of significantly associated targets, color-coded by availability of documented motifs (CISBP-RNA) or CLIP data (POSTAR3). Right: log-normalized expression of selected RBPs. j, Differential isoform usage P values for Rbfox3 (left) and Celf5 (right) in two 50-μm spaced adjacent technical replicates. Red points indicate validated targets with RBP binding near regulated events in brain-CLIP data. k, Gnas transcript structure and spatial log-normalized expression patterns. HSPC, hematopoietic stem and progenitor cell. Panel a created using BioRender.com.

At the structural level, SVP genes showed enrichment for exon skipping particularly near the 3′ end, as well as alternative first exons (Fig. 3f,g). To identify potential modulators behind these events, we systematically searched for differential association with spatially variably expressed RBPs, which collectively reproduced major brain compartments (Fig. 3h and Supplementary Fig. 4) and may act through direct binding or as indirect cofactors to drive transcript regional specificity. Top RBPs prioritized in our analysis included well-established neural splicing regulators such as QKI, RBFOX and CELF families and less-studied ones such as ARPP21 (Fig. 3i). To evaluate these relationships, we examined available brain-CLIP datasets40, confirming four of nine significant Rbfox3 associations. Although CELF5-specific CLIP data remain unavailable, eight of ten of its associated target genes showed binding by other CELF family members, supporting CELF5’s role as a major neural splicing modulator in adult mouse brain (Fig. 3j).

Among SVP genes, Gnas showed particularly intriguing associations (Fig. 3k). Its exon 3, which is spliced out specifically in midbrain and fiber tracts, generates a hyperactive G-alpha protein driving differentiation defects and myeloid malignancies when included in SRSF2 and U2AF1 mutants41. Our analysis reveals that RBFOX and CELF families potentially regulate Gnas exon 3 skipping in the brain, raising the possibility that neural RBP dysregulation in nonneural tissues could contribute to disease pathogenesis.

Cross-platform validation of brain transcript variability

The substantial spatial transcript processing identified from long-read data prompted us to examine whether these patterns could be detected across different ST platforms, each recognizing distinct aspects of molecular variation. In addition to alternative splicing (AS), mammalian nervous systems exhibit active alternative polyadenylation (APA)42 that generates transcript diversity near the 3′ end (transcriptome 3′ end diversity, TREND43), which can be captured by 3′-based ST sequencing protocols. In a separate 10X Visium sample44 of adult mouse brain, we identified 815 SVP genes with spatially variable TREND usage (adjusted P < 0.01, Fig. 4a,b and Methods). The increased detection power reflected the higher sequencing depth of SR Visium, as confirmed through down-sampling (Supplementary Fig. 5a). Analysis of a near-single-cell Slide-seqV2 dataset of mouse hippocampus16 revealed similar albeit fewer TREND patterns due to its sparsity (Extended Data Fig. 1a). In total, 95 SVP genes overlapped between long-read and SR datasets despite their technical differences (Extended Data Fig. 2a,b). Furthermore, genes using variable isoforms across space also show significantly higher SV in their 3′ end regions detectable by SR sequencing (Extended Data Fig. 2c and Supplementary Fig. 6), suggesting coordinated transcript processing across the full transcript length45.

Fig. 4: Spatial 3′ end transcript diversity in adult mouse brain extends beyond alternative polyadenylation and shows functional convergence on signaling pathways.
figure 4

a, Workflow overview. b, Comparison of SV in gene expression (x axis) versus TREND usage (y axis) in the SR coronal brain section sample. Points colored by expression perplexity (effective number of isoforms per gene). Inset: significant events (adjusted P < 0.01). c, TREND event annotation with overlapping with genomic features. Alternative exons (Alt-exon) are defined as exons located within introns of other transcript variants. Bold numbers are the absolute number of events. CDS, coding sequence. De novo motif enrichment results shown at bottom. P value computed using a one-sided chi-square test. d, Pathway enrichment analysis comparing SVP and SVENP genes, arranged by precision. e, Spatial log-normalized expression of the neuronal (chr. 9:110005229) and glial (chr. 9:110083783) Map4 variants. f, Unsupervised clustering based on gene-level expression of top 200 SVE genes (left), relative TREND ratios of top 200 SVP genes (middle) and gene expression of all SVE RBPs (right). g, Gnao1 transcript structure (left) and spatial log-normalized expression patterns (middle). Right: the likely protein sequence variation, color-coded by solvent-accessible surface area. FDR, false discovery rate. h, Arpp21 transcript structure and spatial log-normalized expression patterns. aa, amino acid. i, Top RBPs associated with TREND usage in Arpp21, Celf2 and Pcbp2. Dashed line indicates P = 0.01. Panel a created using BioRender.com.

To obtain orthogonal validation, we reanalyzed a 10X Xenium Prime 5K dataset46 of adult mouse brain with subcellular spatial resolution through multiplexed in situ hybridization (Methods). Although most predesigned Xenium probes do not target sequencing-based variable regions, we observed strong gene-level agreement: 55.6% (20 of 36) of ONT-detected and 39.8% (100 of 251) of SR-detected SVP genes showed significant spatial exon usage variability, both far exceeding the baseline rate of 14.9% (Extended Data Fig. 2d–g). At single-cell resolution, we could pinpoint cellular origins of spatial transcript patterns. For example, Xenium confirmed that Dtnbp1 isoform switching detected by long-read in specific brain regions was driven by distinct neuronal subpopulations: Nptxr+/Slc17a7+ excitatory neurons in olfactory and hippocampus-CA3 and GABAergic interneurons in isocortex (Extended Data Fig. 3). Similarly, we validated the Zdhhc8 3′ end variation and its isocortex and hippocampus-dentate gyrus specificity (Supplementary Fig. 7).

This cross-platform convergence demonstrates that spatial transcript diversity represents a fundamental organizing principle of brain architecture, detectable across diverse technical approaches (Extended Data Fig. 4 and Supplementary Fig. 8). It also establishes SPLISOSM as a robust and cell-type-agnostic framework for uncovering transcript patterns invisible at gene level from ST data.

Spatial 3′ end transcript diversity in adult mouse brain shows functional convergence on signaling pathways

Although over 80% of spatially variable TREND events mapped to 3′ untranslated regions (UTRs), we found significant localization at junctions and alternative exons (Fig. 4c). Some non-UTR events represented internal poly(A) priming sites, supported by A-rich motif enrichment, allowing us to detect upstream diversity deeper into transcripts. The functional importance of TREND became evident as 25% of events overlapped coding regions. Of seven previously reported protein-altering poly(A) site shifts between neurons and glia47, we recapitulated six (Map4, Atp2a2, Cdc42, Klc1, Itsn1, Ptpn2) while providing new insights into their spatial distribution (Fig. 4e and Extended Data Fig. 1b). SVP genes converged on key neural signaling pathways, including glutamatergic and adrenergic synaptic transmission and Rap1, Ca2+, cAMP and MAPK cascades (Fig. 4d). We observed additional protein-coding changes in genes involved in diverse cellular functions (Extended Data Fig. 1c), including proteostasis (Hsp90aa1, Hsp90ab1), membrane trafficking and secretion (Septin8, Septin11, Ap1s2, Olfm1, Lgi3) and enzymatic activities (Kalrn, Rexo2, Ppp2r5c). Finally, major brain regions could be defined using TREND preference alone (Fig. 4f and Supplementary Fig. 9), in agreement with our long-read analysis.

To understand the spatial regulation of TREND, we performed association analysis between event usage and RBP expression. Contrary to conventional thinking, SV in core alternative polyadenylation machinery components was insufficient to explain the observed 3′ end diversity patterns (Extended Data Fig. 1d). Instead, neural splicing regulators from our ONT analysis such as QKI, RBFOX and CELF families also emerged as top potential regulators for APA, supporting their central role in shaping the adult mouse brain transcriptome48. A prominent RBFOX target is Gnao1, encoding a subunit of the heterotrimeric G proteins. Our analysis uncovered previously unrecognized spatial divergence in Gnao1 last exon usage between cerebral nuclei and fiber tracts versus other brain regions, generating two protein isoforms with different C termini (Fig. 4g). Using public CLIP40 and knockout data49, we confirmed that RBFOX binds and promotes Gnao1 downstream exon usage. Regional isoform specificity in Gnas and Gnao1 extends to other G proteins (Gnai2, Gnas, Gnal, Gnaq) and G-protein-coupled receptors (Gabbr1, Grm5, Adgrl1), indicating a role of alternative transcript processing in fine-tuning signal transduction for spatially diverse physiological functions.

Our catalog also brought to light self-regulatory mechanisms driving transcript diversity in RBP genes themselves. For instance, the last exon usage of Arpp21 correlated with its total expression (Fig. 4h), consistent with a regulatory model where miR-1l.,28-2, produced from the longer isoform’s second-to-last intron, inhibits Arpp21 expression50. We also verified self-regulation in the 3′ UTR of CELF family51 (Fig. 4i) and discovered additional spatially variable 3′ UTR lengths across RBPs (Elavl2, Elavl3, Qk, Rbfox1, Hnrnpa2b1, Hnrnpk). One particularly intriguing new finding involves Pcbp2, which showed enriched usage of a minor upstream last exon in isocortex and thalamus. This alternative exon encodes a protein variant with an extended disordered C terminus (Extended Data Fig. 1e), whose usage inversely correlated with overall Pcbp2 expression in both mouse brain and across human tissues (Extended Data Fig. 1f). CLIP data52 revealed stronger human PCBP2 binding to this minor last exon compared to the canonical one, suggesting selective auto-regulatory feedback contributing to spatial specialization.

RBFOX and partners cooperatively regulate RNA processing

Causal inference from observational studies faces drawbacks from unknown confounding factors. Since SPLISOSM’s differential test captures both direct and indirect associations in the general form, it is not immune to such confounding effects. By comparing spatial RBP associations with perturbation studies in specific cellular environments, we can nevertheless extract meaningful biological insights.

A revealing example is the spatial regulation of Clta isoforms (Extended Data Fig. 5a). Knockout studies demonstrated that RBFOX proteins suppress Clta exon 5 and 6 inclusion in developing motor neurons49, yet our spatial test revealed a surprising positive correlation between Rbfox3 expression and the Clta-205 isoform containing these exons. Analysis of exon inclusion in a single-cell neocortex atlas initially confirmed this positive relationship across brain cell types (Extended Data Fig. 5b). The perceived contradiction resolved, however, when examining only neuronal subtypes: within GABAergic and far-projecting glutamatergic neurons, Rbfox3 negatively correlates with exon inclusion, aligning with the knockout results. This context-dependent relationship implies the presence of other cell-type-specific factors, countering RBFOX, that drive up Clta exon 5 usage in neurons. We found several additional genes (Myl6, Cltb, Tln1, Klc1; Supplementary Fig. 10) where RBFOX knockout effects contradicted spatial associations. Supporting our hypothesis, previous experiments have shown that repressive CELF binding can override RBFOX enhancement at MYL6 exon 6 in human T cells53.

To systematically identify potential RBFOX coregulators, we assessed agreement between RBP DU results and spatial coexpression patterns (Extended Data Fig. 5c). The top candidates included RBPs with varying reported RBFOX interactions: CELF family proteins, known to act antagonistically with RBFOX in muscle and heart tissue53; ELAVL2, sharing targets with RBFOX1 in human neurons54 and QKI, recently confirmed to affect RBFOX function by minigene assay55. Expression of these collaborating RBPs remained stable after RBFOX depletion (Supplementary Fig. 10a), pointing toward direct cooperation rather than cross-regulation feedback. Sequence motif analysis provided further validation, showing significant enrichment of ELAVL, CELF and QKI binding motifs near Rbfox-associated variable exons detected in the ONT data, while KHDRBS3 (a QKI homolog) and ELAVL motifs were enriched near Rbfox-associated TREND regions in SR data (Extended Data Fig. 5d). Consistently, we observed position-dependent overrepresentation of CELF and ELAVL motifs around exons responding to RBFOX triple knockout in the perturbation data49, supporting the collaborative regulation between multiple RBPs on the same targets (Extended Data Fig. 5e).

Conserved synaptic gene transcript diversity in human prefrontal cortex

Given the critical role of splicing and APA in brain functions, we hypothesized that spatial transcript diversity patterns would be highly preserved throughout mammalian evolution. To investigate this, we reanalyzed 12 10X Visium samples56 from healthy human brain (Fig. 5a and Methods), identifying 861 SVP genes in the dorsolateral prefrontal cortex (DLPFC) (adjusted P < 0.05). Power to detect SV has not yet saturated with respect to sequencing depth (Fig. 5b and Extended Data Fig. 6a), so improved RNA capture per spot would likely reveal additional variable events. When evaluating performance across replicates, SPLISOSM’s expression-based tests (HSIC-GC and HSIC-IC) demonstrated better power and consistency compared to SPARK-X, which was hampered by suboptimal kernel design (Fig. 5c). The HSIC-IR test showed weaker concordance, primarily due to data sparsity and noise, although this improved significantly for genes with higher expression.

Fig. 5: Evolutionarily conserved transcript diversity in synaptic genes across the human prefrontal cortex.
figure 5

a, Workflow overview. b, Relationship between total read coverage in TREND regions per spot (x axis) and number of significant SVP (top, HSIC-IR adjusted P < 0.01) or variably expressed (SVE, bottom, HSIC-GC adjusted P < 0.01) genes across datasets. Line and shading indicate fitted linear model and 95% confidence interval. c, P value correlation distributions between technical replicates (left), different sections from same donor (middle) and different donors (right). Genes are stratified by average TREND read counts per spot and group means are compared using the Kruskal–Wallis test. Boxplots show median (center line), interquartile range (box) and 1.5× interquartile range (whiskers). d, SVP genes ranked by recurrence (number of significant samples, x axis) and the minimal HSIC-IR P values across samples (y axis). Highlighted genes have mouse homologs that are also spatially variably spliced (HSIC-IR adjusted P < 0.01). e, Distributions of average conservation scores in TREND regions across gene categories: nonvariable controls (n = 12,107), human-specific SVENP (n = 8,044) or SVP (n = 1,319) and human-mouse shared SVENP (n = 4,022) or SVP (n = 401). Boxplots show median (center line), interquartile range (box) and 1.5× interquartile range (whiskers). Pairwise group means are compared using two-sided t-test. f, Pathway enrichment analysis comparing SVP (orange) versus SVENP (dark blue) genes conserved between human and mouse. g, RBP-SVP associations ranked by recurrence (x axis) and minimal GLMM P values across samples (y axis). Top potential regulators of SEPTIN8 are highlighted and colored by whether the association is conserved in mouse (P < 0.01 for both GLMM and HSIC-based DU tests). h, SEPTIN8 transcript structure and spatial log-normalized expression in mouse (top) and human (bottom, sample 151673). i, SEPTIN8-long and SEPTIN8-short protein variants with potentially different cellular functions. Panels a and i created using BioRender.com.

Despite these technical limitations, we found strong evidence for cross-species conservation at both individual gene and functional levels. Of the human SVP genes, 178 had mouse homologs that also exhibited SV (Fig. 5d and Extended Data Fig. 6b,c). Moreover, conserved TREND regions displayed significantly higher sequence conservation than invariant regions (Fig. 5e). Functional enrichment analyses in both species highlighted the importance of isoform-level regulation in synaptic signaling and chaperone-mediated autophagy, processes implicated in various neurodegenerative disorders (Fig. 5f and Extended Data Fig. 6d).

The molecular regulation governing spatial isoform diversity also showed remarkable cross-species similarities. MAP4, which organizes microtubules in muscles and neurons, exhibited patterns corresponding to neuronal-versus-glial usage in both species (Fig. 4e and Extended Data Fig. 6e). While PTBP2 was previously identified as a regulator of MAP4 last exon usage in embryonic brain cortex47, its limited SV in adult brain suggested the involvement of additional factors in region-specific regulation (Extended Data Fig. 6f). Our DU analyses pointed to RBFOX1, QKI and CELF families as likely cofactors, with CLIP data revealing conserved binding sites supporting this hypothesis (Extended Data Fig. 6g).

SEPTIN8 provides another compelling example of evolutionarily conserved regulation. This gene encodes a GTP-binding protein that polymerizes into filaments essential for cellular architecture. Our spatial analysis revealed a distinct usage of SEPTIN8 upstream last exon, presumably producing a shorter isoform predominating in mouse fiber tracts and human DLPFC white matter (Fig. 5h). This truncated protein lacks the extra C-terminal region that interacts with VAMP2 to facilitate SNARE-complex assembly57, which also features a filopodia-inducing motif that enhances dendritic arborization when palmitoylated58 (Fig. 5i). The downstream Septin8 last exon is promoted by NOVA2 in mouse motor neurons58. Our analysis indicates that NOVA2 likely cooperates with QKI and CELF proteins jointly to achieve the precise spatial patterning in the brain (Fig. 5g).

Microenvironment shapes glioma spatial transcript diversity

The capability to detect patterns without cell-type or region annotations makes SPLISOSM especially valuable for complex tissues with ambiguous cellular states. Glioblastomas are characterized by their infiltrative nature and heterogeneous spatial structures59. To explore isoform distributions in this challenging context, we systematically mapped spatial transcript usage across 24 human glioma samples (Fig. 6a and Methods), identifying 306 SVP genes in 11 ONT samples60 and 2,828 SVP genes in 13 SR samples59 (adjusted P < 0.05, Fig. 6b). The most frequently recurring SR-SVP genes overlapped substantially with those found in healthy brain tissue, reflecting compositional changes of nontumor cells (Fig. 6c and Extended Data Fig. 7a,b). As in healthy brain, approximately 25% of all variable transcript events in glioma had potential implications for protein structure, with alternating exon accounting for much of this variation (Extended Data Fig. 7d). Beyond RNA processing, glioma-specific transcript diversity also emerged from genomic alterations. In one pediatric sample, we detected variable immunoglobulin transcripts resulting from variable–diversity–joining recombination, revealing clonality in B cell infiltration (Extended Data Fig. 7c).

Fig. 6: ST diversity in human glioma is shaped by microenvironment composition and immune infiltration.
figure 6

a, Workflow overview. b, SVP genes ranked by recurrence and the minimal HSIC-IR P values across samples in ONT (left) and SR (right) cohorts. Orange indicates glioma-specific genes not variable (HSIC-IR adjusted P ≥ 0.05) in all healthy DLPFC samples. c, Distributions of recurrent and glioma-specific SVP genes. d, Pathway enrichment analysis comparing recurrent SVP (HSIC-IR adjusted P < 0.05 in ≥2 samples) genes versus recurrent SVENP (non-SVP and HSIC-GC adjusted P < 0.05 in ≥4 samples) genes. e, Spatial distribution of HLA-DRB1 isoforms in the ONT sample GBM1. f, B2M and CD74 isoform expression in GBM1, comparing tumor (n = 3,706) versus immune-infiltrated (n = 898) spots. Three novel CD74 intron-retention isoforms are pooled together for visualization. Boxplots show median (center line), interquartile range (box) and 1.5× interquartile range (whiskers). P values from two-sided t-test. g, TPM3 transcript structure, RNA-seq coverage, and spatial patterns in the SR sample ZH916bulk. Event counts were locally smoothed when computing ratios, same in l,m. h, Glioblastoma metaprogram distribution in regions with high or low TPM3-event 1 usage. P value from one-sided chi-square test. Right: log-normalized expression (logExpr) of selected markers: SNAP25 (neuron), ACTB (glial), IGKC (B cell), NDRG1 (hypoxia). AC, astrocyte-like; MES, mesenchymal-like; MES.Ast, astrocytic-like mesenchymal; MES.Hyp, MES-hypoxia; Vasc, endothelia cells and pericytes. i, FTH1 transcript structure in ONT sample DMG3 and its RNA-seq coverage in SR sample ZH8811Abulk. j, Hypoxia marker expression and association with FTH1-211 usage in DMG3. P value from one-sided chi-square test. k, Histology (left), metaprogram distribution (middle, P value from one-sided chi-square test) and hemoglobin expression (right) in spots with high (n = 438) or low (n = 1,945) FTH1-event 2 usage in ZH8811Abulk. Mac, macrophage/microglia; OPC, oligodendrocyte progenitor-like; NPC, neural progenitor-like. Boxplots show median (center line), interquartile range (box) and 1.5× interquartile range (whiskers), with P values from two-sided t-test. l, Spatial distribution of FTH1 isoforms in DMG3. m, Spatial distribution of FTH1 TREND usage in ZH8811Abulk. Panel a created using BioRender.com.

The importance of immune interactions was further underscored by consistent variability in antigen presentation components across samples, often correlated with immune infiltration patterns (Fig. 6d and Extended Data Fig. 7e–g). This included human leukocyte antigen (HLA) class I (HLA-A, B, C) and class II (HLA-DRA, DRB1, DPB1) genes, whose sequence variation likely stems from both alternative splicing and germline variability61,62 (Fig. 6e). For nonpolymorphic elements such as B2M and CD74, we found preferential usage of intron-retaining, potentially nonfunctional transcripts in tumor regions (Fig. 6f). While the SR data could not capture identical variability events due to its 3′ end bias, it demonstrated similar functional convergence toward immune-related pathways, particularly in infectious disease response and viral infection pathways (Fig. 6d and Extended Data Fig. 7g).

Furthermore, we detected SV in signal transduction, focal adhesion and cytoskeleton regulation pathways (Extended Data Fig. 7h), reflecting the intricate cellular interactions and extracellular matrix remodeling within tumors. TPM3, a tropomyosin family member with well-documented cancer-associated splicing patterns across tumor types63, exhibited complex and distinct splicing variation (Fig. 6g). We identified an upstream TPM3 event (event 4) that partially overlapped with neuronal and hypoxia signatures, while the canonical form (event 1) associated with glial programs and increased actin expression while negatively correlating with B cell infiltration (Fig. 6h). These results resonate with recent reports of differential splicing of actin cytoskeleton components between peripheral and core glioblastoma cells64. In addition, we observed spatially variable usage of the last exon of GFAP, encoding two variants of an astrocytic intermediate filament protein whose ratio is linked to glioma grade and malignancy65,66 (Extended Data Fig. 7i). The location-specific distribution of these cytoskeleton gene isoforms indicates dynamic responses to local microenvironmental cues, modulating migration potentials and driving glioblastoma invasion.

Microenvironmental influences extended to ribosomal proteins as well. We observed SV in transcripts encoding ribosomal proteins (including RPS8, RPS24, RPL5, RPLP1, RPS6, RPS12) associated with tumor infiltration, frequently involving retained introns or truncated 3′ ends (Supplementary Fig. 11). While some such as RPS8 also exhibited variability in normal brain, the altered read distributions clearly displayed glioblastoma-specific changes (Supplementary Fig. 11a), aligning with previous findings that ribosomal isoform switching occurs in response to microenvironmental cues and contributes to metabolic reprogramming phenotypic plasticity in glioblastoma67.

FTH1 offers another prominent example of microenvironment-driven transcript usage (Fig. 6i–m). This iron-storage protein, upregulated under hypoxic conditions, protects cells from ferroptosis by reducing reactive iron68. We discovered that FTH1 upregulation in hypoxic tumor regions coincided with increased expression of 3′ truncated transcripts lacking the final exon–exon junction (Fig. 6j). In SR data, we identified a minor event spanning FTH1 exons 2 and 3, predominantly in regions with erythrocytes and overlapped with the chromatin regulation metaprogram (Fig. 6k). Our findings suggest coordination between oxygen-sensing, iron metabolism and stress response pathways through transcript-level modulation of FTH1.

Leave a Comment