Tissue and cellular spatiotemporal dynamics in colon aging

Image and ST data preprocessing

Hematoxylin and eosin (H&E) images were processed with SpoTteR (version 0.0.1)156. Briefly, original H&E images were scaled to approximately 500 × 500 pixels. Then, the tissue section was masked generously from the image through 10% quantile thresholding in a user-defined color channel. To detect probable spot centers, the image Hessian was computed. The spot centers then acted as potential grid points that were likely part of a regular grid structure and were selected by calculating the x and y distances between all detected centers. A regular grid was then fitted to these potential grid points using a custom optimizer based on the ‘nlminb’ function of the R package stats, which minimizes the distance of potential grid points to the suggested regular grid while assuming angles of 90° and 42 starting grid points per row and column. Through an iterative process, in which the 0.1% potential grid points that least fit the grid were removed in each iteration, the number of grid points per row and column was updated and a new grid was fitted until the target number of grid points per row (here, 35) and column (here, 33) was reached. Finally, those grid points that overlapped the tissue sections were identified by building a mask that represented the tissue area and registering all grid points that were present in this mask. In case a sectioning artifact was present, the corresponding ST spot was removed from all subsequent analyses.

ST spot annotation

To assign each ST spot with a single (mutually exclusive) corresponding histological tag, a cloud-based interface33 was used to assign each spot (x, y) with one or more regional tags. A total of 14 tags (MROIs) based on established major gross morphology were used: crypt apex, subcrypt, crypt mid, crypt base, CM, epithelium and muscularis mucosae (EMM), EMM and submucosa (EMMSUB), epithelium and muscle and submucosa, ME, MEI, MI, muscularis mucosae and interna, muscle and submucosa, and PP.

Detection of tissue sections in histology images

In most cases, more than one tissue section was placed on the active area of one ST array. To distinguish between different tissue sections, a two-dimensional integer lattice was assumed so that labeled ST spots that were connected were assigned the same tissue section. Next, ST spots were filtered on the basis of their sequencing data quality, such that tissue sections labeled with fewer than five (ages 0 days–3 weeks) or ten (ages 4 weeks–2 years) spots in total were discarded from further analysis. ST spots with fewer than 800 unique molecular identifiers (UMIs) were also discarded from further analysis. To account for spots without four neighbors, each spot was mapped after filtering to match the same two-dimensional integer lattice [[0,1,0], [1,1,1], [0,1,0]] and spots not matching this patterns were also discarded.

Training a cell density classifier for segmentation

To train a cell density classifier to segment individual objects in the histology images, each whole-slide image (WSI) was first subset into smaller patches while retaining patches at the same resolution for training, selected such that each patch would contain all the major colon layers from at least one tissue cross section from the original WSI. Overall, ~200 patches were selected for training, with at least ten replicate patches from each of the different ages. To count the number of cell segments present in each ST spot, a density classifier was first trained using ilastik (version 1.4.1)157. This workflow estimated the density of blob-like structures usually present as overlapping instances, decreasing the chance of underestimating the number of objects because of undersegmentation, which we reasoned was the most appropriate approach for counting cell-dense areas in the colon. To ensure reproducibility across all density conditions in the dataset, in each training patch, at least three separate tissue areas (that is, training squares) were used. In each training square, two classes of objects were labeled: cells and background.

Processing segments per ST spot

Each WSI processed with the SpoTteR ST processing tool156 was split into image patches of 200 × 220 pixels representing the size of an ST spot capture area. The cell-counting workflow described above was then used to extract density predictions for each ST spot. The following image processing and segmentation steps were performed with skimage158 (version 0.18.1). First, an ellipse shape (radius = 100 pixels) was used to mask the true ST capture area in each patch. If no cells were left in the patch (mean image intensity < 0.05), the patch was discarded from further processing. Next, the multi-Otsu159 thresholding algorithm (cutoff > 50) was used to separate objects detected in the patch. Local maxima were found for each object and used to estimate distances between the same. These were then approximated by the watershed algorithm160 into segments that were further labeled into individual objects used in all downstream analyses.

Training an object classifier to obtain superclass cell type labels from histology images

MROI-specific object classifiers were trained using ilastik (version 1.4.1)157, with binary segmentation images and their corresponding H&E patches as input. In this way, each segment in the H&E patch was assigned a cell type superclass label. In each classifier, depending on the cells present in the corresponding MROI, up to five superclasses were labeled: colonocyte, immune, interstitial, muscle and epithelial. MROI-specific classifiers (colonocyte, PP, muscle, noncolonocyte and epithelium) with corresponding cell type superclass labels (colonocyte, immune, interstitial, muscle and epithelial), MROI labels and snRNA-seq cell type labels are presented in Supplementary Table 3. Because of differences in cell and tissue morphology, separate superclass classifiers were created for juvenile (0–4 weeks) and adult/aged (6 weeks–2 years) groups, for a total of ten image-based classifiers representing five MROI-specific classifiers across two age groups. To train each classifier, ~150 patches were randomly selected from all three regions of the colon (proximal, middle and distal) and from each of the following MROIs: CM, EMMSUB, subcrypt, crypt mid, crypt apex, MEI and PP. The object classifiers took into consideration object-level characteristics, such as object shape and work, to predict similar objects in the nearby space. Algorithm features used in training included two-dimensional convex hull and skeleton descriptors in a neighborhood size of 30 × 30 pixels for each object and used a simple threshold (0.5) with a small smoothing factor (1.0). Properties attributed to standard object features such as shape, size, channel intensity and location were also selected in the training process. In total, 1,540 patches and 83,721 segments were labeled during training.

Processing cell type superclass from histology images per ST spot

Each H&E image patch (200 × 220 pixels) and corresponding segmentation predictions were used in ilastik batch processing to predict cell type superclasses using the object classifier described above. Cell label predictions were used in the following image processing workflow implemented using skimage (version 0.18.1)158. Each pixel class in the image was assigned one of the cell type superclasses. Then, small objects (<50 pixels) were removed from each patch and the remaining small segments abutting each other were merged if belonging to the same cell type class. The fractions of foreground pixels belonging to each object class were used as estimates of the abundance of each cell type in each patch.

Testing the object classifier used for obtaining superclass cell type labels from histology images

To evaluate the performance of the cell superclass classifiers, a test set of 781 patches (50% size of training set) spanning the five adult/aged (colonocyte, immune, interstitial, muscle and epithelial) and five juvenile (colonocyte, immune, interstitial, muscle and epithelial) cell superclasses was set aside. Foreground objects were detected using the binary segmentation workflow, after which all objects were manually assigned to one of the five superclasses (colonocyte, immune, interstitial, muscle and epithelial). The same images were then input into the respective MROI-specific object classifiers and performance was evaluated by calculating the confusion (fraction of pixels labeled class A but predicted class B) for all pairs of cell types.

Morphology-informed deconvolution using SPOTlight

The SPOTlight model was used for ‘bottom-up’ deconvolution of ST data48, which takes as input two matrices of count data: \(V\), a (genes × cells) matrix containing the snRNA-seq count data (in which each cell is assigned to one of \({k}_{\mathrm{sn}}\) types), and \(V{\prime}\), a (genes × spots) matrix containing the ST count data. Expression matrices were preprocessed in the following manner: (1) genes were subset to the set shared across both modalities; (2) data were depth-normalized to 10,000 UMI counts per cell or spot; and (3) data were scaled gene-wise to unit variance. Next, genes were further subset to cellular marker genes (log2(fold change) > 1, Benjamini–Hochberg false discovery rate (FDR) < 0.05; likelihood ratio test) and balanced across the \({k}_{\mathrm{sn}}\) cell types, selecting the top \(m=23\) genes by FDR for each cell type, where \(m\) was chosen as the minimum number of significant marker genes across all cell types (23 for T cells). This resulted in a total of 334 unique genes used in deconvolution (Supplementary Table 4).

\(V\) was factored into component matrices \(W\) (genes × topics) and \(H\) (topics, cells) by NMF:

where the number of topics is assumed to be equal to the number of snRNA-seq cell types \({k}_{\mathrm{sn}}\). Before NMF, all values of \({W}_{g,t}\) were initialized to the probability of gene \(g\) being a marker gene for cell type \(t\) (quantified as Benjamini–Hochberg-corrected \({P}_{\mathrm{adj}}\) of t-test on log(count) data) and \(H\) was initialized as a binary matrix denoting the class assignment for each cell in the dataset. These initialization conditions—in which topics were treated equivalently to cell types—are meant to bias the optimization toward the discovery of biologically meaningful topic profiles.

Next, topic profiles \(W\) were fixed, and the following equation was minimized over \(H{\prime}\) (topics, spots) by non-negative least squares (NNLS):

$$V{\prime} =W\times H{\prime}$$

(2)

In this manner, the expression profile of each spot in the ST data was mapped to a combination of topics inferred from snRNA-seq.

Third, \(Q\), a (topics, \({k}_{\mathrm{sn}}\)) matrix from \(H\) was derived by selecting all cells from the same cell type and computing the median of each topic for a consensus cell-type-specific topic signature. This topic matrix was used in a final NNLS minimization to find \(P\), the (\({k}_{\mathrm{sn}}\), spots) matrix denoting the inferred cellular composition of each ST spot:

$$H{\prime} =Q\times P$$

(3)

A modification to Eq. 3 was implemented that allows the incorporation of morphology-informed composition information derived from the image segmentation workflow, by providing two additional inputs: \(L\), a (\({k}_{\mathrm{morph}}\), spots) matrix containing the composition of each ST spot in terms of the \({k}_{\mathrm{morph}}\) morphological cell types defined in the segmentation model, and \(S\), a (\({k}_{\mathrm{morph}}\), \({k}_{\mathrm{sn}}\)) binary matrix mapping each expression cell type to a morphological cell type. Any proposed compositional matrix \(P\) should additionally satisfy the following to reconstruct morphology-informed compositional data:

Morphology-aware SPOTlight decomposition was then achieved by solving the following optimization problem:

$$\mathop{\min }\limits_{P\ge 0}(H{\prime} -Q\times P)+\alpha (L-S\times P)$$

(5)

where \(\alpha\) controls the relative importance of the morphological composition loss (second term) and the expression loss (first term). This optimization problem was solved using the PyTorch implementation of the Adam optimizer with a learning rate of 0.01 run for 100,000 iterations from a random initialization.

cSplotch model specification

Genes i, tissue sections j and spots k were indexed as follows: \(i\in [1,\cdots ,{N}_{\text{genes}}],j\in [1,\cdots ,{N}_{\text{tissues}}],k\in [1,\cdots ,{{N}^{(j)}}_{\text{spots}}]\). Each tissue \(j\) was registered to a common coordinate system, such that each spot k was assigned to one of \({N}_{\mathrm{MROI}}\) distinct MROIs, denoted \({D}_{k}^{(j)}\in [1,\ldots ,{N}_{\text{MROI}}]\), as described above. In the compositional mode, each spot was additionally assigned a simplex vector \({E}_{k}^{\,(j)}\in {{\mathbb{R}}}_{\ge 0}^{{N}_{\mathrm{cell}\,\mathrm{types}}}\), \({\sum }_{x}{E}_{k,x}^{\,(j)}=1\forall j,k\), which describes its proportional composition in terms of all \({N}_{{\text{cell}}\,{\text{types}}}\) unique cell types across the dataset.

For each gene and each spot, the observed counts \({y}_{i,\,j,k}\) were considered to be realizations of random variable with an expected value equal to \({s}_{j,k}{\lambda }_{i,\,j,k}\), where \({s}_{j,k}\) is a size factor (total number of UMIs observed at spot \(k\)) and \({\lambda }_{i,\,j,k}\) is the rate of expression of gene i (events per exposure), such that gene expression is modeled independently of sequencing depth. In practice, \({s}_{j,k}\) was further normalized by the median depth across all spots in the dataset to facilitate comparisons of results across analyses. Thus, cSplotch offers the user two choices for modeling count data: the Poisson distribution or the negative binomial (NB) distribution. Either may be supplemented with zero inflation to account for dropout events (technical zeros), yielding the zero-inflated Poisson (ZIP) or zero-inflated NB (ZINB) distributions:

$$\begin{array}{rcl}{y}_{i,j,k}\sim \{\begin{array}{ll}\text{Pois}({s}_{j,k}{\lambda }_{i,j,k}) & \text{nb}=0,\text{zi}=0\\ \text{ZIP}({s}_{j,k}{\lambda }_{i,j,k},{\theta }_{i}^{\,p}) & \text{nb}=0,\text{zi}=1\\ \text{NB}({s}_{j,k}{\lambda }_{i,j,k},{\phi }_{i}) & \text{nb}=1,\text{zi}=0\\ \text{ZINB}({s}_{j,k}{\lambda }_{i,j,k}{\phi }_{i},{\theta }_{i}^{\,p}) & \text{nb}=1,\text{zi}=1\end{array}\end{array}$$

(6)

where \({s}_{j,k}{\lambda }_{i,\,j,k}\) represents the expected mean of all distributions, \({\phi }_{i}\) represents the gene-specific overdispersion parameter for the NB family and \({\theta }_{i}^{\,p}\) represents the gene-specific probability of technical zeros or dropout. The zero-inflated model accounts for an overabundance of zeros by introducing a second zero-generating process gated by a Bernoulli random variable:

$$\begin{array}{l}{y}_{i,j,k}\sim \left\{\begin{array}{l}0\\ \text{Pois}\left({s}_{j,k}{\lambda }_{i,\,j,k}\right)\end{array}\begin{array}{l}\text{if}\;{\theta }_{i}=0\\ \text{if}\;{\theta }_{i}=1\end{array}\right.\\ {\theta }_{i}\sim \text{Bernoulli}\left({\theta }_{i}^{\,p}\right),{\theta }_{i}^{p}\sim \beta \left(1,2\right)\end{array}$$

(7)

where the Poisson process can be replaced by NB without loss of generality. This mixture model allows for ‘true’ biological zeros to be generated by the Poisson or NB process describing the expression model while ‘shunting’ technical zeros into a separate, technical process, preventing abundant dropout events from lowering the estimated mean expression \({\lambda }_{i,\,j,k}\). Because the Poisson process does not allow for overdispersion (variance exceeding the mean), ZIP should be preferred to Poisson in most situations, whereas the use of NB or ZINB may depend on data quality.

While cSplotch considered a separate random variable to describe gene expression in each spot, the rate parameters \({\lambda }_{i,\,j,k}\) were described in terms of a GLM that separates variation into shared and individual components. Specifically, the rate of gene expression was informed by three components:

$$\log ({\lambda }_{i,\,j,k})={B}_{i,\,j,k}+{\psi }_{i,\,j,k}+{\epsilon }_{i,\,j,k},$$

(8)

where \({B}_{i,\,j,k}\) describes the characteristic expression of gene i within the tissue context of spot \(k\), \({\psi }_{i,\,j,k}\) describes the neighborhood effects and \({\epsilon }_{i,j,k}\) describes spot-specific effects. \({B}_{i,j,k}\) is calculated as a weighted sum of cellular expression rates \({\beta }_{i}\) in proportions \({{E}_{k}}^{(j)}\). Cellular expression was allowed to vary both across MROIs and sample conditions. As such, a characteristic expression matrix \({\beta }_{i}\in {{\mathbb{R}}}^{{N}_{\text{MROI}}\times {N}_{{\rm{c}}{\rm{e}}{\rm{l}}{\rm{l}}{\rm{t}}{\rm{y}}{\rm{p}}{\rm{e}}{\rm{s}}}}\) was defined (when compositional data are unavailable, each MROI may be treated as composed of a single ‘average’ cell type and \({\beta }_{i}\in {{\mathbb{R}}}^{{N}_{\text{MROI}}}\) is defined instead). Inferring a posterior over \({\beta }_{i}\) allowed the quantification of expression changes across regions or cell types by comparing relevant entries.

Because characteristic expression is expected to vary across conditions (for example, age, colon region and sex), region-specific expression \({\beta }_{i}\) was modeled in a hierarchical fashion defined by sample covariates. Up to three levels were explicitly modeled in the hierarchy, each of which split the sample to distinct groups along some covariate. At the top level, the dataset was split along an important covariate (for example, age) and a separate \({{\beta }_{i}}^{({l}_{1})}\) was modeled for each unique set \({l}_{1}\in \{1,\cdots {L}_{1}\}\). At the next level, each set was further partitioned along another covariate (for example, colon region). \({{\beta }_{i}}^{({l}_{1},{l}_{2})}\) was assumed to be centered around its corresponding top-level estimate \({{\beta }_{i}}^{({l}_{1})}\), with some additional variance associated with the new covariate \({\sigma }_{i}^{({l}_{2})}\). This encoded knowledge about the experimental system and separated out sources of variation associated with each covariate (without propagation of uncertainty, as all levels are inferred simultaneously). A three-level hierarchical model for \({\beta }_{i}\) was, thus, specified as:

$$\begin{array}{l}{\beta }_{i,{l}_{1}}{\mathscr{ \sim }}{\mathcal{N}}\left({\mu }_{i}^{\left({l}_{1}\right)},{\left({\sigma }_{i}^{\left({l}_{1}\right)}\right)}^{2}{\bf{{I}}}\right)\\ \begin{array}{l}{\beta }_{i,{l}_{1},{l}_{2}}{\mathscr{ \sim }}{\mathcal{N}}\left({\beta }_{{i,l}_{1}},{\left({\sigma }_{i}^{\left({l}_{2}\right)}\right)}^{2}{\bf{{I}}}\right)\\ {\beta }_{i,{l}_{1},{l}_{2},{l}_{3}}{\mathscr{ \sim }}{\mathcal{N}}\left({\beta }_{{i,l}_{1},{l}_{2}},{\left({\sigma }_{i}^{\left({l}_{3}\right)}\right)}^{2}{\bf{{I}}}\right)\\ {\sigma }_{i}^{\left({l}_{2}\right)},{\sigma }_{i}^{\left({l}_{3}\right)} \sim {{\mathcal{N}}}_{\ge 0}(0,1)\end{array}\end{array}$$

(9)

where, in the compositional mode, prior hyperparameters \({{\mu }_{i}}^{({l}_{1})}(\cdot ,m)\) and \({{\sigma }_{i}}^{({l}_{1})}(\cdot ,m)\) are set to the empirical mean and s.d., respectively, over the expression of gene i in cell type m in the snRNA-seq data for all MROIs; in the noncompositional mode (one ‘average’ cell type per MROI), \({{\mu }_{i}}^{({l}_{1})}(\cdot )\) and \({{\sigma }_{i}}^{({l}_{1})}(\cdot )\) are set to 0 and 2, respectively, for all MROIs. Variation parameters \({\sigma }_{i}^{\left({l}_{2}\right)},{\sigma }_{i}^{\left({l}_{3}\right)}\) are assumed to have truncated Gaussian priors reflecting our limited knowledge of the effects of covariate-driven variation and are inferred separately for each level 2 and 3 covariate group. For convenience, because each tissue \(j\) belongs to one covariate group at each level, the inverse mapping function \({\rho }^{-1}(j)\) was introduced to map \(j\) to the appropriate \({l}_{1},{l}_{2},{l}_{3}\) indices for \({\beta }_{i}\). \({B}_{i,\,j,k}\) was formally defined in the noncompositional model:

$${B}_{i,\,j,k}=\left\{\begin{array}{l}{x}_{j,k}^{T}{\beta }_{i,{\rho }^{-1}\left(j\right)}{E}_{k}^{\,\left(j\right)}\\ {x}_{j,k}^{T}{\beta }_{i,{\rho }^{-1}\left(j\right)}\end{array}\begin{array}{l}\text{if compositional}\\ \text{else}\end{array}\right.$$

(10)

where \({x}_{j,k}^{T}\) is a one-hot encoding of \({{D}_{k}}^{(j)}\), the MROI annotation for spot k. With this framework for integrating multiple sections or experiments, the posterior distributions of the latent parameters \({\beta }_{i}\) were studied at different levels of the hierarchical experimental design and expression changes were quantified across conditions, tissue contexts or individual cell types.

The second component of Eq. 8, \({\psi }_{i,j,k}\), describes the effects of the local neighborhood of spot k on the expression of gene i. This was modeled using the conditional autoregressive (CAR) prior, which assumes that the value at a given location (spot) is conditional on the values of neighboring locations (spots). \({\psi }_{i,j}\) was defined as a Markov random field over the spots on each array:

$$\begin{array}{l}{\psi }_{i,k}|{a}_{i},{\tau }_{i},{{\bf{W}}}_{j}{\mathscr{ \sim }}{\mathcal{N}}\left({\bf{0}},{\left({\tau }_{i}{{\bf{K}}}_{j}\left({\bf{I}}-{{a}_{i}{{\bf{K}}}_{j}^{-1}{\bf{W}}}_{j}\right)\right)}^{-1}\right)\\ {a}_{i}{\mathscr{ \sim }}{\mathcal{U}}\left(0,1\right)\\ {\tau }_{i} \sim {\Gamma }^{-1}\left(1,1\right)\end{array}$$

(11)

where \({a}_{i}\) is a spatial autocorrelation parameter, \({\tau }_{i}\) is a conditional precision parameter, \({{\bf{K}}}_{j}\) is a diagonal matrix containing the number of neighbors for each spot in tissue \(j\) and \({{\bf{W}}}_{j}\) is the adjacency matrix (with zero diagonal). If the classic ST methodology of using cartesian arrays is applied, each spot is assumed to have a four-spot neighborhood; if the Visium platform using hexagonal arrays is applied, each spot is assumed to have a six-spot neighborhood. The level of spatial autocorrelation (\({a}_{i}\)) and conditional precision (\({\tau }_{i}\)) was inferred separately for each gene, although weak hyperpriors on both parameters are shared across all genes to enforce the same physiological constraints (Eq. 11). Taken together, the \(B\) and \(\psi\) terms capture spatial autocorrelation on two different scales: tissue context (across samples) and local neighborhood (within samples).

The final component of Eq. 8, \({\epsilon }_{i,\,j,k}\), captures variation at the level of individual spots. This variation was assumed to be independent and identically distributed for each gene:

$$\begin{array}{l}{\epsilon }_{i,\,j,k}{\mathscr{\sim }}{\mathcal{N}}\left(0,{\sigma }_{i}^{2}\right)\\ {\sigma }_{i}\sim {{\mathcal{N}}}_{\ge 0}\left({0,0.3}^{2}\right)\end{array}$$

(12)

where \({\sigma }_{i}\) is the inferred level of variability for gene i. The degree of variance in the truncated Gaussian prior on \({\sigma }_{i}\) is set to 0.3 following the work of Maniatis et al.161, who chose the value to reflect the relatively low levels of spot-specific variation (relative to other model terms).

BF estimation for cSplotch DE analysis

To quantify difference in expression between two conditions using cSplotch, the BF between posterior distributions over characteristic expression coefficients \(\beta\) estimated by the model was examined. Without loss of generality, the difference in expression was quantified between conditions represented by \({\beta }^{(1)}\) and \({\beta }^{(2)}\), which may differ across any combination of genes, sample covariates (for example, distal versus proximal colon), tissue regions (for example, crypt apex versus muscle) or cell types (for example, neuron versus myocyte). A random variable \({\Delta }_{\beta }={\beta }^{(1)}-{\beta }^{(2)}\) was defined, which captures the difference between \({\beta }^{(1)}\) and \({\beta }^{(2)}\). If \({\Delta }_{\beta }\) is tightly centered around zero, then the two distributions are very similar to each other and the null hypothesis of identical expression cannot be rejected. To quantify this similarity, the posterior distribution \({\Delta }_{\beta }|{\mathcal{D}}\) (where \({\mathcal{D}}\) represents the data used to train the model) was compared to the prior distribution \({\Delta }_{\beta }\) using the Savage–Dickey density ratio that estimates the BF between the conditions:

$$\text{BF}\approx \frac{p\left({\Delta }_{\beta }=0\right)}{p\left({\Delta }_{\beta }=0|{\mathcal{D}}\right)}$$

(13)

where the probability density functions were evaluated at zero. If expression is different between the two conditions, then the posterior \({\Delta }_{\beta }|{\mathcal{D}}\) will have very little mass at 0 and the estimated BF will be large (by convention, \(\text{BF} > 5\) indicates substantial support). Conversely, for similar expression regimes, the posterior will place a mass equal to or greater than that of the prior at zero and the BF will be \(\le 1\). While \(p({\Delta }_{\beta })\) can be derived analytically (the prior distributions over all \(\beta\) are normally distributed and the difference between two normally distributed random variables is in turn normally distributed), \(p({\Delta }_{\beta }|{\mathcal{D}}{\mathscr{)}}\) must be approximated using the posterior samples obtained below. When we executed a comparison between sets of conditions (for example, neurons versus all other cells), we pooled the posterior samples from all component conditions together.

Parameter inference for cSplotch

cSplotch was implemented in Stan162. For all analyses, Bayesian inference was performed over the parameters using Stan’s adaptive Hamiltonian Monte Carlo (HMC) sampler with default parameters. Four independent chains were sampled, each with 250 warmup iterations and 250 sampling iterations, and convergence was monitored using the \(\hat{R}\) statistic. On average, sampling of a given gene was completed in ~3 h parallelized over four CPUs (<2 GB of total RAM used) and the sampling time for each chain was linear in both the number of spots in the dataset and the number of samples drawn, while multiple genes can be run in parallel. For high-resolution technologies (for example, Visium HD), the number of spots per tissue is dramatically increased and users may consider using variational inference to initialize HMC sampling (provided as ‘splotch-vinit’ executable) to reduce sampling time and/or suppressing the CAR model component (accounting for ~60% of computational time) with the ‘–no-car’ option.

Simulated ST data generation

Simulated ST data were generated from the snRNA-seq profiles in the two regimes described in the subsequent sections. For all simulation studies, 12 ST arrays were generated, each containing 2,000 spots. For data in which distinct MROIs were simulated, the two regions were considered to exist in a 1:1 ratio. Cell clusters comprising each region are detailed in Supplementary Table 5.

Cluster-based simulation

Average expression profiles for unique mouse colon cell types obtained from snRNA-seq \(G\in {{\mathbb{Z}}}_{\ge 0}^{{N}_{\text{genes}}\times {N}_{{\rm{c}}{\rm{e}}{\rm{l}}{\rm{l}}{\rm{t}}{\rm{y}}{\rm{p}}{\rm{e}}{\rm{s}}}}\) (\({N}_{\text{genes}}=22,986,{N}_{\text{cell}\,\text{types}}=30\)) were normalized column-wise, such that the total expression within each cluster summed to 1. For each spot \(k\) from tissue \(j\), a true composition vector \({E}_{k}^{(j)}\in {{\mathbb{R}}}_{\ge 0}^{{N}_{\text{celltypes}}}\) was drawn such that a random subset of cell types present in the current region (\({D}_{k}^{(j)}\); Supplementary Table 5) were represented in uniformly random proportions. For each cell type \(t\), reads \({y}_{k,t}\) were drawn from a multinomial distribution:

$${y}_{k,t}\sim \text{Mult}\left({M}_{k}^{\,\left(j\right)}{E}_{k,t}^{\,\left(j\right)},{G}_{.,t}\right)$$

(14)

where \({M}_{k}^{\,\left(j\right)}\) is the total number of reads in the current spot and \({G}_{.,t}\) is the expression profile for cell type \(t\). In practice, \({{M}_{k}}^{(j)}=1,000\) was used for all \(j,k\). Read counts were then pooled across all cell types, yielding spot-level reads \({y}_{k}={\sum }_{t}{y}_{k,t}\). Composition vectors were then (optionally) pooled within high-level annotation categories (such as the histological superclasses in Supplementary Table 3) to simulate cases of limited observability. Finally, Gaussian noise was (optionally) added to \({E}_{k}^{\,(j)}\) to generate ‘observed’ composition vectors \({\widetilde{E}}_{k}^{\,(j)}\). Resulting negative entries were removed by element-wise maximization against the zero vector (\({\widetilde{E}}_{k}^{\,(j)}\leftarrow \max ({\widetilde{E}}_{k}^{\,\left(j\right)},0)\)), following which \({\widetilde{E}}_{k}^{\,(j)}\) was renormalized to produce a valid simplex. \(y,D,\widetilde{E}\) served as inputs to cSplotch. Randomized compositions (highest level of noise) were generated by drawing from an \({N}_{\mathrm{cell}\,\mathrm{types}}\)-dimensional uniform (0, 1) distribution and then unit-normalizing sums to produce valid simplexes.

For all cluster-based data, cSplotch was run with a Poisson likelihood (without zero inflation) to meet the expected form of the marginals over the generating distribution (multinomial). To focus on the compositional module of cSplotch, the effects of local spatial autocorrelation were removed by simulating each spot independently and, as such, suppressed the ψi term of the GLM. Local neighborhood effects can be simulated in this regimen by passing a spatial smoothing filter over the simulated array, blending the transcriptome of each spot with those of its neighbors in a defined proportion.

Cell-based simulation

Individual snRNA-seq cell profiles \(G\in {{\mathbb{Z}}}_{\ge 0}^{{N}_{\text{genes}}\times {N}_{{\rm{c}}{\rm{e}}{\rm{l}}{\rm{l}}{\rm{t}}{\rm{y}}{\rm{p}}{\rm{e}}{\rm{s}}}}\) (\({N}_{\text{genes}}=22,986,{N}_{\text{cells}}=419,334\)), where each cell was assigned to one of \({N}_{\mathrm{cell}\,\mathrm{types}}=30\) cell types, were normalized cell-wise to a target depth of 1,000 counts using Scanpy’s (version 1.11.1)163 ‘normalize_total’ function. Each spot \(k\) on tissue \(j\) comprised 30 cells, partitioned uniformly at random among the cell types present within the current region (\({D}_{k}^{(j)}\)). The integer vector containing the number of cells belonging to each type was normalized to produce a simplex serving as the true composition vector \({E}_{k}^{\,(j)}\). For each cell type \(t\), \(30{E}_{k,t}^{\,(j)}\) cells were drawn at random (without replacement) from \({G}^{
(15)

such that \({||}{w}_{i}{||}\le 1,{\rho }_{i}\left({w}_{i}\right) < {c}_{i}\forall i\), where \({\rho }_{i}({w}_{i})\) represent least absolute shrinkage and selection operator (LASSO) regression penalties and \({c}_{i}\) controls the degree of sparsity. These tuning parameters were chosen by R’s MultiCCA.permute function108. For each pair of cell types i and j,

$${\text{argmax}}_{{w}_{i},{w}_{j}}{w}_{i}\bar{{\beta }_{i}}{\bar{\beta }}_{j}^{T}{w}_{j}={\text{argmax}}_{{w}_{i},wj}\text{corr}\left({\bar{\beta }}_{i}^{T}{w}_{i},{\bar{\beta }}_{j}^{T}{w}_{j}\right)$$

(16)

Therefore, the canonical covariates identified a space in which each MCP signal in cell type i is highly correlated with the corresponding signal in all other cell types. Next, given canonical variates from PMD, each of the \(M\) MCP signals was characterized by identifying at most n = 250 genes across all cell types that showed the strongest positive or negative contribution as measured by \(\mathrm{abs}(w) > 0.1\). The list of correlated genes for each MCP, both ‘up’ genes with positive weight and ‘down’ genes with negative weight, was then used as input to a Fisher’s exact test to identify Kyoto Encyclopedia of Genes and Genomes (KEGG) functional gene sets enriched in the MCP. Each MROI was allowed at most \(k\) MCPs under the constraint that the bases for each MCP are orthogonal. As MCCA will always output programs in the same order, the maximum of \(k\) MCPs was calculated for a given spatial niche and latter programs that show low self-correlation among member genes (mean Pearson’s r < 0.3) or high cross-correlation with genes from other programs (mean Pearson’s r > 0.05) we optionally removed. Total MCP activity across conditions was calculated as a weighted sum of member gene activity (\({y}_{m}={\sum }_{i}^{k}({\sum }_{g}^{G}({\bar{\,\beta }}_{i}[g,\bullet ]{w}_{i}[g,m])\,)\)), whereas cellular contributions to each MCP were calculated by binning positive and negative weights by cellular origin.

Spatiotemporal cellular composition analysis

To identify significant differences in cellular composition, each tissue section was treated as an independent sample and compositional estimates of all spots for a given MROI were pooled from the same section. Statistical significance across groups (for example, proximal–middle–distal samples) was assessed using Welch’s t-test. A crypt gradient gene was defined as one that showed a significant (BF > 2, log2 fold change > 0.5) difference in expression between crypt base and apex and a monotonic change in expression from base to base and mid, mid and apex.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Leave a Comment