• Overview
  • FASTQ+
  • PISA
  • Yano
  • My GitHub Page
  • Discussions
  1. Workflows & Short cases
  2. Cell trajectory analysis
  • Workflows & Short cases
    • From FASTQ to counts
    • Annotate various features
    • Select cells from reduction maps
    • Alternative splicing analysis
    • Allele-specific gene expression analysis
    • Annotating genetic variants
    • Analysis multiple Visium samples
    • Cell trajectory analysis

On this page

  • Prepare raw data
  • Cell clustering and annotation
  • Cell development trajetory construction
  • Perform alternative splicing analysis along the cell trajetory
  • Perform alternative splicing analysis for all cell space
  • Perform alternative splicing on a supervised two dimension space
  1. Workflows & Short cases
  2. Cell trajectory analysis

Spatial dissimilarity analysis for single-cell trajectories and supervised embeddings

In the previous vignette, we performed cell clustering using unsupervised methods. Typically, a PCA or integrated space (e.g., Harmony) is generated during this process, and we then calculate the cell weight matrix directly from the cell embedding space. This approach is convenient for running the pipeline, and provide the general result in all cell population. Furthermore, for specific signaling or biological pathways, it is often more informative to focus on a ‘supervised’ cell ordering. This vignette demonstrates how to perform spatial dissimilarity test in low-dimensional spaces, such as cell trajectory.

Prepare raw data

We used the human testis single-cell atlas dataset for this analysis. The full dataset is publicly available at GSE112013. For this vignette, we only used replicate 1.

## We first download the BAM files and GTF file
wget -c https://sra-pub-src-1.s3.amazonaws.com/SRR6860519/14515X1.bam.1 -O donor1_rep1.bam
wget -c https://sra-pub-src-1.s3.amazonaws.com/SRR6860520/14606X1.bam.1 -O donor1_rep2.bam

## Please Note: we use a different annotation file from previos vignette, because the BAM file use NCBI style chromosome name here.
wget https://ftp.ensembl.org/pub/release-114/gtf/homo_sapiens/Homo_sapiens.GRCh38.114.chr.gtf.gz

## Index the bam files for track plots
samtools index donor1_rep1.bam
samtools index donor1_rep2.bam

## Then we use PISA to generate feature counts
PISA anno -gtf Homo_sapiens.GRCh38.114.chr.gtf.gz -exon -psi -o d1_rep2_anno.bam donor1_rep2.bam
PISA anno -gtf Homo_sapiens.GRCh38.114.chr.gtf.gz -exon -psi -o d1_rep1_anno.bam donor1_rep1.bam

mkdir exp1
mkdir exp2
mkdir exon1
mkdir exon2
mkdir junction1
mkdir junction2
mkdir exclude1
mkdir exclude2

PISA count -tag CB -umi UB -anno-tag GN -outdir exp1 d1_rep1_anno.bam
PISA count -tag CB -umi UB -anno-tag GN -outdir exp2 d1_rep2_anno.bam
PISA count -tag CB -umi UB -anno-tag EX -outdir exon1 d1_rep1_anno.bam
PISA count -tag CB -umi UB -anno-tag EX -outdir exon2 d1_rep2_anno.bam
PISA count -tag CB -umi UB -anno-tag JC -outdir junction1 d1_rep1_anno.bam
PISA count -tag CB -umi UB -anno-tag JC -outdir junction2 d1_rep2_anno.bam

# Note: Although we generate the excluded reads here, we do not use them in this demonstration for the sake of simplicity. However, it is recommended to test them in real cases, so I’ve kept the data available for you to use in your own analyses.
PISA count -tag CB -umi UB -anno-tag ER -outdir exclude1 d1_rep1_anno.bam
PISA count -tag CB -umi UB -anno-tag ER -outdir exclude2 d1_rep2_anno.bam

Cell clustering and annotation

require(Yano)

exp1 <- ReadPISA("exp1/", prefix = "d1-rep1-")
exp2 <- ReadPISA("exp2/", prefix = "d1-rep2-")

exp <- mergeMatrix(exp1, exp2)

obj <- QuickRecipe(exp, min.features = 800, min.cells = 100, resolution=0.2)
Warning: Feature names cannot have underscores ('_'), replacing with dashes
('-')
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 4400
Number of edges: 134553

Running Louvain algorithm...
Maximum modularity in 10 random starts: 0.9639
Number of communities: 13
Elapsed time: 0 seconds
Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session
rm(exp1, exp2, exp)

obj$replicate <- gsub("d1-(.*)-.*-1", "\\1", colnames(obj))
DimPlot(obj, label = TRUE, label.size = 5)

FeaturePlot(obj, features = c("CD14", "VIM", "VWF", "ACTA2", "DLK1", "DAZL", "MAGEA4", "UTF1", "ID4", "FGFR3", "KIT", "DMRT1", "DMRTB1", "STRA8", "SYCP3", "SPO11", "MLH3", "ZPBP", "TNP1", "PRM2"), ncol=5, order=TRUE) & NoAxes() & NoLegend()
Warning: The `slot` argument of `FetchData()` is deprecated as of SeuratObject 5.0.0.
ℹ Please use the `layer` argument instead.
ℹ The deprecated feature was likely used in the Seurat package.
  Please report the issue at <https://github.com/satijalab/seurat/issues>.

new.cluster.ids <- c("Sperm", "Early primary S'cytes", "Leydig cells", "Round S'tids", "Differentiating S'gonia", "Elongated S'tids", "Endothelial cells", "SSCs", "Myoid cells", "Sertoli cells", "Macrophages", "Later primary S'cytes", "Sertoli-like")
names(new.cluster.ids) <- levels(obj)
obj <- RenameIdents(obj, new.cluster.ids)
DimPlot(obj, reduction = "umap", label = TRUE, pt.size = 0.5) + NoLegend()

Cell development trajetory construction

We now select germline cells from the dataset and construct the cell development trajectory using a principal curve. Although there are many well-documented methods for constructing linear trajectories, here we use a straightforward and perhaps the simplest approach purely for demonstration purposes.

sel <- DimSelector(obj)
obj.sel <- obj[,sel]

require(princurve)
Loading required package: princurve
emb <- Embeddings(obj.sel, 'umap')
emb0 <- as.data.frame(emb)
emb0$idents <- Idents(obj)[rownames(emb0)]
emb0 %>% group_by(idents) %>% summarize(x=median(umap_1),y=median(umap_2)) -> emb1
emb2 <- as.matrix(emb1[,c("x","y")])
rownames(emb2) <- emb1$idents
emb2 <- emb2[c("SSCs", "Differentiating S'gonia", "Early primary S'cytes", "Later primary S'cytes", "Round S'tids", "Elongated S'tids", "Sperm"),]
pc <- princurve::principal_curve(x = emb, start = emb2)
s <- as.data.frame(pc$s)
colnames(s) <- c("x", "y")
emb0 <- cbind(emb0, s)
emb0 <- emb0[pc$ord, ]

ggplot(emb0)+ geom_point(aes(umap_1, umap_2, color = idents)) + geom_path(aes(x,y), color = "blue", linewidth=1)+ geom_point(data = emb2, aes(x, y), size=2) 

Perform alternative splicing analysis along the cell trajetory

cells <- colnames(obj.sel)

exon1 <- ReadPISA("exon1/", prefix = "d1-rep1-", cells = cells)
exon2 <- ReadPISA("exon2/", prefix = "d1-rep2-", cells = cells)
exon <- mergeMatrix(exon1, exon2)

junction1 <- ReadPISA("junction1/", prefix = "d1-rep1-", cells = cells)
junction2 <- ReadPISA("junction2/", prefix = "d1-rep2-", cells = cells)
junction <- mergeMatrix(junction1, junction2)

obj.sel[['exon']] <- CreateAssayObject(exon, min.cells = 100)
obj.sel[['junction']] <- CreateAssayObject(junction, min.cells = 100)
rm(exon, exon1,exon2, junction, junction1, junction2)

order.cells <- rownames(emb0)

DefaultAssay(obj.sel) <- "exon"
obj.sel <- NormalizeData(obj.sel)
Warning: The `slot` argument of `SetAssayData()` is deprecated as of SeuratObject 5.0.0.
ℹ Please use the `layer` argument instead.
ℹ The deprecated feature was likely used in the Seurat package.
  Please report the issue at <https://github.com/satijalab/seurat/issues>.
Warning: The `slot` argument of `GetAssayData()` is deprecated as of SeuratObject 5.0.0.
ℹ Please use the `layer` argument instead.
ℹ The deprecated feature was likely used in the Seurat package.
  Please report the issue at <https://github.com/satijalab/seurat/issues>.
obj.sel <- ParseExonName(obj.sel)

When setup order cells, the weight matrix name will be named to “trajectory_wm”.

obj.sel <- RunAutoCorr(obj.sel, order.cells = order.cells, wm.name = "trajectory_wm")
Working on assay : exon
Run autocorrelation test for 72346 features.
Runtime : 9.748158 secs
58448 autocorrelated features.
obj.sel <- RunSDT(obj.sel, wm.name = "trajectory_wm", bind.name = "gene_name", bind.assay = "RNA")
Working on assay exon.
Working on binding assay RNA.
Use predefined weight matrix "trajectory_wm".
Processing 58448 features.
Processing 12417 binding features.
Retrieve binding data from assay RNA.
Use "data" layer for test features and binding features.
Using 7 threads.
Runtime : 6.944135 mins.
DefaultAssay(obj.sel) <- "junction"
obj.sel <- NormalizeData(obj.sel)
obj.sel <- ParseExonName(obj.sel)
Working on assay junction
obj.sel <- RunAutoCorr(obj.sel, order.cells = order.cells, wm.name = "trajectory_wm")
Working on assay : junction
Run autocorrelation test for 7583 features.
Runtime : 1.508061 secs
6466 autocorrelated features.
obj.sel <- RunSDT(obj.sel, wm.name = "trajectory_wm", bind.name = "gene_name", bind.assay = "RNA")
Working on assay junction.
Working on binding assay RNA.
Use predefined weight matrix "trajectory_wm".
Processing 6466 features.
Processing 3403 binding features.
Retrieve binding data from assay RNA.
Use "data" layer for test features and binding features.
Using 7 threads.
Runtime : 46.74309 secs.
FbtPlot(obj.sel, val = "gene_name.padj", assay = c("exon", "junction"), shape.by = "assay", color.by = "assay")

Meta(obj.sel, assay = "exon") %>% filter(gene_name.padj <1e-8) %>% knitr::kable()
chr start end gene_name strand moransi.pval moransi autocorr.variable gene_name.D gene_name.t gene_name.pval gene_name.padj
1:211398036-211399150/+/LINC00467 1 211398036 211399150 LINC00467 + 0.0000000 0.0816324 TRUE 0.6269152 7.875978 0 0
1:211398036-211399220/+/LINC00467 1 211398036 211399220 LINC00467 + 0.0000000 0.0797745 TRUE 0.6330680 8.161452 0 0
1:211398036-211399145/+/LINC00467 1 211398036 211399145 LINC00467 + 0.0000000 0.0804612 TRUE 0.6270202 7.833554 0 0
10:84371014-84372469/+/CCSER2 10 84371014 84372469 CCSER2 + 0.0000466 0.0459559 TRUE 0.6213102 8.800596 0 0
3:52414496-52414844/+/PHF7 3 52414496 52414844 PHF7 + 0.0000000 0.1310723 TRUE 0.6558172 7.956239 0 0
7:4984910-4985075/+/RBAK-RBAKDN 7 4984910 4985075 RBAK-RBAKDN + 0.0000000 0.1330786 TRUE 0.6593649 8.421093 0 0
7:74196642-74197050/+/EIF4H 7 74196642 74197050 EIF4H + 0.0000000 0.1397753 TRUE 0.7475285 14.535688 0 0
7:96645724-96645936/-/SEM1 7 96645724 96645936 SEM1 - 0.0000145 0.0494136 TRUE 0.6470772 10.301505 0 0
7:96645390-96645932/-/SEM1 7 96645390 96645932 SEM1 - 0.0000000 0.0720516 TRUE 0.6593465 10.585368 0 0
7:96645391-96645932/-/SEM1 7 96645391 96645932 SEM1 - 0.0000000 0.0748368 TRUE 0.6561704 10.424636 0 0
7:96645588-96645936/-/SEM1 7 96645588 96645936 SEM1 - 0.0000000 0.0796474 TRUE 0.6639002 10.867209 0 0
7:96645724-96645932/-/SEM1 7 96645724 96645932 SEM1 - 0.0002720 0.0407584 TRUE 0.6266415 9.058837 0 0
7:127591409-127591951/+/FSCN3 7 127591409 127591951 FSCN3 + 0.0000000 0.2950965 TRUE 0.8446530 13.659938 0 0
9:128500067-128501292/+/ODF2 9 128500067 128501292 ODF2 + 0.0023921 0.0331474 TRUE 0.6216168 8.123203 0 0
9:128500067-128501260/+/ODF2 9 128500067 128501260 ODF2 + 0.0023921 0.0331474 TRUE 0.6216168 8.123203 0 0
9:128500067-128500960/+/ODF2 9 128500067 128500960 ODF2 + 0.0020902 0.0336515 TRUE 0.6187131 7.971053 0 0
Meta(obj.sel, assay = "junction") %>% filter(gene_name.padj <1e-8) %>% knitr::kable()
chr start end gene_name strand moransi.pval moransi autocorr.variable gene_name.D gene_name.t gene_name.pval gene_name.padj
3:52414048-52414496/+/PHF7 3 52414048 52414496 PHF7 + 0 0.133177 TRUE 0.6908519 8.943853 0 0
FeaturePlot(obj.sel, features = c("3:52413996-52414048/+/PHF7", "3:52414048-52414496/+/PHF7", "PHF7"), ncol=3, order=TRUE)
Warning: Could not find 3:52413996-52414048/+/PHF7 in the default search
locations, found in 'exon' assay instead
Warning: Could not find PHF7 in the default search locations, found in 'RNA'
assay instead

Perform alternative splicing analysis for all cell space

To use all cell space, we do not need to specify a name for the weight matrix here, because it will default to using the PCA reduction map to generate the weight matrix. The resulting matrix will be named “reduction_wm”.

DefaultAssay(obj.sel) <- "exon"
obj.sel <- RunAutoCorr(obj.sel)
Working on assay : exon
Run autocorrelation test for 72346 features.
Runtime : 20.01246 secs
70882 autocorrelated features.

Now we have two weight matrices. If no name is specified in RunSDT(), the first matrix will always be used by default, so we need to explicitly specify “pca_wm” to avoid misusing the weight matrix here.

grep("_wm", names(obj.sel), value=TRUE)
[1] "trajectory_wm" "pca_wm"       
obj.sel <- RunSDT(obj.sel, bind.name = "gene_name", bind.assay = "RNA", wm.name = "pca_wm", prefix="all")
Working on assay exon.
Working on binding assay RNA.
Use predefined weight matrix "pca_wm".
Processing 70882 features.
Processing 14088 binding features.
Retrieve binding data from assay RNA.
Use "data" layer for test features and binding features.
Using 7 threads.
Runtime : 11.06989 mins.
DefaultAssay(obj.sel) <- "junction"
obj.sel <- RunAutoCorr(obj.sel)
Working on assay : junction
Run autocorrelation test for 7583 features.
Runtime : 2.407756 secs
7545 autocorrelated features.
obj.sel <- RunSDT(obj.sel, bind.name = "gene_name", bind.assay = "RNA", wm.name = "pca_wm", prefix="all")
Working on assay junction.
Working on binding assay RNA.
Use predefined weight matrix "pca_wm".
Processing 7545 features.
Processing 3800 binding features.
Retrieve binding data from assay RNA.
Use "data" layer for test features and binding features.
Using 7 threads.
Runtime : 1.231873 mins.

By comparing the use of cell trajectory ordering with the PCA space, we found that using the PCA approach can detect more alternatively expressed genes. This is because forcing all cells into a 1D trajectory ordering preserves only the rank of each cell, but loses much of the neighborhood information about cells in the broader context. Since cell trajectories are typically constructed based on cell–cell distances in high-dimensional space, regardless of the algorithm used to build the trajectory, I personally recommend using the cell space directly rather than relying solely on 1D cell ordering, unless you have specific reasons or detailed knowledge.

FbtPlot(obj.sel, val = "all.padj", assay = c("exon", "junction"), shape.by = "assay", color.by = "assay")

Meta(obj.sel, assay = "exon") %>% filter(all.padj <1e-20) %>% knitr::kable()
chr start end gene_name strand moransi.pval moransi autocorr.variable gene_name.D gene_name.t gene_name.pval gene_name.padj all.D all.t all.pval all.padj
1:25901491-25901682/-/STMN1 1 25901491 25901682 STMN1 - 0 0.5365729 TRUE 0.5138080 -0.6083112 0.7278129 1.0000000 0.5050367 15.15135 0 0
1:25901407-25901682/-/STMN1 1 25901407 25901682 STMN1 - 0 0.4763702 TRUE 0.4974101 -1.2050008 0.8844622 1.0000000 0.4729623 14.75713 0 0
1:244854973-244855044/-/HNRNPU 1 244854973 244855044 HNRNPU - 0 0.1066201 TRUE 0.4263039 -5.3421718 0.9999997 1.0000000 0.4608809 15.76613 0 0
1:244855424-244855608/-/HNRNPU 1 244855424 244855608 HNRNPU - 0 0.1499542 TRUE 0.4614148 -2.8587774 0.9974071 1.0000000 0.5291103 17.48624 0 0
1:244855424-244855597/-/HNRNPU 1 244855424 244855597 HNRNPU - 0 0.1448451 TRUE 0.4657325 -2.5138832 0.9932233 1.0000000 0.5180136 17.42260 0 0
1:244855424-244856096/-/HNRNPU 1 244855424 244856096 HNRNPU - 0 0.1290502 TRUE 0.4635919 -2.2568029 0.9868908 1.0000000 0.4924957 15.99049 0 0
10:35179134-35179769/+/CREM 10 35179134 35179769 CREM + 0 0.2795176 TRUE 0.6571513 6.5368376 0.0000000 0.0000038 0.6140893 20.52634 0 0
10:35179134-35179847/+/CREM 10 35179134 35179847 CREM + 0 0.2965156 TRUE 0.6671544 6.6315722 0.0000000 0.0000026 0.6322654 20.79021 0 0
10:35179134-35179843/+/CREM 10 35179134 35179843 CREM + 0 0.2963525 TRUE 0.6664554 6.6308385 0.0000000 0.0000026 0.6310836 20.75944 0 0
10:84371014-84372469/+/CCSER2 10 84371014 84372469 CCSER2 + 0 0.0694474 TRUE 0.6213102 8.8005958 0.0000000 0.0000000 0.4298537 14.91362 0 0
13:27622875-27623449/+/POLR1D 13 27622875 27623449 POLR1D + 0 0.1353060 TRUE 0.5669509 3.8114291 0.0001201 0.1754453 0.5305367 17.51706 0 0
13:27622841-27623452/+/POLR1D 13 27622841 27623452 POLR1D + 0 0.1352448 TRUE 0.5677941 3.8753581 0.0000957 0.1434632 0.5308687 17.44830 0 0
15:80131180-80131293/+/ZFAND6 15 80131180 80131293 ZFAND6 + 0 0.1415461 TRUE 0.5294575 2.4872262 0.0072731 1.0000000 0.4809282 15.51713 0 0
15:80129600-80131293/+/ZFAND6 15 80129600 80131293 ZFAND6 + 0 0.1130790 TRUE 0.5055989 1.6040236 0.0559472 1.0000000 0.4281787 16.85202 0 0
15:80137480-80137678/+/ZFAND6 15 80137480 80137678 ZFAND6 + 0 0.1506118 TRUE 0.5313132 2.8853403 0.0023996 1.0000000 0.4767508 15.08482 0 0
15:80137480-80137811/+/ZFAND6 15 80137480 80137811 ZFAND6 + 0 0.1501969 TRUE 0.5089926 1.1308509 0.1304250 1.0000000 0.4761354 14.67508 0 0
16:1883196-1883296/-/MEIOB 16 1883196 1883296 MEIOB - 0 0.1565362 TRUE 0.5540380 4.4186253 0.0000127 0.0218663 0.4252987 14.58586 0 0
16:1884097-1884294/-/MEIOB 16 1884097 1884294 MEIOB - 0 0.1760430 TRUE 0.5573449 4.7002546 0.0000042 0.0074443 0.4450364 16.37583 0 0
17:1345258-1345499/-/YWHAE 17 1345258 1345499 YWHAE - 0 0.0589847 TRUE 0.6035363 7.2019361 0.0000000 0.0000002 0.4505080 14.75010 0 0
17:75778810-75778963/-/H3-3B 17 75778810 75778963 H3-3B - 0 0.6555098 TRUE 0.4775407 -2.4017911 0.9909098 1.0000000 0.4657620 14.05098 0 0
17:75778810-75779184/-/H3-3B 17 75778810 75779184 H3-3B - 0 0.6255431 TRUE 0.4735932 -2.5603642 0.9940165 1.0000000 0.4537018 13.30349 0 0
17:82615718-82616025/-/WDR45B 17 82615718 82616025 WDR45B - 0 0.1523974 TRUE 0.4596537 -1.4446807 0.9241474 1.0000000 0.4633808 13.58817 0 0
18:49482152-49482410/-/RPL17-C18orf32 18 49482152 49482410 RPL17-C18orf32 - 0 0.1626891 TRUE 0.5976995 2.9204659 0.0021642 1.0000000 0.5137507 16.14814 0 0
18:49482176-49482410/-/RPL17-C18orf32 18 49482176 49482410 RPL17-C18orf32 - 0 0.1629240 TRUE 0.5960931 2.8858653 0.0023959 1.0000000 0.5137443 16.25708 0 0
18:49483584-49483771/-/RPL17-C18orf32 18 49483584 49483771 RPL17-C18orf32 - 0 0.2033912 TRUE 0.5729379 2.4115739 0.0088634 1.0000000 0.4885709 14.70143 0 0
2:61141592-61143945/-/C2orf74-AS1 2 61141592 61143945 C2orf74-AS1 - 0 0.1129809 TRUE 0.6125111 6.8532884 0.0000000 0.0000010 0.4952707 15.55161 0 0
20:35774488-35775058/+/PHF20 20 35774488 35775058 PHF20 + 0 0.0693745 TRUE 0.5246998 3.7053294 0.0001739 0.2479685 0.4130324 15.00582 0 0
3:45007420-45007575/+/EXOSC7 3 45007420 45007575 EXOSC7 + 0 0.1241484 TRUE 0.4985161 1.2424687 0.1084993 1.0000000 0.4331207 14.18691 0 0
3:45011235-45011393/+/EXOSC7 3 45011235 45011393 EXOSC7 + 0 0.1563919 TRUE 0.5202833 1.9727128 0.0256586 1.0000000 0.4716648 14.73913 0 0
3:45011235-45011471/+/EXOSC7 3 45011235 45011471 EXOSC7 + 0 0.1772089 TRUE 0.5317377 2.4737189 0.0075368 1.0000000 0.5021152 16.29195 0 0
3:45011235-45011470/+/EXOSC7 3 45011235 45011470 EXOSC7 + 0 0.1777901 TRUE 0.5333368 2.5716212 0.0058044 1.0000000 0.5041374 16.46894 0 0
3:52413996-52414048/+/PHF7 3 52413996 52414048 PHF7 + 0 0.1410946 TRUE 0.6335731 6.2777400 0.0000000 0.0000122 0.4988230 14.70155 0 0
3:52414496-52414844/+/PHF7 3 52414496 52414844 PHF7 + 0 0.1455321 TRUE 0.6558172 7.9562386 0.0000000 0.0000000 0.5357904 17.07624 0 0
5:28810519-28810897/+/LINC02109 5 28810519 28810897 LINC02109 + 0 0.1095984 TRUE 0.5522362 4.8399018 0.0000024 0.0045116 0.4643932 14.33751 0 0
7:2234198-2235564/-/ENSG00000286192 7 2234198 2235564 ENSG00000286192 - 0 0.1514527 TRUE 0.5234723 2.4383464 0.0082683 1.0000000 0.4600698 15.10529 0 0
7:4984910-4985075/+/RBAK-RBAKDN 7 4984910 4985075 RBAK-RBAKDN + 0 0.1729539 TRUE 0.6593649 8.4210932 0.0000000 0.0000000 0.5476979 19.67029 0 0
7:5779985-5780292/-/RNF216 7 5779985 5780292 RNF216 - 0 0.0640424 TRUE 0.5246596 3.5873547 0.0002606 0.3384448 0.4067393 14.01626 0 0
7:74196642-74197050/+/EIF4H 7 74196642 74197050 EIF4H + 0 0.1499599 TRUE 0.7475285 14.5356884 0.0000000 0.0000000 0.6529201 26.76167 0 0
7:92244002-92244149/-/ENSG00000285953 7 92244002 92244149 ENSG00000285953 - 0 0.2155620 TRUE 0.5857502 4.7302274 0.0000037 0.0068079 0.5361302 16.49559 0 0
7:92244902-92245134/-/ENSG00000285953 7 92244902 92245134 ENSG00000285953 - 0 0.1772488 TRUE 0.5634043 3.8898895 0.0000909 0.1398010 0.4787132 13.47767 0 0
7:96645724-96645936/-/SEM1 7 96645724 96645936 SEM1 - 0 0.0720968 TRUE 0.6470772 10.3015054 0.0000000 0.0000000 0.5103109 17.74491 0 0
7:96645390-96645932/-/SEM1 7 96645390 96645932 SEM1 - 0 0.0827188 TRUE 0.6593465 10.5853680 0.0000000 0.0000000 0.5265752 18.97728 0 0
7:96645391-96645932/-/SEM1 7 96645391 96645932 SEM1 - 0 0.0817727 TRUE 0.6561704 10.4246355 0.0000000 0.0000000 0.5217723 18.72397 0 0
7:96645588-96645936/-/SEM1 7 96645588 96645936 SEM1 - 0 0.0857761 TRUE 0.6639002 10.8672094 0.0000000 0.0000000 0.5333738 19.05890 0 0
7:96645724-96645932/-/SEM1 7 96645724 96645932 SEM1 - 0 0.0615317 TRUE 0.6266415 9.0588370 0.0000000 0.0000000 0.4757020 16.33801 0 0
7:127591409-127591951/+/FSCN3 7 127591409 127591951 FSCN3 + 0 0.3332084 TRUE 0.8446530 13.6599377 0.0000000 0.0000000 0.8517817 28.91322 0 0
8:101198726-101199239/-/ZNF706 8 101198726 101199239 ZNF706 - 0 0.1575763 TRUE 0.5652140 2.8498588 0.0026610 1.0000000 0.5134903 15.38145 0 0
9:79571773-79572835/+/TLE4 9 79571773 79572835 TLE4 + 0 0.3314505 TRUE 0.4805687 -1.5151168 0.9335354 1.0000000 0.4487088 14.19573 0 0
9:79571941-79572835/+/TLE4 9 79571941 79572835 TLE4 + 0 0.3425540 TRUE 0.5324464 1.8857681 0.0311289 1.0000000 0.4993045 16.19725 0 0
9:79571965-79572835/+/TLE4 9 79571965 79572835 TLE4 + 0 0.3129983 TRUE 0.5223707 1.5287304 0.0647601 1.0000000 0.4732105 15.54075 0 0
9:124858458-124858537/-/RPL35 9 124858458 124858537 RPL35 - 0 0.3502975 TRUE 0.5369936 2.1858733 0.0155911 1.0000000 0.4623329 13.77950 0 0
9:128500067-128501292/+/ODF2 9 128500067 128501292 ODF2 + 0 0.0466525 TRUE 0.6216168 8.1232027 0.0000000 0.0000000 0.4261405 13.24331 0 0
9:128500067-128501260/+/ODF2 9 128500067 128501260 ODF2 + 0 0.0466525 TRUE 0.6216168 8.1232027 0.0000000 0.0000000 0.4261405 13.24331 0 0
Meta(obj.sel, assay = "junction") %>% filter(all.padj <1e-20) %>% knitr::kable()
chr start end gene_name strand moransi.pval moransi autocorr.variable gene_name.D gene_name.t gene_name.pval gene_name.padj all.D all.t all.pval all.padj
1:244855044-244855424/-/HNRNPU 1 244855044 244855424 HNRNPU - 0 0.1372512 TRUE 0.4599386 -3.751276 0.9998517 1.0000000 0.4784554 15.26365 0 0
15:80131293-80137480/+/ZFAND6 15 80131293 80137480 ZFAND6 + 0 0.1457399 TRUE 0.5318035 3.241519 0.0008102 0.7483565 0.4457714 15.09220 0 0
3:45007575-45011235/+/EXOSC7 3 45007575 45011235 EXOSC7 + 0 0.1611300 TRUE 0.5440651 3.856204 0.0001025 0.1325229 0.4670253 13.95661 0 0
3:52412920-52413996/+/PHF7 3 52412920 52413996 PHF7 + 0 0.1512534 TRUE 0.6429861 7.607239 0.0000000 0.0000000 0.5163198 16.13420 0 0
3:52414048-52414496/+/PHF7 3 52414048 52414496 PHF7 + 0 0.1885870 TRUE 0.6908519 8.943853 0.0000000 0.0000000 0.5930035 18.84522 0 0
7:4983773-4984910/+/RBAK-RBAKDN 7 4983773 4984910 RBAK-RBAKDN + 0 0.1228286 TRUE 0.6280600 7.217475 0.0000000 0.0000001 0.4648469 14.47706 0 0
9:124858067-124858458/-/RPL35 9 124858067 124858458 RPL35 - 0 0.3729386 TRUE 0.5382251 1.941759 0.0275044 1.0000000 0.4623314 13.51239 0 0

Perform alternative splicing on a supervised two dimension space

We can also perform the analysis using a supervised two-dimensional cell space. As an example, we utilize Seurat’s built-in cell cycle scoring scheme for demonstration purposes.

DefaultAssay(obj.sel) <- "RNA"

s.genes <- cc.genes$s.genes
g2m.genes <- cc.genes$g2m.genes

obj.sel[['Group']] <- Idents(obj.sel)
obj.sel <- CellCycleScoring(obj.sel, s.features = s.genes, g2m.features = g2m.genes, set.ident = TRUE)
Warning: The following features are not present in the object: MLF1IP, E2F8,
not searching for symbol synonyms
Warning: The following features are not present in the object: FAM64A, AURKB,
HN1, not searching for symbol synonyms
df <- obj.sel[[]]

p1 <- ggplot(df) + geom_point(aes(x=S.Score, y = G2M.Score, color = Phase))
p2 <- ggplot(df) + geom_point(aes(x=S.Score, y = G2M.Score, color = Group))
p1 + p2

# Create cell cycle coordinate space with the module scores
ccc <- as.matrix(df[, c("S.Score", "G2M.Score")])
head(ccc)
                              S.Score   G2M.Score
d1-rep1-GACAGAGTCATACGGT-1 0.37908301  0.27462393
d1-rep1-GCTCCTATCATCGCTC-1 0.40969930  0.24794880
d1-rep1-GTAGTCATCCGAACGC-1 0.06002289  0.09029104
d1-rep1-TTTCCTCCACTATCTT-1 0.37392161  0.02509706
d1-rep1-TACGGTAGTAAGGGCT-1 0.51073399  0.28205031
d1-rep1-AGGGTGAAGATATGCA-1 0.24597679 -0.02650419
colnames(ccc) <- c("CCC_1", "CCC_2")

obj.sel[['ccc']] <- CreateDimReducObject(embeddings = ccc, assay = "RNA")

DefaultAssay(obj.sel) <- "exon"
obj.sel <- RunAutoCorr(obj.sel, reduction = "ccc", wm.name = "ccc_wm", dims=1:2)
Working on assay : exon
Run autocorrelation test for 72346 features.
Runtime : 15.69235 secs
60820 autocorrelated features.
obj.sel <- RunSDT(obj.sel, wm.name = "ccc_wm", bind.name = "gene_name", bind.assay = "RNA", prefix = "ccc")
Working on assay exon.
Working on binding assay RNA.
Use predefined weight matrix "ccc_wm".
Processing 60820 features.
Processing 12689 binding features.
Retrieve binding data from assay RNA.
Use "data" layer for test features and binding features.
Using 7 threads.
Runtime : 8.745614 mins.
DefaultAssay(obj.sel) <- "junction"
obj.sel <- RunAutoCorr(obj.sel, reduction = "ccc", wm.name = "ccc_wm", dims=1:2)
Working on assay : junction
Run autocorrelation test for 7583 features.
Runtime : 1.545865 secs
6884 autocorrelated features.
obj.sel <- RunSDT(obj.sel, wm.name = "ccc_wm", bind.name = "gene_name", bind.assay = "RNA", prefix = "ccc")
Working on assay junction.
Working on binding assay RNA.
Use predefined weight matrix "ccc_wm".
Processing 6884 features.
Processing 3549 binding features.
Retrieve binding data from assay RNA.
Use "data" layer for test features and binding features.
Using 7 threads.
Runtime : 53.14298 secs.
FbtPlot(obj.sel, val = "ccc.padj", assay = c("exon", "junction"), shape.by = "assay", color.by = "assay")

Meta(obj.sel, assay = "exon") %>% filter(ccc.padj<1e-30) %>% knitr::kable()
chr start end gene_name strand moransi.pval moransi autocorr.variable gene_name.D gene_name.t gene_name.pval gene_name.padj all.D all.t all.pval all.padj ccc.D ccc.t ccc.pval ccc.padj
10:35179134-35179769/+/CREM 10 35179134 35179769 CREM + 0 0.1660249 TRUE 0.6571513 6.536838 0.0000000 3.8e-06 0.6140893 20.52634 0 0 0.6498339 21.80302 0 0
10:35179134-35179847/+/CREM 10 35179134 35179847 CREM + 0 0.1783511 TRUE 0.6671544 6.631572 0.0000000 2.6e-06 0.6322654 20.79021 0 0 0.6774472 23.44110 0 0
10:35179134-35179843/+/CREM 10 35179134 35179843 CREM + 0 0.1765705 TRUE 0.6664554 6.630839 0.0000000 2.6e-06 0.6310836 20.75944 0 0 0.6755427 23.55242 0 0
3:45011235-45011471/+/EXOSC7 3 45011235 45011471 EXOSC7 + 0 0.1252058 TRUE 0.5317377 2.473719 0.0075368 1.0e+00 0.5021152 16.29195 0 0 0.5183262 19.60756 0 0
3:45011235-45011470/+/EXOSC7 3 45011235 45011470 EXOSC7 + 0 0.1252657 TRUE 0.5333368 2.571621 0.0058044 1.0e+00 0.5041374 16.46894 0 0 0.5193649 19.73390 0 0
7:74196642-74197050/+/EIF4H 7 74196642 74197050 EIF4H + 0 0.0903061 TRUE 0.7475285 14.535688 0.0000000 0.0e+00 0.6529201 26.76167 0 0 0.6038284 19.55842 0 0
7:127591409-127591951/+/FSCN3 7 127591409 127591951 FSCN3 + 0 0.1644589 TRUE 0.8446530 13.659938 0.0000000 0.0e+00 0.8517817 28.91322 0 0 0.8107162 23.39077 0 0
FeaturePlot(obj.sel, reduction =  "ccc", features = c("7:74196642-74197050/+/EIF4H", "EIF4H"), order = TRUE)
Warning: Could not find 7:74196642-74197050/+/EIF4H in the default search
locations, found in 'exon' assay instead
Warning: Could not find EIF4H in the default search locations, found in 'RNA'
assay instead

gtf <- gtf2db("./Homo_sapiens.GRCh38.114.chr.gtf.gz")
[2025-06-19 18:42:26] GTF loading..
[2025-06-19 18:42:52] Load 78686 genes.
bamfiles <- list(rep1 = "./donor1_rep1.bam", rep2 = "./donor1_rep2.bam")
cell.list <- split(obj.sel$Phase, obj.sel$replicate)

cell.list1 <- lapply(cell.list, function(x) {
  nm <- names(x)
  names(x) <- gsub(".*rep[12]-(.*)","\\1",nm)
  x
})
str(cell.list1)
List of 2
 $ rep1: Named chr [1:954] "S" "S" "G2M" "S" ...
  ..- attr(*, "names")= chr [1:954] "GACAGAGTCATACGGT-1" "GCTCCTATCATCGCTC-1" "GTAGTCATCCGAACGC-1" "TTTCCTCCACTATCTT-1" ...
 $ rep2: Named chr [1:2023] "S" "S" "S" "S" ...
  ..- attr(*, "names")= chr [1:2023] "GGCTCGACATGTCTCC-1" "CCTATTAGTCCAGTAT-1" "GTAACGTCACTTACGA-1" "AAGGCAGCACGTCTCT-1" ...
TrackPlot(bamfile = bamfiles, cell.group = cell.list1, gtf = gtf, gene = "EIF4H", highlights = c(74196642,74197050), junc=TRUE)

Command(obj.sel)
 [1] "NormalizeData.RNA"            "FindVariableFeatures.RNA"    
 [3] "ScaleData.RNA"                "RunPCA.RNA"                  
 [5] "FindNeighbors.RNA.pca"        "FindClusters"                
 [7] "RunUMAP.RNA.pca"              "NormalizeData.exon"          
 [9] "ParseExonName.exon"           "NormalizeData.junction"      
[11] "ParseExonName.junction"       "RunAutoCorr.exon.pca"        
[13] "RunAutoCorr.junction.pca"     "RunAutoCorr.exon.ccc"        
[15] "SetAutoCorrFeatures.exon"     "RunSDT.exon"                 
[17] "RunAutoCorr.junction.ccc"     "SetAutoCorrFeatures.junction"
[19] "RunSDT.junction"             
sessionInfo()
R version 4.5.0 (2025-04-11)
Platform: aarch64-apple-darwin24.4.0
Running under: macOS Sequoia 15.5

Matrix products: default
BLAS:   /opt/homebrew/Cellar/openblas/0.3.29/lib/libopenblasp-r0.3.29.dylib 
LAPACK: /opt/homebrew/Cellar/r/4.5.0/lib/R/lib/libRlapack.dylib;  LAPACK version 3.12.1

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Asia/Shanghai
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] princurve_2.1.6    future_1.49.0      dplyr_1.1.4        Seurat_5.3.0      
[5] SeuratObject_5.1.0 sp_2.2-0           ggplot2_3.5.2      Yano_1.2          

loaded via a namespace (and not attached):
  [1] deldir_2.0-4           pbapply_1.7-2          gridExtra_2.3         
  [4] rlang_1.1.6            magrittr_2.0.3         RcppAnnoy_0.0.22      
  [7] spatstat.geom_3.3-6    matrixStats_1.5.0      ggridges_0.5.6        
 [10] compiler_4.5.0         png_0.1-8              vctrs_0.6.5           
 [13] reshape2_1.4.4         stringr_1.5.1          pkgconfig_2.0.3       
 [16] fastmap_1.2.0          labeling_0.4.3         promises_1.3.2        
 [19] rmarkdown_2.29         purrr_1.0.4            xfun_0.52             
 [22] jsonlite_2.0.0         goftest_1.2-3          later_1.4.2           
 [25] spatstat.utils_3.1-4   irlba_2.3.5.1          parallel_4.5.0        
 [28] cluster_2.1.8.1        R6_2.6.1               ica_1.0-3             
 [31] stringi_1.8.7          RColorBrewer_1.1-3     spatstat.data_3.1-6   
 [34] reticulate_1.42.0      spatstat.univar_3.1-3  parallelly_1.44.0     
 [37] lmtest_0.9-40          scattermore_1.2        Rcpp_1.0.14           
 [40] knitr_1.50             tensor_1.5             future.apply_1.11.3   
 [43] zoo_1.8-14             R.utils_2.13.0         sctransform_0.4.2     
 [46] httpuv_1.6.16          Matrix_1.7-3           splines_4.5.0         
 [49] igraph_2.1.4           tidyselect_1.2.1       viridis_0.6.5         
 [52] abind_1.4-8            yaml_2.3.10            spatstat.random_3.3-3 
 [55] codetools_0.2-20       miniUI_0.1.2           spatstat.explore_3.4-2
 [58] listenv_0.9.1          lattice_0.22-6         tibble_3.2.1          
 [61] plyr_1.8.9             withr_3.0.2            shiny_1.10.0          
 [64] ROCR_1.0-11            evaluate_1.0.3         Rtsne_0.17            
 [67] fastDummies_1.7.5      survival_3.8-3         polyclip_1.10-7       
 [70] fitdistrplus_1.2-2     pillar_1.10.2          KernSmooth_2.23-26    
 [73] plotly_4.10.4          generics_0.1.4         RcppHNSW_0.6.0        
 [76] scales_1.4.0           gtools_3.9.5           globals_0.18.0        
 [79] xtable_1.8-4           glue_1.8.0             lazyeval_0.2.2        
 [82] tools_4.5.0            data.table_1.17.2      RSpectra_0.16-2       
 [85] RANN_2.6.2             dotCall64_1.2          cowplot_1.1.3         
 [88] grid_4.5.0             tidyr_1.3.1            nlme_3.1-168          
 [91] patchwork_1.3.0        cli_3.6.5              spatstat.sparse_3.1-0 
 [94] spam_2.11-1            viridisLite_0.4.2      uwot_0.2.3            
 [97] gtable_0.3.6           R.methodsS3_1.8.2      digest_0.6.37         
[100] progressr_0.15.1       ggrepel_0.9.6          htmlwidgets_1.6.4     
[103] farver_2.1.2           htmltools_0.5.8.1      R.oo_1.27.1           
[106] lifecycle_1.0.4        httr_1.4.7             mime_0.13             
[109] MASS_7.3-65           
Back to top
Analysis multiple Visium samples