Movatterモバイル変換

This vignette introduces the coFAST workflow for the analysis ofNSCLC CosMx spatial transcriptomics dataset. In this vignette, theworkflow of coFAST consists of three steps

Downstream analysis

Clustering and assess the cluster significance

First, we show how to use the spot embedding for spatial domainsegmentation. User can set the number of clusters as fixed value, i.e.,K=10; or set K=NULL using the default resolution. The cluster resultsare in thecofast.cluster column of data.framemeta.data slot. The Idents of Seurat object is also set ascofast.cluster.

CosMx_subset<-AddCluster(CosMx_subset)print(head(CosMx_subset))table(Idents(CosMx_subset))

We visualize the clusters on spatial coordinates.

CosMx_subset<-Addcoord2embed(CosMx_subset,coord.name =c("x","y"))print(CosMx_subset)cols_cluster<- PRECAST::chooseColors(palettes_name ='Classic 20',n_colors=20,plot_colors =TRUE)# DimPlot(CosMx_subset, reduction='Spatial', cols=cols_cluster, pt.size = 2)DimPlot(CosMx_subset,reduction='Spatial',group.by =c("cofast.cluster",'cell_type'),cols=cols_cluster,pt.size =1.5)

Next, we assess the aggregation scores of each cluster on spatialspace and embedding space. We calculated the spatial aggregation scoresand found that cluster 8 (corresponding to tumor) has the highestspatial aggregation score, which is concordance with the spatialclustering map.

dat.spa.score<-AggregationScore(CosMx_subset,reduction.name ='Spatial')print(dat.spa.score)

Next, we calculated the embedding aggregation scores and found thatcluster 8 also has the highest embedding aggregation score.

dat.embd.score<-AggregationScore(CosMx_subset,reduction.name ='cofast')print(dat.embd.score)

In the following, we show how to find the signature genes based oncomebeddings. First, we calculate the distance matrix.

CosMx_subset<-pdistance(CosMx_subset,reduction ="cofast")

Next, we find the signature genes for each cell type

print(table(Idents(CosMx_subset)))#Idents(CosMx_subset) <- CosMx_subset$cell_typedf_sig_list<-find.signature.genes(CosMx_subset)str(df_sig_list)

Then, we obtain the top five signature genes and organize them into adata.frame. Next, we calculate the UMAP projections of coembeddings. Thecolnamedistance means the distance between gene (i.e.,IL7R) and cells with the specific cluster (i.e., cluster 1), which iscalculated based on the coembedding of genes and cells in thecoembedding space. The distance is smaller, the association between geneand the cell type is stronger. The colnameexpr.proprepresents the expression proportion of the gene (i.e., IL7R) within thecell type (i.e., tumor). The colnamelabel means thecluster and colnamegene denotes the gene name. By thedata.frame object, we knowIL7R is the one of the topsignature gene of cluster 1.

dat<-get.top.signature.dat(df_sig_list,ntop =2,expr.prop.cutoff =0.1)head(dat)

Next, we calculate the UMAP projections of coembeddings of cells andthe selected signature genes.

CosMx_subset<-coembedding_umap(  CosMx_subset,reduction ="cofast",reduction.name ="UMAP",gene.set =unique(dat$gene))

Furthermore, we visualize the cells and top two signature genes oftumor in the UMAP space of coembedding. We observe that the UMAPprojections of the two signature genes are near to tumor, whichindicates these genes are enriched in tumor.

## choose beutifual colorscols_cluster2<-c("black", cols_cluster)p1<-coembed_plot(   CosMx_subset,reduction ="UMAP",gene_txtdata =subset(dat, label=='8'),cols=cols_cluster2,pt_text_size =3)p1

Then, we visualize the cells and top two signature genes of allinvolved cell types in the UMAP space of coembedding. We observe thatthe UMAP projections of the signature genes are near to thecorresponding cell type, which indicates these genes are enriched in thecorresponding cells.

p2<-coembed_plot(   CosMx_subset,reduction ="UMAP",gene_txtdata = dat,cols=cols_cluster2,pt_text_size =3,alpha=0.2)p2

In addtion, we can fully take advantages of the visualizationfunctions inSeurat package for visualization. Thefollowing is an example that visualizes the cell types on the UMAPspace.

DimPlot(CosMx_subset,reduction ='UMAP',cols=cols_cluster)

Then, there is another example that we plot the first two signaturegenes of Tumor 5 on UMAP space, in which we observed the high expressionin tumor in constrast to other cell types.

FeaturePlot(CosMx_subset,reduction ='UMAP',features =c("PSCA","CEACAM6"))

Session Info

sessionInfo()#> R version 4.4.1 (2024-06-14 ucrt)#> Platform: x86_64-w64-mingw32/x64#> Running under: Windows 11 x64 (build 26100)#>#> Matrix products: default#>#>#> locale:#> [1] LC_COLLATE=C#> [2] LC_CTYPE=Chinese (Simplified)_China.utf8#> [3] LC_MONETARY=Chinese (Simplified)_China.utf8#> [4] LC_NUMERIC=C#> [5] LC_TIME=Chinese (Simplified)_China.utf8#>#> time zone: Asia/Shanghai#> tzcode source: internal#>#> attached base packages:#> [1] stats     graphics  grDevices utils     datasets  methods   base#>#> loaded via a namespace (and not attached):#>  [1] digest_0.6.37     R6_2.5.1          fastmap_1.2.0     xfun_0.47#>  [5] cachem_1.1.0      knitr_1.48        htmltools_0.5.8.1 rmarkdown_2.28#>  [9] lifecycle_1.0.4   cli_3.6.3         sass_0.4.9        jquerylib_0.1.4#> [13] compiler_4.4.1    rstudioapi_0.16.0 tools_4.4.1       evaluate_1.0.0#> [17] bslib_0.8.0       yaml_2.3.10       rlang_1.1.4       jsonlite_1.8.9

Movatterモバイル変換

coFAST: NSCLC CosMx data coembedding

Wei Liu

2025-12-14

Load and view data

Preprocessing

Coembedding using coFAST

Downstream analysis

Clustering and assess the cluster significance