The principal steps to build the SingleCellExperiment (SCE) object required to use the ASTEC-sc shiny application is detailed in this section.
Step 1: Creation of SingleCellExperiment object
Let X
, Xnorm
and Xlog
be a matrix of raw counts, normalized counts and lognormalized counts respectively. The SCE object is then initialized as follows :
If you want to add another transformed data matrix Xtranf
:
Step 2: Dimensionality reduction methods
Let RedDim1
, RedDim2
, … be the coordinate matrices associated to several dimensionality reduction methods. You may enter them into the SCE object in the following way:
At least one dimensionality reduction method is required in the SCE object.
Step 3: cell clusterings
Let Clust1
, Clust2
, … be vectors which contain a cell clustering respectively. To integrate them in the SCE object, it is required to enter a list of these cell clusterings in SCE@metadata[["clustering"]]
:
At the end of this step, your SCE object is ready to be used in the application.
Step 4: Add cell / feature information (optional)
You may add supplementary information about cells and/or features in the SCE object.
If you have qualitative cell information (available in vectors vector1
, vector2
, …. respectively), you may save it in SCE@metadata
. Be careful, your vectors must be of length the total number of cells in the SCE object (length(vector1)==ncols(SCE)
).
If you have qualitative feature information (available in vectors vectorfeature1
, vectorfeature2
, …. respectively), you may save it in SCE@int_elementmetadata
. You may integrate biological function information of some features in knownfunc
, which must be a binary dataframe with some features in rows and functions in columns.
SCE@int_elementMetadata$NameType1 <- vectorfeature1
SCE@int_elementMetadata$NameType2 <- vectorfeature2
SCE@int_elementMetadata$KnownFunc <- knownfunc
To use your SCE object, you have to save it in a Rdata file:
The construction of our SCE object example is here detailed. It is based on single cell RNA-seq data available in scRNAseq
package and studied in Zeisel et al. (2015).
Firstly, the object is initialized with the raw counts.
library(scRNAseq)
ZeiselBrain <- ZeiselBrainData()
library(SingleCellExperiment)
SCE <- SingleCellExperiment(assays=SimpleList(counts=counts(ZeiselBrain)))
genes=rownames(SCE)
We remove genes that have an expression counts less than 25 for each cell.
We use the scater
package to compute (log-transformed) normalized expression values:
The package scater
allows us to easily build some dimensionality reduction methods for SingleCellExperiment object. See scater vignette for more details.
# PCA
pcalogcounts <- runPCA(SCE,ncomponents=10,exprs_values="logcounts")
pcanormcounts <- runPCA(SCE,ncomponents=10,exprs_values="normcounts")
# tSNE
tsnelogcounts<-runTSNE(SCE,exprs_values = "logcounts")
tsnenormcounts<-runTSNE(SCE,exprs_values = "normcounts")
# UMAP
umaplogcounts<-runUMAP(SCE,exprs_values = "logcounts")
umapnormcounts<-runUMAP(SCE,exprs_values ="normcounts")
You can enter them into the SCE in the following way.
reducedDims(SCE)<-
SimpleList(PCAlogcounts=reducedDim(pcalogcounts,"PCA"),
tSNElogcounts=reducedDim(tsnelogcounts,"TSNE"),
UMAPlogcounts=reducedDim(umaplogcounts,"UMAP"),
PCAnormcounts=reducedDim(pcanormcounts,"PCA"),
tSNEnormcounts=reducedDim(tsnenormcounts,"TSNE"),
UMAPnormcounts=reducedDim(umapnormcounts,"UMAP"))
We use the clustering available in ZeiselBrain
and other cell clusterings are computed using Seurat
, pcaReduce
and SC3
.
Clust1 <- ZeiselBrain@colData$`group #`
#pcaReduce
pca.red <- PCAreduce(t(logcounts(SCE)), nbt = 3, q = 20, method = 'S')[[1]]
Clust2<-pca.red[,13]
# SC3
sceaux<-SCE
logcounts(sceaux)<-as.matrix(logcounts(sceaux))
counts(sceaux)<-as.matrix(counts(sceaux))
normcounts(sceaux)<-as.matrix(normcounts(sceaux))
rowData(sceaux)$feature_symbol<-rownames(sceaux)
ressc3<-sc3(sceaux,ks=9,gene_filter=T,n_cores=2)
Clust3<-colData(ressc3)[,"sc3_9_clusters"]
# Seurat
ZeiselBrainSeurat<-as.Seurat(SCE,counts = "counts",data = "logcounts",assay = "RNA",project = "SingleCellExperiment")
ZeiselBrainSeurat<-FindVariableFeatures(ZeiselBrainSeurat, selection.method = "vst", nfeatures = 2000)
all.genes <- rownames(ZeiselBrainSeurat)
ZeiselBrainSeurat <- ScaleData(ZeiselBrainSeurat, features = all.genes)
ZeiselBrainSeurat <- RunPCA(ZeiselBrainSeurat, features = VariableFeatures(object = ZeiselBrainSeurat))
ZeiselBrainSeurat <- FindNeighbors(ZeiselBrainSeurat, dims = 1:10)
ZeiselBrainSeurat <- FindClusters(ZeiselBrainSeurat, resolution = 0.5)
Clust4<-Idents(ZeiselBrainSeurat)
We integrate these cell clusterings in our SCE object as follows :
SCE@metadata[["clustering"]]<-list(ClustZeisel=Clust1,
ClustPcared=Clust2,
ClustSC3=Clust3,
ClustSeurat=Clust4)
We integrate some cell information available in ZeiselBrain
:
SCE@metadata[["tissue"]]<-ZeiselBrain@colData$tissue
SCE@metadata[["level1class"]]<-ZeiselBrain@colData$level1class
SCE@metadata[["sex"]]<-ZeiselBrain@colData$sex
We integrate biological function information of some features in knownfunc
:
Zeisel, Amit, Ana B Muñoz-Manchado, Simone Codeluppi, Peter Lönnerberg, Gioele La Manno, Anna Juréus, Sueli Marques, et al. 2015. “Cell Types in the Mouse Cortex and Hippocampus Revealed by Single-Cell Rna-Seq.” Science 347 (6226): 1138–42.