Build your SCE object

The principal steps to build the SingleCellExperiment (SCE) object required to use the ASTEC-sc shiny application is detailed in this section.

library(SingleCellExperiment)

Step 1: Creation of SingleCellExperiment object
Let X, Xnorm and Xlog be a matrix of raw counts, normalized counts and lognormalized counts respectively. The SCE object is then initialized as follows :

SCE <- SingleCellExperiment(assays=SimpleList(counts=X,
                                             normcounts=Xnorm,
                                             logcounts=Xlog)) 

If you want to add another transformed data matrix Xtranf:

assays(SCE, "mytransf") <- Xtranf  #Optional

Step 2: Dimensionality reduction methods

Let RedDim1, RedDim2, … be the coordinate matrices associated to several dimensionality reduction methods. You may enter them into the SCE object in the following way:

reducedDims(SCE)<-
  SimpleList(NameRedDims1 = RedDims1,
             NameRedDims2 = RedDims2,
             ...)

At least one dimensionality reduction method is required in the SCE object.

Step 3: cell clusterings

Let Clust1, Clust2, … be vectors which contain a cell clustering respectively. To integrate them in the SCE object, it is required to enter a list of these cell clusterings in SCE@metadata[["clustering"]]:

SCE@metadata[["clustering"]]<-list(nameClust1=Clust1,
                                  nameClust2=Clust2,
                                            ...)

At the end of this step, your SCE object is ready to be used in the application.

Step 4: Add cell / feature information (optional)

You may add supplementary information about cells and/or features in the SCE object.

If you have qualitative cell information (available in vectors vector1, vector2, …. respectively), you may save it in SCE@metadata. Be careful, your vectors must be of length the total number of cells in the SCE object (length(vector1)==ncols(SCE)).

SCE@metadata[["cellType1"]] <- vector1
SCE@metadata[["cellType2"]] <- vector2 

If you have qualitative feature information (available in vectors vectorfeature1, vectorfeature2, …. respectively), you may save it in SCE@int_elementmetadata. You may integrate biological function information of some features in knownfunc, which must be a binary dataframe with some features in rows and functions in columns.

SCE@int_elementMetadata$NameType1 <- vectorfeature1
SCE@int_elementMetadata$NameType2 <- vectorfeature2
SCE@int_elementMetadata$KnownFunc <- knownfunc

To use your SCE object, you have to save it in a Rdata file:

save(SCE,file="mySCE.RData")

Our SCE object example

The construction of our SCE object example is here detailed. It is based on single cell RNA-seq data available in scRNAseq package and studied in Zeisel et al. (2015).

Firstly, the object is initialized with the raw counts.

library(scRNAseq)
ZeiselBrain <- ZeiselBrainData()
library(SingleCellExperiment)
SCE <- SingleCellExperiment(assays=SimpleList(counts=counts(ZeiselBrain)))
genes=rownames(SCE)

We remove genes that have an expression counts less than 25 for each cell.

delete<-which(apply(counts(SCE),1,max)<25)
genes <- genes[-delete]
SCE <- SCE[-delete,]

We use the scater package to compute (log-transformed) normalized expression values:

library(scater)
SCE<-logNormCounts(SCE,log=F) #normcounts
SCE<-logNormCounts(SCE,log=T) #logcounts

The package scater allows us to easily build some dimensionality reduction methods for SingleCellExperiment object. See scater vignette for more details.

# PCA
pcalogcounts <- runPCA(SCE,ncomponents=10,exprs_values="logcounts") 
pcanormcounts <- runPCA(SCE,ncomponents=10,exprs_values="normcounts")
# tSNE
tsnelogcounts<-runTSNE(SCE,exprs_values = "logcounts")
tsnenormcounts<-runTSNE(SCE,exprs_values = "normcounts")
# UMAP
umaplogcounts<-runUMAP(SCE,exprs_values = "logcounts")
umapnormcounts<-runUMAP(SCE,exprs_values ="normcounts")

You can enter them into the SCE in the following way.

reducedDims(SCE)<-
  SimpleList(PCAlogcounts=reducedDim(pcalogcounts,"PCA"),
             tSNElogcounts=reducedDim(tsnelogcounts,"TSNE"),
             UMAPlogcounts=reducedDim(umaplogcounts,"UMAP"),
             PCAnormcounts=reducedDim(pcanormcounts,"PCA"),
             tSNEnormcounts=reducedDim(tsnenormcounts,"TSNE"),
             UMAPnormcounts=reducedDim(umapnormcounts,"UMAP"))

We use the clustering available in ZeiselBrain and other cell clusterings are computed using Seurat, pcaReduce and SC3.

Clust1 <- ZeiselBrain@colData$`group #`
#pcaReduce
pca.red <- PCAreduce(t(logcounts(SCE)), nbt = 3, q = 20, method = 'S')[[1]]
Clust2<-pca.red[,13]
# SC3
sceaux<-SCE
logcounts(sceaux)<-as.matrix(logcounts(sceaux))
counts(sceaux)<-as.matrix(counts(sceaux))
normcounts(sceaux)<-as.matrix(normcounts(sceaux))
rowData(sceaux)$feature_symbol<-rownames(sceaux)
ressc3<-sc3(sceaux,ks=9,gene_filter=T,n_cores=2)
Clust3<-colData(ressc3)[,"sc3_9_clusters"]
# Seurat
ZeiselBrainSeurat<-as.Seurat(SCE,counts = "counts",data = "logcounts",assay = "RNA",project = "SingleCellExperiment")
ZeiselBrainSeurat<-FindVariableFeatures(ZeiselBrainSeurat, selection.method = "vst", nfeatures = 2000)
all.genes <- rownames(ZeiselBrainSeurat)
ZeiselBrainSeurat <- ScaleData(ZeiselBrainSeurat, features = all.genes)
ZeiselBrainSeurat <- RunPCA(ZeiselBrainSeurat, features = VariableFeatures(object = ZeiselBrainSeurat))
ZeiselBrainSeurat <- FindNeighbors(ZeiselBrainSeurat, dims = 1:10)
ZeiselBrainSeurat <- FindClusters(ZeiselBrainSeurat, resolution = 0.5)
Clust4<-Idents(ZeiselBrainSeurat)

We integrate these cell clusterings in our SCE object as follows :

SCE@metadata[["clustering"]]<-list(ClustZeisel=Clust1,
                                  ClustPcared=Clust2,
                                  ClustSC3=Clust3,
                                  ClustSeurat=Clust4)

We integrate some cell information available in ZeiselBrain :

SCE@metadata[["tissue"]]<-ZeiselBrain@colData$tissue
SCE@metadata[["level1class"]]<-ZeiselBrain@colData$level1class
SCE@metadata[["sex"]]<-ZeiselBrain@colData$sex

We integrate biological function information of some features in knownfunc:

library(readxl)
knownFunc <- read_excel("GenesFunctionZeisel.xlsx")
knownFunc <- dplyr::left_join(data.frame(genes), knownFunc, by=c("genes"="Names"))
knownFunc[is.na(knownFunc)] <- 0

SCE@int_elementMetadata$Name=genes
SCE@int_elementMetadata$KnownFunc <- knownFunc
save(SCE,file="SCE-ZeiselBrain-example.RData")

Reference

Zeisel, Amit, Ana B Muñoz-Manchado, Simone Codeluppi, Peter Lönnerberg, Gioele La Manno, Anna Juréus, Sueli Marques, et al. 2015. “Cell Types in the Mouse Cortex and Hippocampus Revealed by Single-Cell Rna-Seq.” Science 347 (6226): 1138–42.