
Seurat包学习笔记(九):Differential expression testing - 图2


  1. library(Seurat)
  2. # 这里我们使用之前分析用过的PBMC数据
  3. pbmc <- readRDS(file = "../data/pbmc3k_final.rds")



  1. # list options for groups to perform differential expression on
  2. # 查看已分群好的细胞类群
  3. levels(pbmc)
  4. [1] "Naive CD4 T" "Memory CD4 T" "CD14+ Mono" "B" "CD8 T"
  5. [6] "FCGR3A+ Mono" "NK" "DC" "Platelet"
  6. # Find differentially expressed features between CD14+ and FCGR3A+ Monocytes
  7. # 使用默认方法对CD14+ and FCGR3A+ Monocytes两组细胞类群进行差异表达分析
  8. monocyte.de.markers <- FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono")
  9. # 查看筛选出的一些marker基因
  10. head(monocyte.de.markers)

Seurat包学习笔记(九):Differential expression testing - 图3

  • p_val:假设检验后得到的原始P值
  • avg_logFC:两组之间平均表达差异倍数的对数值。正值表示该基因在第一组中的表达更高。
  • pct.1:第一组中检测到表达该基因的细胞所占的百分比
  • pct.2:第二组中检测到表达该基因的细胞所占的百分比
  • p_val_adj:bonferroni多重检验校正后得到的校正后的P值。


  1. # Find differentially expressed features between CD14+ Monocytes and all other cells, only search for positive markers
  2. monocyte.de.markers <- FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = NULL, only.pos = TRUE)
  3. # view results
  4. head(monocyte.de.markers)

Seurat包学习笔记(九):Differential expression testing - 图4



  1. # Pre-filter features that are detected at <50% frequency in either CD14+ Monocytes or FCGR3A+ Monocytes
  2. # 设置min.pct = 0.5参数过滤掉那些在50%以下细胞中检测到的基因
  3. head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", min.pct = 0.5))

Seurat包学习笔记(九):Differential expression testing - 图5

  1. # Pre-filter features that have less than a two-fold change between the average expression of CD14+ Monocytes vs FCGR3A+ Monocytes
  2. # 设置logfc.threshold = log(2)参数过滤掉那些在两个不同组之间平均表达的差异倍数低于2的基因
  3. head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", logfc.threshold = log(2)))

Seurat包学习笔记(九):Differential expression testing - 图6

  1. # Pre-filter features whose detection percentages across the two groups are similar (within 0.25)
  2. # 设置min.diff.pct = 0.25参数过滤掉那些在两个不同组之间能检测到的细胞比例低于0.25的基因
  3. head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", min.diff.pct = 0.25))

Seurat包学习笔记(九):Differential expression testing - 图7

  1. # Increasing min.pct, logfc.threshold, and min.diff.pct, will increase the speed of DE testing, but could also miss features that are prefiltered
  2. # Subsample each group to a maximum of 200 cells. Can be very useful for large clusters, or computationally-intensive DE tests
  3. head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", max.cells.per.ident = 200))

Seurat包学习笔记(九):Differential expression testing - 图8



  • “wilcox” : Wilcoxon rank sum test (default)
  • “bimod” : Likelihood-ratio test for single cell feature expression, (McDavid et al., Bioinformatics, 2013)
  • “roc” : Standard AUC classifier
  • “t” : Student’s t-test
  • “poisson” : Likelihood ratio test assuming an underlying poisson distribution. Use only for UMI-based datasets
  • “negbinom” : Likelihood ratio test assuming an underlying negative binomial distribution. Use only for UMI-based datasets
  • “LR” : Uses a logistic regression framework to determine differentially expressed genes. Constructs a logistic regression model predicting group membership based on each feature individually and compares this to a null model with a likelihood ratio test.
  • “MAST” : GLM-framework that treates cellular detection rate as a covariate (Finak et al, Genome Biology, 2015)
  • “DESeq2” : DE based on a model using the negative binomial distribution (Love et al, Genome Biology, 2014)


  1. # Test for DE features using the MAST package
  2. # 设置test.use = "MAST"参数指定使用MAST方法
  3. head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", test.use = "MAST"))

Seurat包学习笔记(九):Differential expression testing - 图9

  1. # Test for DE features using the DESeq2 package. Throws an error if DESeq2 has not already been installed Note that the DESeq2 workflows can be computationally intensive for large datasets, but are incompatible with some feature pre-filtering options We therefore suggest initially limiting the number of cells used for testing
  2. # 设置test.use = "DESeq2"参数指定使用DESeq2方法
  3. head(FindMarkers(pbmc, ident.1 = "CD14+ Mono", ident.2 = "FCGR3A+ Mono", test.use = "DESeq2", max.cells.per.ident = 50))

Seurat包学习笔记(九):Differential expression testing - 图10
