edgeR: Empirical Analysis of Digital Gene Expression Data in R
一个差异性分析的R包,用于RNA-seq或DNA甲基化等相关技术分析。
其原理利用广义线性模型对每个基因或者甲基化位点建模,然后直接比较线性模型的参数。
输入要求:必须是支持该位点的原始read count,而不是经过normalization计算的结果。
对于RNA-seq可以是htseq-count的结果。
对于甲基化分析可以是bismark的结果。
edgeR分析bs-seq data的背景
A DNA methylation study often involves comparing methylation levels at CpG loci between different experimental groups. Differential methylation analyses can be performed in edgeR for both whole genome bisulfite sequencing (WGBS) and reduced representation bisulfite sequencing (RRBS). This is done by considering the observed read counts of both methylated and unmethylated CpG’s across all the samples. Extra coefficients are added to the design matrix to represent the methylation levels and the differences of the methylation levels betweeen groups.
[
](https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf)
使用教程
edgeR的说明书
来自edgeR User’s Guide:https://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf
在使用说明书的章节 4.7 有一个详细的差异甲基化分析示例。章节标题为:Bisulfite sequencing of mouse oocytes
里面有一句话值得注意:
BS-seq 和其他测序数据之间的一个关键区别是,为特定样本保存甲基化和未甲基化读数的两个库被视为一个单元。为了确保对同一样本的甲基化和非甲基化读物进行相同比例的处理,我们需要为每对库设置等于库大小。我们将每个样本的库大小设置为甲基化和未甲基化库总阅读计数的平均值:
edgeR对BS-seq执行差异甲基化分析
因为已有相关文章发表可以使用edgeR对BS-seq data(尤其是RRBS)进行差异甲基化分析。
这是以一篇example workflow为目的的文章呈现给大家的。
标题:Differential methylation analysis of reduced representation bisulfite sequencing experiments using edgeR
文章获取地址:https://f1000research.com/articles/6-2055/v2
当时下载了文献,随便传了上来:
edgeR使用示例—REVISED Differential methylation analysis of reduced representation bisulfite sequencing experiments using edgeR.pdf
缺点
目前edgeR只能分析预先定义的基因组区域的甲基化变化(作为文件或者指定参数传递到edgeR包的函数),不能像DSS包等工具能做到滑窗法检测差异甲基化区域(DMRs)。
有没有办法把它移植到genomic regions的甲基化水平的差异分析?
附:
DSS包差异分析的实验设计的文章,2016年发表在bioinformatics上,标题为Differential methylation analysis for BS-seq data under general experimental design. Bioinformatics.。