由於語法渲染問題而影響閱讀體驗, 請移步博客閱讀~
本文GitPage地址
WGCNA
Lecture by Steven Horvath 2013, UCLA
Though, it was designed for microarray data, but it applied in DNA methylation data, RNA-seq, miRNA-seq, Peptide Count data…
- NetWork: the value between notes is either 0 or 1
- weighted network: the value is variable between notes (Connection is weighted).
Unsigned Network and Signed Network
- Unsigned network:
- The value is from -1 to 1. (Both -1 and 1 means adjacent is 0, strongly correlated)
- Both high value and low (nagetive) value are correlated because of adjacent.
- Singed network:
- The correlation value is form 0-1.
From unsigned correlation to singed network,
Q&A
Reference: Peter 2017
Data Analysis Questions
How many samples do I need?
At least for 15 samples. More samples could robust and refined results. Noises may can’t be removed if you have less samples.
Should I filter probesets or genes?
- Filtering genes by DEGs is not recommended since it completely invalidates the scale-free topology assumption, which is the indicator of the soft thresholding power.
- On the other hand, filtering gene by DE will lead to a set of correlated genes that will essentially form a single (or few high correlated) module respectively.
- What argument (option) settings are recommended?
In general, we attempt to select suitable default which feat multiple applications and also, it is high reproducible.
While for new calculations, you can customize argument.
- Signed Network
The choice of Signed or Unsigned network is complicate, but generally, we prefer the Signed network.
To construct Signed Networks, we usring:
type = "signed"
ortype = "signed hybrid"
in function such asaccuracyMeasures
,adjacency
,chooseOneHubInEachModule
,chooseTopHubInEachModule
,nearestNeighborConnectivity
,nearestNeighborConnectivityMS
,orderBranchesUsingHubGenes
,softConnectivity
. Some functions useingnetworkType
to select thesigned
. - Robust correlation
Generaly, we recommend to using default arguments to detect the correlation unless you have enough reason to believe there is no outlier measurment. You can usingcorFnc
,cor
, et al. to customizing your own detective. For more, please go to Tutorial
- Can WGCNA be used to analyze RNA-Seq data?
Peter: Yes. As far as WGCNA is concerned, working with (properly normalized) RNA-seq data isn’t really any different from working with (properly normalized) microarray data.
Suggestion:
- Removing low hits transcripts.
Low counting transcripts tend to reflect noise. (for example, removing all features that have a count of less than say 10 in more than 90% of the samples) Normalization
varianceStabilizingTransformation
from DESeq2 is really useful.
RPKM and FPKM is helpful, too.
We can also usinglog2(x+1)
.
Notions:- Different algorithms have huge impact on the result of expression change, but have limit affect on WGCNA.
- Id data comes from different batch, We can use
ComBat
(Exp: 木头的博客) for batch effect removal. - Finally, we usually check quantile scatterplots to make sure there are no systematic shifts between samples; if sample quantiles show correlations (which they usually do), quantile normalization can be used to remove this effect.
Data heterogeneous
Data heterogeneous can effect any statistical analysis. (Skip)Soft-thresholding power
can’t get a good scale-free topology index no matter how high I set the soft-thresholding power.First, the user should ensure that variables (probesets, genes etc.) have not been filtered by differential expression with respect to a sample trait.
Probability: Checking the clustering tree (exp); strong clusters in the tree indicates globally different groups of sample. It may caused by batch effects or heterogenous. Carefully adjust the samples before building topology index.
If the one causing heterogenous you don’t remove, you can still chosen the soft thresholding power by the number of samples at table below.
Number of samples | Unsigned and signed hybrid networks | Signed networks |
---|---|---|
Less than 20 | 9 | 18 |
20-30 | 8 | 16 |
30-40 | 7 | 14 |
more than 40 | 6 | 12 |
Enjoy~
由於語法渲染問題而影響閱讀體驗, 請移步博客閱讀~
本文GitPage地址
GitHub: Karobben
Blog:Karobben
BiliBili:史上最不正經的生物狗