Rfam 数据库是一个 RNA 家族的集合,每个家族都由多个序列比对、一致二级结构和协方差模型表示。

https://rfam.org/

http://eddylab.org/infernal/

  1. conda install -c bioconda infernal

https://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/

  1. # Rfam.cm
  2. wget -c https://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.clanin
  3. wget -c https://ftp.ebi.ac.uk/pub/databases/Rfam/CURRENT/Rfam.cm.gz
  4. gunzip Rfam.cm.gz
  5. cmpress Rfam.cm
  6. ## Working... done.
  7. ## Pressed and indexed 4178 CMs and p7 HMM filters (4178 names and 4178 accessions).
  8. ## Covariance models and p7 filters pressed into binary file: Rfam.cm.i1m
  9. ## SSI index for binary covariance model file: Rfam.cm.i1i
  10. ## Optimized p7 filter profiles (MSV part) pressed into: Rfam.cm.i1f
  11. ## Optimized p7 filter profiles (remainder) pressed into: Rfam.cm.i1p
  1. mkdir infernal
  2. cd infernal
  3. genome=../../genome.renamed.fa
  4. cmscan \
  5. --cut_ga \
  6. --rfam \
  7. --nohmmonly \
  8. --fmt 2 \
  9. --tblout genome.tblout \
  10. --clanin ../Rfam.clanin \
  11. ../Rfam.cm \
  12. ${genome}

https://github.com/nawrockie/jiffy-infernal-hmmer-scripts/blob/master/infernal-tblout2gff.pl

参考

Rfam 地址:https://rfam.xfam.org/

Rfam 介绍:https://docs.rfam.org/en/latest/about-rfam.html

使用Rfam数据库注释基因组中的非编码RNA(ncRNA)

ncRNA注释