参考
https://ftp.ncbi.nih.gov/pub/taxonomy/
工具 | R | taxonomizr
工具 | R | taxize
https://cran.r-project.org/web/packages/myTAI/vignettes/Taxonomy.html
SQL
1. 介绍
taxonomizr
provides some simple functions to parse NCBI taxonomy files and accession dumps and efficiently use them to assign taxonomy to accession numbers or taxonomic IDs. This is useful for example to assign taxonomy to BLAST results. This is all done locally after downloading the appropriate files from NCBI using included functions (see below).
2. 命令
2.1 主要函数
prepareDatabase
: download data from NCBI and prepare SQLite databaseaccessionToTaxa
: convert accession numbers to taxonomic IDsgetTaxonomy
: convert taxonomic IDs to taxonomy
library(taxonomizr)
#note this will require a lot of hard drive space, bandwidth and time to process all the data from NCBI
# prepareDatabase('accessionTaxa.sql')
blastAccessions <- c("Z17430.1","Z17429.1","X62402.1")
ids <- accessionToTaxa(blastAccessions,'/home/xmyan/TPS_SynNet/20221031/TPS_identified/NR/Species/accessionTaxa.sql')
getTaxonomy(ids,'/home/xmyan/TPS_SynNet/20221031/TPS_identified/NR/Species/accessionTaxa.sql')
2.2 其他函数
getId
: convert a biological name to taxonomic IDgetRawTaxonomy
: find all taxonomic ranks for a taxonomic IDnormalizeTaxa
: combine raw taxonomies with different taxonomic rankscondenseTaxa
: condense a set of taxa to their most specific common branchmakeNewick
: generate a Newick formatted tree from taxonomic outputgetAccessions
: find accessions for a given taxonomic ID
taxaId <- getId(a$V1,'/home/xmyan/TPS_SynNet/20221031/TPS_identified/NR/Species/accessionTaxa.sql')
print(taxaId)