介绍
https://github.com/Nextomics/NextPolish
https://github.com/Nextomics/NextPolish2
conda install nextpolish -c bioconda
[General]
job_type = local
job_prefix = nextPolish
task = best
rewrite = yes
rerun = 3
parallel_jobs = 6
multithread_jobs = 5
genome = ./raw.genome.fasta #genome file
genome_size = auto
workdir = ./01_rundir
polish_options = -p {multithread_jobs}
[lgs_option]
lgs_fofn = ./lgs.fofn
lgs_options = -min_read_len 1k -max_depth 100
lgs_minimap2_options = -x map-ont
threads=20
genome=input.genome.fa # 组装的基因组
lgsreads=input.lgs.reads.fq.gz # 三代长度序列
minimap2 \
-ax map-pb \
-t ${threads} \
${genome} \
${lgsreads} | \
samtools sort - -m 2g --threads 20 -o genome.lgs.bam
samtools index genome.lgs.bam
ls `pwd`/genome.lgs.bam > pb.map.bam.fofn
python NextPolish/lib/nextpolish2.py \
-g ${genome} \
-l pb.map.bam.fofn \
-r hifi \
-p 20 \
-a \
-o genome.lgspolish.fa
usage: nextpolish2.py [-h] -g FILE -l FILE -r {clr,hifi,ont} [-b FILE] [-i BLOCK_INDEX] [-o FILE] [-p INT] [-u] [-w STR] [-a] [-sp] [-id FLOAT] [-as FLOAT]
nextpolish2.py:
correct structural & base errors in the genome with long reads using multi-processor.
exmples:
nextpolish2.py -g genome.fa -l lgs.sort.bam.list -r ont -p 10
options:
-h, --help show this help message and exit
-g FILE, --genome FILE
genome file, the reference of bam alignments. (default: None)
-l FILE, --bam_list FILE
sorted bam file list of long reads, one file one line, require index file. (default: None)
-r {clr,hifi,ont}, --read_type {clr,hifi,ont}
reads type, clr=PacBio continuous long read, hifi=PacBio highly accurate long reads, ont=NanoPore 1D reads (default: None)
-b FILE, --block FILE
genome block file, each line includes [seq_id, index]. (default: None)
-i BLOCK_INDEX, --block_index BLOCK_INDEX
index of seqs need to be corrected in genome block file. (default: all)
-o FILE, --out FILE output file, corrected seqs in output file will be skipped. (default: stdout)
-p INT, --process INT
number of processes used for correcting. (default: 10)
-u, --uppercase output uppercase sequences. (default: False)
-w STR, --window STR size of window (>=5M) to split super-long contigs, shorter size requires less memory and more CPU time. (default: 5M)
-a, --auto automatically adjust window size (-w) and processes (-p). (default: True)
-sp, --split split the corrected contig with un-corrected regions. (default: True)
-id FLOAT, --alignment_identity_ratio FLOAT
split the corrected contig if alignment_identity/median_alignment_identity < $identity_ratio, co-use with --split. (default: 0.8)
-as FLOAT, --alignment_score_ratio FLOAT
split the corrected contig if alignment_score/max_alignment_score < $alignment_score_ratio, co-use with --split. (default: 0.8)
/proj/nobackup/hpc2nstor2024-021/shwzhao/bin/miniconda3/share/nextpolish-1.4.1/lib/nextpolish2.py
参考
github 地址:https://github.com/Nextomics/NextPolish
公众号 | 生信媛 | 使用nextpolish对三代组装进行polish(v1.2.2)