介绍
下载
使用
RepeatScout -h
## RepeatScout Version 1.0.6
##
## Usage:
## RepeatScout -sequence <seq> -output <out> -freq <freq> -l <l> [opts]
## -L # size of region to extend left or right (10000)
## -match # reward for a match (+1)
## -mismatch # penalty for a mismatch (-1)
## -gap # penalty for a gap (-5)
## -maxgap # maximum number of gaps allowed (5)
## -maxoccurrences # cap on the number of sequences to align (10,000)
## -maxrepeats # stop work after reporting this number of repeats (10000)
## -cappenalty # cap on penalty for exiting alignment of a sequence (-20)
## -tandemdist # of bases that must intervene between two l-mers for both to be counted (500)
## -minthresh # stop if fewer than this number of l-mers are found in the seeding phase (3)
## -minimprovement # amount that a the alignment needs to improve each step to be considered progress (3)
## -stopafter # stop the alignment after this number of no-progress columns (100)
## -goodlength # minimum required length for a sequence to be reported (50)
## -maxentropy # entropy (complexity) threshold for an l-mer to be considered (-.7)
## -v[v[v[v]]] How verbose do you want it to be? -vvvv is super-verbose.
-sequence
: 输入序列。-output
输出文件。-freq
-l
: -L
: 指定延伸种子序列时,左右延伸的最大长度。[10000]
-match
: 匹配得分。[+1]
-mismatch
: 错配罚分。[-1]
-gap
: gap 罚分。[-5]
-maxgap
: 允许最多的 gap 数量。[5]
-maxoccurrences
: 允许参与比对的最大序列数量。[10000]
-maxrepeats
: 发现指定最大数量的 repeats 之后,报告并结束。[10000]
-cappenalty
: 罚分达到指定阈值时停止。[-20]
-tandemdist
: 两个 l-mer 之间必须间隔的最小碱基数量。[500]
-minthresh
: 种子阶段中,如果找到的 l-mer 数量小于指定的值,则停止。[3]
-minimprovement
: -stopafter
: 指定数量的列数没有进展的情况下,停止比对。-goodlength
: 指定报告的重复序列的最小长度。[50]
-maxentropy
: l-mer 的复杂度阈值(熵值),用于过滤简单序列。[-0.7]
-v[v[v[v]]]
: 指定输出信息的详细程度。
运行
参考
- https://github.com/mmcco/RepeatScout
- https://biocontainer-doc.readthedocs.io/en/latest/source/repeatscout/repeatscout.html