介绍

下载

使用

  1. RepeatScout -h
  2. ## RepeatScout Version 1.0.6
  3. ##
  4. ## Usage:
  5. ## RepeatScout -sequence <seq> -output <out> -freq <freq> -l <l> [opts]
  6. ## -L # size of region to extend left or right (10000)
  7. ## -match # reward for a match (+1)
  8. ## -mismatch # penalty for a mismatch (-1)
  9. ## -gap # penalty for a gap (-5)
  10. ## -maxgap # maximum number of gaps allowed (5)
  11. ## -maxoccurrences # cap on the number of sequences to align (10,000)
  12. ## -maxrepeats # stop work after reporting this number of repeats (10000)
  13. ## -cappenalty # cap on penalty for exiting alignment of a sequence (-20)
  14. ## -tandemdist # of bases that must intervene between two l-mers for both to be counted (500)
  15. ## -minthresh # stop if fewer than this number of l-mers are found in the seeding phase (3)
  16. ## -minimprovement # amount that a the alignment needs to improve each step to be considered progress (3)
  17. ## -stopafter # stop the alignment after this number of no-progress columns (100)
  18. ## -goodlength # minimum required length for a sequence to be reported (50)
  19. ## -maxentropy # entropy (complexity) threshold for an l-mer to be considered (-.7)
  20. ## -v[v[v[v]]] How verbose do you want it to be? -vvvv is super-verbose.
  • -sequence: 输入序列。
  • -output输出文件。
  • -freq
  • -l:
  • -L: 指定延伸种子序列时,左右延伸的最大长度。[10000]
  • -match: 匹配得分。[+1]
  • -mismatch: 错配罚分。[-1]
  • -gap: gap 罚分。[-5]
  • -maxgap: 允许最多的 gap 数量。[5]
  • -maxoccurrences: 允许参与比对的最大序列数量。[10000]
  • -maxrepeats: 发现指定最大数量的 repeats 之后,报告并结束。[10000]
  • -cappenalty: 罚分达到指定阈值时停止。[-20]
  • -tandemdist: 两个 l-mer 之间必须间隔的最小碱基数量。[500]
  • -minthresh: 种子阶段中,如果找到的 l-mer 数量小于指定的值,则停止。[3]
  • -minimprovement:
  • -stopafter: 指定数量的列数没有进展的情况下,停止比对。
  • -goodlength: 指定报告的重复序列的最小长度。[50]
  • -maxentropy: l-mer 的复杂度阈值(熵值),用于过滤简单序列。[-0.7]
  • -v[v[v[v]]]: 指定输出信息的详细程度。

运行

参考

  1. https://github.com/mmcco/RepeatScout
  2. https://biocontainer-doc.readthedocs.io/en/latest/source/repeatscout/repeatscout.html