对TAB分隔的,含染色体位置的文件都可以建立索引,如VCF, GFF, BED, SAM …
输入文件必须是bgzip压缩文件
bgzip -@ 4 input.vcf # 压缩tabix input.vcf.gz # 建索引tabix input.vcf.gz -l # 查看染色体列表tabix input.vcf.gz X # 查看某个染色体tabix input.vcf.gz 2:1000-1000 # 查看某个位点tabix input.vcf.gz 2:1000-2000 # 查看某个区间 (1-based)tabix -T target.bed input.vcf.gz # 指定区间文件(1-based)tabix -R region.bed input.vcf.gz # 指定区间文件(0-based)
任意TSV文件
$ tabixVersion: 1.9Usage: tabix [OPTIONS] [FILE] [REGION [...]]Indexing Options:-0, --zero-based coordinates are zero-based-b, --begin INT column number for region start [4]-c, --comment CHAR skip comment lines starting with CHAR [null]-C, --csi generate CSI index for VCF (default is TBI)-e, --end INT column number for region end (if no end, set INT to -b) [5]-f, --force overwrite existing index without asking-m, --min-shift INT set minimal interval size for CSI indices to 2^INT [14]-p, --preset STR gff, bed, sam, vcf-s, --sequence INT column number for sequence names (suppressed by -p) [1]-S, --skip-lines INT skip first INT lines [0]Querying and other options:-h, --print-header print also the header lines-H, --only-header print only the header lines-l, --list-chroms list chromosome names-r, --reheader FILE replace the header with the content of FILE-R, --regions FILE restrict to regions listed in the file-T, --targets FILE similar to -R but streams rather than index-jumps
区间文件(如BED文件,UCSC库文件等)
#chrom start end name score1 1 2 aaa 1tabix -c '#' -s 1 -b 2 -c 3 input.tsv.gz
位点文件(如VCF,ANNOVAR注释结果文件等)
chrom pos anno1 111 hellotabix -c 'chrom' -s 1 -b 2 -e 2 input.tsv.gz
