1. Introduction of bwa
Objective:
After quality control, BWA MEM compared the quality controled data to the reference genome.
Algorithm:
BWA is a fast comparison tool based on BWT. It consists of three algorithms. The three algorithms are BWA Backtrack, BWA SW, and BWA MEM. Among them, BWA MEM is the latest, which is faster, more accurate, and more suitable for human weight data analysis.
Installation:
conda install -c bioconda bwa
2. Bwa process :(see help file for details)
Index building
# Establish AN FM-Index for large genomes
bwa index -a bwtsw $reference
Establish index for small genome with fast speed and large memory consumption
bwa index -a is the ref. fasta
bwa index ./2019-nCoV.fasta
3. Single two-terminal data comparison :(the input data is trim after)
bwa mem -t 8 ./bwa_index/2019-nCoV.fasta out.R1.fq.gz out.R2.fq.gz > 2nd.sam
4. Batch double-end data comparison:
INDEX=./bwa_index/2019-nCoV.fasta
for i in $(ls *_)
Cat/holds the trim fastq file |while read id
do
arr=($id)
sample=${arr[0]}
fq1=${arr[1]}
fq2=${arr[2]}
echo $sample $fq1 $fq2
bwa mem -t 20 $INDEX /home/kelly/wesproject/4_clean/wes/$fq1 /home/kelly/wesproject/4_clean/wes/$fq2 |samtools sort -@ 20 -o $sample.bam
done
5. The result:
Get the Samtools input Sam file for the next step