1. SAM file
SAM file is a format to save alignment information of short reads mapped against reference sequences.
##---------------View SAM file$ cat P3-VERO-P3-1-vero_L4.sam | head -n 10# or$ samtools view -h P3-VERO-P3-1-vero_L4.sam | head -n 10
A SAM file usually starts with a header section and is followed by alignment information as tab-separated lines for each read.
* Header section
Each header line begins with the character @ followed by one of the two-letter header record type codes defined in this section.
@SQ SN:NC_045512.2 LN:29903@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem -t 4 -M /public/home/ykk/SARS_CoV_2/mapping/ref/Wuhan-Hu-1.fasta /public/home/ykk/SARS_CoV_2/clean_datP3-1-vero_L4_1.fq.gz /public/home/ykk/SARS_CoV_2/clean_data/P3-VERO-P3-1-vero_L4_2.fq.gz
* Tab-delimited read alignment information lines
Each alignment line typically represents the linear alignment of a segment. Each line consists of 11 or more TAB-separated fields.
A00821:275:HWMMWDSXX:4:1101:1298:1016 83 NC_045512.2 4209 60 150M = 4179 -180 AAGCTTTGAGAAAAGTGCCAACAGACAATTATATAACCACTAGGGTTTAAATGGTTACACTGTAGAGGAGGCAAAGACAGTGCTTAAAAAGTGTAAAAGTGCCTTTTACATTCTACCATCTATTATCTCTAATGAGAAGC FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF::FFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFF NM:i:0 MD:Z:150 MC:Z:150M AS:i:150……A00821:275:HWMMWDSXX:4:1101:23484:2519 77 * 0 0 * * 0 0 ACGGATTGTACGCAAGTACAGTGGTAGGGGAGCGTTCCAAGGGTGATGATAAGGACTGGTGGAGCGCTTGGAAGTGATTATGCCGGCATGAGTAACGTTTGGAAGTGAGAATCTTCCATGCCGTTTGACCAAGGTTTCCT FFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFF,FFFFFFFFFF:FFFFFFFFFF:FFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFF AS:i:0 XS:i:0
The first eleven fields are always present and in the order shown below; if the information represented by any of these fields is unavailable, that field’s value will be a placeholder, either ‘0’ or ‘*’ as determined by the field’s type.
* The Meaning of Each Column
- QNAME: Read Name
- FLAG: Combination of bitwise FLAGs. Each bit is explained in the following table:

- RNAME: Reference sequence NAME of the alignment.
- POS: 1-based leftmost mapping position of the first CIGAR operation that “consumes” a reference base (see table below). The fifirst base in a reference sequence has coordinate 1. POS is set as 0 for an unmapped read without coordination.
- MAPQ: Mapping quality.
- CIGAR: The CIGAR operations are given in the following table (set ‘*’ if unavailable):

- RNEXT: Reference sequence name of the primary alignment of the NEXT read in the template. This field is set as ‘*’ when the information is unavailable, and set as ‘=’ if RNEXT is identical to RNAME.
- PNEXT: 1-based Position of the primary alignment of the NEXT read in the template. Set as 0 when the information is unavailable.
- TLEN: The length of the template.
- SEQ: Read Sequence
- QUAL: Read Quality
2. BAM file
A BAM file (*.bam) is the compressed binary version of a SAM file.View BAM file
BAM files can only be viewed by
samtools viewcommand.
$ samtools view -h P3-VERO-P3-1-vero_L4_sort.bam | head -n 10
@HD VN:1.6 SO:coordinate@SQ SN:NC_045512.2 LN:29903@PG ID:bwa PN:bwa VN:0.7.17-r1188 CL:bwa mem -t 4 -M /public/home/ykk/SARS_CoV_2/mapping/ref/Wuhan-Hu-1.fasta /public/home/ykk/SARS_CoV_2/clean_data/P3-VERO-P3-1-vero_L4_1.fq.gz /public/home/ykk/SARS_CoV_2/clean_data/P3-VERO-P3-1-vero_L4_2.fq.gz@PG ID:samtools PN:samtools PP:bwa VN:1.11 CL:samtools view -b -F 12 /public/home/ykk/SARS_CoV_2/mapping/P3-VERO-P3-1-vero_L4.sam@PG ID:samtools.1 PN:samtools PP:samtools VN:1.11 CL:samtools sort -@ 4 -l 9 -o /public/home/ykk/SARS_CoV_2/sortedbam/P3-VERO-P3-1-vero_L4_sort.bam /public/home/ykk/SARS_CoV_2/bam/P3-VERO-P3-1-vero_L4.bam@PG ID:samtools.2 PN:samtools PP:samtools.1 VN:1.12 CL:samtools view -h P3-VERO-P3-1-vero_L4_sort.bamA00821:275:HWMMWDSXX:4:1116:11957:9580 163 NC_045512.2 1 60 92M = 1 92 ATTAAAGGTTTATACCTTCCCAGGTAACAAACCAACCAACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGC ,FFFF:F:FFFF:FFF:FFFFFF:FFFFFFFFFFFFFFFF:FFFFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F NM:i:0 MD:Z:92 MC:Z:92M AS:i:92 XS:i:0
3. More information:
- http://samtools.github.io/hts-specs/SAMv1.pdf
- https://blog.csdn.net/zhu_si_tao/article/details/53436351
- https://blog.csdn.net/genome_denovo/article/details/78712972?utm_medium=distribute.pc_relevant.none-task-blog-2~default~BlogCommendFromBaidu~default-4.control&dist_request_id=&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2~default~BlogCommendFromBaidu~default-4.control
