由於語法渲染問題而影響閱讀體驗, 請移步博客閱讀~
本文GitPage地址

miRNA pipeline

1 fastq dum

fastq-dump *

reads length distribution

from intetnet
python scripts

  1. from collections import Counter
  2. Seq_read = "fastp_cut.fq"
  3. output = "fastp.distribution"
  4. with open(Seq_read,'r') as Fileout, open('srg1.r1.paired.results.txt','w') as Filein:
  5. i = 4
  6. dic, arr = {}, []
  7. while True:
  8. line = Fileout.readline()
  9. i += 1
  10. if i%4 == 2:
  11. arr.append(len(str(line)))
  12. if not line:
  13. break
  14. dic = Counter(arr)
  15. result = ''
  16. for k,v in dic.items():
  17. result = result +str(k)+"\t"+str(v)+"\n"
  18. fo = open(output, "w")
  19. fo.write(result)

R scripts

  1. save_name="fastp.pdf"
  2. table_name = "fastp.distribution"
  3. library(ggplot2)
  4. A <- read.table(table_name)
  5. ggplot(A,aes(x=V1,y=V2)) +geom_bar(stat="identity")
  6. ggsave(save_name)

2 fastQC

for i in $(ls *.fastq);do
mkdir QC_$i
~/Biosoft/FastQC/fastqc -o QC_$i -t 7 $i
done

mkdir 2-QC
mv QC* 2-QC/
mkdir 1-reads
mv E* 1-reads/

3 alignment

mkdir 3-align
cd 3-align
for i in $(ls ../1-reads/ERR219785*.fastq);do
bowtie2 -p 8 -x /media/ken/Data/CrippsLab/DB_D.melanogaster/Genome -U ../1-reads/$i   -S $i.Genome.sam
done
mv ../1-reads/*.sam .
cd ..

for i in $(ls ERR219785*.fastq);do
bowtie2 -p 8 -x /media/ken/Data/DB/miRNA/hairpin -U $i   -S $i.hairpin.sam
done
mv *.sam 3-align

for i in $(ls ERR219785*.fastq);do
bowtie2 -p 8 -x /media/ken/Data/DB/miRNA/mature -U $i   -S $i.hairpin.sam
done
mv *.sam 3-align

4 counts

samtools view  -SF 4 2.sam |perl -alne '{$h{$F[2]}++}END{print "$_\t$h{$_}" foreach sort keys %h }'  > 2-hairpin.counts

Enjoy~

本文由Python腳本GitHub/語雀自動更新

由於語法渲染問題而影響閱讀體驗, 請移步博客閱讀~
本文GitPage地址

GitHub: Karobben
Blog:Karobben
BiliBili:史上最不正經的生物狗