Monday, January 11, 2016

NGS: simple pipeline for variant calling

The flow of data type: fastq -> sam/bam -> vcf/bcf

1. align NGS short reads to reference genome
1). Build the index for the reference genome:
mkdir ref/ref
bowtie2-build /reference genome/*.fa ref/ref

2). Align the reads to the reference genome:
bowtie2 -x ref/ref sample.fastq -S sample.bt2.sam

3). View the alignment:
samtools view -b -T ref.fa sample.bt2.sam > sample.bt2.bam

4). Sort and Index the aligned bam file:
samtools sort sample.bt2.bam sample.bt2.sorted
samtools index sample.bt2.sorted.bam

5). mpileup:
samtools mpileup -v -u -f /reference genome/*.fa sample.bt2.sorted.bam > sample.vcf

No comments:

Post a Comment