Edit me

Step 2.4 Calling the variants

There is no single “best” approach to capture all the genetic variations. For germline variants, 1 suggest using a consensus of results from three tools:

  1. CRISP 2,
  2. HaplotypeCaller 3 from the GATK, and
  3. mpileup from SAMtools (tutorial is available on the ANGUS site, Michigan State university http://ged.msu.edu/angus/tutorials-2012/snp_tutorial.html).

Recently, MuTect2 was added as a variant discovery tool to the GATK specifically for cancer variants. MuTect2 calls somatic SNPs and indels by combining the original MuTect 4 with the HaplotypeCaller. The HaplotypeCaller relies on diploid assumption, while MuTect2 allows for different allelic fractions for each variant. This makes the caller useful in tumor variant discovery. Joint calling (GVCF generation) is not available in MuTect2.

The variant calls are usually produced in the form of VCF files 5, occupying much smaller size than the BAMs generating them.

Bibliography

  1. Pabinger, S. et al. A survey of tools for variant analysis of next-generation genome sequencing data. Brief. Bioinformatics 15, 256–278 (2014). 

  2. Bansal, V. A statistical method for the detection of variants from next-generation resequencing of DNA pools. Bioinformatics 26, i318-24 (2010). 

  3. Van der Auwera, G. A. et al. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline. Curr. Protoc. Bioinformatics 11, 11.10.1-11.10.33 (2013). 

  4. Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013). 

  5. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).