Explain the process of variant calling for SNPs and indels from NGS data.
Answer
Variant calling identifies genetic variants from aligned NGS reads. The process involves: 1) Read preprocessing - quality filtering, duplicate marking, base quality recalibration (BQSR in GATK). 2) Pileup generation - stack reads at each genomic position to count alleles. 3) Variant detection - statistical models distinguish true variants from sequencing errors. GATK HaplotypeCaller performs local de novo assembly. FreeBayes uses Bayesian approaches. 4) Filtering - apply quality filters (QUAL score, depth, strand bias, mapping quality) using VQSR or hard filters. 5) Annotation - determine functional impact using VEP, ANNOVAR, or SnpEff. Challenges include: low-frequency variants, repetitive regions, indel alignment ambiguity, and systematic errors.
Master These Concepts with IIT Certification
175+ hours of industry projects. Get placed at Bosch, Tata Motors, L&T and 500+ companies.