Stratton, M. R., Campbell, P. J. & Futreal, P. A. The cancer genome. Nature 458, 719–724 (2009).
Alexandrov, L. B. et al. The repertoire of mutational signatures in human cancer. Nature 578, 94–101 (2020).
Alexandrov, L. B. & Stratton, M. R. Mutational signatures: the patterns of somatic mutations hidden in cancer genomes. Curr. Opin. Genet. Dev. 24, 52–60 (2014).
Perera-Bel, J. et al. From somatic variants towards precision oncology: evidence-driven reporting of treatment options in molecular tumor boards. Genome Med. 10, 18 (2018).
Garcia-Prieto, C. A., MartÃnez-Jiménez, F., Valencia, A. & Porta-Pardo, E. Detection of oncogenic and clinically actionable mutations in cancer genomes critically depends on variant calling tools. Bioinformatics 38, 3181–3191 (2022).
Farswan, A. et al. Branching clonal evolution patterns predominate mutational landscape in multiple myeloma. Am. J. Cancer Res. 11, 5659–5679 (2021).
Li, W. & Freudenberg, J. Mappability and read length. Front. Genet. 5, 381 (2014).
Larson, D. E. et al. SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28, 311–317 (2012).
Koboldt, D. C. et al. VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Wilm, A. et al. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res. 40, 11189–11201 (2012).
Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31, 213–219 (2013).
Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).
Sahraeian, S. M. E. et al. Deep convolutional neural networks for accurate somatic mutation detection. Nat. Commun. 10, 1041 (2019).
Krishnamachari, K. et al. Accurate somatic variant detection using weakly supervised deep learning. Nat. Commun. 13, 4248 (2022).
Musunuri, R. L. et al. Lancet2: improved and accelerated somatic variant calling with joint multi-sample local assembly graphs. Preprint at bioRxiv https://doi.org/10.1101/2025.02.18.638852 (2025).
Fang, L. T. et al. Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing. Nat. Biotechnol. 39, 1151–1160 (2021).
Logsdon, G. A., Vollger, M. R. & Eichler, E. E. Long-read human genome sequencing and its applications. Nat. Rev. Genet. 21, 597–614 (2020).
Damaraju, N., Miller, A. L. & Miller, D. E. Long-read DNA and RNA sequencing to streamline clinical genetic testing and reduce barriers to comprehensive genetic testing. J. Appl. Lab. Med. 9, 138–150 (2024).
Kolesnikov, A. et al. Local read haplotagging enables accurate long-read small variant calling. Nat. Commun. 15, 5907 (2024).
Zheng, Z. et al. Symphonizing pileup and full-alignment for deep learning-based long-read variant calling. Nat. Comput. Sci. 2, 797–803 (2022).
Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).
Shafin, K. et al. Haplotype-aware variant calling with PEPPER-Margin-DeepVariant enables high accuracy in nanopore long-reads. Nat. Methods 18, 1322–1332 (2021).
Kolmogorov, M. et al. Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation. Nat. Methods 20, 1483–1492 (2023).
Zheng, Z. et al. ClairS: a deep-learning method for long-read somatic small variant calling. Preprint at bioRxiv https://doi.org/10.1101/2023.08.17.553778 (2023).
Kolmogorov, M. & Gokce, A. CASTLE-Panel/castle. Datasets. GitHub https://github.com/CASTLE-Panel/castle (2025).
Keskus, A. G. et al. Severus detects somatic structural variation and complex rearrangements in cancer genomes using long-read sequencing. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02618-8 (2025)
DÃaz-Gay, M. et al. Assigning mutational signatures to individual samples and individual somatic mutations with SigProfilerAssignment. Bioinformatics 39, btad756 (2023).
Vasimuddin, M., Misra, S., Li, H. & Aluru, S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. In Proc. 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 314–324 (IEEE, 2019).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Bergstrom, E. N. et al. SigProfilerMatrixGenerator: a tool for visualizing and exploring patterns of small mutational events. BMC Genomics 20, 685 (2019).
Lansdon, L. A. et al. Successful classification of clinical pediatric leukemia genetic subtypes via structural variant detection using HiFi long-read sequencing. Preprint at medRxiv https://doi.org/10.1101/2024.11.05.24316078 (2024).
Kim, R. rkimoakbioinformatics/oakvar. Source code. GitHub https://github.com/rkimoakbioinformatics/oakvar/ (2025).
Steiert, T. A. et al. A critical spotlight on the paradigms of FFPE-DNA sequencing. Nucleic Acids Res. 51, 7143–7162 (2023).
Xiao, W. et al. Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing. Nat. Biotechnol. 39, 1141–1150 (2021).
Koboldt, D. C. Best practices for variant calling in clinical sequencing. Genome Med. 12, 91 (2020).
Keskus, A. G. et al. Severus detects somatic structural variation and complex rearrangements in cancer genomes using long-read sequencing. Nat. Biotechnol. https://doi.org/10.1038/s41587-025-02618-8 (2025).
Cohen, A. S. A. et al. Genomic answers for children: Dynamic analyses of >1000 pediatric rare disease genomes. Genet. Med. 24, 1336–1348 (2022).
Monlong, J., Lorig-Roach, R., Meredith, M. & Negi, S. nanoporegenomics/wambam. Source code. GitHub https://github.com/nanoporegenomics/wambam (2025).
Bushnell, B. BioInfoTools/BBMap. Source code. GitHub https://github.com/BioInfoTools/BBMap/blob/master/sh/reformat.sh (2025).
Baid, G. et al. An extensive sequence dataset of gold-standard samples for benchmarking and development. Preprint at bioRxiv https://doi.org/10.1101/2020.12.11.422022 (2020).
An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).
Lake, J. A. & Sequencing (CoLoRS), C. of L. R. Consortium of Long Read Sequencing Database (CoLoRSdb). Zenodo https://doi.org/10.5281/zenodo.11511513 (2024).
Chen, N.-C. et al. Improving variant calling using population data and deep learning. BMC Bioinf. 24, 197 (2023).
Sherry, S. T. et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 29, 308–311 (2001).
Karczewski, K. J. et al. The mutational constraint spectrum quantified from variation in 141,456 humans. Nature 581, 434–443 (2020).
Auton, A. et al. A global reference for human genetic variation. Nature 526, 68–74 (2015).
Szegedy, C. et al. Rethinking the inception architecture for computer vision. Proc. IEEE Conference on Computer Vision and Pattern Recognition 2818–2826 (2016); https://doi.org/10.1109/CVPR.2016.308
Poplin, R. et al. google/deepvariant. Google (2025). Source code. GitHub https://github.com/google/deepvariant (2025).
Kingma, D. P. & Ba, J. ADAM: a method for stochastic optimization. Preprint at https://arxiv.org/abs/1412.6980 (2017).
Ahmad, T. KolmogorovLab/Wakhan. Source code. GitHub https://github.com/KolmogorovLab/Wakhan (2025).
Bergstrom, E. N. et al. AlexandrovLab/SigProfilerAssignment. Source code. GitHub https://github.com/AlexandrovLab/SigProfilerAssignment (2025).
DÃaz-Gay, M. et al. AlexandrovLab/SigProfilerMatrixGenerator. Source code. GitHub https://github.com/AlexandrovLab/SigProfilerMatrixGenerator (2025).
CASTLE panel: Cancer Standards Long-read Evaluation. Datasets. Sequence Read Archive https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJNA1086849 (2025).
Childhood Cancer Data Initiative (CCDI): Comprehensive Genomic Sequencing of Pediatric Cancer Cases (CMRI/KUCC) Datasets. dbGAP https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs002529.v2.p1 (2025).
DeepSomatic: Accurate Somatic Small Variant Discovery for Multiple Sequencing Technologies. Datasets. dbGAP https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs004188.v1.p1 (2025).
Park, J. Supporting data for: Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic. Zenodo https://doi.org/10.5281/zenodo.16595168 (2025).
Park, J. et al. google/deepsomatic. Google (2025). Source code. GitHub https://github.com/google/deepsomatic (2025).