Episode

yallHap: Modern Y-chromosome haplogroup inference with probabilistic scoring and ancient DNA support

Dec 29, 20259:20
Bioinformatics
No ratings yet

Abstract

The human Y chromosome enables detailed reconstruction of paternal lineages through haplogroup classification. Existing tools for this purpose typically rely on outdated phylogenies, lack ancient DNA handling, or provide limited confidence metrics. Here I present yallHap, a Y-chromosome haplogroup classifier that integrates the YFull phylogenetic tree (185,780 SNPs) with probabilistic scoring, built-in ancient DNA damage filtering, and parallel processing for population-scale studies. Validation on 1,231 high-coverage gnomAD samples achieved 99.9% accuracy (95% CI: 99.5-100%) on GRCh38, and 1,233 samples from 1000 Genomes Phase 3 achieved 99.8% accuracy (95% CI: 99.3-100%). For ancient DNA with moderate variant density (4-10%), Bayesian ancient mode achieves +19.3 pp improvement over heuristic mode (+12 to +24 pp at 1% increments; see Supplementary Table S3), reaching 60-86% accuracy. On full AADR ancient DNA validation (7,333 samples spanning ~45,000 years), this translates to 90.7% overall accuracy (95% CI: 90.0-91.3%) versus 88.3% for heuristic transversions-only mode. At variant densities [≥]10%, both modes reach 97-99% accuracy. yallHap supports multiple reference genomes (GRCh37, GRCh38, T2T-CHM13v2.0), provides detailed quality metrics including optional ISOGG nomenclature output, and offers multi-threaded batch processing for large-scale studies. The tool is designed for integration into modern bioinformatics pipelines, with example wrappers for nf-core/eager [16,17] and Snakemake [18] workflows. The software is open source, available at https://github.com/trianglegrrl/yallHap, and distributed via pip, Bioconda, and Docker.

Links & Resources

Authors

Cite This Paper

Year:2025
Category:bioinformatics
APA

A., H. (2025). yallHap: Modern Y-chromosome haplogroup inference with probabilistic scoring and ancient DNA support. arXiv preprint arXiv:10.64898/2025.12.28.696719.

MLA

Hardie, A.. "yallHap: Modern Y-chromosome haplogroup inference with probabilistic scoring and ancient DNA support." arXiv preprint arXiv:10.64898/2025.12.28.696719 (2025).