Imputation accuracy across global human populations.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: Cell Press Country of Publication: United States NLM ID: 0370475 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1537-6605 (Electronic) Linking ISSN: 00029297 NLM ISO Abbreviation: Am J Hum Genet Subsets: MEDLINE
    • Publication Information:
      Publication: 2008- : [Cambridge, MA] : Cell Press
      Original Publication: Baltimore, American Society of Human Genetics.
    • Subject Terms:
    • Abstract:
      Genotype imputation is now fundamental for genome-wide association studies but lacks fairness due to the underrepresentation of references from non-European ancestries. The state-of-the-art imputation reference panel released by the Trans-Omics for Precision Medicine (TOPMed) initiative improved the imputation of admixed African-ancestry and Hispanic/Latino samples, but imputation for populations primarily residing outside of North America may still fall short in performance due to persisting underrepresentation. To illustrate this point, we imputed the genotypes of over 43,000 individuals across 123 populations around the world and identified numerous populations where imputation accuracy paled in comparison to that of European-ancestry populations. For instance, the mean imputation r-squared (Rsq) for variants with minor allele frequencies between 1% and 5% in Saudi Arabians (n = 1,061), Vietnamese (n = 1,264), Thai (n = 2,435), and Papua New Guineans (n = 776) were 0.79, 0.78, 0.76, and 0.62, respectively, compared to 0.90-0.93 for comparable European populations matched in sample size and SNP array content. Outside of Africa and Latin America, Rsq appeared to decrease as genetic distances to European-ancestry reference increased, as predicted. Using sequencing data as ground truth, we also showed that Rsq may over-estimate imputation accuracy for non-European populations more than European populations, suggesting further disparity in accuracy between populations. Using 1,496 sequenced individuals from Taiwan Biobank as a second reference panel to TOPMed, we also assessed a strategy to improve imputation for non-European populations with meta-imputation, but this design did not improve accuracy across frequency spectra. Taken together, our analyses suggest that we must ultimately strive to increase diversity and size to promote equity within genetics research.
      Competing Interests: Declaration of interests The authors declare no competing interests.
      (Copyright © 2024 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.)
    • Comments:
      Update of: bioRxiv. 2023 Oct 26:2023.05.22.541241. doi: 10.1101/2023.05.22.541241. (PMID: 37292811)
    • References:
      PLoS One. 2010 Mar 15;5(3):e9697. (PMID: 20300623)
      Sci Adv. 2023 Aug 9;9(32):eadg6319. (PMID: 37556544)
      Hum Mol Genet. 2016 Dec 15;25(24):5321-5331. (PMID: 27798100)
      Nature. 2020 May;581(7809):434-443. (PMID: 32461654)
      Front Genet. 2021 Sep 27;12:643883. (PMID: 34646295)
      PLoS Genet. 2021 Jan 11;17(1):e1009210. (PMID: 33428619)
      Science. 2020 Mar 20;367(6484):. (PMID: 32193295)
      Int J Epidemiol. 2013 Feb;42(1):76-85. (PMID: 22253318)
      G3 (Bethesda). 2018 Oct 3;8(10):3255-3267. (PMID: 30131328)
      Nat Genet. 2016 Nov;48(11):1443-1448. (PMID: 27694958)
      Nat Commun. 2021 Oct 12;12(1):5929. (PMID: 34642339)
      HGG Adv. 2022 Nov 11;4(1):100159. (PMID: 36465187)
      Nat Genet. 2016 Oct;48(10):1279-83. (PMID: 27548312)
      EBioMedicine. 2022 Feb;76:103879. (PMID: 35158310)
      Nature. 2024 Jan;625(7995):540-547. (PMID: 38030719)
      Nat Commun. 2015 Aug 21;6:8018. (PMID: 26292667)
      Genome Biol. 2022 Sep 13;23(1):194. (PMID: 36100952)
      Nature. 2015 Oct 1;526(7571):82-90. (PMID: 26367797)
      Science. 2008 Feb 22;319(5866):1100-4. (PMID: 18292342)
      Cell Genom. 2022 Oct 12;2(11):100197. (PMID: 36776991)
      Nat Genet. 2012 Jul 22;44(8):955-9. (PMID: 22820512)
      Nat Genet. 2016 Oct;48(10):1284-1287. (PMID: 27571263)
      Gigascience. 2015 Feb 25;4:7. (PMID: 25722852)
      Cell. 2019 May 2;177(4):1080. (PMID: 31051100)
      Trends Genet. 2009 Nov;25(11):489-94. (PMID: 19836853)
      Hum Genet. 2018 Apr;137(4):343-355. (PMID: 29705978)
      PLoS Genet. 2020 Nov 16;16(11):e1009049. (PMID: 33196638)
      Commun Biol. 2021 Nov 5;4(1):1269. (PMID: 34741098)
      NPJ Genom Med. 2021 Feb 11;6(1):10. (PMID: 33574314)
      Cell Res. 2021 Dec;31(12):1308-1310. (PMID: 34489580)
      Nature. 2019 Jun;570(7762):514-518. (PMID: 31217584)
      Elife. 2020 Dec 22;9:. (PMID: 33350384)
      Nat Commun. 2019 Dec 16;10(1):5732. (PMID: 31844061)
      Nature. 2023 Oct;622(7984):784-793. (PMID: 37821707)
      Cell Genom. 2023 May 23;3(6):100332. (PMID: 37388906)
      Nat Rev Genet. 2010 Jul;11(7):499-511. (PMID: 20517342)
      Brief Bioinform. 2019 Nov 06;:. (PMID: 32002535)
      Cell. 2019 Oct 31;179(4):984-1002.e36. (PMID: 31675503)
      Am J Hum Genet. 2022 Jun 2;109(6):1007-1015. (PMID: 35508176)
      Nature. 2016 Oct 12;538(7624):161-164. (PMID: 27734877)
      Hum Mol Genet. 2020 Aug 3;29(13):2275-2284. (PMID: 32491157)
      Am J Hum Genet. 2020 Aug 6;107(2):265-277. (PMID: 32707084)
      Lancet Neurol. 2012 Nov;11(11):951-62. (PMID: 23041239)
      PLoS One. 2015 Oct 12;10(10):e0137601. (PMID: 26458263)
      Int J Epidemiol. 2011 Jun;40(3):619-25. (PMID: 20507864)
      PLoS Genet. 2021 Feb 11;17(2):e1009273. (PMID: 33571193)
      Nature. 2021 Feb;590(7845):290-299. (PMID: 33568819)
      Nat Genet. 2022 Jun;54(6):740-745. (PMID: 35668301)
      Am J Hum Genet. 2022 Nov 3;109(11):1986-1997. (PMID: 36198314)
      Cell. 2019 Oct 17;179(3):589-603. (PMID: 31607513)
      Am J Hum Genet. 2009 Feb;84(2):235-50. (PMID: 19215730)
    • Grant Information:
      R35 GM142783 United States GM NIGMS NIH HHS
    • Publication Date:
      Date Created: 20240411 Date Completed: 20240503 Latest Revision: 20241103
    • Publication Date:
      20241103
    • Accession Number:
      PMC11080279
    • Accession Number:
      10.1016/j.ajhg.2024.03.011
    • Accession Number:
      38604166