qc3C: Reference-free quality control for Hi-C sequencing data.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Author(s): DeMaere MZ;DeMaere MZ; Darling AE; Darling AE
  • Source:
    PLoS computational biology [PLoS Comput Biol] 2021 Oct 11; Vol. 17 (10), pp. e1008839. Date of Electronic Publication: 2021 Oct 11 (Print Publication: 2021).
  • Publication Type:
    Journal Article; Research Support, Non-U.S. Gov't
  • Language:
    English
  • Additional Information
    • Source:
      Publisher: Public Library of Science Country of Publication: United States NLM ID: 101238922 Publication Model: eCollection Cited Medium: Internet ISSN: 1553-7358 (Electronic) Linking ISSN: 1553734X NLM ISO Abbreviation: PLoS Comput Biol Subsets: MEDLINE
    • Publication Information:
      Original Publication: San Francisco, CA : Public Library of Science, [2005]-
    • Subject Terms:
    • Abstract:
      Hi-C is a sample preparation method that enables high-throughput sequencing to capture genome-wide spatial interactions between DNA molecules. The technique has been successfully applied to solve challenging problems such as 3D structural analysis of chromatin, scaffolding of large genome assemblies and more recently the accurate resolution of metagenome-assembled genomes (MAGs). Despite continued refinements, however, preparing a Hi-C library remains a complex laboratory protocol. To avoid costly failures and maximise the odds of successful outcomes, diligent quality management is recommended. Current wet-lab methods provide only a crude assay of Hi-C library quality, while key post-sequencing quality indicators used have-thus far-relied upon reference-based read-mapping. When a reference is accessible, this reliance introduces a concern for quality, where an incomplete or inexact reference skews the resulting quality indicators. We propose a new, reference-free approach that infers the total fraction of read-pairs that are a product of proximity ligation. This quantification of Hi-C library quality requires only a modest amount of sequencing data and is independent of other application-specific criteria. The algorithm builds upon the observation that proximity ligation events are likely to create k-mers that would not naturally occur in the sample. Our software tool (qc3C) is to our knowledge the first to implement a reference-free Hi-C QC tool, and also provides reference-based QC, enabling Hi-C to be more easily applied to non-model organisms and environmental samples. We characterise the accuracy of the new algorithm on simulated and real datasets and compare it to reference-based methods.
      Competing Interests: The authors have declared that no competing interests exist.
    • References:
      Genome Res. 2017 May;27(5):801-812. (PMID: 27940952)
      Am J Hum Genet. 2002 Aug;71(2):439-41. (PMID: 12111669)
      Nucleic Acids Res. 2018 Jul 2;46(W1):W11-W16. (PMID: 29901812)
      Bioinformatics. 2018 Sep 1;34(17):i884-i890. (PMID: 30423086)
      Methods. 2018 Jun 1;142:47-58. (PMID: 29723572)
      Methods. 2012 Nov;58(3):268-76. (PMID: 22652625)
      Bioinformatics. 2016 Oct 1;32(19):3047-8. (PMID: 27312411)
      PLoS Comput Biol. 2019 Aug 21;15(8):e1007273. (PMID: 31433799)
      Nat Biotechnol. 2017 Apr 11;35(4):316-319. (PMID: 28398311)
      Methods. 2017 Jul 1;123:56-65. (PMID: 28435001)
      Nucleic Acids Res. 2020 Jul 2;48(W1):W177-W184. (PMID: 32301980)
      Science. 2009 Oct 9;326(5950):289-93. (PMID: 19815776)
      Genome Biol. 2019 Feb 26;20(1):46. (PMID: 30808380)
      Gigascience. 2020 Jan 1;9(1):. (PMID: 31919520)
      Nat Biotechnol. 2013 Dec;31(12):1119-25. (PMID: 24185095)
      F1000Res. 2015 Nov 20;4:1310. (PMID: 26835000)
      Genome Biol. 2015 Dec 01;16:259. (PMID: 26619908)
      Gigascience. 2018 Feb 1;7(2):. (PMID: 29149264)
    • Accession Number:
      9007-49-2 (DNA)
    • Publication Date:
      Date Created: 20211011 Date Completed: 20211213 Latest Revision: 20211214
    • Publication Date:
      20240829
    • Accession Number:
      PMC8530316
    • Accession Number:
      10.1371/journal.pcbi.1008839
    • Accession Number:
      34634030