DeepFold: enhancing protein structure prediction through optimized loss functions, improved template features, and re-optimized energy function.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: Oxford University Press Country of Publication: England NLM ID: 9808944 Publication Model: Print Cited Medium: Internet ISSN: 1367-4811 (Electronic) Linking ISSN: 13674803 NLM ISO Abbreviation: Bioinformatics Subsets: MEDLINE
    • Publication Information:
      Original Publication: Oxford : Oxford University Press, c1998-
    • Subject Terms:
    • Abstract:
      Motivation: Predicting protein structures with high accuracy is a critical challenge for the broad community of life sciences and industry. Despite progress made by deep neural networks like AlphaFold2, there is a need for further improvements in the quality of detailed structures, such as side-chains, along with protein backbone structures.
      Results: Building upon the successes of AlphaFold2, the modifications we made include changing the losses of side-chain torsion angles and frame aligned point error, adding loss functions for side chain confidence and secondary structure prediction, and replacing template feature generation with a new alignment method based on conditional random fields. We also performed re-optimization by conformational space annealing using a molecular mechanics energy function which integrates the potential energies obtained from distogram and side-chain prediction. In the CASP15 blind test for single protein and domain modeling (109 domains), DeepFold ranked fourth among 132 groups with improvements in the details of the structure in terms of backbone, side-chain, and Molprobity. In terms of protein backbone accuracy, DeepFold achieved a median GDT-TS score of 88.64 compared with 85.88 of AlphaFold2. For TBM-easy/hard targets, DeepFold ranked at the top based on Z-scores for GDT-TS. This shows its practical value to the structural biology community, which demands highly accurate structures. In addition, a thorough analysis of 55 domains from 39 targets with publicly available structures indicates that DeepFold shows superior side-chain accuracy and Molprobity scores among the top-performing groups.
      Availability and Implementation: DeepFold tools are open-source software available at https://github.com/newtonjoo/deepfold.
      (© The Author(s) 2023. Published by Oxford University Press.)
    • Comments:
      Erratum in: Bioinformatics. 2023 Dec 1;39(12):. (PMID: 38140709)
    • References:
      Nature. 2023 Aug;620(7976):1089-1100. (PMID: 37433327)
      Proteins. 2019 Dec;87(12):1113-1127. (PMID: 31407380)
      J Mol Biol. 1993 Dec 5;234(3):779-815. (PMID: 8254673)
      Science. 2021 Aug 20;373(6557):871-876. (PMID: 34282049)
      Nat Methods. 2022 Jun;19(6):679-682. (PMID: 35637307)
      J Virol. 2015 Dec 09;90(5):2254-63. (PMID: 26656707)
      Nucleic Acids Res. 2023 Jul 5;51(W1):W274-W280. (PMID: 37102670)
      J Mol Biol. 1969 May 28;42(1):65-86. (PMID: 5817651)
      Phys Rev E Stat Nonlin Soft Matter Phys. 2013 Jan;87(1):012707. (PMID: 23410359)
      Structure. 2022 Aug 4;30(8):1169-1177.e4. (PMID: 35609601)
      Nucleic Acids Res. 1997 Sep 1;25(17):3389-402. (PMID: 9254694)
      Bioinformatics. 2015 Mar 1;31(5):674-81. (PMID: 25338715)
      Science. 1998 Oct 23;282(5389):740-4. (PMID: 9784131)
      Bioinformatics. 2015 Mar 15;31(6):926-32. (PMID: 25398609)
      Proteins. 2011;79 Suppl 10:37-58. (PMID: 22002823)
      Science. 2023 Mar 17;379(6637):1123-1130. (PMID: 36927031)
      J Comput Chem. 2005 Dec;26(16):1668-88. (PMID: 16200636)
      Proc Natl Acad Sci U S A. 2018 May 22;115(21):E4758-E4766. (PMID: 29735687)
      Proteins. 2021 Dec;89(12):1752-1769. (PMID: 34387010)
      Proteins. 2007;69 Suppl 8:27-37. (PMID: 17894351)
      Proteins. 2007;69 Suppl 8:83-9. (PMID: 17894332)
      Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):E1293-301. (PMID: 22106262)
      Proteins. 2018 Mar;86 Suppl 1:84-96. (PMID: 29047157)
      Science. 1973 Jul 20;181(4096):223-30. (PMID: 4124164)
      Bioinformatics. 2023 Jan 1;39(1):. (PMID: 36355462)
      Res Comput Mol Biol. 2009;5541:31-45. (PMID: 22506254)
      Nature. 2021 Aug;596(7873):583-589. (PMID: 34265844)
      Nucleic Acids Res. 2017 Jan 4;45(D1):D170-D176. (PMID: 27899574)
      Molecules. 2022 Jun 09;27(12):. (PMID: 35744836)
      PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. (PMID: 28056090)
      J Mol Biol. 1994 Feb 4;235(5):1501-31. (PMID: 8107089)
      Proteins. 2013 Jan;81(1):149-62. (PMID: 22933340)
      Nature. 2020 Jan;577(7792):706-710. (PMID: 31942072)
      Nucleic Acids Res. 2023 Jan 6;51(D1):D753-D759. (PMID: 36477304)
      Proteins. 2023 Dec;91(12):1704-1711. (PMID: 37565699)
      BMC Bioinformatics. 2010 Aug 18;11:431. (PMID: 20718988)
      Proteins. 2018 Mar;86 Suppl 1:122-135. (PMID: 29159837)
      Structure. 2011 Dec 7;19(12):1784-95. (PMID: 22153501)
      Proteins. 2018 Mar;86 Suppl 1:113-121. (PMID: 28940798)
      Biopolymers. 1983 Dec;22(12):2577-637. (PMID: 6667333)
      Nat Methods. 2011 Dec 25;9(2):173-5. (PMID: 22198341)
      Proteins. 2018 Mar;86 Suppl 1:136-151. (PMID: 29082551)
      Proteins. 2021 Dec;89(12):1940-1948. (PMID: 34324227)
      Nat Methods. 2019 Jul;16(7):603-606. (PMID: 31235882)
      Nucleic Acids Res. 2005 Apr 22;33(7):2302-9. (PMID: 15849316)
      J Biol Chem. 2021 Jul;297(1):100870. (PMID: 34119522)
      Proteins. 2023 Dec;91(12):1684-1703. (PMID: 37650367)
      Proteins. 2018 Mar;86 Suppl 1:67-77. (PMID: 28845538)
      Commun Chem. 2023 Sep 7;6(1):188. (PMID: 37679431)
      Nat Methods. 2017 Jan;14(1):71-73. (PMID: 27819658)
      Proteins. 2021 Dec;89(12):1687-1699. (PMID: 34218458)
      Proteins. 2023 Dec;91(12):1712-1723. (PMID: 37485822)
    • Grant Information:
      Institute of Information & communications Technology Planning & Evaluation; National Research Foundation of Korea
    • Accession Number:
      0 (Proteins)
    • Publication Date:
      Date Created: 20231123 Date Completed: 20231211 Latest Revision: 20231229
    • Publication Date:
      20231229
    • Accession Number:
      PMC10699847
    • Accession Number:
      10.1093/bioinformatics/btad712
    • Accession Number:
      37995286