Comprehensive assessment of protein loop modeling programs on large-scale datasets: prediction accuracy and efficiency.

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Additional Information
    • Source:
      Publisher: Oxford University Press Country of Publication: England NLM ID: 100912837 Publication Model: Print Cited Medium: Internet ISSN: 1477-4054 (Electronic) Linking ISSN: 14675463 NLM ISO Abbreviation: Brief Bioinform Subsets: MEDLINE
    • Publication Information:
      Publication: Oxford : Oxford University Press
      Original Publication: London ; Birmingham, AL : H. Stewart Publications, [2000-
    • Subject Terms:
    • Abstract:
      Protein loops play a critical role in the dynamics of proteins and are essential for numerous biological functions, and various computational approaches to loop modeling have been proposed over the past decades. However, a comprehensive understanding of the strengths and weaknesses of each method is lacking. In this work, we constructed two high-quality datasets (i.e. the General dataset and the CASP dataset) and systematically evaluated the accuracy and efficiency of 13 commonly used loop modeling approaches from the perspective of loop lengths, protein classes and residue types. The results indicate that the knowledge-based method FREAD generally outperforms the other tested programs in most cases, but encountered challenges when predicting loops longer than 15 and 30 residues on the CASP and General datasets, respectively. The ab initio method Rosetta NGK demonstrated exceptional modeling accuracy for short loops with four to eight residues and achieved the highest success rate on the CASP dataset. The well-known AlphaFold2 and RoseTTAFold require more resources for better performance, but they exhibit promise for predicting loops longer than 16 and 30 residues in the CASP and General datasets. These observations can provide valuable insights for selecting suitable methods for specific loop modeling tasks and contribute to future advancements in the field.
      (© The Author(s) 2024. Published by Oxford University Press.)
    • References:
      PLoS One. 2011;6(8):e23294. (PMID: 21887241)
      Comput Struct Biotechnol J. 2017 Feb 01;15:222-231. (PMID: 28228926)
      Proteins. 2010 Dec;78(16):3428-36. (PMID: 20872556)
      BMC Struct Biol. 2006 Jul 04;6:15. (PMID: 16820050)
      J Mol Biol. 2002 Jul 12;320(3):597-608. (PMID: 12096912)
      Proteins. 2019 Dec;87(12):1011-1020. (PMID: 31589781)
      Protein Sci. 2001 Mar;10(3):599-612. (PMID: 11344328)
      J Mol Biol. 1997 Mar 28;267(2):352-67. (PMID: 9096231)
      Chembiochem. 2022 Nov 4;23(21):e202200449. (PMID: 36082509)
      J Mol Biol. 2007 Oct 19;373(2):503-19. (PMID: 17825317)
      PLoS One. 2014 Nov 24;9(11):e113811. (PMID: 25419655)
      J Comput Chem. 2014 Feb 5;35(4):335-41. (PMID: 24327406)
      IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):596-606. (PMID: 29990046)
      Chem Rev. 2016 Jun 8;116(11):6391-423. (PMID: 26889708)
      Bioinformatics. 2013 Dec 15;29(24):3158-66. (PMID: 24078704)
      Nature. 2021 Aug;596(7873):590-596. (PMID: 34293799)
      Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W173-6. (PMID: 16844985)
      Annu Rev Biophys Biomol Struct. 2000;29:291-325. (PMID: 10940251)
      Biol Chem. 2019 Feb 25;400(3):275-288. (PMID: 30676995)
      Nucleic Acids Res. 2014 Jul;42(12):8106-14. (PMID: 24920827)
      Bioinformatics. 2020 Feb 15;36(4):1099-1106. (PMID: 31504192)
      Bioinformatics. 2015 Dec 1;31(23):3767-72. (PMID: 26249814)
      Bioinformatics. 2017 May 1;33(9):1346-1353. (PMID: 28453681)
      Science. 2021 Aug 20;373(6557):871-876. (PMID: 34282049)
      Brief Bioinform. 2022 Sep 20;23(5):. (PMID: 35945035)
      Curr Opin Struct Biol. 2010 Dec;20(6):702-10. (PMID: 20951028)
      Biomolecules. 2022 Jul 14;12(7):. (PMID: 35883541)
      BMC Evol Biol. 2005 Feb 03;5:10. (PMID: 15691378)
      J Comput Chem. 2022 Feb 15;43(5):349-358. (PMID: 34904248)
      J Comput Chem. 2004 Mar;25(4):510-28. (PMID: 14735570)
      Bioinformatics. 2003 Dec 12;19(18):2500-1. (PMID: 14668246)
      Nucleic Acids Res. 2000 Jan 1;28(1):235-42. (PMID: 10592235)
      PLoS Comput Biol. 2014 Apr 24;10(4):e1003539. (PMID: 24763317)
      J Am Chem Soc. 2021 Mar 17;143(10):3830-3845. (PMID: 33661624)
      J Am Chem Soc. 2018 Nov 21;140(46):15889-15903. (PMID: 30362343)
      Curr Res Struct Biol. 2021 Aug 05;3:187-191. (PMID: 34409304)
      Nucleic Acids Res. 2019 Jul 2;47(W1):W423-W428. (PMID: 31114872)
      Protein Sci. 2014 Nov;23(11):1584-95. (PMID: 25142412)
      Nature. 2021 Aug;596(7873):583-589. (PMID: 34265844)
      Proteins. 2014 Aug;82(8):1583-98. (PMID: 24833271)
      Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444. (PMID: 34791371)
      J Chem Inf Model. 2011 Jul 25;51(7):1656-66. (PMID: 21702492)
      Nucleic Acids Res. 2011 Jul;39(Web Server issue):W210-4. (PMID: 21576220)
      Nat Methods. 2009 Aug;6(8):551-2. (PMID: 19644455)
      J Phys Chem Lett. 2021 May 13;12(18):4368-4377. (PMID: 33938761)
      PLoS One. 2013 May 21;8(5):e63090. (PMID: 23704889)
      J Mol Biol. 1999 Jun 25;289(5):1469-90. (PMID: 10373380)
      Bioinformatics. 2019 Sep 1;35(17):3013-3019. (PMID: 30649193)
      IEEE Trans Pattern Anal Mach Intell. 2016 Jul 07;39(7):1320-1334. (PMID: 27392342)
      PLoS One. 2011;6(8):e24109. (PMID: 21909381)
      Protein Eng. 2003 Dec;16(12):979-85. (PMID: 14983078)
      Nucleic Acids Res. 2015 Jan;43(Database issue):D364-8. (PMID: 25352545)
      Proteins. 2011 Oct;79(10):2920-35. (PMID: 21905115)
      Bioinformatics. 2014 Jul 1;30(13):1935-6. (PMID: 24603983)
      Nat Struct Mol Biol. 2018 Jan;25(1):4-12. (PMID: 29323277)
      Nucleic Acids Res. 2020 Jan 8;48(D1):D376-D382. (PMID: 31724711)
      Proteins. 2018 Mar;86 Suppl 1:7-15. (PMID: 29082672)
      FASEB J. 2019 Apr;33(4):5126-5142. (PMID: 30668920)
      Nucleic Acids Res. 2014 Jan;42(Database issue):D315-9. (PMID: 24265221)
      Protein Sci. 2000 Sep;9(9):1753-73. (PMID: 11045621)
      J Chem Theory Comput. 2012 May 8;8(5):1820-7. (PMID: 26593673)
      Protein Sci. 1996 Dec;5(12):2600-16. (PMID: 8976569)
      Nucleic Acids Res. 2005 Apr 22;33(7):2302-9. (PMID: 15849316)
      Nucleic Acids Res. 2016 Jul 8;44(W1):W390-4. (PMID: 27105847)
      J Chem Theory Comput. 2013 Mar 12;9(3):1821-9. (PMID: 26587638)
      Nucleic Acids Res. 2014 Jan;42(Database issue):D310-4. (PMID: 24293656)
      J Mol Biol. 1992 Apr 5;224(3):685-99. (PMID: 1569550)
      Nucleic Acids Res. 2009 Jul;37(Web Server issue):W571-4. (PMID: 19429894)
      Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W94-8. (PMID: 15980589)
      Proteins. 2010 May 1;78(6):1431-40. (PMID: 20034110)
      Protein Sci. 2003 May;12(5):963-72. (PMID: 12717019)
      Brief Bioinform. 2022 Sep 20;23(5):. (PMID: 35598325)
      Proteins. 2003 Jun 1;51(4):591-606. (PMID: 12784218)
      Proteins. 2004 May 1;55(2):351-67. (PMID: 15048827)
    • Grant Information:
      2021YFE0206400 National Key Research and Development Program of China; 22220102001 National Natural Science Foundation of China; 226-2022-00220 Fundamental Research Funds for the Central Universities
    • Contributed Indexing:
      Keywords: AlphaFold2; artificial intelligence; deep learning; loop modeling; protein loop
    • Accession Number:
      0 (Proteins)
    • Publication Date:
      Date Created: 20240103 Date Completed: 20240105 Latest Revision: 20240122
    • Publication Date:
      20240122
    • Accession Number:
      PMC10764206
    • Accession Number:
      10.1093/bib/bbad486
    • Accession Number:
      38171930