Abstract
Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
Original language | English (US) |
---|---|
Pages (from-to) | 1987-1996 |
Number of pages | 10 |
Journal | Proteins: Structure, Function and Bioinformatics |
Volume | 89 |
Issue number | 12 |
DOIs | |
State | Published - Dec 2021 |
Keywords
- CASP
- COVID
- EMA
- SARS-CoV-2
- model accuracy
- protein structure prediction
ASJC Scopus subject areas
- Structural Biology
- Biochemistry
- Molecular Biology
Fingerprint
Dive into the research topics of 'Modeling SARS-CoV-2 proteins in the CASP-commons experiment'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS
Modeling SARS-CoV-2 proteins in the CASP-commons experiment. / AlphaFold team; CASP-COVID participants.
In: Proteins: Structure, Function and Bioinformatics, Vol. 89, No. 12, 12.2021, p. 1987-1996.Research output: Contribution to journal › Article › peer-review
}
TY - JOUR
T1 - Modeling SARS-CoV-2 proteins in the CASP-commons experiment
AU - AlphaFold team
AU - CASP-COVID participants
AU - Kryshtafovych, Andriy
AU - Moult, John
AU - Billings, Wendy M.
AU - Della Corte, Dennis
AU - Fidelis, Krzysztof
AU - Kwon, Sohee
AU - Olechnovič, Kliment
AU - Seok, Chaok
AU - Venclovas, Česlovas
AU - Won, Jonghun
AU - Adhikari, Badri
AU - Adiyaman, Recep
AU - Aguirre-Plans, Joaquim
AU - Anishchenko, Ivan
AU - Baek, Minkyung
AU - Baker, David
AU - Baldassarre, Frederico
AU - Barger, Jacob
AU - Bhattacharya, Sutanu
AU - Bhattacharya, Debswapna
AU - Bitton, Mor
AU - Cao, Renzhi
AU - Cheng, Jianlin
AU - Christoffer, Charles
AU - Czaplewski, Cezary
AU - Du, Zongyang
AU - Elofsson, Arne
AU - Faraggi, Eshel
AU - Feig, Michael
AU - Fernandez-Fuentes, Narcis
AU - Grishin, Nick
AU - Grudinin, Sergei
AU - Guo, Zhiye
AU - Hanazono, Yuya
AU - Hassabis, Demis
AU - Hedelius, Bryce
AU - Heo, Lim
AU - Hiranuma, Naozumi
AU - Hunt, Cassandra
AU - Igashov, Ilia
AU - Ishida, Takashi
AU - Jernigan, Robert L.
AU - Jones, David
AU - Jumper, John
AU - Kadukova, Maria
AU - Kandathil, Shaun
AU - Keasar, Chen
AU - Kihara, Daisuke
AU - Kinch, Lisa
AU - Kiyota, Yasuomi
N1 - Funding Information: The CASP experiment is supported by the US National Institute of General Medical Sciences (NIGMS/NIH), grant number GM100482. Funding Information: CS was supported by the National Research Foundation of Korea (NRF) grants funded by the Korea government (Nos. 2020M3A9G7103933 and 2019M3E5D4066898). Funding Information: The work by LJM and RA was supported by the Biotechnology and Biological Sciences Research Council (BBSRC), grant number BB/T018496/1. Funding Information: CC, MK, AL, and EL are supported by grants UMO‐2017/26/M/ST4/00044 (to CC), UMO‐2017/25/B/ST4/01026 (to AL) from the National Science Centre (NCN), Poland. UNRES group used computer resources: CI TASK, Technical University of Gdańsk; ICM, University of Warsaw (grant: GA76‐11); Cyfronet, AGH University of Science and Technology, Cracow (grant: unres19). The authors thank Anna Antoniak, Artur Giełdoń, Sergey A. Samsonov, Adam K. Sieradzan, Rafał Ślusarz (Faculty of Chemistry, University of Gdańsk) for assistance in solving part of the targets and background work. Funding Information: The CASP experiment is supported by the US National Institute of General Medical Sciences (NIGMS/NIH), grant number GM100482. CS was supported by the National Research Foundation of Korea (NRF) grants funded by the Korea government (Nos. 2020M3A9G7103933 and 2019M3E5D4066898). ?V and KO were in part supported by the Research Council of Lithuania (grants S-MIP-17-60 and S-MIP-21-35). The predictions made by MULTICOM predictors were partially supported by two NSF grants (DBI 1759934 and IIS1763246), one NIH grant (GM093123), and two DOE grants (DE-SC0020400 and DE-SC0021303) to JC. KT, YT, and YY were partially supported by Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research [BINDS]) from AMED under Grant Number JP20am0101110. Computational resource of AI Bridging Cloud Infrastructure (ABCI) provided by National Institute of Advanced Industrial Science and Technology (AIST) was used. NFF, AM, and JAP acknowledge support received from the UK Biotechnology and Biological Science Research Council (BBS/E/W/0012843D). The work by LJM and RA was supported by the Biotechnology and Biological Sciences Research Council (BBSRC), grant number BB/T018496/1. CC, MK, AL, and EL are supported by grants UMO-2017/26/M/ST4/00044 (to CC), UMO-2017/25/B/ST4/01026 (to AL) from the National Science Centre (NCN), Poland. UNRES group used computer resources: CI TASK, Technical University of Gda?sk; ICM, University of Warsaw (grant: GA76-11); Cyfronet, AGH University of Science and Technology, Cracow (grant: unres19). The authors thank Anna Antoniak, Artur Gie?do?, Sergey A. Samsonov, Adam K. Sieradzan, Rafa? ?lusarz (Faculty of Chemistry, University of Gda?sk) for assistance in solving part of the targets and background work. DK is partially supported by the National Institutes of Health (R01GM133840 and R01GM123055) and the National Science Foundation (CMMI1825941, MCB1925643, and DBI2003635). CC is supported by the National Institute of General Medical Sciences-funded predoctoral fellowship to C.C. (T32 GM132024). Funding Information: DK is partially supported by the National Institutes of Health (R01GM133840 and R01GM123055) and the National Science Foundation (CMMI1825941, MCB1925643, and DBI2003635). CC is supported by the National Institute of General Medical Sciences‐funded predoctoral fellowship to C.C. (T32 GM132024). Funding Information: KT, YT, and YY were partially supported by Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research [BINDS]) from AMED under Grant Number JP20am0101110. Computational resource of AI Bridging Cloud Infrastructure (ABCI) provided by National Institute of Advanced Industrial Science and Technology (AIST) was used. Funding Information: NFF, AM, and JAP acknowledge support received from the UK Biotechnology and Biological Science Research Council (BBS/E/W/0012843D). Funding Information: The predictions made by MULTICOM predictors were partially supported by two NSF grants (DBI 1759934 and IIS1763246), one NIH grant (GM093123), and two DOE grants (DE‐SC0020400 and DE‐SC0021303) to JC. Funding Information: Biotechnology and Biological Sciences Research Council, Grant/Award Numbers: BB/T018496/1, BBS/E/W/0012843D; Japan Agency for Medical Research and Development, Grant/Award Number: JP20am0101110; Narodowe Centrum Nauki, Grant/Award Numbers: UMO‐2017/25/B/ST4/01026, UMO‐2017/26/M/ST4/00044; National Institute of General Medical Sciences, Grant/Award Numbers: GM100482, T32 GM132024; National Institutes of Health, Grant/Award Numbers: GM093123, R01GM133840, R01GM123055; National Science Foundation, Grant/Award Numbers: DBI 1759934, IIS1763246, CMMI1825941, MCB1925643, DBI2003635; U.S. Department of Energy, Grant/Award Numbers: DE‐SC0020400, DE‐SC0021303; Cyfronet, AGH University of Science and Technology, Cracow, Grant/Award Number: unres19; ICM, University of Warsaw, Grant/Award Number: GA76‐11; Research Council of Lithuania, Grant/Award Numbers: S‐MIP‐21‐35, S‐MIP‐17‐60; National Research Foundation of Korea, Grant/Award Numbers: 2019M3E5D4066898, 2020M3A9G7103933 Funding information Publisher Copyright: © 2021 Wiley Periodicals LLC.
PY - 2021/12
Y1 - 2021/12
N2 - Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
AB - Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
KW - CASP
KW - COVID
KW - EMA
KW - SARS-CoV-2
KW - model accuracy
KW - protein structure prediction
UR - http://www.scopus.com/inward/record.url?scp=85116474145&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85116474145&partnerID=8YFLogxK
U2 - 10.1002/prot.26231
DO - 10.1002/prot.26231
M3 - Article
C2 - 34462960
AN - SCOPUS:85116474145
VL - 89
SP - 1987
EP - 1996
JO - Proteins: Structure, Function and Bioinformatics
JF - Proteins: Structure, Function and Bioinformatics
SN - 0887-3585
IS - 12
ER -