Please use this identifier to cite or link to this item:
|Statistics||SHARE CORE MendeleyBASE||
|Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL|
Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation
|Authors:||Qian, Bin; Ortiz, Ángel R. ; Baker, David|
|Keywords:||Protein structure models|
|Publisher:||National Academy of Sciences (U.S.)|
|Citation:||PNAS, 2004 vol. 101 no. 43 15346-15351|
|Abstract:||Accurate high-resolution refinement of protein structure models is a formidable challenge because of the delicate balance of forces in the native state, the difficulty in sampling the very large number of alternative tightly packed conformations, and the inaccuracies in current force fields. Indeed, energy-based refinement of comparative models generally leads to degradation rather than improvement in model quality, and, hence, most current comparative modeling procedures omit physically based refinement. However, despite their inaccuracies, current force fields do contain information that is orthogonal to the evolutionary information on which comparative models are based, and, hence, refinement might be able to improve comparative models if the space that is sampled is restricted sufficiently so that false attractors are avoided. Here, we use the principal components of the variation of backbone structures within a homologous family to define a small number of evolutionarily favored sampling directions and show that model quality can be improved by energy-based optimization along these directions.
With the progression of structural genomics initiatives (1–3), comparative modeling has become an increasingly important method for building protein structure models (4, 5). After a suitable structure template is chosen, accurate comparative modeling requires a correct alignment between the target protein sequence and the template sequence, an accurate method for modeling the loops (the insertions and deletions in an alignment) and side chains, and, finally, a method for refining the coordinates derived from the template structure toward those of the true native structure (6–8). In this study, we focus on this last model-refinement step. Improvement of the accuracy of comparative models is very important because accurate comparative models potentially can be used for many applications, such as virtual drug scanning (9), molecular replacement (10), and function prediction (11). Refinement is particularly important when the sequence identity between a target protein and the template protein is <30% (12), because models built by using current methods generally have rms deviations (rmsd) of >1.5 Å (13).|
However, high-resolution refinement is as formidable as it is important. This difficulty is due to both the large size of conformational space and the delicate balance of forces in the native state. Indeed, in the recent CASP5 experiment (The 5th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction), most refined structures had larger rmsd to the native structure than the starting template backbone conformation (7). High-resolution refinement is thus a very stringent test of accuracy that perhaps no current force field satisfies. Progress on this very important but very challenging problem may be facilitated by focusing on more constrained and thus more tractable refinement problems. We were led to thinking about such problems by the observation that a refinement protocol that did not markedly improve de novo structure-prediction models was very much more successful on the more constrained rigid-body protein–protein docking problem (14). The greatly reduced number of backbone degrees of freedom in the protein–protein docking problem significantly reduces the number of false attractors in the free-energy landscape: As illustrated in ref. 14, the docking free-energy landscapes typically are funneled strongly into the native minimum. The conceptual step forward in this paper is to use evolutionary information to reduce the number of degrees of freedom in the monomeric protein-refinement problem to mimic the situation in the protein–protein docking problem. We accomplish this goal by restricting sampling to the subspace defined by the largest principal components (PCs) of the variation in the structural core of homologous proteins. This strategy greatly enhances the sampling of near-native backbone conformations, and the low-energy models identified by using the Rosetta high-resolution energy function (15, 16) usually have lower rmsd to the native backbones than the starting templates. This restricted refinement problem can provide a testing ground for evaluating and improving potential functions for the unrestricted comparative-modeling refinement problem. More practically, the refinement of structure cores by energy-based sampling along evolutionarily preferred directions can serve as the first step toward improving a model structure built from a template. After a more accurate structure core is obtained, the rest of the structure can be built by using loop modeling and side-chain repacking (6).
|Publisher version (URL):||http://dx.doi.org/10.1073/pnas.0404703101|
|Appears in Collections:||(CBM) Artículos|