Molecular Dynamics Simulations

Journal THE JOURNAL OF PHYSICAL CHEMISTRY, June 9, 2009 Title Active Site Cysteine Is Protonated in the PAD4 Michaelis Complex: Evidence from Born-Oppenheimer Ab Initio QM/MM Molecular Dynamics Simulations Grant Funding This work has been supported by (R01-GM079223). Y.Z. is also grateful for support from the National Institute of Health National Science Foundation (CHE-CAREER-0448156, CHE-MRI-0420870), (James D. Watson Young Investigator Award), and NYU (Whitehead Fellowship for Junior Faculty in NYSTAR Biomedical and Biological Sciences). Ackno wledg : ments "We thank NYU ITS for providing computational resources and support." -the authors

How does a protein fold? How does an enzyme work?
To answer these questions (and many others) scientists use appropriate experimental techniques, e.g., single-molecule spectroscopy, NMR spectroscopy , site-directed mutagenesis etc.

Introduction
• In Molecular Dynamics (MD) simulations the laws of classical physics are used to approximate the interactions and motion of (macro)molecules. • MD simulations are a multidisciplinary field. The underlying laws and theories stem from mathematics, physics and chemistry; algorithms from various areas of applied mathematics and computer science are implemented in a wide variety of MD software packages.
Understanding biomolecular structure, dynamics and function • Instead of expressing and purifying the protein of interest ….. we supply the computer with the 3D coordinates of the desired protein(s). • For proteins and small organic compounds the atomic coordinates can be downloaded from the PROTEIN DATA BANK (www.rcsb.org). • In the following slides the molecule of interest will always be protein, keep in mind though that for other types of (macro)molecules everything works analogously.
• Furthermore, we need a model to describe the geometry and the physical interactions of the atoms of our protein. • There are two main approaches: • In this course we will discuss the model based on classical mechanics.

Molecular Mechanics Representation
• Each atom is represented as a sphere.

•
The biomolecular force field is what shapes the collection of spheres into things that look like molecules through so-called bonded interactions.

•
The force field also describes the atom-atom interactions of distal parts within a flexible molecule as well as the interactions between molecules through nonbonded interactions.
• Sometimes all-atom representation is simplified to so-called united-atoms (e.g.: a methyl group is a single sphere).
• Electronic structure is in general coarse-grained out and expected to be captured by the force field. For the CHARMM paper <click here> Abstract: CHARMM (Chemistry at HARvard Molecular Mechanics) is a highly versatile and widely used molecular simulation program. It has been developed over the last three decades with a primary focus on molecules of biological interest, including proteins, peptides, lipids, nucleic acids, carbohydrates, and small molecule ligands, as they occur in solution, crystals, and membrane environments. For the study of such systems, the program provides a large suite of computational tools that include numerous conformational and path sampling methods, free energy estimators, molecular minimization, dynamics, and analysis techniques, and model-building capabilities. The CHARMM program is applicable to problems involving a much broader class of many-particle systems. Calculations with CHARMM can be performed using a number of different energy functions and models, from mixed quantum mechanical-molecular mechanical force fields, to allatom classical potential energy functions with explicit solvent and various boundary conditions, to implicit solvent and membrane models. The program has been ported to numerous platforms in both serial and parallel architectures. This article provides an overview of the program as it exists today with an emphasis on developments since the publication of the original CHARMM article in 1983. J Comput Chem 30: 2009 Key words: biomolecular simulation; CHARMM program; molecular mechanics; molecular dynamics; molecular modeling; biophysical computation; energy function • In modern force fields: • Let's go through the single terms.
November 2019 MD simulations 17 • approximates the energy needed to stretch a covalent bond between two atoms by Hooke's law for the potential energy stored in a spring: where i and j are two covalently connected atoms and is the distance between them.
is the equilibrium bond length and is a constant representing the stiffness of the bond.
November 2019 MD simulations 18 • models the energy needed to bend the angle formed by two covalent bonds. This term is represented by the following formula: where i, j and k are three covalently connected atoms and is the angle between them. is the equilibrium angle width and represents the rigidity of the angle.

November 2019 MD simulations 19
• models the energy needed to bend the dihedral angle formed by three covalent bonds. This term is represented by the following formula: where i, j, k and l are four covalently connected atoms and is the dihedral angle between them.
November 2019 MD simulations 20 • In , the parameter stands for the rigidity of the dihedral, while the parameters and determine the value(s) where the energy has a minimum. For example for the ω dihedral angle of the protein backbone the two minima should always be close to 180°or 0°. • So far we have treated covalently bonded atoms. What about the interactions between non covalently bonded atoms, i.e., pairs of atoms that are separated by more than two covalent bonds in the same protein or pairs of atoms indifferent proteins? • In the classical force field representation, these interactions are pairwise and modeled with the following formula: • In principle, should be calculated for every pair of atoms in the system. In practice, a distance threshold is used, i.e., nonbonding interactions are actually calculated only for atom pairs whose distance is smaller than a threshold (usually about 1.2 nm).  • Explicit solvent representation: Water molecules and ions are modeled considering all of their nuclei as interaction centers and degrees of freedom.
• Implicit solvent representation: The interactions of the examined protein with the surrounding solvent are described by a mean field framework.

An experiment on a computer
Explicit Water Representation • The description of the water molecules is based on the same principles aforementioned for the protein.
• Water molecule's geometry is described with spheres connected by rigid and unbreakable bonds. • The interactions of the water molecules with each other and with the examined protein are given by the following term of the force field: An experiment on a computer • Once we have our protein properly solvated in a simulation box, we can start our simulation(s). • In our approximation, the evolution in time of our system will be governed by Newton's equation of motion An experiment on a computer • At t = 0, the positions of each atom are given from the PDB structure.
• An initial velocity, taken from a Gaussian distribution, is assigned to each atom. • The force acting on each atom i at a certain time t is the gradient of the energy, i.e., the vector of first derivatives of the force field: • Knowing the acceleration, the velocity and the position at time t for all the atoms, we can calculate the positions at time (t + Δt): while the velocities are given by the equation: Usually, the time step is chosen to be Δt = 2 fs.
Changing the ensemble

•
The system we have put together has constant number of particles, constant volume and constant energy. The simulation samples then what in statistical mechanics is called a microcanonical ensemble (NVE).

•
Often, it will be necessary to match the conditions of a simulation to that of an experiment or to something that is physically more suitable to the problem.
• NVE → NVT: Add a thermostat; in essence an auxiliary process that modifies particle velocities (common: Berendsen, Nose-Hoover, Andersen, Bussi-Parrinello) • NVE → NPT: Add a manostat and a thermostat; in essence, a manostat is an auxiliary process that rescales the volume of the system in response to the difference between internal and external pressure (common: Berendsen, Parrinello-Rahman, Langevin piston) • NVE → μ m VT: Add a thermostat and a chemostat (particle bath); in essence, a chemostat is a reservoir of "bath" particles that allows the concentration of particles in the explicitly represented volume to fluctuate (remember aquaporins).
Newton's equation is "extended" by two terms: a random force relying on a stationary and normalized random (Wiener) process, and a friction term that is velocity-dependent.

•
The rate of kinetic energy dissipation (friction) and the magnitude of the random force resulting from equilibrium fluctuations are fundamentally linked because they are caused by the same underlying process: assumed collisions with "bath" particles.
• Because the system is coupled explicitly to an (assumed) bath, energy is not conserved and one naturally obtains a canonical ensemble (NVT) when using Langevin's equations of motion to propagate a dynamical system.

November 2019 MD simulations 40
An experiment on a computer • Since modelling all atoms explicitly is computationally expensive, representations with lower resolution (coarse graining) have been developed to simulate processes that take place on longer time scales. • Coarse graining can reduce the statistical error (i.e., can reach longer time scales because of less degrees of freedom and longer time step) but increases the systematic error. • The simulation packages produce an output file called trajectory, that contains the coordinates of all the atoms at specific time intervals, e.g. every 2-20 ps depending on the type of simulation and on the studied system. • From the trajectory, we can calculate time evolution or averages over time of quantities like radius of gyration, internal energy, secondary structure propensity, interaction energy, root mean square deviation (RMSD) of the protein(s).

An experiment on a computer
An experiment on a computer Simulations of PDZ3 of PSD-95/SAP90 in native state: Over the complete MD trajectory we can compute the root mean square fluctuations (RMSF) for specific atoms.
The formula for the RMSF of an atom i is the following : where T is the time over which one wants to average, is the position of atom i at time and is its reference position (usually its average position over a time interval of 1 ns to 5 ns).  Since the MD trajectory allow us to follow the position of every single atom over time, we can also monitor individual interactions over time.
The distance of the donor and acceptor atoms involved in a hydrogen bond reports on the stability of that interaction (for the time series in panel A, note that the optimal hydrogen bond distance is around 3 Å). • Developed over the last three decades with a primary focus on molecules of biological interest, including proteins, peptides, lipids, nucleic acids, carbohydrates and small molecule ligands.
• Energy minimization, normal mode analysis, molecular dynamics, Monte Carlo sampling, umbrella sampling, free energy perturbation, large set of analysis techniques, and modelbuilding capabilities.
• Different energy functions and models including classical potential energy (force field), mixed quantum mechanical-molecular mechanical approach, and interface to ab initio quantum chemistry.
• Explicit solvent and various boundary conditions, several implicit solvent models (surface based, generalized Born, finite-difference Poisson).