Hydration free energy#
In this tutorial you’ll learn how to use BioSimSpace to set up a set of alchemical free energy simulations to compute the relative free energy of hydration ethane and methanol. Alchemical free energy calculations employ unphysical (“alchemical”) intermediates to estimate the free energies of various physical processes. Here, we will exploit a perturbation pathway between ethane and methanol which will allow us to estimate the relative free energy difference in solution, \({\Delta G_{\mathrm{sol}}}\), and vacuum, \({\Delta G_{\mathrm{vac}}}\). The free energy of hydration, \({\Delta\Delta G_{\mathrm{hyd}}}\) can then be computed from the difference between the free energy of the solvation and vacuum simulation legs, i.e.
\({\Delta\Delta G_{\mathrm{hyd}} = \Delta G_{\mathrm{sol}} - \Delta G_{\mathrm{vac}}}\)
To get started, let’s import BioSimSpace and create ethane and methanol molecules parameterised with the Generalised AMBER Force-Field (GAFF).
>>> import BioSimSpace as BSS
>>> ethane = BSS.Parameters.gaff("CC").getMolecule()
>>> methanol = BSS.Parameters.gaff("CO").getMolecule()
In order to create a perturbation between the two molecules we need to define
a mapping. This mapping defines the maximum common substructure (MCS) between
the molecules, mapping indices from one to the matching indices in the other.
The BioSimSpace.Align
package provides tools for calculating the
mappings between molecules and for aligning and merging molecules based on a
mapping. The user can request a multiple mappings, which can be scored and
aligned in various ways. Here we’ll simply use the default options and compute
the highest scoring mapping between the molecules:
>>> mapping = BSS.Align.matchAtoms(ethane, methanol)
The mapping is just a dictionary mapping atoms in the ethane molecule to those in the methanol. To see what atoms have been mapped we can print the corresponding atoms from each molecule:
>>> for idx0, idx1 in mapping.items():
... print(ethane.getAtoms()[idx0], "<-->", methanol.getAtoms()[idx1])
...
<BioSimSpace.Atom: name='H', molecule=2 index=5> <--> <BioSimSpace.Atom: name='H3', molecule=3 index=3>
<BioSimSpace.Atom: name='C', molecule=2 index=1> <--> <BioSimSpace.Atom: name='C', molecule=3 index=0>
<BioSimSpace.Atom: name='H', molecule=2 index=6> <--> <BioSimSpace.Atom: name='H1', molecule=3 index=1>
<BioSimSpace.Atom: name='H', molecule=2 index=7> <--> <BioSimSpace.Atom: name='H2', molecule=3 index=2>
<BioSimSpace.Atom: name='C', molecule=2 index=0> <--> <BioSimSpace.Atom: name='OH', molecule=3 index=4>
<BioSimSpace.Atom: name='H', molecule=2 index=2> <--> <BioSimSpace.Atom: name='HO', molecule=3 index=5>
When working running interactively, e.g. within an IPython, console or Jupyter notebook, it’s possible to visualise the mapping:
>>> BSS.Align.viewMapping(ethane, methanol, mapping)
The two molecules are displayed side-by-side with numbers used to indicate which atoms map to each other. For eample, the carbon indexed 0 in the ethane molecule maps to a hydrogen indexed 4 in the methanol.
Next we need to align the ethane molecule to the methane based on the mapping.
The BioSimSpace.Align.rmsdAlign
function will align them based on a root
mean square displacement (RMSD) scoring function, whereas
BioSimSpace.Align.flexAlign
will flexibly align ethane to methanol using the
fkcombu program,
which can be useful if you want both molecules to be aligned to a specific pose.
For simplicity we’ll use the RMSD alignment:
>>> ethane = BSS.Align.rmsdAlign(ethane, methanol, mapping)
Note
If we were to omit the mapping argument above, then the rmsdAlign
function would compute the best mapping for us automatically. This means
that we can choose to compute the mapping then align separately, or do
the whole thing in one step.
Having successfully aligned the ethane molecule, we now need to merge the two molecules together to create a merged (or perturbable molecule). This contains all of the properties, e.g. bonds, angles, dihedrals, needed to define the two molecules at either end state of the perturbation, \({\lambda=0}\) and \({\lambda=1}\). The end states represent the two molecules (ethane in one state, methanol in the other) plus dummy (or virtual) atoms for any atoms from the other molecule that aren’t part of the MCS.
>>> merged = BSS.Align.merge(ethane, methanol, mapping)
Next we need to solvate our merged molecule in a box of water. Here we’ll use the TIP3P water model and a cubic box with a base length of 5 nanometers:
>>> solvated = BSS.Solvent.tip3p(molecule=merged, box=3 * [5 * BSS.Units.Length.nanometer])
Finally we can create objects to perform the perturbations for each leg of the calculation, which will automatically configure everything that is needed to run the simulations. First we will use our built in, GPU optimised, molecular dynamics engine called SOMD. (This is a wrapper around the excellent OpenMM package and is the default engine if no other packages are present.)
>>> protocol = BSS.Protocol.FreeEnergyProduction()
>>> free_somd = BSS.FreeEnergy.Relative(solvated, protocol, engine="somd", work_dir="freenrg_somd/free")
>>> vac_somd = BSS.FreeEnergy.Relative(merged.toSystem(), protocol, engine="somd", work_dir="freenrg_somd/vacuum")
Note
For GROMACS, you can use BioSimSpace.Protocol.FreeEnergyMinimisation
and
BioSimSpace.Protocol.FreeEnergyEquilibration
to minimise and equilibrate at each
\({\lambda}\) value prior to setting up to the production run above. For SOMD this
isn’t possible, so you can either use GROMACS to prepare production input for each window,
or use the regular BioSimSpace.Protocol.Minimisation
and
BioSimSpace.Protocol.Equilibration
protocols to minimise and equilibrate
the \({\lambda=0}\) and \({\lambda=1}\) states only using any supported
engine from BioSimSpace.Process
.
Note
It’s possible to use a different protocol or molecular dynamics engine for each leg, e.g. if you want to use a different \({\lambda}\) schedule.
When complete, BioSimSpace will have set up a folder hierarchy containing everything that is needed to run the hydration free energy calculation using SOMD. Let’s examine the work_dir for the free (solvated) leg specified above:
$ ls freenrg_somd
free vacuum
Inside the top-level directory are two sub-directories called free and vacuum. These correspond the the solvated and vacuum legs of the simulation. Let’s further examine the free directory to see what’s inside:
$ ls freenrg_somd/free
lambda_0.0000 lambda_0.3000 lambda_0.6000 lambda_0.9000
lambda_0.1000 lambda_0.4000 lambda_0.7000 lambda_1.0000
lambda_0.2000 lambda_0.5000 lambda_0.800
Inside this are 11 further sub-directories, one for each of the \({\lambda}\) windows of the leg. Within each of these directories are all of the files needed to run an individual simulation, e.g.:
$ ls freenrg_somd/free/lambda_0.0000
somd.cfg somd.err somd.out somd.pert somd.prm7 somd.rst7
The BioSimSpace.FreeEnergy.Relative
object can also automatically run
all of the simulations for you and analyse the output that is generated. However,
since these simulations will take a long time we won’t run them here.
By specifying a different molecular dynamics engine, we can use the exact same code to set up an identical set of simulations with GROMACS:
>>> free_gmx = BSS.FreeEnergy.Relative(solvated, protocol, engine="gromacs", work_dir="freenrg_gmx/free")
>>> vac_gmx = BSS.FreeEnergy.Relative(merged.toSystem(), protocol, engine="gromacs", work_dir="freenrg_gmx/vacuum")
Let’s examine the directory for the \({\lambda=0}\) window of the free leg:
$ ls freenrg_gmx/free/lambda_0.0000
gromacs.err gromacs.mdp gromacs.out.mdp gromacs.tpr
gromacs.gro gromacs.out gromacs.top
Similarly, we can also set up the same simulations using AMBER:
free_amber = BSS.FreeEnergy.Relative(solvated, protocol, engine="amber", work_dir="freenrg_amber/free")
vac_amber = BSS.FreeEnergy.Relative(merged.toSystem(), protocol, engine="amber", work_dir="freenrg_amber/vacuum")
Let’s examine the directory for the \({\lambda=0}\) window of the free leg:
$ ls freenrg_amber/free/lambda_0.0000
amber.cfg amber.err amber.out amber.prm7 amber_ref.rst7 amber.rst7
There you go! This tutorial has shown you how BioSimSpace can be used to easily set
up everything that is needed for complex alchemical free energy simulations. Please
visit the API documentation
for further information.