A. Introduction
In the last practical, we calculated free energies between two states of a system. Today we will calculate the free energy along a pathway. This means along a so-called reaction coordinate, which describes the transition of interest and provides a low-dimensional description of the relevant degrees of freedom of the system. To calculate the free energy profile, or more precisely the potential of mean force (PMF), sufficient sampling along the reaction coordinate is needed. However, it can be challenging to sample in energetically unfavored regions of the reaction coordinate. To overcome this sampling issue, it is common to use umbrella sampling. In this method several simulations are performed in which the system is restrained to different points along the reaction coordinate using a harmonic biasing potential. These simulations are typically called “umbrella windows”.

By restraining the system, the entire range of interest can be sufficiently sampled, even when the respective region is energetically unfavored. To get the PMF, the results of the simulations are combined and the bias is removed to estimate the unbiased probability distribution. This is often done using WHAM (Weighted Histogram Analysis Method), which optimally weights the histograms of the umbrella windows.
B. Dissociation of two methanol
First, we will look at the dissociation of two methanol molecules molecules and calculate the corresponding PMF. The reaction coordinate is the distance between the center of mass of the two methanol molecules. To get the starting structures for the umbrella windows, we first perform a pulling simulation that increases the distance over time.
The files needed for this part of the practicals can be downloaded. To download and extract directly in the terminal use:
wget https://biophys.uni-saarland.de/storage/courses/pract/p13/umbrella-pract/two-methanol.tar.gz
tar xzvf two-methanol.tar.gz
You can have a look at the specific parameters that are used for the pulling simulation:
grep pull pull.mdp
Now start the simulation:
source /home/gromacs/softwarerc
gmx grompp -f pull.mdp -c two-methanol-box.gro -p topol.top -n index.ndx -o pull.tpr
gmx mdrun -v -s pull.tpr
If you want to look at the simulation, it is useful to remove rotation and translation of the system:
gmx trjconv -f traj_comp.xtc -s pull.tpr -pbc mol -o whole_traj.xtc
gmx trjconv -f whole_traj.xtc -s pull.tpr -fit rot+trans -o fit_traj.xtc
When ask for a group type 0.
Then you can have a look with your favorite tool, e.g. vmd
vmd confout.gro fit_traj.xtc
Now we can use this pulling to extract starting configurations with different distances between the methanol molecules for our umbrella sampling.
To extract the configurations and start the simulations automatically, we provide you a script do-umbrella.sh that you can find in the folder umbrella. The script will run 15 windows, each for 1 ns. The restraints are placed at equally spaced distances between 0.3 nm and 1.5 nm.
cd umbrella
bash do-umbrella.sh
While the simulations take a few minutes, you can have a look at the pullf.xvg files that are created in the folders run-00, run-01, .... These report on the force derived from the biasing potential that is applied.
When the simulations are finished, we can use gmx wham to calculate the PMF. For this we need to provide the .tpr and pullf.xvg files for each window listed in two files:
ls run*/topol.tpr > tpr-files.dat
ls run*/pullf.xvg > pullf-files.dat
Then we can run the calculation.
gmx wham -it tpr-files.dat -if pullf-files.dat -o profile.xvg -hist histo.xvg -temp 300 -zprof0 1
First you can have a look at the histograms that are created.
xmgrace -nxy histo.xvg
Each color represents the histogram of a single umbrella window.
Question: Do the form and position of the histograms look like you would expect?
Question: WHAM needs a good overlap between the histograms to work. Is this the case?
Now have a look at the calculated PMF
xmgrace profile.xvg
Question: At large distances, would you expect the PMF to decrease, increase or plateau to a constant value? Does the calculated PMF match your expectation?
💡 Think about entropic effects.
Question/exercise: click when thought about last question
The entropic contribution goes with log(r2/r0) compared to some reference r0.Question: How could you remove this entropic contribution in the PMF and recover the underlying potential?
Let us use awk to correct the PMF. For this replace the ??? in the following line with the corrected energy. You can use the uncorrected PMF w, the distance r and kT.
awk -v kT=2.494 '/^[#@]/ {print; next} {r = $1; w = $2 ; print r, ??? "}' profile.xvg > profile_corrected.xvg
Question: Solution:
awk -v kT=2.494 '/^[#@]/ {print; next} {r = $1; w = $2 ; print r, w + kT*log((r*r)/r0 "}' profile.xvg > profile_corrected.xvg
To see whether 1 ns is long enough for the PMF simulation to converge, we can divide the umbrella sampling simulation into time blocks and calculate the PMF of these blocks.
block=250
total=1000
for b in $(seq 0 "$block" $((total - block))); do
e=$((b + block))
gmx wham -it tpr-files.dat -if pullf-files.dat \
-o profile_${b}ps-${e}ps.xvg \
-hist histo_${b}ps-${e}ps.xvg \
-temp 300 \
-b "$b" -e "$e" -zprof0 1
done
Question: Do you think 1ns for each umbrella window simulation is long enough?
C. Permeation of small molecules through a lipid membrane
Often systems that are studied are much more complex compared to the two methanol system. For example the permeation of small molecules through a lipid membrane. The PMF along the permeation is influenced by the changing chemical environment along the membrane normal, from water, over the polar headgroups to the hydrophobic core. Studying this process can help to understand how drugs and other compounds passively cross cell membranes, which is an important factor in pharmacokinetics and drug design.
Here, you will calculate the PMF for methanol and ibuprofen. The reaction coordinate is the center of mass position of the ligand along the membrane normal.
For the umbrella sampling, five methanol molecules are placed in a lipid membrane. For this umbrella sampling, we already prepared the starting configurations and the other files that are needed to run the simulation. For the 15 umbrella windows the methanol molecules are placed at different positions along the membrane normal spanning from water over the membrane again to water.
The files for the methanol system can be downloaded.
wget https://biophys.uni-saarland.de/storage/courses/pract/p13/umbrella-pract/methanol-umbrella.tar.gz
tar xzvf methanol-umbrella.tar.gz
To run the gmx grompp and mdrun command for all the file do
bash run-all-grompp.sh
cd umbrealla-runs
bash run-all-mdrun.sh
This will take a while. In the meantime, we can calculate the PMF in another way: We can pull a methanol molecule through the membrane along the membrane normal and can calculate the PMF by integrating the pull forces over the reaction coordinate.
First, we will run the pulling simulations. We want to pull 7nm in 1ns. What is the pulling rate in nm/ps? Change the parameter for the pulling rate pull-coord1-rate in the mdp file pull.mdp and the paramenter for the number of steps nsteps accordingly (the timestep is given by the parameter dt in ps).
gmx grompp -f pull.mdp -c two-methanol-box.gro -p topol.top -n index.ndx -o pull.tpr
gmx mdrun -v -nice 0 -s pull.tpr
The pull forces are written to the pullf.xvg file and the positions of the methanol in pullx.xvg.
These can be used to calculate the PMF by integrating the forces over the reaction coordinate, i.e. the position of methanol. If we integrate over bins, we can get the PMF along the reaction coordinate.
We prepared a script that calculates the PMF from the pullf.xvg and pullx.xvg by integration
In the top of the file pmf-from-pulling.sh, specify the pullf.xvg and pullf.xvg files to be analyzed.
bash pmf-from-pulling.sh
To look at the file with xmgrace, we have to include the following header at the top of the created file pmf_1000ps.txt.
{
cat << 'EOF'
@ title "PMF"
@ xaxis label "Distance (nm)"
@ yaxis label "PMF (kJ/mol)"
@TYPE xy
@ s0 legend "PMF"
EOF
cat pmf_1000ps.txt
} > pmf_1000ps.xvg
Then you can look at the PMF with xmgrace.
xmgrace pmf_1000ps.xvg
Question: How do your results compare to the results of your colleagues?
Question: Do you think that this PMF is converged?
To get better convergence of the pulling simulations, we can pull with a slower rate. Because these simulations would take too much time in the practicals, we provide you here with pullf.xvg and pullf.xvg files of longer simulations of 1ns, 50ns and 100ns.
wget https://biophys.uni-saarland.de/storage/courses/pract/p13/umbrella-pract/pullf_pullx.tar.gz
tar xzvf pullf_pullx.tar.gz
Use again the script pmf-from-pulling.sh with the downloaded files and calculate again the PMF.
Question: As the simulation time increases, the PMF becomes increasingly converged. What do you think is the reason the pulling simulations systematically overestimate the PMF?
Now we calculate the PMF from the umbrella sampling. Do the next steps for the umbrella sampling that you started earlier as soon as they are finished.
To calculate the PMF with WHAM you can use the following script that trims a bit at the boundaries with poor overlap of the umbrella histograms.
bash do-wham.sh
have a look at the PMF profile.
xmgrace profile.xvg
Question: Can you see the different regions of the membrane system in the PMF (water, the polar headgroups and hydrophobic core)?
Question: Is the PMF for low values of the reaction coordinate and high values the same?
At both ends of the PMF, the reaction coordinate corresponds to the positions where methanol is placed in water. Thus, we expect the same energies at the beginning and end of the PMF.
So a “good” PMF should converge such that the PMF is the same at the ends, but in practice you’ll always see some residual mismatch from finite-time sampling. Therefore it is common to also calculate a cyclic PMF profile where the reaction coordinate is treated as periodic. You can have a look at the cyclic profile.
xmgrace profile_cycl.xvg
Question: Do you think 100ps of umbrella window simulation is enough for the system to be converged? You can also compare your PMF with your colleagues. What could be reasons for an unconverged simulation?
The convergence of the simulations can be improved by extending the simulation time in the windows. Here we provide you a set finished umbrella sampling simulations where the simulations of the windows were run for 10ns.
wget https://biophys.uni-saarland.de/storage/courses/pract/p13/umbrella-pract/methanol-umbrella-long.tar.gz
tar xzvf methanol-umbrella-long.tar.gz
Calculate again the PMF and look at the profiles.
Question: How do the results of the shorter umbrella windows compare to the longer ones?
Question: How do the results compare to the PMF calculated by integration?
Because of more degrees of freedom of Ibuprofen, it is even harder to get converged simulations. Thus, we provided you directly with converged umbrella sampling simulations, which you can download.
wget https://biophys.uni-saarland.de/storage/courses/pract/p13/umbrella-pract/ibuprofen.tar.gz
tar xzvf ibuprofen.tar.gz
Repeat the PMF calculation for ibuprofen.