Week 06 Lab: Single Particle Langevin Dynamics
Contents
Week 06 Lab: Single Particle Langevin Dynamics¶
Student Name: YOUR NAME HERE¶
Today we’re studying a very simple system: a single particle on a one-dimensional energy surface. While this might not seem like the most exciting (or biologically-relevant) system, it is complex enough to demonstrate some essential principles of MD simulation (and even gain some insight into real biomolecular processes).
While there are many software packages that implement MD simulations, in this lab we will be writing all of the code to do this in our Jupyter notebook! In later weeks we will move on to use the OpenMM program to conduct biomolecular simulations, but for this week and the next we will get down and dirty implementing MD with functions we write ourselves.
Forces¶
First let’s write functions for the internal energy (\(U(x)\)) and the net force (\(F(x) = -dU(x)/dx\)).
We get to design our own energy function here and we will start with the following:
Let’s code it up and then see how it looks!
Task: Finish the following function:
def potential(x):
U = None # CORRECT THIS LINE
return U
Now run the cell below to plot it on the range (-1,4):
import numpy as np
import matplotlib.pyplot as plt
x_vals = np.linspace(-1,4,100)
plt.plot(x_vals,potential(x_vals))
plt.xlabel("x")
plt.ylabel("U(x)")
You should see a double-well potential with the deepest well around \(x=0\) and a secondary well at \(x=3\).
We often refer to these wells as “local minima”. The \(x=0\) well here is the “global minimum”.
Task: Make another function called force
that takes in x
, then computes and returns -dU/dx
. (Dust off those calculus skills!)
This should automatically work both when x
is a float and when x
is a numpy array.
# your code here
# Confirm it is working with this test
if force(2) == 4.0:
print("Test passed!")
else:
print("Test failed!")
if (force(np.array([1.0,2.0])) == np.array([-6,4])).all():
print("Test passed!")
else:
print("Test failed!")
Integrator¶
Now let’s write a Langevin Velocity Verlet integrator.
As a reminder here are the Langevin-VV update equations:
Remember: the \(R(t)\) function is a random noise term! It generates a random variable for each timestep that is drawn from a Gaussian distribution with mean=0 and variance=1. This can be easily done here using the np.random.normal()
function.
The following VV_langevin
function will implement this integration scheme. It will take in positions (\(x\)) velocities (\(v\)), masses (\(m\)) and a function to compute the forces. It also has an argument for a parameter dictionary that must contain the timestep (\(dt\)), the friction coefficient (\(\gamma\)) and the thermal energy (\(k_B T\)). Note that even though we only have one particle (\(N = 1\)), this function will be written to be general for any value of \(N\).
Task: complete the missing lines below.
def VV_langevin(x, v, m, force_alg, params):
"""
x : positions
np.array, shape = (N,ndim)
v : velocities
np.array, shape = (N,ndim)
m : masses
np.array, shape = (N)
force_alg : force function
callable function that takes in x, returns a np.array with same shape
params : parameter dict
dict (must contain dt, gamma, kT [all floats])
"""
dt = params['dt']
gamma = params['gamma']
kT = params['kT']
# define arrays for new positions and new velocities
new_x = np.zeros_like(x)
new_v = np.zeros_like(x)
N = x.shape[0]
# compute f(t)
f = force_alg(x)
# use f to update x
for i in range(N):
# new_x[i] is the (new) position for particle i
new_x[i] = None # CORRECT THIS LINE
# compute f(t+dt)
new_f = force_alg(new_x)
# use f(t+dt) and f(t) to update v
for i in range(N):
# define Langevin prefactor
prefac = np.sqrt(2*gamma*kT/m[i])
new_v[i] = None # CORRECT THIS LINE
return new_x, new_v
Great! Now let’s test this integrator out by calling it with our force function.
test_x = np.array([[2.]])
test_v = np.array([[0.]])
m = np.array([1.0])
params = {'dt' : 0.01,
'gamma' : 1.0,
'kT' : 0} # use zero thermal energy for now (testing)
updated_x, updated_v = VV_langevin(test_x, test_v, m, force, params)
# confirm it is working with this test
if updated_x[0,0] == 2.0002:
print("Test passed!")
else:
print("Test failed!")
if np.abs(updated_v[0,0] - 0.040008) < 1e-6:
print("Test passed!")
else:
print("Test failed!")
One more test: turn on the random force (set kT = 1000
) and then call the function 10000
times (each time using the same test_x
and test_v
). Save of all of the updated_v
values that you get in an array called all_vs
.
# your code here
Here’s a histogram of the distribution:
counts, bins, _ = plt.hist(all_vs,bins=20)
And we can test the code using a fit to a normal distribution using scipy.stats.norm
:
from scipy.stats import norm
mean,std = norm.fit(all_vs)
if np.abs(mean) < 0.05:
print("Test passed!")
else:
print("Test failed!")
if np.abs(std - np.sqrt(2*params['gamma']*params['kT']/m[0])*params['dt']) < 0.01:
print("Test passed!")
else:
print("Test failed!")
print("The data fits to a normal distribution with mean", mean, "and width", std)
Are these results expected? Why or why not?
/ your answer here /
Simulation manager¶
We’re all ready to run some simulations! First let’s make things a little easier on ourselves by defining a simulation manager. This will be a function that takes in initial positions, velocities, masses, an integrator and a set of parameters, and returns a trajectory.
def sim_manager(init_pos, init_vel, m, force_fn, integrator, params):
"""
init_pos : initial positions
np.array, shape = (N,ndim)
init_vel : initial velocities
np.array, shape = (N,ndim)
m : masses
np.array, shape = (N)
force_fn : force function
callable function that takes in x, returns a np.array with same shape
integrator : integrator function
callable function that takes in x, v, m, force_alg, params
params : parameter dict
dict (must contain dt, gamma, kT [floats],
n_steps, report_freq [integers])
"""
# define empty arrays for (periodically) saving x and v
n_steps = params['n_steps']
report_freq = params['report_freq']
n_reports = int(n_steps/report_freq) + 1
N = init_pos.shape[0]
ndim = init_pos.shape[1]
traj_x = np.zeros((n_reports, N, ndim))
traj_v = np.zeros((n_reports, N, ndim))
current_x = init_pos
current_v = init_vel
for step in range(n_steps):
current_x, current_v = integrator(current_x, current_v, m, force_fn, params)
if step % report_freq == 0:
# time to report
report_idx = int(step/report_freq)
traj_x[report_idx] = current_x
traj_v[report_idx] = current_v
return traj_x, traj_v
Run three trajectories starting at x=0, v=0:
params = {'dt' : 0.01,
'gamma' : 1.0,
'kT' : 100,
'n_steps' : 10000,
'report_freq' : 100
}
all_trajs = []
n_traj = 3
for i in range(n_traj):
print("Running trajectory",i+1,"of",n_traj)
pos_traj, _ = sim_manager(np.array([[0.]]),
np.array([[0.]]),
np.array([1.]),
force,
VV_langevin,
params)
all_trajs.append(pos_traj)
Here’s a nice plot of our trajectories, along with the potential:
def traj_plot(traj_list,x_vals,potential,n_steps):
fig, ax = plt.subplots(nrows=2,ncols=1, figsize=(4,8),sharex=True)
for i, traj in enumerate(traj_list):
ax[0].plot(traj[:,0,0],np.linspace(0,int(n_steps),pos_traj.shape[0]), label=f"Traj {i+1}")
ax[0].set_ylabel("Time")
ax[0].legend()
ax[1].plot(x_vals,potential(x_vals))
ax[1].set_xlabel("x")
ax[1].set_ylabel("U(x)")
traj_plot(all_trajs,x_vals,potential,params['n_steps'])
Run this again, using a larger number of trajectories and a larger number of steps. What is happening here? Why?
/your answer here/
Try starting at x=3
instead. Run a set of trajectories below and describe their behavior.
# your code here
/your description here/
Free energy plots¶
A great way to analyze your trajectories is to show the probability of visiting a given value of \(x\), which we denote \(P(x)\). Probabilities are in the range \([0,1]\), but sometimes it can be difficult to see the probabilities of rarely visited states in your trajectory when the probability is plotted directly.
For this reason it is customary to instead plot \(f(x) = -\log(P(x))\).
Since \(x\) is continuous, we can first use the np.histogram
function to bin it. Let’s use the all_trajs[0]
array for an example:
counts, bins = np.histogram(all_trajs[0],bins=20)
cent_bins = 0.5*(bins[1:] + bins[:-1]) # get the center of each bin
Now the probability can be computed using the counts:
P = counts/counts.sum()
And the f
can be computed using np.log
, although since \(\log(0) \rightarrow \infty\), we need to handle the non-zero values appropriately:
# the "where" argument only computes the log where the P values are greater than 0
f = -params['kT']*np.log(P,where=P>0)
# the P>0 is a "mask" used to select only the values of f where P > 0
f -= f[P>0].min()
fmax = 10000
# again, we use a P==0 mask to select only elements of f where P = 0,
# and then we set those to fmax
f[P==0] = fmax
This f
array is actually the free energy of the system in units of kT
. This can be plotted along with the potential as follows:
zeroed_potential = potential(x_vals)
zeroed_potential -= zeroed_potential.min()
fig, ax = plt.subplots(1,1)
ax.plot(x_vals,zeroed_potential, color='blue',label='U(x)')
ax.set_xlabel('x')
ax.set_ylim([-1,18]) # set the same range of y-values
ax.set_ylabel('U(x)',color='blue')
ax2 = ax.twinx()
ax2.plot(cent_bins[f < fmax], f[f < fmax], color='red', label='Free energy (kT)')
ax2.set_ylim([-100,1800]) # set the same range of y-values
ax2.set_ylabel('Free energy (kT)',color='red')
If we want to combine multiple trajectories together, we can’t just average the free energy! Instead we need to average the probabilities, or, equivalently, just sum the counts. Let’s make a function that does this:
def free_energy(coord_list, kT, nbins=20, fmax=10000):
# get global min and max
c_min = np.array(coord_list).min()
c_max = np.array(coord_list).max()
# define a total counts array
tot_counts = np.zeros((nbins))
for coord in coord_list:
counts, bins = np.histogram(coord,bins=nbins,range=(c_min,c_max))
tot_counts += counts
cent_bins = 0.5*(bins[1:] + bins[:-1]) # get the center of each bin
P = tot_counts/tot_counts.sum()
f_nrg = -kT*np.log(P,where=P>0)
f_nrg -= f_nrg[P>0].min() # subtract the free energy minimum, so that
# the lowest value is zero
f_nrg[P==0] = fmax # for all unvisited bins, assign them the maximum
# free energy
return cent_bins, f_nrg, P
And use it to plot a global free energy distribution for the whole all_trajs
list:
bins, f, P = free_energy(all_trajs, params['kT'])
fig, ax = plt.subplots(1,1)
ax.plot(x_vals,zeroed_potential, color='blue',label='U(x)')
ax.set_xlabel('x')
ax.set_ylim([-1,18]) # set the same range of y-values
ax.set_ylabel('U(x)',color='blue')
ax2 = ax.twinx()
ax2.plot(bins[f < fmax], f[f < fmax], color='red', label='Free energy (kT)')
ax2.set_ylim([-100,1800]) # set the same range of y-values
ax2.set_ylabel('Free energy (kT)',color='red')
Turning up the heat¶
To finish off our lab today, run sets of 10 trajectories (starting at x=0
, each with 50000 steps, for each kT
value in the following list: 100, 150, 200, 250, 300
.
Save each trajectory set separately, then plot their P(x)
curves (averaged over all the trajectories in the set). Set the plt.yscale('log')
and make sure to mask out the P=0
values!
# your code here