Introducing Observables¶
Observables are quantities calculated from simulation data that provide insights into the system’s behavior. A common observable is RMSD (Root Mean Square Deviation). In this guide, we’ll walk through the process of defining and using observables within a simulation.
import mdtraj as mdj
# Load the PDB file
pdb_path = 'path/to/your/file.pdb'
pdb = mdj.load_pdb(pdb_path)
To calculate observables, we often need to select specific residues or atoms. Here’s how to select residues using MDTraj.
# Select specific residues based on segment names and residue numbers
active_1_res = pdb.top.select('(segname PROA and (residue 399)) or (segname PROB and (residue 366))')
active_2_res = pdb.top.select('(segname PROB and (residue 399)) or (segname PROC and (residue 366))')
active_3_res = pdb.top.select('(segname PROC and (residue 399)) or (segname PROA and (residue 366))')
CARA_active_res = pdb.top.select('segname CARA and (residue 24 25)')
CARB_active_res = pdb.top.select('segname CARB and (residue 24 25)')
CARC_active_res = pdb.top.select('segname CARC and (residue 24 25)')
An observable function takes simulation data as input and computes a specific property. You can define your own functions based on your research needs.
Here’s an example of how you might define an observable function to compute centroid distances:
import numpy as np
def centroid_distance(fields_d, *args, **kwargs):
centroid_distances = []
for i in range(len(fields_d['positions'])):
active_1_centroid = np.mean(fields_d['positions'][i][args[0]['active_1_res']], axis=0)
active_2_centroid = np.mean(fields_d['positions'][i][args[0]['active_2_res']], axis=0)
active_3_centroid = np.mean(fields_d['positions'][i][args[0]['active_3_res']], axis=0)
CARA_active_centroid = np.mean(fields_d['positions'][i][args[0]['CARA_active_res']], axis=0)
CARB_active_centroid = np.mean(fields_d['positions'][i][args[0]['CARB_active_res']], axis=0)
CARC_active_centroid = np.mean(fields_d['positions'][i][args[0]['CARC_active_res']], axis=0)
centroid_distances.append(np.array([
np.linalg.norm(active_1_centroid - CARA_active_centroid),
np.linalg.norm(active_3_centroid - CARB_active_centroid),
np.linalg.norm(active_2_centroid - CARC_active_centroid)
]))
return np.array(centroid_distances)
Once we have the observable function, we can use it to compute properties during the simulation. Wepy provides tools to manage simulation data and compute observables.
from wepy.hdf5 import WepyHDF5
# Path to the WEPY results files
outputs_dir = ['path/to/output1', 'path/to/output2']
wepy_results = [WepyHDF5(output_dir + '/wepy.results.h5', mode='r+') for output_dir in outputs_dir]
for wepy_result in wepy_results:
with wepy_result:
args = [{
'active_1_res': active_1_res,
'active_2_res': active_2_res,
'active_3_res': active_3_res,
'CARA_active_res': CARA_active_res,
'CARB_active_res': CARB_active_res,
'CARC_active_res': CARC_active_res
}]
obs = wepy_result.compute_observable(
centroid_distance,
['positions'], # Specify the required fields
args=(args), # Pass custom arguments
save_to_hdf5='centroid', # Save results to the HDF5 file
return_results=True
)
Now you have centroid distance data that you can use to analyze the system’s behavior. You can also use the data to visualize the system’s behavior over time.