Unbinding Simulation ==================== In the previous step we mentioned about how to prepare your data for running a Wepy simulation. In this part, we will cover the topic of how to setup the simulation with this data. Which files you need depends on whether you are doing a rebinding or unbinding simulation. Since we are focusing on unbinding simulation for this tutorial we need the following files: - system.pkl : Generated in the Building System and Topology Pickle Files section. - topology.pkl : Generated in the Building System and Topology Pickle Files section. - \*.rst : A restart file from the simulation you want to continue from. This is need for getting the positions and boundary values of the system. E.g. ``step5_10.rst`` in this case. - \*.pdb : A PDB file of the system. E.g. ``step3_input.pdb`` in this case. If you do not have these files, please visit the :any:`How to Prepare Your Data ` section of this tutorial. After confirming that you have the necessary files, you can proceed to the next step. Let's start by importing necessary modules. .. code:: python import os import os.path as osp import pickle as pkl import simtk.openmm.app as omma import simtk.openmm as omm import simtk.unit as unit import mdtraj as mdj from wepy.sim_manager import Manager from wepy.resampling.resamplers.revo import REVOResampler from wepy.resampling.distances.receptor import UnbindingDistance from wepy.runners.openmm import OpenMMGPUWalkerTaskProcess, OpenMMRunner, OpenMMWalker, OpenMMState, gen_sim_state from wepy.boundary_conditions.receptor import UnbindingBC from wepy.reporter.hdf5 import WepyHDF5Reporter from wepy.work_mapper.task_mapper import TaskMapper from wepy.util.mdtraj import mdtraj_to_json_topology from walker_pkl_reporter import WalkersPickleReporter from wepy.reporter.dashboard import DashboardReporter from wepy.reporter.openmm import OpenMMRunnerDashboardSection Great! Now that we have imported the necessary modules, we can proceed to setup simulation parameters file directories. In this tutorial, we will utilize 8 walkers and run 2 cycles of 5 steps. These parameters can be adjusted to suit your needs since these parameters are for a short demonstration. .. code:: python # Typical sim details num_walkers = 8 run = 1 n_steps = 5 n_cycles = 2 #### Paths: Set it to your preference inp_path = './' pdb_path = f'{inp_path}/step3_input.pdb' rst_path = f'{inp_path}/step5_10.rst' system_path = f'{inp_path}/system.pkl' topology_path = f'{inp_path}/topology.pkl' outputs_dir = f'simdata_run{run}_steps{n_steps}_cycs{n_cycles}' os.makedirs(outputs_dir, exist_ok=True) The next step is to use the files we have prepared to setup the simulation. We will start by loading the system and topology files. .. code:: python with open(system_path,'rb') as f: system = pkl.load(f) with open(topology_path,'rb') as f: omm_top = pkl.load(f) # Get positions and box vectors from an rst file # This is typically a restart file post nvt/npt from openmm # However, you can build it from scratch using omma.PDBFile() too with open(rst_path, 'r') as f: simtk_state = omm.XmlSerializer.deserialize(f.read()) bv = simtk_state.getPeriodicBoxVectors() pos = simtk_state.getPositions() system.setDefaultPeriodicBoxVectors(bv[0],bv[1],bv[2]) Now we need an integrator and a runner for our simulation. an integrator is a computational method that updates the positions and velocities of particles over time. It calculates the motion of particles based on the forces acting on them, following Newton's laws of motion. In this tutorial, we will use the Langevin integrator. This simulation will be running on a GPU, so we will use the platform='CUDA' option while setting up the OpenMMRunner. .. code:: python integrator = omm.LangevinIntegrator(300*unit.kelvin, 1/unit.picosecond, 0.002*unit.picoseconds) runner = OpenMMRunner(system, omm_top, integrator, platform='CUDA') Next, we will build the walkers for the simulation. For this, we need to generate the state from the system and topology files. All of the initial walkers will have the same state and weight at the start of the simulation. .. code:: python # Generate a new simtk "state" new_simtk_state = gen_sim_state(pos, system, integrator) # Set up parameters for running the simulation init_weight = 1.0 / num_walkers # Generate the walker state in wepy format walker_state = OpenMMState(new_simtk_state) # Make a list of the initial walkers init_walkers = [OpenMMWalker(walker_state, init_weight) for i in range(num_walkers)] Now that we have the walkers, we need a distance metric for measuring differences between walker states. In this case, we will use the UnbindingDistance class from the wepy library, however you can define your own distance metrics if needed. For this tutorial, we will use the the indices of the ligand, binding site and the walker state. .. code:: python pdb = mdj.load_pdb(pdb_path) json_top = mdtraj_to_json_topology(pdb.top) # Save some relevant indices lig_idxs = pdb.top.select('') protein_idxs = pdb.top.select('') binding_selection_idxs = mdj.compute_neighbors(pdb, 0.5, lig_idxs, haystack_indices=protein_idxs, periodic=True)[0] # Distance metric to be used in REVO unb_distance = UnbindingDistance(lig_idxs, binding_selection_idxs, walker_state) Next, we will setup the boundary conditions for the simulation. In this case, we will use the UnbindingBC class from the wepy library. This class will be used to check if the walker has crossed the boundary and should be resampled. .. code:: python ubc = UnbindingBC(cutoff_distance=1.0, # nm initial_state=walker_state, topology=json_top, ligand_idxs=lig_idxs, receptor_idxs=protein_idxs) Next, we will use the REVOResampler class from the wepy library to setup the resampler. This class will be used to resample the walkers based on the distance metric and boundary conditions we have defined. .. code:: python # Set up the REVO Resampler with the parameters resampler = REVOResampler(distance=unb_distance, init_state=walker_state, weights=True, pmax=0.1, dist_exponent=4, merge_dist=0.25, char_dist=0.1) And finally, we need to setup the reporters for recording the simulation data, define the task mapper and the simulation manager. An important point in this step is to define the number of workers and the device ids for the GPU. In this case, we will use 2 workers and the GPU device ids 0 and 1. If you are using more GPUs, you can increase the number of workers and device ids accordingly. .. code:: python # Set up the HDF5 reporter hdf5_reporter = WepyHDF5Reporter(save_fields=('positions','box_vectors'), file_path=osp.join(outputs_dir,f'wepy.results.h5') , resampler=resampler, boundary_conditions=ubc, topology=json_top) # Set up the pickle reporter (Essential for restarts) out_folder_pkl = osp.join(outputs_dir,f'pkls') pkl_reporter = WalkersPickleReporter(save_dir = out_folder_pkl, freq = 1, num_backups = 2) # Set up the dashboard reporter dashboard_path = osp.join(outputs_dir,f'wepy.dash.org') openmm_dashboard_sec = OpenMMRunnerDashboardSection(runner) dashboard_reporter = DashboardReporter(file_path = dashboard_path, runner_dash = openmm_dashboard_sec) # Create a work mapper for NVIDIA GPUs for a GPU cluster mapper = TaskMapper(walker_task_type=OpenMMGPUWalkerTaskProcess, num_workers=2, platform='CUDA', device_ids=[0,1]) # Build the simulation manager sim_manager = Manager(init_walkers, runner=runner, resampler=resampler, boundary_conditions=ubc, work_mapper=mapper, reporters=[hdf5_reporter, pkl_reporter, dashboard_reporter]) print('Running the simulation...') # run a simulation with the manager for 'n_cycles' with 'n_steps' of integrator steps in each steps_list = [n_steps for i in range(n_cycles)] # and..... go! sim_manager.run_simulation(n_cycles, steps_list) And that is it! Now you can start the simulation by running ``python wepy_run.py``. The simulation will run for the specified number of cycles and steps. The simulation data will be saved in the ``outputs_dir`` directory. You can use the saved data to analyze the simulation results.