Introduction & Features¶
There is an academic paper describing various aspects of the design and
usage of wepy
. Peer-review is in process:
Weighted Ensemble (WE)¶
The weighted ensemble algorithm (WE) is a general strategy for simulating rare or long-timescale events in stochastic systems []. It works by simulating an ensemble of different simulations (individually called ‘walkers’), where at specific times during the simulations all walkers are stopped and examined in order to identify any behaviors of interest that may have occured. Each pause in the simulation is called a ‘cycle’ and walkers which are “interesting” will have more simulation effort put into them while those of less exhibited interest will be dropped or simulated less. This typically is achieved by “cloning” high-value walkers into many copies, thus giving them more chances to exhibit otherwise rare behaviors. This is also usually concomitant with a removal (typically called merging/squashing) of walkers so that the computational resources are not diluted. This is similar to Importance Sampling methods, however, we are interested in not only the behaviors of the walker simulations but also the estimates of their probability. In order to achieve this, the process of cloning and merging must always must preserve the original weight distribution. Such a process is said to be a “resampling” process, and indeed the cloning and merging approach does this by construction. The outcome is that the “weights” of walkers is always accounted for, and in the limit of a fully converged simulation these weights correspond to the stationary probabilities.
While the computational effort needed to reach convergence may be
excessive for systems with high-dimensional state spaces, WE has other
advantages when compared to so-called biased methods. That is that the
corresponding “laws” of propagation (e.g. Hamiltonians, force-fields,
etc.) are never modified which means that the individual trajectories
are always “correct”. This particular feature may not be of interest to
many but for fields where the specific details of the “mechanism” of a
process are of interest this is a major advantage. For instance a major
use-case of WE (and wepy
itself) is to observe “transition states”
(kinetic bottleneck structures in biomolecular processes) at all-atom
resolution. This is useful for scientists that are designing drugs that
effect the structure in these particular states.
Features¶
State of the art WE resamplers: WExplore [] and REVO []
Super fast molecular dynamics via OpenMM []
Purpose built HDF5 storage format for WE data with extensive API:
WepyHDF5
Analysis routines for:
free energy profiles
rate calculations
computing trajectory observables
extracting linear trajectories from clone-merge trees
conformation state networks such as Markov State Models (MSMs)
aggregating multiple runs
Orchestration framework for managing large number of simulations with simulation checkpointing, recovery, and continuations.
Expert friendly: Fully-featured framework for building and customizing simulations for exactly what you need.
Leverage the entire python ecosystem, you’re never limited to an old version of an embedded interpreter.
No complex ad hoc configuration files, everything is python.
Getting Started¶
Once you have wepy installed you can check out the quickstart to get a rough idea of how it works.
Then you can either read the user's guide or head on to the tutorials or execute the examples.
For a complete description of the modules and their components check out the API documentation.
Compatibility¶
Tested succesfully with:
Python |
OpenMM |
Pass |
---|---|---|
3.6 |
7.4.1 |
✓ |
3.6 |
7.3.1 |
✓ |
3.7 |
7.4.1 |
✓ |
3.7 |
7.3.1 |
✓ |
3.7 |
7.5.1 |
✓ |
3.8 |
7.4.1 |
✗ |
3.8 |
7.3.1 |
✗ |
See the noxfile.py
for full test matrix.
Contributed wepy libraries and other useful resources¶
Here is a list of packages that are not in the main wepy
repository
but may be of interest to users of wepy.
These include things like :
distance metrics and boundary conditions for different kinds of systems
resampler and runner prototypes
related analysis or utility libraries
They are:
- geomm
purely functional library for common numerical routines in computational biology and chemistry, with no dependency on specific file or topology formats.
- wepy-developer-resources
Unofficial and miscellaneous materials related to wepy, including talks, workshops, contributed tutorials etc. May be out of date.
- mastic
Library for doing general purpose “profiling” of intermolecular interactions. Useful for computing observables an experimental chemist understands. Also useful for building distance metrics.
- mdtraj
Excellent library with optimized code for numerical routines of interest in computational biology and chemistry. Differs from geomm in that it relies on their own topology format. The WepyHDF5 JSON topology format is borrowed from this library. Used in wepy as a utility writer of commonly used formats like PDBs, DCDs, etc.
- openmmtools
Contributed components for OpenMM. Contains some ready-made test systems that are very convenient for testing and prototyping components in wepy.
- openmm-systems
A friendly fork of
openmmtools
that just provides the test systems for ease of installation. We depend on this for our examples and testing.- CSNAnalysis
small library for aiding in the analysis of conformation state networks (CSNs) which can be generated from
wepy
data.
Alternatives¶
wepy
is not the only WE framework package. Other packages have
different scopes and features. I have tried to provide a fair comparison
of wepy
to them to help potential users make an informed decision.
If you feel a package is misrepresented contact the wepy
devs or
submit a pull request with your desired changes.
Weighted ensemble package in Python 2.7. More reliant and integrated with unix-like operating systems providing modularity through shell scripting and python modules [].
As an older project it has support for more MD engines (and non-MD stochastic sampling engines, e.g. BioNetGen) and is currently better suited for running simulations on large numbers of CPUs in a clustered environment.
Support for WE algorithms closer to the original paper by Huber and Kim with a focus on static tesselation of conformational space.
Has some support for adaptive binning algorithms like WExplore, but it is a little more challenging to develop radically different resamplers like REVO, which have no concept of bins at all.
AWE: Accelerated Weighted Ensemble
Another Python 2 library with a focus on the Accelerated WE resampling algorithm and integration with a Work Queue library for distributed jobs [] .