wepy.analysis.parents module¶
Routines for converting resampling data to parental lineages.
The core mechanism of weighted ensemble (WE) is to resample cohorts of parallel simulations. Principally this means ‘cloning’ and ‘merging’ different walkers which gives rise to branching ‘family’ tree structures.
The routines in this module are concerned with utilizing raw resampling records data from a resampler and extracting these lineages in useful easy to query structures.
Cloning and merging is performed on a cohort of walkers at every cycle. A walker has both a state and weight. The state is related to the dynamical state such as the positions, velocities, etc. that describe the state of a simulation. The weight corresponds to the probability normalized with the other walkers of the cohort.
An n-clone constitutes copying a state n times to n walkers with w/n weights where w is the weight of the cloned walker. The cloned walker is said to be the parent of the new child walkers.
An k-merge constitutes combining k walkers into a single walker with a weight that is the sum of all k walkers. The state of this walker is chosen by sampling from the discrete distribution of the k walkers. The walker that has it’s state chosen to persist is referred to as the ‘kept’ walker and the other walkers are said to be ‘squashed’, as they are compressed into a single walker, preserving their weight but not their state. Squashed walkers are leaves of the tree and leave no children. The kept walker and walkers for which no cloning or merging was performed will have a single child.
Routines¶
- resampling_panelCompiles an unordered collection of resampling records
into a structured array of records.
- parent_panelUsing a parental relationship reduce the records in a
resampling panel to a structured array of parent indices.
- net_parent_tableReduce the full parent panel (with multiple steps
per cycle) to the net results of resampling per cycle.
- parent_table_discontinutiesUsing an interpretation of warping
records assigns a special value in a parent table for discontinuous warping events.
- ancestorsGenerate the lineage trace of a given walker back through
history. This is used to get logically contiguous trajectories from walker slot trajectories (not necessarily contiguous in storage).
- sliding_windowGenerate non-redundant sliding window traces over the
parent forest (tree) imposed over a parent table.
- ParentForestClass that imposes the forest (tree) structure over the
parent table. Valid for a single contig in the contig tree (forest).
- wepy.analysis.parents.DISCONTINUITY_VALUE = -1¶
Special value used to determine if a parent-child relationship has discontinuous dynamical continuity. Functions in this module uses this to set this value.
- wepy.analysis.parents.resampling_panel(resampling_records, is_sorted=False)[source]¶
Converts an unordered collection of resampling records into a structured array (lists) corresponding to cycles and resampling steps within cycles.
It is like doing a pivot on the step indices into an extra dimension. Hence it can be thought of as a list of tables indexed by the cycle, hence the name panel.
- Parameters:
- Returns:
resampling_panel – The panel (list of tables) of resampling records in order (cycle, step, walker)
- Return type:
- wepy.analysis.parents.parent_panel(decision_class, resampling_panel)[source]¶
Using the parental interpretation of resampling records given by the decision_class, convert resampling records in a resampling panel to parent indices.
- Parameters:
- Returns:
parent_panel – A structured list of the same for as the resampling panel, with parent indices swapped for resampling records.
- Return type:
- wepy.analysis.parents.net_parent_table(parent_panel)[source]¶
Reduces a full parent panel to get parent indices on a cycle basis.
The full parent panel has parent indices for every step in each cycle. This computes the net parent relationships for each cycle, thus reducing the list of tables (panel) to a single table. A table need not have the same length rows, i.e. a ragged array, since there can be different numbers of walkers each cycle.
- wepy.analysis.parents.parent_table_discontinuities(boundary_condition_class, parent_table, warping_records)[source]¶
Given a parent table and warping records returns a new parent table with the discontinuous warping events for parents set to a special value (-1).
- Parameters:
boundary_condition_class (class implementing BoundaryCondition interface) – The boundary condition class that interprets warping records for if they are discontinuous or not.
warping_records (list of namedtuple records) – The unordered collection of warping events from the simulation.
- Returns:
parent_table – Same shape as input parent_table but with discontinuous relationships inserted as -1.
- Return type:
- wepy.analysis.parents.ancestors(parent_table, cycle_idx, walker_idx, ancestor_cycle=0)[source]¶
Returns the lineage of ancestors as walker indices leading up to the given walker.
- Parameters:
- Returns:
ancestor_trace – Contig walker trace of the ancestors leading up to the queried walker. The contig is sequence of cycles in the parent table.
- Return type:
list of tuples of ints (traj_idx, cycle_idx)
- wepy.analysis.parents.sliding_window(parent_table, window_length)[source]¶
Return contig walker traces of sliding windows of given length over the parent forest imposed over the contig given by the parent table.
Windows are given in no particular order and are nonredundant for trees.
- class wepy.analysis.parents.ParentForest(contig=None, parent_table=None)[source]¶
Bases:
object
A tree abstraction to a contig representing the family trees of walkers.
Uses a directed graph (networkx.DiGraph) to represent parent-child relationships; i.e. for edge (A, B) node A is a parent of node B.
Constructs a parent forest from either a Contig object or parent table.
Either a contig or parent_table must be given but not both.
The underlying data structure used is a parent table. However, if a contig is given a reference to it will be kept.
- Parameters:
- Raises:
ValueError – If neither parent_table nor contig is given, or if both are given.
- WEIGHT = 'weight'¶
Key for weight node attribute.
- FREE_ENERGY = 'free_energy'¶
Key for free energy node attribute.
- ROOT_CYCLE_IDX = -1¶
Special value for the root nodes cycle indices which the initial conditions of the simulation
- DISCONTINUITY_VALUE = -1¶
Special value used to determine if a parent-child relationship has discontinuous dynamical continuity. This class tests for this value in the parent table.
- CONTINUITY_VALUE = 0¶
- _make_child_parent_edges(step_idx, parent_idxs)[source]¶
Generate edge_ids and edge attributes for an array of parent indices.
- Parameters:
- Returns:
edges (list of 2-tuple of node_id)
edge_attributes (list of dict)
- property contig¶
Underlying contig if given during construction.
- property parent_table¶
Underlying parent table data structure.
- property graph¶
Underlying networkx.DiGraph object.
- property roots¶
Returns the roots of all the trees in this forest.
- property trees¶
Returns a list of the subtrees from each root in this forest. In no particular order
- property n_steps¶
Number of steps of resampling in the parent forest.
- set_node_attributes(attribute_key, node_attribute_dict)[source]¶
Set attributes for all nodes for a single key.
- set_attrs_by_array(attribute_key, values)[source]¶
Set node attributes on a stepwise basis using structural indices.
Expects a array/list that is n_steps long and has the appropriate number of values for the number of walkers at each step.
- Parameters:
attribute_key (str) – Key to set for a node attribute.
values (array_like of dim (n_steps, n_walkers) or list of) – list of values. Either an array_like if there is a constant number of walkers or a list of lists of the values.