wepy.analysis.parents module¶
Routines for converting resampling data to parental lineages.
The core mechanism of weighted ensemble (WE) is to resample cohorts of parallel simulations. Principally this means ‘cloning’ and ‘merging’ different walkers which gives rise to branching ‘family’ tree structures.
The routines in this module are concerned with utilizing raw resampling records data from a resampler and extracting these lineages in useful easy to query structures.
Cloning and merging is performed on a cohort of walkers at every cycle. A walker has both a state and weight. The state is related to the dynamical state such as the positions, velocities, etc. that describe the state of a simulation. The weight corresponds to the probability normalized with the other walkers of the cohort.
An n-clone constitutes copying a state n times to n walkers with w/n weights where w is the weight of the cloned walker. The cloned walker is said to be the parent of the new child walkers.
An k-merge constitutes combining k walkers into a single walker with a weight that is the sum of all k walkers. The state of this walker is chosen by sampling from the discrete distribution of the k walkers. The walker that has it’s state chosen to persist is referred to as the ‘kept’ walker and the other walkers are said to be ‘squashed’, as they are compressed into a single walker, preserving their weight but not their state. Squashed walkers are leaves of the tree and leave no children. The kept walker and walkers for which no cloning or merging was performed will have a single child.
Routines¶
- resampling_panelCompiles an unordered collection of resampling records
into a structured array of records.
- parent_panelUsing a parental relationship reduce the records in a
resampling panel to a structured array of parent indices.
- net_parent_tableReduce the full parent panel (with multiple steps
per cycle) to the net results of resampling per cycle.
- parent_table_discontinutiesUsing an interpretation of warping
records assigns a special value in a parent table for discontinuous warping events.
- ancestorsGenerate the lineage trace of a given walker back through
history. This is used to get logically contiguous trajectories from walker slot trajectories (not necessarily contiguous in storage).
- sliding_windowGenerate non-redundant sliding window traces over the
parent forest (tree) imposed over a parent table.
- ParentForestClass that imposes the forest (tree) structure over the
parent table. Valid for a single contig in the contig tree (forest).
-
wepy.analysis.parents.
DISCONTINUITY_VALUE
= -1¶ Special value used to determine if a parent-child relationship has discontinuous dynamical continuity. Functions in this module uses this to set this value.
-
wepy.analysis.parents.
resampling_panel
(resampling_records, is_sorted=False)[source]¶ Converts an unordered collection of resampling records into a structured array (lists) corresponding to cycles and resampling steps within cycles.
It is like doing a pivot on the step indices into an extra dimension. Hence it can be thought of as a list of tables indexed by the cycle, hence the name panel.
- Parameters
resampling_records (list of nametuple records) – A list of resampling records.
is_sorted (bool) – If this is True it will be assumed that the resampling_records are presorted, otherwise they will be sorted.
- Returns
resampling_panel – The panel (list of tables) of resampling records in order (cycle, step, walker)
- Return type
list of list of list of namedtuple records
-
wepy.analysis.parents.
parent_panel
(decision_class, resampling_panel)[source]¶ Using the parental interpretation of resampling records given by the decision_class, convert resampling records in a resampling panel to parent indices.
- Parameters
decision_class (class implementing Decision interface) – The class that interprets resampling records for parental relationships.
resampling_panel (list of list of list of namedtuple records) – Panel of resampling records.
- Returns
parent_panel – A structured list of the same for as the resampling panel, with parent indices swapped for resampling records.
- Return type
list of list of list of int
-
wepy.analysis.parents.
net_parent_table
(parent_panel)[source]¶ Reduces a full parent panel to get parent indices on a cycle basis.
The full parent panel has parent indices for every step in each cycle. This computes the net parent relationships for each cycle, thus reducing the list of tables (panel) to a single table. A table need not have the same length rows, i.e. a ragged array, since there can be different numbers of walkers each cycle.
- Parameters
parent_panel (list of list of list of int) – The full panel of parent relationships.
- Returns
parent_table – Net parent relationships for each cycle.
- Return type
list of list of int
-
wepy.analysis.parents.
parent_table_discontinuities
(boundary_condition_class, parent_table, warping_records)[source]¶ Given a parent table and warping records returns a new parent table with the discontinuous warping events for parents set to a special value (-1).
- Parameters
boundary_condition_class (class implementing BoundaryCondition interface) – The boundary condition class that interprets warping records for if they are discontinuous or not.
parent_table (list of list of int) –
warping_records (list of namedtuple records) – The unordered collection of warping events from the simulation.
- Returns
parent_table – Same shape as input parent_table but with discontinuous relationships inserted as -1.
- Return type
list of list of int
-
wepy.analysis.parents.
ancestors
(parent_table, cycle_idx, walker_idx, ancestor_cycle=0)[source]¶ Returns the lineage of ancestors as walker indices leading up to the given walker.
- Parameters
- Returns
ancestor_trace – Contig walker trace of the ancestors leading up to the queried walker. The contig is sequence of cycles in the parent table.
- Return type
list of tuples of ints (traj_idx, cycle_idx)
-
wepy.analysis.parents.
sliding_window
(parent_table, window_length)[source]¶ Return contig walker traces of sliding windows of given length over the parent forest imposed over the contig given by the parent table.
Windows are given in no particular order and are nonredundant for trees.
- Parameters
parent_table (list of list of int) – Parent table defining parent relationships and contig.
window_length (int) – Length of window to use. Must be greater than 1.
- Returns
windows – List of contig walker traces.
- Return type
list of list of tuples of ints (traj_idx, cycle_idx)
-
class
wepy.analysis.parents.
ParentForest
(contig=None, parent_table=None)[source]¶ Bases:
object
A tree abstraction to a contig representing the family trees of walkers.
Uses a directed graph (networkx.DiGraph) to represent parent-child relationships; i.e. for edge (A, B) node A is a parent of node B.
Constructs a parent forest from either a Contig object or parent table.
Either a contig or parent_table must be given but not both.
The underlying data structure used is a parent table. However, if a contig is given a reference to it will be kept.
- Parameters
contig (Conting object, optional conditional on parent_table) –
parent_table (list of list of int, optional conditional on contig) – Must not contain the discontinuity values. If you want to include metadata on discontinuities use the contig input which is preferrable.
- Raises
ValueError – If neither parent_table nor contig is given, or if both are given.
-
WEIGHT
= 'weight'¶ Key for weight node attribute.
-
FREE_ENERGY
= 'free_energy'¶ Key for free energy node attribute.
-
ROOT_CYCLE_IDX
= -1¶ Special value for the root nodes cycle indices which the initial conditions of the simulation
-
DISCONTINUITY_VALUE
= -1¶ Special value used to determine if a parent-child relationship has discontinuous dynamical continuity. This class tests for this value in the parent table.
-
CONTINUITY_VALUE
= 0¶
-
_make_child_parent_edges
(step_idx, parent_idxs)[source]¶ Generate edge_ids and edge attributes for an array of parent indices.
- Parameters
step_idx (int) – Index of step, just sets the value to put into the node_ids
parent_idxs (list of int) – For element i in this list, the value j is the index of the child in slot i in step_idx+1. Thus j must be between 0 and len(parent_idxs)-1.
- Returns
edges (list of 2-tuple of node_id)
edge_attributes (list of dict)
-
property
contig
¶ Underlying contig if given during construction.
-
property
parent_table
¶ Underlying parent table data structure.
-
property
graph
¶ Underlying networkx.DiGraph object.
-
property
roots
¶ Returns the roots of all the trees in this forest.
-
property
trees
¶ Returns a list of the subtrees from each root in this forest. In no particular order
-
property
n_steps
¶ Number of steps of resampling in the parent forest.
-
step
(step_idx)[source]¶ Get the nodes at the step (level of the tree).
- Parameters
step_idx (int) –
- Returns
nodes
- Return type
list of node_id
-
steps
()[source]¶ Returns the nodes ordered by the step and walker indices.
- Returns
node_steps
- Return type
list of list of node_id
-
walker
(walker_idx)[source]¶ Get the nodes for this walker for the whole tree.
- Parameters
walker_idx (int) –
- Returns
nodes
- Return type
list of node_id
-
set_node_attributes
(attribute_key, node_attribute_dict)[source]¶ Set attributes for all nodes for a single key.
- Parameters
attribute_key (str) – Key to set for a node attribute.
node_attribute_dict (dict of node_id: value) – Dictionary mapping nodes to the values that will be set for the attribute_key.
-
set_attrs_by_array
(attribute_key, values)[source]¶ Set node attributes on a stepwise basis using structural indices.
Expects a array/list that is n_steps long and has the appropriate number of values for the number of walkers at each step.
- Parameters
attribute_key (str) – Key to set for a node attribute.
values (array_like of dim (n_steps, n_walkers) or list of) – list of values. Either an array_like if there is a constant number of walkers or a list of lists of the values.