wepy.analysis.parents module

Routines for converting resampling data to parental lineages.

The core mechanism of weighted ensemble (WE) is to resample cohorts of parallel simulations. Principally this means ‘cloning’ and ‘merging’ different walkers which gives rise to branching ‘family’ tree structures.

The routines in this module are concerned with utilizing raw resampling records data from a resampler and extracting these lineages in useful easy to query structures.

Cloning and merging is performed on a cohort of walkers at every cycle. A walker has both a state and weight. The state is related to the dynamical state such as the positions, velocities, etc. that describe the state of a simulation. The weight corresponds to the probability normalized with the other walkers of the cohort.

An n-clone constitutes copying a state n times to n walkers with w/n weights where w is the weight of the cloned walker. The cloned walker is said to be the parent of the new child walkers.

An k-merge constitutes combining k walkers into a single walker with a weight that is the sum of all k walkers. The state of this walker is chosen by sampling from the discrete distribution of the k walkers. The walker that has it’s state chosen to persist is referred to as the ‘kept’ walker and the other walkers are said to be ‘squashed’, as they are compressed into a single walker, preserving their weight but not their state. Squashed walkers are leaves of the tree and leave no children. The kept walker and walkers for which no cloning or merging was performed will have a single child.

Routines

resampling_panelCompiles an unordered collection of resampling records

into a structured array of records.

parent_panelUsing a parental relationship reduce the records in a

resampling panel to a structured array of parent indices.

net_parent_tableReduce the full parent panel (with multiple steps

per cycle) to the net results of resampling per cycle.

parent_table_discontinutiesUsing an interpretation of warping

records assigns a special value in a parent table for discontinuous warping events.

ancestorsGenerate the lineage trace of a given walker back through

history. This is used to get logically contiguous trajectories from walker slot trajectories (not necessarily contiguous in storage).

sliding_windowGenerate non-redundant sliding window traces over the

parent forest (tree) imposed over a parent table.

ParentForestClass that imposes the forest (tree) structure over the

parent table. Valid for a single contig in the contig tree (forest).

wepy.analysis.parents.DISCONTINUITY_VALUE = -1

Special value used to determine if a parent-child relationship has discontinuous dynamical continuity. Functions in this module uses this to set this value.

wepy.analysis.parents.resampling_panel(resampling_records, is_sorted=False)[source]

Converts an unordered collection of resampling records into a structured array (lists) corresponding to cycles and resampling steps within cycles.

It is like doing a pivot on the step indices into an extra dimension. Hence it can be thought of as a list of tables indexed by the cycle, hence the name panel.

Parameters
  • resampling_records (list of nametuple records) – A list of resampling records.

  • is_sorted (bool) – If this is True it will be assumed that the resampling_records are presorted, otherwise they will be sorted.

Returns

resampling_panel – The panel (list of tables) of resampling records in order (cycle, step, walker)

Return type

list of list of list of namedtuple records

wepy.analysis.parents.parent_panel(decision_class, resampling_panel)[source]

Using the parental interpretation of resampling records given by the decision_class, convert resampling records in a resampling panel to parent indices.

Parameters
  • decision_class (class implementing Decision interface) – The class that interprets resampling records for parental relationships.

  • resampling_panel (list of list of list of namedtuple records) – Panel of resampling records.

Returns

parent_panel – A structured list of the same for as the resampling panel, with parent indices swapped for resampling records.

Return type

list of list of list of int

wepy.analysis.parents.net_parent_table(parent_panel)[source]

Reduces a full parent panel to get parent indices on a cycle basis.

The full parent panel has parent indices for every step in each cycle. This computes the net parent relationships for each cycle, thus reducing the list of tables (panel) to a single table. A table need not have the same length rows, i.e. a ragged array, since there can be different numbers of walkers each cycle.

Parameters

parent_panel (list of list of list of int) – The full panel of parent relationships.

Returns

parent_table – Net parent relationships for each cycle.

Return type

list of list of int

wepy.analysis.parents.parent_table_discontinuities(boundary_condition_class, parent_table, warping_records)[source]

Given a parent table and warping records returns a new parent table with the discontinuous warping events for parents set to a special value (-1).

Parameters
  • boundary_condition_class (class implementing BoundaryCondition interface) – The boundary condition class that interprets warping records for if they are discontinuous or not.

  • parent_table (list of list of int) –

  • warping_records (list of namedtuple records) – The unordered collection of warping events from the simulation.

Returns

parent_table – Same shape as input parent_table but with discontinuous relationships inserted as -1.

Return type

list of list of int

wepy.analysis.parents.parent_cycle_discontinuities(parent_idxs, discontinuities)[source]
wepy.analysis.parents.ancestors(parent_table, cycle_idx, walker_idx, ancestor_cycle=0)[source]

Returns the lineage of ancestors as walker indices leading up to the given walker.

Parameters
  • parent_table (list of list of int) –

  • cycle_idx (int) – Cycle of walker to query.

  • walker_idx (int) – Walker index in to query along with cycle_idx.

  • ancestor_cycle (int) – Index of cycle in history to go back to. Must be less than cycle_idx.

Returns

ancestor_trace – Contig walker trace of the ancestors leading up to the queried walker. The contig is sequence of cycles in the parent table.

Return type

list of tuples of ints (traj_idx, cycle_idx)

wepy.analysis.parents.sliding_window(parent_table, window_length)[source]

Return contig walker traces of sliding windows of given length over the parent forest imposed over the contig given by the parent table.

Windows are given in no particular order and are nonredundant for trees.

Parameters
  • parent_table (list of list of int) – Parent table defining parent relationships and contig.

  • window_length (int) – Length of window to use. Must be greater than 1.

Returns

windows – List of contig walker traces.

Return type

list of list of tuples of ints (traj_idx, cycle_idx)

class wepy.analysis.parents.ParentForest(contig=None, parent_table=None)[source]

Bases: object

A tree abstraction to a contig representing the family trees of walkers.

Uses a directed graph (networkx.DiGraph) to represent parent-child relationships; i.e. for edge (A, B) node A is a parent of node B.

Constructs a parent forest from either a Contig object or parent table.

Either a contig or parent_table must be given but not both.

The underlying data structure used is a parent table. However, if a contig is given a reference to it will be kept.

Parameters
  • contig (Conting object, optional conditional on parent_table) –

  • parent_table (list of list of int, optional conditional on contig) – Must not contain the discontinuity values. If you want to include metadata on discontinuities use the contig input which is preferrable.

Raises

ValueError – If neither parent_table nor contig is given, or if both are given.

WEIGHT = 'weight'

Key for weight node attribute.

FREE_ENERGY = 'free_energy'

Key for free energy node attribute.

ROOT_CYCLE_IDX = -1

Special value for the root nodes cycle indices which the initial conditions of the simulation

DISCONTINUITY_VALUE = -1

Special value used to determine if a parent-child relationship has discontinuous dynamical continuity. This class tests for this value in the parent table.

CONTINUITY_VALUE = 0
_make_child_parent_edges(step_idx, parent_idxs)[source]

Generate edge_ids and edge attributes for an array of parent indices.

Parameters
  • step_idx (int) – Index of step, just sets the value to put into the node_ids

  • parent_idxs (list of int) – For element i in this list, the value j is the index of the child in slot i in step_idx+1. Thus j must be between 0 and len(parent_idxs)-1.

Returns

  • edges (list of 2-tuple of node_id)

  • edge_attributes (list of dict)

property contig

Underlying contig if given during construction.

property parent_table

Underlying parent table data structure.

property graph

Underlying networkx.DiGraph object.

property roots

Returns the roots of all the trees in this forest.

property trees

Returns a list of the subtrees from each root in this forest. In no particular order

property n_steps

Number of steps of resampling in the parent forest.

step(step_idx)[source]

Get the nodes at the step (level of the tree).

Parameters

step_idx (int) –

Returns

nodes

Return type

list of node_id

steps()[source]

Returns the nodes ordered by the step and walker indices.

Returns

node_steps

Return type

list of list of node_id

walker(walker_idx)[source]

Get the nodes for this walker for the whole tree.

Parameters

walker_idx (int) –

Returns

nodes

Return type

list of node_id

set_node_attributes(attribute_key, node_attribute_dict)[source]

Set attributes for all nodes for a single key.

Parameters
  • attribute_key (str) – Key to set for a node attribute.

  • node_attribute_dict (dict of node_id: value) – Dictionary mapping nodes to the values that will be set for the attribute_key.

set_attrs_by_array(attribute_key, values)[source]

Set node attributes on a stepwise basis using structural indices.

Expects a array/list that is n_steps long and has the appropriate number of values for the number of walkers at each step.

Parameters
  • attribute_key (str) – Key to set for a node attribute.

  • values (array_like of dim (n_steps, n_walkers) or list of) – list of values. Either an array_like if there is a constant number of walkers or a list of lists of the values.