Specification Package

Submodules

causal_testing.specification.causal_dag module

class causal_testing.specification.causal_dag.CausalDAG(dot_path: Optional[str] = None, **attr)

Bases: networkx.classes.digraph.DiGraph

A causal DAG is a directed acyclic graph in which nodes represent random variables and edges represent causality between a pair of random variables. We implement a CausalDAG as a networkx DiGraph with an additional check that ensures it is acyclic. A CausalDAG must be specified as a dot file.

add_edge(u_of_edge: Union[str, int], v_of_edge: Union[str, int], **attr)

Add an edge to the causal DAG.

Overrides the default networkx method to prevent users from adding a cycle. :param u_of_edge: From node :param v_of_edge: To node :param attr: Attributes

adjustment_set_is_minimal()

Given a list of treatments X, a list of outcomes Y, and an adjustment set Z, determine whether Z is the smallest possible adjustment set.

Z is the minimal adjustment set if no element of Z can be removed without breaking the constructive back-door criterion.

Reference: Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019, Corollary 5, p.19)

Parameters
  • treatments – List of treatment variables.

  • outcomes – List of outcome variables.

  • adjustment_set – Set of adjustment variables.

Returns

True or False depending on whether the adjustment set is minimal.

constructive_backdoor_criterion()

A variation of Pearl’s back-door criterion applied to a proper backdoor graph which enables more efficient computation of minimal adjustment sets for the effect of a set of treatments on a set of outcomes.

The constructive back-door criterion is satisfied for a causal DAG G, a set of treatments X, a set of outcomes Y, and a set of covariates Z, if: (1) Z is not a descendent of any variable on a proper causal path between X and Y. (2) Z d-separates X and Y in the proper back-door graph relative to X and Y.

Reference: (Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019, Definition 4, p.16)

Parameters
  • proper_backdoor_graph – A proper back-door graph relative to the specified treatments and outcomes.

  • treatments – A list of treatment variables that appear in the proper back-door graph.

  • outcomes – A list of outcome variables that appear in the proper back-door graph.

  • covariates – A list of variables that appear in the proper back-door graph that we will check against

the constructive back-door criterion. :return: True or False, depending on whether the set of covariates satisfies the constructive back-door criterion.

depends_on_outputs(node: Union[str, int], scenario: causal_testing.specification.scenario.Scenario) bool

Check whether a given node in a given scenario is or depends on a model output in the given scenario. That is, whether or not the model needs to be run to determine its value.

NOTE: The graph must be acyclic for this to terminate.

Parameters
  • node (Node) – The node in the DAG representing the variable of interest.

  • scenario (Scenario) – The modelling scenario.

Returns

Whether the given variable is or depends on an output.

Return type

bool

enumerate_minimal_adjustment_sets()

Get the smallest possible set of variables that blocks all back-door paths between all pairs of treatments and outcomes.

This is an implementation of the Algorithm presented in Adjustment Criteria in Causal Diagrams: An Algorithmic Perspective, Textor and Lískiewicz, 2012 and extended in Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019. These works use the algorithm presented by Takata et al. in their work entitled: Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph, 2013.

At a high-level, this algorithm proceeds as follows for a causal DAG G, set of treatments X, and set of outcomes Y): 1). Transform G to a proper back-door graph G_pbd (remove the first edge from X on all proper causal paths). 2). Transform G_pbd to the ancestor moral graph (G_pbd[An(X union Y)])^m. 3). Apply Takata’s algorithm to output all minimal X-Y separators in the graph.

Parameters
  • treatments – A list of strings representing treatments.

  • outcomes – A list of strings representing outcomes.

Returns

A list of strings representing the minimal adjustment set.

get_ancestor_graph()

Given a list of treament variables and a list of outcome variables, transform a CausalDAG into an ancestor graph.

An ancestor graph G[An(W)] for a CausalDAG G is a subgraph of G consisting of only the vertices who are ancestors of the set of variables W and all edges between them. Note that a node is an ancestor of itself.

Reference: (Adjustment Criteria in Causal Diagrams: An Algorithmic Perspective, Textor and Lískiewicz, 2012, p. 3 [Descendants and Ancestors]).

Parameters
  • treatments – A list of treatment variables to include in the ancestral graph (and their ancestors).

  • outcomes – A list of outcome variables to include in the ancestral graph (and their ancestors).

Returns

An ancestral graph relative to the set of variables X union Y.

get_backdoor_graph()

A back-door graph is a graph for the list of treatments is a Causal DAG in which all edges leaving the treatment nodes are deleted.

Parameters

treatments – The set of treatments whose outgoing edges will be deleted.

Returns

A back-door graph corresponding to the given causal DAG and set of treatments.

get_proper_backdoor_graph()

Convert the causal DAG to a proper back-door graph.

A proper back-door graph of a causal DAG is obtained by removing the first edge of every proper causal path from treatments to outcomes. A proper causal path from X to Y is a path of directed edges that starts from X and ends in Y.

Reference: (Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019, Definition 3, p.15)

Parameters
  • treatments – A list of treatment variables.

  • outcomes – A list of outcomes.

Returns

A CausalDAG corresponding to the proper back-door graph.

is_acyclic() bool

Checks if the graph is acyclic.

Returns

True if acyclic, False otherwise.

proper_causal_pathway()

Given a list of treatments and outcomes, compute the proper causal pathways between them.

PCP(X, Y) = {DeX^(X) - X} intersect AnX_(Y)}, where: - DeX^(X) refers to the descendents of X in the graph obtained by deleting all edges into X. - AnX_(Y) refers to the ancestors of Y in the graph obtained by deleting all edges leaving X.

Parameters
  • treatments – A list of treatment variables in the causal DAG.

  • outcomes – A list of outcomes in the causal DAG.

Return vars_on_proper_causal_pathway

Return a list of the variables on the proper causal pathway between

treatments and outcomes.

causal_testing.specification.causal_dag.close_separator()

Compute the close separator for a set of treatments in an undirected graph.

A close separator (relative to a set of variables X) is a separator whose vertices are adjacent to those in X. An X-Y separator is a set of variables which, once deleted from a graph, create a subgraph in which X and Y are in different components.

Reference: (Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph, Ken Takata, 2013, p.4, CloseSeparator procedure).

Parameters
  • graph – An undirected graph.

  • treatment_node – A label for the treatment node (parent of treatments in undirected graph).

  • outcome_node – A label for the outcome node (parent of outcomes in undirected graph).

  • treatment_node_set – The set of variables containing the treatment node ({treatment_node}).

Returns

A treatment_node-outcome_node separator whose vertices are adjacent to those in treatments.

causal_testing.specification.causal_dag.list_all_min_sep()

A backtracking algorithm for listing all minimal treatment-outcome separators in an undirected graph.

Reference: (Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph, Ken Takata, 2013, p.5, ListMinSep procedure).

Parameters
  • graph – An undirected graph.

  • treatment_node – The node corresponding to the treatment variable we wish to separate from the output.

  • outcome_node – The node corresponding to the outcome variable we wish to separate from the input.

  • treatment_node_set – Set of treatment nodes.

  • outcome_node_set – Set of outcome nodes.

Returns

A list of minimal-sized sets of variables which separate treatment and outcome in the undirected graph.

causal_testing.specification.causal_specification module

class causal_testing.specification.causal_specification.CausalSpecification(scenario: causal_testing.specification.scenario.Scenario, causal_dag: causal_testing.specification.causal_dag.CausalDAG)

Bases: abc.ABC

causal_testing.specification.scenario module

class causal_testing.specification.scenario.Scenario

Bases: object

A scenario defines the setting by listing the endogenous variables, their datatypes, distributions, and any constraints over them. This is a common practice in CI and is analogous to an investigator specifying “we are interested in individuals over 40 who regularly eat cheese” or whatever. A scenario, here, is not a specific test case; it just defines the population of interest, in our case “runs of the model with parameters meeting the constraints”. The model may have other inputs/outputs which the investigator may choose to leave out. These are then exogenous variables and behave accordingly.

Parameters
  • variables ({Variable}) – The set of endogenous variables.

  • constraints ({ExprRef}) – The set of constraints relating the endogenous variables.

Attr variables

Attr constraints

add_variable(v: causal_testing.specification.variable.Variable) None
inputs()

Get the set of scenario inputs.

Returns

The scenario inputs.

Return type

{Input}

metas()

Get the set of scenario metavariables.

Returns

The scenario metavariables.

Return type

{Input}

outputs()

Get the set of scenario outputs.

Returns

The scenario outputs.

Return type

{Output}

setup_treatment_variables() None

Create a mirror of the current variable set with “primed” variables to represent the treatment values. Corresponding constraints are added to the contraint set such that the “primed” variables are constrained in the same way as their unprimed counterparts.

variables_of_type()

Get the set of scenario variables of a particular type, e.g. Inputs.

Parameters

t (type) – The type of variable to return, where t extends Variable.

Returns

A set of scenario variables of the supplied type.

Return type

{Variable}

causal_testing.specification.variable module

class causal_testing.specification.variable.Input(name: str, datatype: causal_testing.specification.variable.T, distribution: Optional[scipy.stats._distn_infrastructure.rv_generic] = None)

Bases: causal_testing.specification.variable.Variable

An extension of the Variable class representing inputs.

copy(name=None) causal_testing.specification.variable.Input

Return a new instance of the Variable with the given name, or with the original name if no name is supplied.

Parameters

name (str) – The variable name.

Returns

A new Variable instance.

Return type

Variable

typestring() str

Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).

Returns

A string representing the variable Type.

Return type

str

class causal_testing.specification.variable.Meta(name: str, datatype: causal_testing.specification.variable.T, populate: Callable[[pandas.core.frame.DataFrame], pandas.core.frame.DataFrame])

Bases: causal_testing.specification.variable.Variable

An extension of the Variable class representing metavariables. These are variables which are relevant to the _causal_ structure and properties we may want to test, but are not directly related to the computational model either as inputs or outputs.

Parameters
  • name (str) – The name of the variable.

  • datatype (T) – The datatype of the variable.

  • populate (Callable[[DataFrame], DataFrame]) – Populate a given dataframe containing runtime data with the

metavariable values as calculated from model inputs and ouputs. :attr populate:

copy(name=None) causal_testing.specification.variable.Meta

Return a new instance of the Variable with the given name, or with the original name if no name is supplied.

Parameters

name (str) – The variable name.

Returns

A new Variable instance.

Return type

Variable

populate: Callable[[pandas.core.frame.DataFrame], pandas.core.frame.DataFrame]
typestring() str

Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).

Returns

A string representing the variable Type.

Return type

str

class causal_testing.specification.variable.Output(name: str, datatype: causal_testing.specification.variable.T, distribution: Optional[scipy.stats._distn_infrastructure.rv_generic] = None)

Bases: causal_testing.specification.variable.Variable

An extension of the Variable class representing outputs.

copy(name=None) causal_testing.specification.variable.Output

Return a new instance of the Variable with the given name, or with the original name if no name is supplied.

Parameters

name (str) – The variable name.

Returns

A new Variable instance.

Return type

Variable

typestring() str

Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).

Returns

A string representing the variable Type.

Return type

str

class causal_testing.specification.variable.Variable(name: str, datatype: causal_testing.specification.variable.T, distribution: Optional[scipy.stats._distn_infrastructure.rv_generic] = None)

Bases: abc.ABC

An abstract class representing causal variables.

Parameters
  • name (str) – The name of the variable.

  • datatype (T) – The datatype of the variable.

  • distribution (rv_generic) – The expected distribution of the variable values.

Attr type z3

The Z3 mirror of the variable.

Attr name

Attr datatype

Attr distribution

cast(val: any) causal_testing.specification.variable.T

Cast the supplied value to the datatype T of the variable.

Parameters

val (any) – The value to cast.

Returns

The supplied value as an instance of T.

Return type

T

abstract copy(name: Optional[str] = None) causal_testing.specification.variable.Variable

Return a new instance of the Variable with the given name, or with the original name if no name is supplied.

Parameters

name (str) – The variable name.

Returns

A new Variable instance.

Return type

Variable

datatype: causal_testing.specification.variable.T
distribution: scipy.stats._distn_infrastructure.rv_generic
name: str
sample(n_samples: int) [T]

Generate a Latin Hypercube Sample of size n_samples according to the Variable’s distribution.

Parameters

n_samples (int) – The number of samples to generate.

Returns

A list of samples

Return type

List[T]

abstract typestring() str

Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).

Returns

A string representing the variable Type.

Return type

str

Module contents