Specification Package¶
Submodules¶
causal_testing.specification.causal_dag module¶
- class causal_testing.specification.causal_dag.CausalDAG(dot_path: Optional[str] = None, **attr)¶
Bases:
networkx.classes.digraph.DiGraphA causal DAG is a directed acyclic graph in which nodes represent random variables and edges represent causality between a pair of random variables. We implement a CausalDAG as a networkx DiGraph with an additional check that ensures it is acyclic. A CausalDAG must be specified as a dot file.
- add_edge(u_of_edge: Union[str, int], v_of_edge: Union[str, int], **attr)¶
Add an edge to the causal DAG.
Overrides the default networkx method to prevent users from adding a cycle. :param u_of_edge: From node :param v_of_edge: To node :param attr: Attributes
- adjustment_set_is_minimal()¶
Given a list of treatments X, a list of outcomes Y, and an adjustment set Z, determine whether Z is the smallest possible adjustment set.
Z is the minimal adjustment set if no element of Z can be removed without breaking the constructive back-door criterion.
Reference: Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019, Corollary 5, p.19)
- Parameters
treatments – List of treatment variables.
outcomes – List of outcome variables.
adjustment_set – Set of adjustment variables.
- Returns
True or False depending on whether the adjustment set is minimal.
- constructive_backdoor_criterion()¶
A variation of Pearl’s back-door criterion applied to a proper backdoor graph which enables more efficient computation of minimal adjustment sets for the effect of a set of treatments on a set of outcomes.
The constructive back-door criterion is satisfied for a causal DAG G, a set of treatments X, a set of outcomes Y, and a set of covariates Z, if: (1) Z is not a descendent of any variable on a proper causal path between X and Y. (2) Z d-separates X and Y in the proper back-door graph relative to X and Y.
Reference: (Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019, Definition 4, p.16)
- Parameters
proper_backdoor_graph – A proper back-door graph relative to the specified treatments and outcomes.
treatments – A list of treatment variables that appear in the proper back-door graph.
outcomes – A list of outcome variables that appear in the proper back-door graph.
covariates – A list of variables that appear in the proper back-door graph that we will check against
the constructive back-door criterion. :return: True or False, depending on whether the set of covariates satisfies the constructive back-door criterion.
- depends_on_outputs(node: Union[str, int], scenario: causal_testing.specification.scenario.Scenario) bool¶
Check whether a given node in a given scenario is or depends on a model output in the given scenario. That is, whether or not the model needs to be run to determine its value.
NOTE: The graph must be acyclic for this to terminate.
- Parameters
node (Node) – The node in the DAG representing the variable of interest.
scenario (Scenario) – The modelling scenario.
- Returns
Whether the given variable is or depends on an output.
- Return type
bool
- enumerate_minimal_adjustment_sets()¶
Get the smallest possible set of variables that blocks all back-door paths between all pairs of treatments and outcomes.
This is an implementation of the Algorithm presented in Adjustment Criteria in Causal Diagrams: An Algorithmic Perspective, Textor and Lískiewicz, 2012 and extended in Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019. These works use the algorithm presented by Takata et al. in their work entitled: Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph, 2013.
At a high-level, this algorithm proceeds as follows for a causal DAG G, set of treatments X, and set of outcomes Y): 1). Transform G to a proper back-door graph G_pbd (remove the first edge from X on all proper causal paths). 2). Transform G_pbd to the ancestor moral graph (G_pbd[An(X union Y)])^m. 3). Apply Takata’s algorithm to output all minimal X-Y separators in the graph.
- Parameters
treatments – A list of strings representing treatments.
outcomes – A list of strings representing outcomes.
- Returns
A list of strings representing the minimal adjustment set.
- get_ancestor_graph()¶
Given a list of treament variables and a list of outcome variables, transform a CausalDAG into an ancestor graph.
An ancestor graph G[An(W)] for a CausalDAG G is a subgraph of G consisting of only the vertices who are ancestors of the set of variables W and all edges between them. Note that a node is an ancestor of itself.
Reference: (Adjustment Criteria in Causal Diagrams: An Algorithmic Perspective, Textor and Lískiewicz, 2012, p. 3 [Descendants and Ancestors]).
- Parameters
treatments – A list of treatment variables to include in the ancestral graph (and their ancestors).
outcomes – A list of outcome variables to include in the ancestral graph (and their ancestors).
- Returns
An ancestral graph relative to the set of variables X union Y.
- get_backdoor_graph()¶
A back-door graph is a graph for the list of treatments is a Causal DAG in which all edges leaving the treatment nodes are deleted.
- Parameters
treatments – The set of treatments whose outgoing edges will be deleted.
- Returns
A back-door graph corresponding to the given causal DAG and set of treatments.
- get_proper_backdoor_graph()¶
Convert the causal DAG to a proper back-door graph.
A proper back-door graph of a causal DAG is obtained by removing the first edge of every proper causal path from treatments to outcomes. A proper causal path from X to Y is a path of directed edges that starts from X and ends in Y.
Reference: (Separators and adjustment sets in causal graphs: Complete criteria and an algorithmic framework, Zander et al., 2019, Definition 3, p.15)
- Parameters
treatments – A list of treatment variables.
outcomes – A list of outcomes.
- Returns
A CausalDAG corresponding to the proper back-door graph.
- is_acyclic() bool¶
Checks if the graph is acyclic.
- Returns
True if acyclic, False otherwise.
- proper_causal_pathway()¶
Given a list of treatments and outcomes, compute the proper causal pathways between them.
PCP(X, Y) = {DeX^(X) - X} intersect AnX_(Y)}, where: - DeX^(X) refers to the descendents of X in the graph obtained by deleting all edges into X. - AnX_(Y) refers to the ancestors of Y in the graph obtained by deleting all edges leaving X.
- Parameters
treatments – A list of treatment variables in the causal DAG.
outcomes – A list of outcomes in the causal DAG.
- Return vars_on_proper_causal_pathway
Return a list of the variables on the proper causal pathway between
treatments and outcomes.
- causal_testing.specification.causal_dag.close_separator()¶
Compute the close separator for a set of treatments in an undirected graph.
A close separator (relative to a set of variables X) is a separator whose vertices are adjacent to those in X. An X-Y separator is a set of variables which, once deleted from a graph, create a subgraph in which X and Y are in different components.
Reference: (Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph, Ken Takata, 2013, p.4, CloseSeparator procedure).
- Parameters
graph – An undirected graph.
treatment_node – A label for the treatment node (parent of treatments in undirected graph).
outcome_node – A label for the outcome node (parent of outcomes in undirected graph).
treatment_node_set – The set of variables containing the treatment node ({treatment_node}).
- Returns
A treatment_node-outcome_node separator whose vertices are adjacent to those in treatments.
- causal_testing.specification.causal_dag.list_all_min_sep()¶
A backtracking algorithm for listing all minimal treatment-outcome separators in an undirected graph.
Reference: (Space-optimal, backtracking algorithms to list the minimal vertex separators of a graph, Ken Takata, 2013, p.5, ListMinSep procedure).
- Parameters
graph – An undirected graph.
treatment_node – The node corresponding to the treatment variable we wish to separate from the output.
outcome_node – The node corresponding to the outcome variable we wish to separate from the input.
treatment_node_set – Set of treatment nodes.
outcome_node_set – Set of outcome nodes.
- Returns
A list of minimal-sized sets of variables which separate treatment and outcome in the undirected graph.
causal_testing.specification.causal_specification module¶
- class causal_testing.specification.causal_specification.CausalSpecification(scenario: causal_testing.specification.scenario.Scenario, causal_dag: causal_testing.specification.causal_dag.CausalDAG)¶
Bases:
abc.ABC
causal_testing.specification.scenario module¶
- class causal_testing.specification.scenario.Scenario¶
Bases:
objectA scenario defines the setting by listing the endogenous variables, their datatypes, distributions, and any constraints over them. This is a common practice in CI and is analogous to an investigator specifying “we are interested in individuals over 40 who regularly eat cheese” or whatever. A scenario, here, is not a specific test case; it just defines the population of interest, in our case “runs of the model with parameters meeting the constraints”. The model may have other inputs/outputs which the investigator may choose to leave out. These are then exogenous variables and behave accordingly.
- Parameters
variables ({Variable}) – The set of endogenous variables.
constraints ({ExprRef}) – The set of constraints relating the endogenous variables.
- Attr variables
- Attr constraints
- add_variable(v: causal_testing.specification.variable.Variable) None¶
- inputs()¶
Get the set of scenario inputs.
- Returns
The scenario inputs.
- Return type
{Input}
- metas()¶
Get the set of scenario metavariables.
- Returns
The scenario metavariables.
- Return type
{Input}
- outputs()¶
Get the set of scenario outputs.
- Returns
The scenario outputs.
- Return type
{Output}
- setup_treatment_variables() None¶
Create a mirror of the current variable set with “primed” variables to represent the treatment values. Corresponding constraints are added to the contraint set such that the “primed” variables are constrained in the same way as their unprimed counterparts.
- variables_of_type()¶
Get the set of scenario variables of a particular type, e.g. Inputs.
- Parameters
t (type) – The type of variable to return, where t extends Variable.
- Returns
A set of scenario variables of the supplied type.
- Return type
{Variable}
causal_testing.specification.variable module¶
- class causal_testing.specification.variable.Input(name: str, datatype: causal_testing.specification.variable.T, distribution: Optional[scipy.stats._distn_infrastructure.rv_generic] = None)¶
Bases:
causal_testing.specification.variable.VariableAn extension of the Variable class representing inputs.
- copy(name=None) causal_testing.specification.variable.Input¶
Return a new instance of the Variable with the given name, or with the original name if no name is supplied.
- Parameters
name (str) – The variable name.
- Returns
A new Variable instance.
- Return type
- typestring() str¶
Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).
- Returns
A string representing the variable Type.
- Return type
str
- class causal_testing.specification.variable.Meta(name: str, datatype: causal_testing.specification.variable.T, populate: Callable[[pandas.core.frame.DataFrame], pandas.core.frame.DataFrame])¶
Bases:
causal_testing.specification.variable.VariableAn extension of the Variable class representing metavariables. These are variables which are relevant to the _causal_ structure and properties we may want to test, but are not directly related to the computational model either as inputs or outputs.
- Parameters
name (str) – The name of the variable.
datatype (T) – The datatype of the variable.
populate (Callable[[DataFrame], DataFrame]) – Populate a given dataframe containing runtime data with the
metavariable values as calculated from model inputs and ouputs. :attr populate:
- copy(name=None) causal_testing.specification.variable.Meta¶
Return a new instance of the Variable with the given name, or with the original name if no name is supplied.
- Parameters
name (str) – The variable name.
- Returns
A new Variable instance.
- Return type
- populate: Callable[[pandas.core.frame.DataFrame], pandas.core.frame.DataFrame]¶
- typestring() str¶
Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).
- Returns
A string representing the variable Type.
- Return type
str
- class causal_testing.specification.variable.Output(name: str, datatype: causal_testing.specification.variable.T, distribution: Optional[scipy.stats._distn_infrastructure.rv_generic] = None)¶
Bases:
causal_testing.specification.variable.VariableAn extension of the Variable class representing outputs.
- copy(name=None) causal_testing.specification.variable.Output¶
Return a new instance of the Variable with the given name, or with the original name if no name is supplied.
- Parameters
name (str) – The variable name.
- Returns
A new Variable instance.
- Return type
- typestring() str¶
Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).
- Returns
A string representing the variable Type.
- Return type
str
- class causal_testing.specification.variable.Variable(name: str, datatype: causal_testing.specification.variable.T, distribution: Optional[scipy.stats._distn_infrastructure.rv_generic] = None)¶
Bases:
abc.ABCAn abstract class representing causal variables.
- Parameters
name (str) – The name of the variable.
datatype (T) – The datatype of the variable.
distribution (rv_generic) – The expected distribution of the variable values.
- Attr type z3
The Z3 mirror of the variable.
- Attr name
- Attr datatype
- Attr distribution
- cast(val: any) causal_testing.specification.variable.T¶
Cast the supplied value to the datatype T of the variable.
- Parameters
val (any) – The value to cast.
- Returns
The supplied value as an instance of T.
- Return type
T
- abstract copy(name: Optional[str] = None) causal_testing.specification.variable.Variable¶
Return a new instance of the Variable with the given name, or with the original name if no name is supplied.
- Parameters
name (str) – The variable name.
- Returns
A new Variable instance.
- Return type
- datatype: causal_testing.specification.variable.T¶
- distribution: scipy.stats._distn_infrastructure.rv_generic¶
- name: str¶
- sample(n_samples: int) [T]¶
Generate a Latin Hypercube Sample of size n_samples according to the Variable’s distribution.
- Parameters
n_samples (int) – The number of samples to generate.
- Returns
A list of samples
- Return type
List[T]
- abstract typestring() str¶
Return the type of the Variable, e.g. INPUT, or OUTPUT. Note that this is NOT the datatype (int, str, etc.).
- Returns
A string representing the variable Type.
- Return type
str