Testing Package

Submodules

causal_testing.testing.causal_test_case module

class causal_testing.testing.causal_test_case.CausalTestCase

Bases: object

A causal test case is a triple (X, Delta, Y), where X is an input configuration, Delta is an intervention, and Y is the expected causal effect on a particular output. The goal of a causal test case is to test whether the intervention Delta made to the input configuration X causes the model-under-test to produce the expected change in Y.

get_control_values()

Return a list of the control values for each treatment variable in this causal test case.

get_outcome_variables()

Return a list of the outcome variables (as strings) for this causal test case.

get_treatment_values()

Return a list of the treatment values for each treatment variable in this causal test case.

get_treatment_variables()

Return a list of the treatment variables (as strings) for this causal test case.

causal_testing.testing.causal_test_engine module

class causal_testing.testing.causal_test_engine.CausalTestEngine(causal_test_case: causal_testing.testing.causal_test_case.CausalTestCase, causal_specification: causal_testing.specification.causal_specification.CausalSpecification, data_collector: causal_testing.data_collection.data_collector.DataCollector)

Bases: object

Overarching workflow for Causal Testing. The CausalTestEngine proceeds in four steps. (1) Given a causal test case, specification, and (optionally) observational data, compute the causal estimand

that, once estimated, yields the desired causal effect. This is essentially the recipe for a statistical procedure that, according to the assumptions encoded in the causal specification, estimates the casual effect of interest.

  1. If using observational data, check whether the data is sufficient for estimating the causal effect of interest. If the data is insufficient, identify the missing data to guide the user towards un-exercised areas of the system-under-test. Else, if generating experimental data, run the model in the experimental conditions required to isolate the causal effect of interest.

  2. Using the gathered data (whether observational or experimental), implement the statistical procedure prescribed by the causal estimand. For example, apply a linear regression model which includes a term for the set of variables which block (d-separate) all back-door paths. Return the causal estimate obtained following this procedure and, optionally, (depending on the estimator used) confidence intervals for this estimate. These are provided as an instance of the CausalTestResult class.

  3. Define a test oracle procedure which uses the causal test results to determine whether the intervention has had the anticipated causal effect. This should assign a pass/fail value to the CausalTestResult.

execute_test(estimator: causal_testing.testing.estimators.Estimator, estimate_type: str = 'ate') causal_testing.testing.causal_test_outcome.CausalTestResult

Execute a causal test case and return the causal test result.

Test case execution proceeds with the following steps: (1) Check that data has been loaded using the method load_data (2) Check loaded data for any positivity violations and warn the user if so (3) Instantiate the estimator with the values of the causal test case. (4) Using the estimator, estimate the average treatment effect of the changing the treatment from control value

to treatment value on the outcome of interest, adjusting for the identified adjustment set.

  1. Depending on the estimator used, compute 95% confidence intervals for the estimate.

  2. Store results in an instance of CausalTestResults.

  3. Apply test oracle procedure to assign a pass/fail to the CausalTestResult and return.

Parameters
  • estimator – A reference to an Estimator class.

  • estimate_type – A string which denotes the type of estimate to return, ATE or CATE.

Return causal_test_result

A CausalTestResult for the executed causal test case.

load_data(n_repeats: int = 1, **kwargs)

Load execution data corresponding to the causal test case into a pandas dataframe and return the minimal adjustment set.

Data can be loaded in two ways:
  1. Experimentally - the model is executed with the treatment and control input configurations under conditions that guarantee the observed change in outcome must be caused by the change in input (intervention).

  2. Observationally - previous execution data is supplied in the form of a csv which is then filtered to remove any data corresponding to executions of a different scenario (i.e. not the scenario-under-test) and checked for positivity violations.

After the data is loaded, both are treated in the same way and, provided the identifiability and modelling assumptions hold, can be used to estimate the causal effect for the causal test case.

Parameters

n_repeats – An optional int which specifies the number of times to run a causal test case in the

experimental case. :return self: Update the causal test case’s execution data dataframe. :return minimal_adjustment_set: The smallest set of variables which can be adjusted for to obtain a causal estimate as opposed to a purely associational estimate.

causal_testing.testing.causal_test_outcome module

class causal_testing.testing.causal_test_outcome.CausalTestOutcome

Bases: abc.ABC

An abstract class representing an expected causal effect.

abstract apply(res: causal_testing.testing.causal_test_outcome.CausalTestResult) bool
class causal_testing.testing.causal_test_outcome.CausalTestResult

Bases: object

A container to hold the results of a causal test case. Every causal test case provides a point estimate of the ATE, given a particular treatment, outcome, and adjustment set. Some but not all estimators can provide confidence intervals.

ci_high()

Return the higher bracket of the confidence intervals.

ci_low()

Return the lower bracket of the confidence intervals.

summary()

Summarise the causal test result as an intuitive sentence.

class causal_testing.testing.causal_test_outcome.ExactValue(value: float)

Bases: causal_testing.testing.causal_test_outcome.CausalTestOutcome

An extension of TestOutcome representing that the expected causal effect should be a specific value.

apply(res: causal_testing.testing.causal_test_outcome.CausalTestResult) bool
class causal_testing.testing.causal_test_outcome.Negative

Bases: causal_testing.testing.causal_test_outcome.CausalTestOutcome

An extension of TestOutcome representing that the expected causal effect should be negative.

apply(res: causal_testing.testing.causal_test_outcome.CausalTestResult) bool
class causal_testing.testing.causal_test_outcome.NoEffect

Bases: causal_testing.testing.causal_test_outcome.CausalTestOutcome

An extension of TestOutcome representing that the expected causal effect should be zero.

apply(res: causal_testing.testing.causal_test_outcome.CausalTestResult) bool
class causal_testing.testing.causal_test_outcome.Positive

Bases: causal_testing.testing.causal_test_outcome.CausalTestOutcome

An extension of TestOutcome representing that the expected causal effect should be positive.

apply(res: causal_testing.testing.causal_test_outcome.CausalTestResult) bool

causal_testing.testing.estimators module

class causal_testing.testing.estimators.CausalForestEstimator(treatment: tuple, treatment_values: float, control_values: float, adjustment_set: set, outcome: tuple, df: Optional[pandas.core.frame.DataFrame] = None, effect_modifiers: Optional[set] = None)

Bases: causal_testing.testing.estimators.Estimator

A causal random forest estimator is a non-parametric estimator which recursively partitions the covariate space to learn a low-dimensional representation of treatment effect heterogeneity. This form of estimator is best suited to the estimation of heterogeneous treatment effects i.e. the estimated effect for every sample rather than the population average.

add_modelling_assumptions()

Add any modelling assumptions to the estimator.

Return self

Update self.modelling_assumptions

estimate_ate() float

Estimate the average treatment effect.

Return ate, confidence_intervals

The average treatment effect and 95% confidence intervals.

estimate_cates() pandas.core.frame.DataFrame

Estimate the conditional average treatment effect for each sample in the data as a function of a set of covariates (X) i.e. effect modifiers. That is, the predicted change in outcome caused by the intervention (change in treatment from control to treatment value) for every execution of the system-under-test, taking into account the value of each effect modifier X. As a result, for every unique setting of the set of covariates X, we expect a different CATE.

Return results_df

A dataframe containing a conditional average treatment effect, 95% confidence intervals, and

the covariate (effect modifier) values for each sample.

class causal_testing.testing.estimators.Estimator(treatment: tuple, treatment_values: float, control_values: float, adjustment_set: set, outcome: tuple, df: Optional[pandas.core.frame.DataFrame] = None, effect_modifiers: Optional[set] = None)

Bases: abc.ABC

An estimator contains all of the information necessary to compute a causal estimate for the effect of changing a set of treatment variables to a set of values.

All estimators must implement the following two methods:

1) add_modelling_assumptions: The validity of a model-assisted causal inference result depends on whether the modelling assumptions imposed by a model actually hold. Therefore, for each model, is important to state the modelling assumption upon which the validity of the results depend. To achieve this, the estimator object maintains a list of modelling assumptions (as strings). If a user wishes to implement their own estimator, they must implement this method and add all assumptions to the list of modelling assumptions.

2) estimate_ate: All estimators must be capable of returning the average treatment effect as a minimum. That is, the average effect of the intervention (changing treatment from control to treated value) on the outcome of interest adjusted for all confounders.

abstract add_modelling_assumptions()

Add modelling assumptions to the estimator. This is a list of strings which list the modelling assumptions that must hold if the resulting causal inference is to be considered valid.

compute_confidence_intervals()

Estimate the 95% Wald confidence intervals for the effect of changing the treatment from control values to treatment values on the outcome. :return: 95% Wald confidence intervals.

abstract estimate_ate() float

Estimate the unit effect of the treatment on the outcome. That is, the coefficient of the treatment variable in the linear regression equation. :return: The intercept and coefficient of the linear regression equation

class causal_testing.testing.estimators.LinearRegressionEstimator(treatment: tuple, treatment_values: float, control_values: float, adjustment_set: set, outcome: tuple, df: Optional[pandas.core.frame.DataFrame] = None, effect_modifiers: Optional[set] = None)

Bases: causal_testing.testing.estimators.Estimator

A Linear Regression Estimator is a parametric estimator which restricts the variables in the data to a linear combination of parameters and functions of the variables (note these functions need not be linear).

add_modelling_assumptions()

Add modelling assumptions to the estimator. This is a list of strings which list the modelling assumptions that must hold if the resulting causal inference is to be considered valid.

add_product_term_to_df(term_a: str, term_b: str)

Add a product term to the linear regression model and df.

This enables the user to capture interaction between a pair of variables in the model. In other words, while each covariate’s contribution to the mean is assumed to be independent of the other covariates, the pair of product terms term_a*term_b a are restricted to vary linearly with each other.

Parameters
  • term_a – The first term of the product term.

  • term_b – The second term of the product term.

add_squared_term_to_df(term_to_square: str)

Add a squared term to the linear regression model and df.

This enables the user to capture curvilinear relationships with a linear regression model, not just straight lines, while automatically adding the modelling assumption imposed by the addition of this term.

Parameters

term_to_square – The term (column in data and variable in DAG) which is to be squared.

estimate_ate()

Estimate the average treatment effect of the treatment on the outcome. That is, the change in outcome caused by changing the treatment variable from the control value to the treatment value.

Returns

The average treatment effect and the 95% Wald confidence intervals.

estimate_unit_ate() float

Estimate the unit average treatment effect of the treatment on the outcome. That is, the change in outcome caused by a unit change in treatment.

Returns

The unit average treatment effect and the 95% Wald confidence intervals.

causal_testing.testing.intervention module

class causal_testing.testing.intervention.Intervention(treatment_variables: tuple, treatment_values: tuple)

Bases: object

An intervention is an object which manipulates the input configuration of the scenario-under-test. It must define a method which takes the input configuration, does something to it, and returns a modified input configuration.

This provides a causal test case with two input configurations to compare: a control input configuration (the original) and a treatment input configuration (the modified). The causal test case then requires data for the execution of each of these input configurations to obtain the causal effect of this intervention.

apply(input_configuration: dict)

Take an input configuration and modify it in a particular way.

It is the effect of this change a causal test case will focus on.

Parameters

input_configuration – Input configuration for the scenario-under-test.

Return treatment_input_configuration

a modified input configuration.

Module contents