Data Collection Package¶
Submodules¶
causal_testing.data_collection.data_collector module¶
- class causal_testing.data_collection.data_collector.DataCollector(scenario: causal_testing.specification.scenario.Scenario)¶
Bases:
abc.ABCA data collector is a mechanism which generates or collects data from a system for a given scenario.
- abstract collect_data(**kwargs) pandas.core.frame.DataFrame¶
Populate the dataframe with execution data. :return df: A pandas dataframe containing execution data for the system-under-test.
- filter_valid_data(data: pandas.core.frame.DataFrame, check_pos: bool = True) pandas.core.frame.DataFrame¶
Check is execution data is valid for the scenario-under-test.
Data is invalid if it does not meet the constraints specified in the scenario-under-test.
- Parameters
data – A pandas dataframe containing execution data from the system-under-test.
check_pos – Whether to check the data for positivity violations (defaults to true).
- Return satisfying_data
A pandas dataframe containing execution data that satisfy the constraints specified
in the scenario-under-test.
- class causal_testing.data_collection.data_collector.ExperimentalDataCollector(scenario: causal_testing.specification.scenario.Scenario, control_input_configuration: dict, treatment_input_configuration: dict, n_repeats: int = 1)¶
Bases:
causal_testing.data_collection.data_collector.DataCollectorA data collector that generates data directly by running the system-under-test in the desired conditions.
Users should implement these methods to collect data from their system.
- abstract collect_data(**kwargs) pandas.core.frame.DataFrame¶
Populate the dataframe with execution data.
- Returns
A pandas dataframe containing execution data for the system-under-test in both control and treatment
executions.
- abstract run_system_with_input_configuration(input_configuration: dict) pandas.core.frame.DataFrame¶
Run the system with a given input configuration and return the resulting execution data.
- Parameters
input_configuration – A dictionary which maps a subset of inputs to values.
- Returns
A pandas dataframe containing execution data obtained by executing the system-under-test with the
specified input configuration.
- class causal_testing.data_collection.data_collector.ObservationalDataCollector(scenario: causal_testing.specification.scenario.Scenario, csv_path: str)¶
Bases:
causal_testing.data_collection.data_collector.DataCollectorA data collector that extracts data that is relevant to the specified scenario from a csv of execution data.
- collect_data(**kwargs) pandas.core.frame.DataFrame¶
Read a csv containing execution data for the system-under-test into a pandas dataframe and filter to remove any data which is invalid for the scenario-under-test.
Data is invalid if it does not meet the constraints outlined in the scenario-under-test (Scenario).
- Returns
A pandas dataframe containing execution data that is valid for the scenario-under-test.