Usage
Setup the configuration file
To train a model, compute explaination attributions and evaluation metrics on tabular data, you will need to specify a config file for each dataset. There are several examples in config/ with the following format:
path: "data/my_dataset.csv"
target_col: "class"
datetime_cols:
- "date"
cols_to_delete:
- "ID"
cleaned_data_path: "output/data/my_dataset_cleaned.csv"
task: "classification"
The different options can be described as follow:
path: path of the dataset, it can be usually placed in a folder
data/target_col: target column of interest
datetime_cols: columns with a datetime format that will be divided in several integer columns (year,month,day,hour)
cols_to_delete: columns to drop (for example ID columns)
cleaned_data_path: path to save the dataset after preprocessing, it can be directly used for repeated experiments
task: classification or regression
Other operations such as adding specific colums from columns operations or deleting specific values must be done during the instanciation of the dataset in the notebooks or scripts.
Notebooks
Several notebooks are available in the notebook section to train a model, compute explaination attributions and evaluation metrics on tabular data. It is recommended to execute the examples in the order they are presented.
Load data and train a model
from beexai.dataset.load_data import load_data
from beexai.dataset.dataset import Dataset
from beexai.training.train import Trainer
DATA_NAME = "configname"
MODEL_NAME = "NeuralNetwork"
CONFIG_PATH = f"config/{DATA_NAME}.yml"
df,target_col,task,_ = load_data(from_cleaned=False,config_path=CONFIG_PATH)
data = Dataset(df,target_col)
X_train, X_test, y_train, y_test = data.get_train_test()
NN_PARAMS = {"input_dim":X_train.shape[1],"output_dim":num_labels}
trainer = Trainer(MODEL_NAME,task,NN_PARAMS)
trainer.train(X_train, y_train)
Compute explaination attributions and evaluation metrics
from beexai.explaining import CaptumExplainer
from beexai.metrics.get_results import get_all_metrics
METHOD = "IntegratedGradients"
exp = CaptumExplainer(trainer.model,task=task,method=METHOD,sklearn=False)
exp.init_explainer()
LABEL = 0
get_all_metrics(X_test.values,LABEL,trainer.model,exp)