beexai.training package
Submodules
beexai.training.models module
Architectures for neural networks.
- class beexai.training.models.NNModel(input_dim: int, output_dim: int, task: str, n_neurons: int = 32, device: str = 'cpu', batch_norm: bool = True, use_dropout: bool = True, dropout_rate: float = 0.1, n_hidden_layers: int = 1)[source]
Bases:
NeuralNetworkInherit from NeuralNetwork to overwrite fit and predict methods.
- fit(x_train: DataFrame | ndarray | Tensor, y_train: DataFrame | ndarray | Tensor, learning_rate: float = 0.005, epochs: int = 1000, loss_file: str | None = None, x_val: DataFrame | ndarray | Tensor | None = None, y_val: DataFrame | ndarray | Tensor | None = None) Any[source]
- class beexai.training.models.NeuralNetwork(input_dim: int, output_dim: int, task: str, n_neurons: int = 32, batch_norm: bool = True, use_dropout: bool = True, dropout_rate: float = 0.1, n_hidden_layers: int = 1)[source]
Bases:
ModuleNeural network class.
- forward(x_in: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class beexai.training.models.NeuralNetworkBlock(n_neurons: int = 32, batch_norm: bool = True, use_dropout: bool = True, dropout_rate: float = 0.1)[source]
Bases:
ModuleNeural network block class.
- forward(x_in: Tensor) Tensor[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
beexai.training.train module
Training models and evaluating their performance.
- class beexai.training.train.Trainer(model_name: str, task: str, model_params: dict | None = None, device: str = 'cpu')[source]
Bases:
objectTrainer class
- model
model object
- Type:
callable
- Parameters:
model_name (str) – Name of the model from models dict. Must be one of ‘LogisticRegression’, ‘LinearRegression’, ‘DecisionTreeClassifier’, ‘RandomForestClassifier’, ‘GradientBoostingClassifier’, ‘XGBClassifier’, ‘DecisionTreeRegressor’, ‘RandomForestRegressor’, ‘GradientBoostingRegressor’, ‘XGBRegressor’, ‘NeuralNetwork’, ‘HistGradientBoostingClassifier’, ‘HistGradientBoostingRegressor’
task (str) – “classification” or “regression”.
model_params (dict) – Parameters for the model
device (str) – device to use. Defaults to “cpu”.
- cross_val(x_train: DataFrame, y_train: DataFrame, param_grid: dict | None = None, scoring: str | None = None, kfold: int | KFold = 5, search_type: str = 'grid') Callable[source]
Cross validation for the model
- Parameters:
x_train (pd.DataFrame) – train set
y_train (pd.DataFrame) – target
param_grid (dict, optional) – grid search parameters. Defaults to None.
scoring (str, optional) – scoring metric. Defaults to None.
kfold (Union[int, KFold], optional) – number of folds or kfold object. Defaults to 5.
search_type (str, optional) – “grid” or “random”. Defaults to “grid”.
- Returns:
best model
- Return type:
callable
- get_metrics(x: DataFrame, y: DataFrame) dict[source]
Get metrics for the model. Accuracy and f1 score for classification, mse and r2 score for regression.
- train(x_train: DataFrame | ndarray | Tensor, y_train: DataFrame | ndarray | Tensor, learning_rate: float = 0.005, epochs: int = 1000, loss_file: str | None = None, x_val: DataFrame | ndarray | Tensor | None = None, y_val: DataFrame | ndarray | Tensor | None = None) Callable[source]
Perform training on the whole training set.
- Parameters:
x_train (pd.DataFrame) – x_train
y_train (pd.DataFrame) – y_train
learning_rate (float, optional) – learning rate. Defaults to 0.005.
epochs (int, optional) – number of epochs. Defaults to 1000.
loss_file (str, optional) – path to save the loss plot. Defaults to None.
x_val (pd.DataFrame, optional) – validation set. Defaults to None.
y_val (pd.DataFrame, optional) – validation target. Defaults to None.
- Returns:
trained model
- Return type:
callable
- beexai.training.train.grid_search_all_models(x_train: DataFrame, y_train: DataFrame, task: str, params_dict: dict | None = None, params_grid_dict: dict | None = None, scoring: str | None = None, kfold: int | KFold = 5, search_type: str = 'grid') Tuple[dict, dict][source]
Grid search for all models
- Parameters:
x_train (pd.DataFrame) – x_train
y_train (pd.DataFrame) – y_train
task (str) – “classification” or “regression”
params_dict (dict, optional) – parameters for each model. Defaults to None.
params_grid_dict (dict, optional) – grid search parameters for each model. Defaults to None.
scoring (str, optional) – scoring metric. Defaults to None.
kfold (Union[int, KFold], optional) – kfold object. Defaults to 5.
search_type (str, optional) – “grid” or “random”. Defaults to “grid”.
- Returns:
best models and best parameters
- Return type: