Classifiers utilities

clf_utilities.create_clf_params_product_generator(params_grid)[source]

Generates all possible combinations of classifier’s hyperparameters values.

Parameters

params_grid (dict) – Contains classifier’s hyperparameters names as keys and the correspoding search space as values

Yields

dict – Contains a classifier’s hyperparameters configuration

clf_utilities.create_feature_sets_generator(fold_path)[source]

Creates a generator yielding features sets names.

Parameters

fold_path (str) – Path to read features sets

Yields

list – pairs of X_train, X_test features sets names

clf_utilities.evaluate(y_test, y_pred)[source]

Evaluates model predictions through a series of metrics.

Parameters
  • y_test (numpy.ndarray) – True labels

  • y_pred (numpy.ndarray) – Predicted labels

Returns

Contains metrics names as keys and the corresponding values as values

Return type

dict

clf_utilities.get_top_k_predictions(model, X_test)[source]

Makes predictions utilizing model over X_test.

Parameters
  • model (object) – The model to be used for predictions

  • X_test (numpy.ndarray) – The test features array

Returns

Contains predictions in (label, score) pairs

Return type

list

clf_utilities.inverse_transform_labels(encoder, k_preds)[source]

Utilizes encoder to transform encoded labels back to the original strings.

Parameters
  • encoder (sklearn.preprocessing.LabelEncoder) – The encoder to be utilized

  • k_preds (list) – Contains predictions in (label, score) pairs

Returns

Contains predictions in (label, score) pairs, where label is now in the original string format

Return type

list

clf_utilities.is_valid(clf_name)[source]

Checks whether clf_name is a valid classifier’s name with respect to the experiment setup.

Parameters

clf_name (str) – Classifier’s name

Returns

Returns True if given classifier’s name is valid

Return type

bool

clf_utilities.k_accuracy_score(y_test, k_best)[source]

Measures the defined k-accuracy metric. For each poi, a successful prediction is considered if true label appears in the top k labels predicted by the model,

Parameters
  • y_test (numpy.ndarray) – True labels

  • k_best (numpy.ndarray) – Top k predicted labels

Returns

The k accuracy score

Return type

float

clf_utilities.normalize_scores(scores)[source]

Normalizes predictions scores to a probabilities-like format.

Parameters

scores (list) – Contains the predictions scores as predicted by the model

Returns

The normalized scores

Return type

list

clf_utilities.train_classifier(clf_name, X_train, y_train)[source]

Trains a classifier through grid search.

Parameters
  • clf_name (str) – Classifier’s name to be trained

  • X_train (numpy.ndarray) – Train features array

  • y_train (numpy.ndarray) – Train labels array

Returns

The trained classifier

Return type

object