Classifiers utilities¶

clf_utilities.create_clf_params_product_generator(params_grid)[source]¶

Generates all possible combinations of classifier’s hyperparameters values.

Parameters: params_grid (dict) – Contains classifier’s hyperparameters names as keys and the correspoding search space as values
Yields: dict – Contains a classifier’s hyperparameters configuration

clf_utilities.create_feature_sets_generator(fold_path)[source]¶

Creates a generator yielding features sets names.

Parameters: fold_path (str) – Path to read features sets
Yields: list – pairs of X_train, X_test features sets names

clf_utilities.evaluate(y_test, y_pred)[source]¶

Evaluates model predictions through a series of metrics.

Parameters

y_test (numpy.ndarray) – True labels
y_pred (numpy.ndarray) – Predicted labels

Returns

Contains metrics names as keys and the corresponding values as values

Return type

dict

clf_utilities.get_top_k_predictions(model, X_test)[source]¶

Makes predictions utilizing model over X_test.

Parameters

model (object) – The model to be used for predictions
X_test (numpy.ndarray) – The test features array

Returns

Contains predictions in (label, score) pairs

Return type

list

clf_utilities.inverse_transform_labels(encoder, k_preds)[source]¶

Utilizes encoder to transform encoded labels back to the original strings.

Parameters

encoder (sklearn.preprocessing.LabelEncoder) – The encoder to be utilized
k_preds (list) – Contains predictions in (label, score) pairs

Returns

Contains predictions in (label, score) pairs, where label is now in the original string format

Return type

list

clf_utilities.is_valid(clf_name)[source]¶

Checks whether clf_name is a valid classifier’s name with respect to the experiment setup.

Parameters: clf_name (str) – Classifier’s name
Returns: Returns True if given classifier’s name is valid
Return type: bool

clf_utilities.k_accuracy_score(y_test, k_best)[source]¶

Measures the defined k-accuracy metric. For each poi, a successful prediction is considered if true label appears in the top k labels predicted by the model,

Parameters

y_test (numpy.ndarray) – True labels
k_best (numpy.ndarray) – Top k predicted labels

Returns

The k accuracy score

Return type

float

clf_utilities.normalize_scores(scores)[source]¶

Normalizes predictions scores to a probabilities-like format.

Parameters: scores (list) – Contains the predictions scores as predicted by the model
Returns: The normalized scores
Return type: list

clf_utilities.train_classifier(clf_name, X_train, y_train)[source]¶

Trains a classifier through grid search.

Parameters

clf_name (str) – Classifier’s name to be trained
X_train (numpy.ndarray) – Train features array
y_train (numpy.ndarray) – Train labels array

Returns

The trained classifier

Return type

object