tune hyperparameters¶

class interlinking.hyperparam_tuning.ParamTuning[source]¶

This class provides all main methods for selecting, fine tuning hyperparameters, training and testing the best classifier for toponym matching. The following classifiers are examined:

Support Vector Machine (SVM)

Decision Trees

Multi-Layer Perceptron (MLP)

Random Forest

Extra-Trees

eXtreme Gradient Boosting (XGBoost)

fineTuneClassifiers(X, y)[source]¶

Search over specified parameter values for various estimators/classifiers and choose the best one.

This method searches over specified values and selects the classifier that achieves the best avg accuracy score for all evaluations. The supported search methods are:

GridSearchCV: Exhaustive search over specified parameter values for supported estimators. The following variables are defined in MLConf() :

MLP_hyperparameters

RandomForests_hyperparameters

XGBoost_hyperparameters

SVM_hyperparameters

DecisionTree_hyperparameters

RandomizedSearchCV: Randomized search over continuous distribution space. max_iter defines the number of parameter settings that are sampled. max_iter trades off runtime vs quality of the solution. The following variables are defined in MLConf() :

MLP_hyperparameters_dist

RandomForests_hyperparameters_dist

XGBoost_hyperparameters_dist

SVM_hyperparameters_dist

DecisionTree_hyperparameters_dist

Parameters

X (array-like or sparse matrix, shape = [n_samples, n_features]) – The training input samples.

y (array-like, shape = [n_samples] or [n_samples, n_outputs]) – The target values, i.e. class labels.

Returns

out – It returns a dictionary with keys accuracy, i.e., the used similarity score, and classifier, i.e., the name of the model in reference.

Return type

dict of {str: int, str: str}

trainClassifier(X_train, y_train, model)[source]¶

Build a classifier from the training set (X_train, y_train).

Parameters

X_train (array-like or sparse matrix, shape = [n_samples, n_features]) – The training input samples.

y_train (array-like, shape = [n_samples] or [n_samples, n_outputs]) – The target values, i.e. class labels.

model (classifier object) – An instance of a classifier.

Returns

It returns a trained classifier.

Return type

classifier object

testClassifier(X_test, y_test, model)[source]¶

Evaluate a classifier on a testing set (X_test, y_test).

Parameters

X_test (array-like or sparse matrix, shape = [n_samples, n_features]) – The training input samples.

y_test (array-like, shape = [n_samples] or [n_samples, n_outputs]) – The target values, i.e. class labels.

model (classifier object) – A trained classifier.

Returns

Returns the computed metrics, i.e., accuracy, precision, recall and f1, for the specified model on the test dataset.

Return type

tuple of (float, float, float, float)

Return Home