features

class interlinking.features.Features[source]

This class loads the dataset, frequent terms and builds features that are used as input to supported classification groups:

  • basic: similarity features based on basic similarity measures.

  • basic_sorted: similarity features based on sorted version of the basic similarity measures used in basic group.

  • lgm: similarity features based on variations of LGM-Sim similarity measures.

See also

compute_features()

Details on the metrics each classification group implements.

build()[source]

Build features depending on the assignment of parameter classification_method and return values (fX, y) as ndarray of floats.

Returns

  • fX (ndarray) – The computed features that will be used as input to ML classifiers.

  • y (ndarray) – Binary labels {True, False} to train the classifiers.

compute_features(s1, s2, sorted=True, lgm_sims=True)[source]

Depending on the group assigned to parameter classification_method, this method builds an ndarray of the following groups of features:

Parameters
  • s1, s2 (str) – Input toponyms.

  • sorted (bool, optional) – Value of True indicate to build features for groups basic and basic_sorted, value of False only for basic group.

  • lgm_sims (bool, optional) – Values of True or False indicate whether to build or not features for group lgm.

Returns

It returns a list (vector) of features.

Return type

list

Return Home