languagechange.models.change package¶

Computes the moving average of a timeseries. :param ts: a timeseries. :type ts: np.array :param k: the window (k timesteps to the left and k to the right) :type k: int

Returns:: the moving average of the timeseries (not including endpoints)

class languagechange.models.change.timeseries.TimeSeries(embs=None, series=None, change_metric=None, timeseries_type=None, k=1, time_labels=None, clustering_algorithm=None, distance_metric='cosine')[source]¶

Bases: object

Parameters:

embs (List[numpy.array])
series (numpy.array)
timeseries_type (str)
time_labels (numpy.array | List)

compute_from_embeddings(embs, change_metric, timeseries_type, k=1, time_labels=None, clustering_algorithm=None, distance_metric='cosine')[source]¶

Parameters:

embs ([np.array]) – a list of embeddings, each element of the list contains embeddings from one time period.
change_metric (str|object) – the metric to use when comparing embeddings from different time periods (should be one of the classes in languagechange.models.change.metrics).
timeseries_type (str) – the kind of timeseries to construct. One of [‘compare_to_first’, ‘compare_to_last’, ‘consecutive’, ‘moving_average’].
time_labels (np.array|list) – labels for the x axis of the timeseries.
clustering_algorithm – the clustering algorithm if using PJSD as the change metric. E.g. one of the algorithms in scikit-learn, or languagechange.
distance_metric (str) – the distance metric to use when computing change scores.

Returns:

the final timeseries. ts (np.array): the time values/labels for each value in the final timeseries.

Return type:

series (np.array)

languagechange.models.change.widid module¶

class languagechange.models.change.widid.WiDiD(algorithm=<class 'languagechange.models.meaning.clustering.APosterioriaffinityPropagation'>, metric='cosine', **args)[source]¶

Bases: object

A class that implements WiDiD (https://github.com/FrancescoPeriti/WiDiD).

compute_scores(embs_list, timeseries_type='consecutive', k=1, change_metric='apd', time_labels=None)[source]¶

Performs a-posteriori affinity propagation (APP) clustering and computes the semantic change as the APD (or another metric) between the prototype embeddings in clusters of different time periods.

Parameters:

embs_list ([np.array]) – a list of embeddings for a target word, where each element is embeddings of one time period.
timeseries_type (str) – the type of timeseries (see usage in languagechange.models.change.timeseries).
k (int) – the window size, if moving average (see usage in languagechange.models.change.timeseries).
change_metric (str) – the change metric (e.g. ‘apd’) to use (see usage in languagechange.models.change.timeseries).
change_metric – the change metric (e.g. ‘apd’) to use (see usage in languagechange.models.change.timeseries).
time_labels (np.array|list) – labels for the x axis of the timeseries (see usage in languagechange.models.change.timeseries).

Returns:

the labels for each embedding in each time period. prot_embs ([np.array]): a list of matrices encoding the prototype (average) embedding of each cluster in each time period. change_scores (TimeSeries): a timeseries (languagechange.models.change.timeseries.TimeSeries) containing the degree of change between the embeddings in different time periods.

Return type:

labels ([np.array])

languagechange.models.change package¶

Submodules¶

languagechange.models.change.metrics module¶

languagechange.models.change.timeseries module¶

languagechange.models.change.widid module¶

Module contents¶