transformers_domain_adaptation.data_selection.metrics.similarity¶
Similiarity metrics for data selection introduced by Ruder and Plank.
The functions here were adapted and vectorized from those in the authors’ repo.
-
transformers_domain_adaptation.data_selection.metrics.similarity.
jensen_shannon_similarity
(repr1, repr2)[source]¶ Calculate similairty based on Jensen-Shannon divergence.
https://en.wikipedia.org/wiki/Jensen%E2%80%93Shannon_divergence
- Parameters:
repr1 (numpy.ndarray) –
repr2 (numpy.ndarray) –
- Return type:
numpy.ndarray
-
transformers_domain_adaptation.data_selection.metrics.similarity.
renyi_similarity
(repr1, repr2, alpha=0.99)[source]¶ Calculate similarity based on Rényi divergence.
https://en.wikipedia.org/wiki/R%C3%A9nyi_entropy#R.C3.A9nyi_divergence
- Parameters:
repr1 (numpy.ndarray) –
repr2 (numpy.ndarray) –
alpha (float) –
- Return type:
numpy.ndarray
-
transformers_domain_adaptation.data_selection.metrics.similarity.
cosine_similarity
(repr1, repr2)[source]¶ Calculate cosine similarity (https://en.wikipedia.org/wiki/Cosine_similarity).
- Parameters:
repr1 (numpy.ndarray) –
repr2 (numpy.ndarray) –
- Return type:
numpy.ndarray
-
transformers_domain_adaptation.data_selection.metrics.similarity.
euclidean_similarity
(repr1, repr2)[source]¶ Calculate similarity based on Euclidean distance.
https://en.wikipedia.org/wiki/Euclidean_distance
- Parameters:
repr1 (numpy.ndarray) –
repr2 (numpy.ndarray) –
- Return type:
numpy.ndarray
-
transformers_domain_adaptation.data_selection.metrics.similarity.
variational_similarity
(repr1, repr2)[source]¶ Calculate similarity based on L1 / Manhattan distance.
https://en.wikipedia.org/wiki/Taxicab_geometry
- Parameters:
repr1 (numpy.ndarray) –
repr2 (numpy.ndarray) –
- Return type:
numpy.ndarray
-
transformers_domain_adaptation.data_selection.metrics.similarity.
bhattacharyya_similarity
(repr1, repr2)[source]¶ Calculate similarity based on Bhattacharyya distance.
https://en.wikipedia.org/wiki/Bhattacharyya_distance
- Parameters:
repr1 (numpy.ndarray) –
repr2 (numpy.ndarray) –
- Return type:
numpy.ndarray
-
transformers_domain_adaptation.data_selection.metrics.similarity.
similarity_func_factory
(metric)[source]¶ Return the corresponding similarity function based on the provided metric.
- Parameters:
metric (str) – Similarity metric
- Raises:
ValueError – If metric does not exist in SIMILARITY_FEATURES
- Return type:
Callable[[numpy.ndarray, numpy.ndarray], numpy.ndarray]