T - type of the tokenpublic final class TanimotoCoefficient<T> extends Object implements SetMetric<T>
similarity(a,b) = a·b / (||a|| * ||b||)
The cosine similarity is identical to the Tanimoto coefficient, but unlike
Tanimoto the occurrence (cardinality) of an entry is taken into account. E.g.
[hello, world] and [hello, world, hello, world] would be
identical when compared with Tanimoto but are dissimilar when the cosine
similarity is used.
This class is immutable and thread-safe.
CosineSimilarity,
Wikipedia
Cosine similarity| Constructor and Description |
|---|
TanimotoCoefficient() |
| Modifier and Type | Method and Description |
|---|---|
float |
compare(Set<T> a,
Set<T> b)
Measures the similarity between sets a and b.
|
String |
toString() |
public float compare(Set<T> a, Set<T> b)
SetMetric
Results are undefined if set1 and set2 are sets based on
different equivalence relations (as HashSet, TreeSet, and
the keySet of an IdentityHashMap all are).
Copyright © 2014–2016. All rights reserved.