Class Element<T>

  • All Implemented Interfaces:
    Matchable

    public class Element<T>
    extends Object
    implements Matchable

    This class represent the string "value" against which match are run.

    Configurable attributes

    • type - The ElementType for the value. This determines the functions applied at different steps of the match
    • weight - Used in scoring function to increase the Document score for an Element. Default is 1.0 for all elements
    • threshold - Value above which elements are considered a match, default 0.3
    • neighborhoodRange - Relevant for NEAREST_NEIGHBORS MatchType. Defines how close should the value be, to be considered a match (default 0.9)
    • preProcessFunction - Function to pre-process the value. If this is not set, the function defined in ElementType is used
    • tokenizerFunction - Function to break values into tokens. If this is not set, the function defined in ElementType is used
    • matchType - MatchType used. If this is not set, the type defined in ElementType is used
    • Constructor Detail

      • Element

        public Element​(ElementType type,
                       String variance,
                       T value,
                       double weight,
                       double threshold,
                       double neighborhoodRange,
                       java.util.function.Function<T,​T> preProcessFunction,
                       java.util.function.Function<Element<T>,​java.util.stream.Stream<Token>> tokenizerFunction,
                       MatchType matchType)
    • Method Detail

      • getValue

        public T getValue()
      • getThreshold

        public double getThreshold()
      • getNeighborhoodRange

        public double getNeighborhoodRange()
      • getDocument

        public Document getDocument()
      • setDocument

        public void setDocument​(Document document)
      • setPreProcessedValue

        public void setPreProcessedValue​(T preProcessedValue)
      • getPreProcessFunction

        public java.util.function.Function<T,​T> getPreProcessFunction()
      • getPreProcessedValue

        public T getPreProcessedValue()
      • getTokenizerFunction

        public java.util.function.Function<Element<T>,​java.util.stream.Stream<Token>> getTokenizerFunction()
      • getMatchType

        public MatchType getMatchType()
      • getScore

        public double getScore​(Integer matchingCount,
                               Element other)
      • getChildCount

        public long getChildCount​(Matchable other)
        This gets the Max number of tokens present between matching Elements. For Elements that do not have a balanced set of tokens, it can push the score down.
        Specified by:
        getChildCount in interface Matchable
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object