Class CharWormSet

java.lang.Object
com.carrotsearch.hppc.CharWormSet
All Implemented Interfaces:
Accountable, CharCollection, CharContainer, CharLookupContainer, CharSet, Preallocable, Cloneable, Iterable<CharCursor>

@Generated(date="2021-12-15T09:45:39+0100", value="KTypeWormSet.java") public class CharWormSet extends Object implements CharLookupContainer, CharSet, Preallocable, Cloneable, Accountable
A hash set of chars, implemented using Worm Hashing strategy.

This strategy is appropriate for a medium sized set (less than 2M keys). It takes more time to put keys in the set because it maintains chains of keys having the same hash. Then the lookup speed is fast even if the set is heavy loaded or hashes are clustered. On average it takes slightly more memory than CharHashSet: heavier but the load factor is higher (it varies around 80%) so it enlarges later.

See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected int
    Seed used to ensure the hash iteration order is different from an iteration to another.
    char[]
    The array holding keys.
    byte[]
    abs(next[i])=offset to next chained entry index.
    protected int
    Set size (number of entries).
  • Constructor Summary

    Constructors
    Constructor
    Description
    New instance with sane defaults.
    CharWormSet(int expectedElements)
    New instance with the provided defaults.
    Creates a new instance from all elements of another container.
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
    add(char key)
    Adds k to the set.
    final int
    addAll(char... elements)
    Adds all elements from the given list (vararg) to this set.
    int
    addAll(CharContainer container)
    Adds all elements from the given CharContainer to this set.
    int
    addAll(Iterable<? extends CharCursor> iterable)
    Adds all elements from the given iterable to this set.
    protected void
    allocateBuffers(int capacity)
     
    void
    Removes all elements from this collection.
    Clones this set.
    boolean
    contains(char key)
    Lookup a given element in the container.
    void
    ensureCapacity(int expectedElements)
    Ensure this container can hold at least the given number of elements without resizing its buffers.
    boolean
    <T extends CharProcedure>
    T
    forEach(T procedure)
    Applies a procedure to all container elements.
    from(char... elements)
    Create a set from a variable number of arguments or an array of char.
    int
    protected int
    hashKey(char key)
     
    boolean
    indexExists(int index)
     
    char
    indexGet(int index)
    Returns the exact value of the existing key.
    void
    indexInsert(int index, char key)
    Inserts a key for an index that is not present in the set.
    int
    indexOf(char key)
    Returns a logical "index" of a given key that can be used to speed up follow-up logic in certain scenarios (conditional logic).
    void
    indexRemove(int index)
    Removes a key at an index previously acquired from indexOf(char).
    char
    indexReplace(int index, char equivalentKey)
    Replaces the existing equivalent key with the given one and returns any previous value stored for that key.
    boolean
    Shortcut for size() == 0.
    Returns an iterator to a cursor traversing the collection.
    protected int
    Provides the next iteration seed used to build the iteration starting slot and offset increment.
    long
    Allocated memory estimation
    long
    Bytes that is actually been used
    void
    Removes all elements from the collection and additionally releases any internal buffers.
    boolean
    remove(char key)
    An alias for the (preferred) removeAll(char).
    int
    removeAll(char key)
    Removes all occurrences of e from this collection.
    int
    Removes all keys present in a given container.
    int
    Default implementation uses a predicate for removal.
    int
    Removes all elements in this collection for which the given predicate returns true.
    int
    Default implementation uses a predicate for retaining.
    int
    Default implementation redirects to CharCollection.removeAll(CharPredicate) and negates the predicate.
    int
    Return the current number of elements in this container.
    char[]
    Default implementation of copying to an array.
    Convert the contents of this container to a human-friendly string.
    visualizeKeyDistribution(int characters)
    Visually depict the distribution of keys.

    Methods inherited from class java.lang.Object

    finalize, getClass, notify, notifyAll, wait, wait, wait

    Methods inherited from interface com.carrotsearch.hppc.CharCollection

    removeAll, retainAll, retainAll

    Methods inherited from interface com.carrotsearch.hppc.CharContainer

    toArray

    Methods inherited from interface java.lang.Iterable

    forEach, spliterator
  • Field Details

    • keys

      public char[] keys
      The array holding keys.
    • next

      public byte[] next
      abs(next[i])=offset to next chained entry index.

      next[i]=0 for free bucket.

      The offset is always forward, and the array is considered circular, meaning that an entry at the end of the array may point to an entry at the beginning with a positive offset.

      The offset is always forward, but the sign of the offset encodes head/tail of chain. next[i] > 0 for the first head-of-chain entry (within [1,WormUtil.maxOffset(int)]), next[i] < 0 for the subsequent tail-of-chain entries (within [-WormUtil.maxOffset(int),-1]. For the last entry in the chain, abs(next[i])=WormUtil.END_OF_CHAIN.

    • size

      protected int size
      Set size (number of entries).
    • iterationSeed

      protected int iterationSeed
      Seed used to ensure the hash iteration order is different from an iteration to another.
  • Constructor Details

    • CharWormSet

      public CharWormSet()
      New instance with sane defaults.
    • CharWormSet

      public CharWormSet(int expectedElements)
      New instance with the provided defaults.

      There is no load factor parameter as this set enlarges automatically. In practice the load factor varies around 80% (between 75% and 90%). The load factor is 100% for tiny sets.

      Parameters:
      expectedElements - The expected number of elements. The capacity of the set is calculated based on it.
    • CharWormSet

      public CharWormSet(CharContainer container)
      Creates a new instance from all elements of another container.
  • Method Details

    • from

      public static CharWormSet from(char... elements)
      Create a set from a variable number of arguments or an array of char. The elements are copied from the argument to the internal buffer.
    • clone

      public CharWormSet clone()
      Clones this set. The cloning operation is efficient because it copies directly the internal arrays, without having to put elements in the cloned set. The cloned set has the same elements and the same capacity as this set.
      Overrides:
      clone in class Object
      Returns:
      A shallow copy of this set.
    • size

      public int size()
      Return the current number of elements in this container. The time for calculating the container's size may take O(n) time, although implementing classes should try to maintain the current size and return in constant time.
      Specified by:
      size in interface CharContainer
    • isEmpty

      public boolean isEmpty()
      Shortcut for size() == 0.
      Specified by:
      isEmpty in interface CharContainer
    • contains

      public boolean contains(char key)
      Lookup a given element in the container. This operation has no speed guarantees (may be linear with respect to the size of this container).
      Specified by:
      contains in interface CharContainer
      Specified by:
      contains in interface CharLookupContainer
      Returns:
      Returns true if this container has an element equal to e.
    • add

      public boolean add(char key)
      Adds k to the set.
      Specified by:
      add in interface CharSet
      Returns:
      Returns true if this element was not part of the set before. Returns false if an equal element is already part of the set, does not replace the existing element with the argument.
    • addAll

      public final int addAll(char... elements)
      Adds all elements from the given list (vararg) to this set.
      Returns:
      Returns the number of elements actually added as a result of this call (not previously present in the set).
    • addAll

      public int addAll(CharContainer container)
      Adds all elements from the given CharContainer to this set.
      Specified by:
      addAll in interface CharSet
      Returns:
      Returns the number of elements actually added as a result of this call (not previously present in the set).
    • addAll

      public int addAll(Iterable<? extends CharCursor> iterable)
      Adds all elements from the given iterable to this set.
      Returns:
      Returns the number of elements actually added as a result of this call (not previously present in the set).
    • remove

      public boolean remove(char key)
      An alias for the (preferred) removeAll(char).
    • removeAll

      public int removeAll(char key)
      Removes all occurrences of e from this collection.
      Specified by:
      removeAll in interface CharCollection
      Parameters:
      key - Element to be removed from this collection, if present.
      Returns:
      The number of removed elements as a result of this call.
    • removeAll

      public int removeAll(CharContainer other)
      Removes all keys present in a given container.
      Returns:
      Returns the number of elements actually removed as a result of this call.
    • removeAll

      public int removeAll(CharPredicate predicate)
      Removes all elements in this collection for which the given predicate returns true.
      Specified by:
      removeAll in interface CharCollection
      Returns:
      Returns the number of removed elements.
    • forEach

      public <T extends CharProcedure> T forEach(T procedure)
      Applies a procedure to all container elements. Returns the argument (any subclass of CharProcedure. This lets the caller to call methods of the argument by chaining the call (even if the argument is an anonymous type) to retrieve computed values, for example (IntContainer):
       int count = container.forEach(new IntProcedure() {
         int count; // this is a field declaration in an anonymous class.
       
         public void apply(int value) {
           count++;
         }
       }).count;
       
      Specified by:
      forEach in interface CharContainer
    • forEach

      public <T extends CharPredicate> T forEach(T predicate)
      Applies a predicate to container elements as long, as the predicate returns true. The iteration is interrupted otherwise.
      Specified by:
      forEach in interface CharContainer
    • iterator

      public Iterator<CharCursor> iterator()
      Returns an iterator to a cursor traversing the collection. The order of traversal is not defined. More than one cursor may be active at a time. The behavior of iterators is undefined if structural changes are made to the underlying collection.

      The iterator is implemented as a cursor and it returns the same cursor instance on every call to Iterator.next() (to avoid boxing of primitive types). To read the current list's value (or index in the list) use the cursor's public fields. An example is shown below.

       for (CharCursor<char> c : container) {
         System.out.println("index=" + c.index + " value=" + c.value);
       }
       
      Specified by:
      iterator in interface CharContainer
      Specified by:
      iterator in interface Iterable<CharCursor>
    • clear

      public void clear()
      Removes all elements from this collection.
      Specified by:
      clear in interface CharCollection
      See Also:
    • release

      public void release()
      Removes all elements from the collection and additionally releases any internal buffers. Typically, if the object is to be reused, a simple CharCollection.clear() should be a better alternative since it'll avoid reallocation.
      Specified by:
      release in interface CharCollection
      See Also:
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • hashKey

      protected int hashKey(char key)
    • indexOf

      public int indexOf(char key)
      Returns a logical "index" of a given key that can be used to speed up follow-up logic in certain scenarios (conditional logic). The semantics of "indexes" are not strictly defined. Indexes may (and typically won't be) contiguous. The index is valid only between modifications (it will not be affected by read-only operations).
      Parameters:
      key - The key to locate in the set.
      Returns:
      A non-negative value of the logical "index" of the key in the set or a negative value if the key did not exist.
      See Also:
    • indexExists

      public boolean indexExists(int index)
      Parameters:
      index - The index of a given key, as returned from indexOf(char).
      Returns:
      Returns true if the index corresponds to an existing key or false otherwise. This is equivalent to checking whether the index is a positive value (existing keys) or a negative value (non-existing keys).
      See Also:
    • indexGet

      public char indexGet(int index)
      Returns the exact value of the existing key. This method makes sense for sets of objects which define custom key-equality relationship.
      Parameters:
      index - The index of an existing key.
      Returns:
      Returns the equivalent key currently stored in the set.
      Throws:
      AssertionError - If assertions are enabled and the index does not correspond to an existing key.
      See Also:
    • indexReplace

      public char indexReplace(int index, char equivalentKey)
      Replaces the existing equivalent key with the given one and returns any previous value stored for that key.
      Parameters:
      index - The index of an existing key.
      equivalentKey - The key to put in the set as a replacement. Must be equivalent to the key currently stored at the provided index.
      Returns:
      Returns the previous key stored in the set.
      Throws:
      AssertionError - If assertions are enabled and the index does not correspond to an existing key.
      See Also:
    • indexInsert

      public void indexInsert(int index, char key)
      Inserts a key for an index that is not present in the set. This method may help in avoiding double recalculation of the key's hash.
      Parameters:
      index - The index of a previously non-existing key, as returned from indexOf(char).
      Throws:
      AssertionError - If assertions are enabled and the index does not correspond to an existing key.
      See Also:
    • indexRemove

      public void indexRemove(int index)
      Removes a key at an index previously acquired from indexOf(char).
      Parameters:
      index - The index of the key to remove, as returned from indexOf(char).
      Throws:
      AssertionError - If assertions are enabled and the index does not correspond to an existing key.
      See Also:
    • toString

      public String toString()
      Convert the contents of this container to a human-friendly string.
    • ensureCapacity

      public void ensureCapacity(int expectedElements)
      Ensure this container can hold at least the given number of elements without resizing its buffers.
      Specified by:
      ensureCapacity in interface Preallocable
      Parameters:
      expectedElements - The total number of elements, inclusive.
    • visualizeKeyDistribution

      public String visualizeKeyDistribution(int characters)
      Visually depict the distribution of keys.
      Specified by:
      visualizeKeyDistribution in interface CharSet
      Parameters:
      characters - The number of characters to "squeeze" the entire buffer into.
      Returns:
      Returns a sequence of characters where '.' depicts an empty fragment of the internal buffer and 'X' depicts full or nearly full capacity within the buffer's range and anything between 1 and 9 is between.
    • ramBytesAllocated

      public long ramBytesAllocated()
      Allocated memory estimation
      Specified by:
      ramBytesAllocated in interface Accountable
      Returns:
      Ram allocated in bytes
    • ramBytesUsed

      public long ramBytesUsed()
      Bytes that is actually been used
      Specified by:
      ramBytesUsed in interface Accountable
      Returns:
      Ram used in bytes
    • allocateBuffers

      protected void allocateBuffers(int capacity)
    • nextIterationSeed

      protected int nextIterationSeed()
      Provides the next iteration seed used to build the iteration starting slot and offset increment. This method does not need to be synchronized, what matters is that each thread gets a sequence of varying seeds.
    • removeAll

      public int removeAll(CharLookupContainer c)
      Default implementation uses a predicate for removal.
      Specified by:
      removeAll in interface CharCollection
      Returns:
      Returns the number of removed elements.
    • retainAll

      public int retainAll(CharLookupContainer c)
      Default implementation uses a predicate for retaining.
      Specified by:
      retainAll in interface CharCollection
      Returns:
      Returns the number of removed elements.
    • retainAll

      public int retainAll(CharPredicate predicate)
      Default implementation redirects to CharCollection.removeAll(CharPredicate) and negates the predicate.
      Specified by:
      retainAll in interface CharCollection
      Returns:
      Returns the number of removed elements.
    • toArray

      public char[] toArray()
      Default implementation of copying to an array.
      Specified by:
      toArray in interface CharContainer