Class RleEncoder<T extends java.lang.Comparable<T>>

  • Type Parameters:
    T - data type T for RLE
    Direct Known Subclasses:
    IntRleEncoder, LongRleEncoder

    public abstract class RleEncoder<T extends java.lang.Comparable<T>>
    extends Encoder
    Encodes values using a combination of run length encoding and bit packing, according to the following grammar:
    
     rle-bit-packing-hybrid: <length> <bitwidth> <encoded-data>
     length := length of the <bitwidth> <encoded-data> in bytes stored as 4 bytes little endian
     bitwidth := bitwidth for all encoded data in <encoded-data>
     encoded-data := <run>*
     run := <bit-packed-run> | <rle-run>
     bit-packed-run := <bit-packed-header> <lastBitPackedNum> <bit-packed-values>
     bit-packed-header := varint-encode(<bit-pack-count> << 1 | 1)
     lastBitPackedNum := the number of useful value in last bit-pack may be less than 8, so
     lastBitPackedNum indicates how many values are useful
     bit-packed-values :=  bit packed
     rle-run := <rle-header> <repeated-value>
     rle-header := varint-encode( (number of times repeated) << 1)
     repeated-value := value that is repeated, using a fixed-width of round-up-to-next-byte(bit-width)
     
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected int bitPackedGroupCount
      the number of group which using bit packing, it is saved in header.
      protected int bitWidth
      the bit width used for bit-packing and rle.
      protected T[] bufferedValues
      array to buffer values temporarily.
      protected java.io.ByteArrayOutputStream byteCache
      output stream to buffer <bitwidth> <encoded-data>.
      protected java.util.List<byte[]> bytesBuffer
      we will write all bytes using bit-packing to OutputStream once.
      protected TSFileConfig config  
      protected boolean isBitPackRun
      flag which indicate encoding mode false -- rle true -- bit-packing.
      protected boolean isBitWidthSaved  
      protected int numBufferedValues
      the number of buffered value in array.
      protected T preValue
      previous value written, used to detect repeated values.
      protected int repeatCount
      for a given value now buffered, how many times it occurs.
      protected java.util.List<T> values
      we save all value in a list and calculate its bitwidth.
    • Constructor Summary

      Constructors 
      Modifier Constructor Description
      protected RleEncoder()
      constructor.
    • Method Summary

      All Methods Instance Methods Abstract Methods Concrete Methods 
      Modifier and Type Method Description
      protected abstract void clearBuffer()
      clean all useless value in bufferedValues and set 0.
      protected abstract void convertBuffer()  
      void encode​(boolean value, java.io.ByteArrayOutputStream out)  
      void encode​(double value, java.io.ByteArrayOutputStream out)  
      void encode​(float value, java.io.ByteArrayOutputStream out)  
      void encode​(int value, java.io.ByteArrayOutputStream out)  
      void encode​(long value, java.io.ByteArrayOutputStream out)  
      void encode​(short value, java.io.ByteArrayOutputStream out)  
      void encode​(java.math.BigDecimal value, java.io.ByteArrayOutputStream out)  
      void encode​(Binary value, java.io.ByteArrayOutputStream out)  
      protected void encodeValue​(T value)
      Encode T value using rle or bit-packing.
      protected void endPreviousBitPackedRun​(int lastBitPackedNum)
      End a bit-packing run write all bit-packing group to OutputStream bit-packing format: [header][lastBitPackedNum][bit-packing group]+ [bit-packing group]+ are saved in List<byte[]> bytesBuffer .
      void flush​(java.io.ByteArrayOutputStream out)
      Write all values buffered in cache to OutputStream.
      protected void reset()  
      void writeOrAppendBitPackedRun()
      Start a bit-packing run transform values to bytes and buffer them in cache.
      protected abstract void writeRleRun()
      Write bytes to OutputStream using rle.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • values

        protected java.util.List<T extends java.lang.Comparable<T>> values
        we save all value in a list and calculate its bitwidth.
      • bitWidth

        protected int bitWidth
        the bit width used for bit-packing and rle.
      • repeatCount

        protected int repeatCount
        for a given value now buffered, how many times it occurs.
      • bitPackedGroupCount

        protected int bitPackedGroupCount
        the number of group which using bit packing, it is saved in header.
      • numBufferedValues

        protected int numBufferedValues
        the number of buffered value in array.
      • bytesBuffer

        protected java.util.List<byte[]> bytesBuffer
        we will write all bytes using bit-packing to OutputStream once. Before that, all bytes are saved in list.
      • isBitPackRun

        protected boolean isBitPackRun
        flag which indicate encoding mode false -- rle true -- bit-packing.
      • preValue

        protected T extends java.lang.Comparable<T> preValue
        previous value written, used to detect repeated values.
      • bufferedValues

        protected T extends java.lang.Comparable<T>[] bufferedValues
        array to buffer values temporarily.
      • isBitWidthSaved

        protected boolean isBitWidthSaved
      • byteCache

        protected java.io.ByteArrayOutputStream byteCache
        output stream to buffer <bitwidth> <encoded-data>.
    • Constructor Detail

      • RleEncoder

        protected RleEncoder()
        constructor.
    • Method Detail

      • reset

        protected void reset()
      • flush

        public void flush​(java.io.ByteArrayOutputStream out)
                   throws java.io.IOException
        Write all values buffered in cache to OutputStream.
        Specified by:
        flush in class Encoder
        Parameters:
        out - - byteArrayOutputStream
        Throws:
        java.io.IOException - cannot flush to OutputStream
      • writeRleRun

        protected abstract void writeRleRun()
                                     throws java.io.IOException
        Write bytes to OutputStream using rle. rle format: [header][value] header: (repeated value) << 1
        Throws:
        java.io.IOException - cannot write RLE run
      • writeOrAppendBitPackedRun

        public void writeOrAppendBitPackedRun()
        Start a bit-packing run transform values to bytes and buffer them in cache.
      • endPreviousBitPackedRun

        protected void endPreviousBitPackedRun​(int lastBitPackedNum)
        End a bit-packing run write all bit-packing group to OutputStream bit-packing format: [header][lastBitPackedNum][bit-packing group]+ [bit-packing group]+ are saved in List<byte[]> bytesBuffer .
        Parameters:
        lastBitPackedNum - - in last bit-packing group, it may have useful values less than 8. This param indicates how many values are useful
      • encodeValue

        protected void encodeValue​(T value)
        Encode T value using rle or bit-packing. It may not write to OutputStream immediately
        Parameters:
        value - - value to encode
      • clearBuffer

        protected abstract void clearBuffer()
        clean all useless value in bufferedValues and set 0.
      • convertBuffer

        protected abstract void convertBuffer()
      • encode

        public void encode​(boolean value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(short value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(int value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(long value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(float value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(double value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(Binary value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder
      • encode

        public void encode​(java.math.BigDecimal value,
                           java.io.ByteArrayOutputStream out)
        Overrides:
        encode in class Encoder