public class RecordReaderUtils extends Object
Modifier and Type | Class and Description |
---|---|
static class |
RecordReaderUtils.ByteBufferAllocatorPool |
Constructor and Description |
---|
RecordReaderUtils() |
Modifier and Type | Method and Description |
---|---|
static void |
addEntireStreamToRanges(long offset,
long length,
DiskRangeList.CreateHelper list,
boolean doMergeBuffers) |
static void |
addRgFilteredStreamToRanges(OrcProto.Stream stream,
boolean[] includedRowGroups,
boolean isCompressed,
OrcProto.RowIndex index,
OrcProto.ColumnEncoding encoding,
OrcProto.Type type,
int compressionSize,
boolean hasNull,
long offset,
long length,
DiskRangeList.CreateHelper list,
boolean doMergeBuffers) |
static DataReader |
createDefaultDataReader(DataReaderProperties properties) |
static long |
estimateRgEndOffset(boolean isCompressed,
boolean isLast,
long nextGroupOffset,
long streamLength,
int bufferSize) |
static boolean[] |
findPresentStreamsByColumn(List<OrcProto.Stream> streamList,
List<OrcProto.Type> types) |
static int |
getIndexPosition(OrcProto.ColumnEncoding.Kind columnEncoding,
OrcProto.Type.Kind columnType,
OrcProto.Stream.Kind streamType,
boolean isCompressed,
boolean hasNulls)
Get the offset in the index positions for the column that the given
stream starts.
|
static boolean |
isDictionary(OrcProto.Stream.Kind kind,
OrcProto.ColumnEncoding encoding)
Is this stream part of a dictionary?
|
static DiskRangeList |
planIndexReading(TypeDescription fileSchema,
OrcProto.StripeFooter footer,
boolean ignoreNonUtf8BloomFilter,
boolean[] fileIncluded,
boolean[] sargColumns,
OrcFile.WriterVersion version,
OrcProto.Stream.Kind[] bloomFilterKinds)
Plans the list of disk ranges that the given stripe needs to read the
indexes.
|
static String |
stringifyDiskRanges(DiskRangeList range)
Build a string representation of a list of disk ranges.
|
public static DiskRangeList planIndexReading(TypeDescription fileSchema, OrcProto.StripeFooter footer, boolean ignoreNonUtf8BloomFilter, boolean[] fileIncluded, boolean[] sargColumns, OrcFile.WriterVersion version, OrcProto.Stream.Kind[] bloomFilterKinds)
fileSchema
- the schema for the filefooter
- the stripe footerignoreNonUtf8BloomFilter
- should the reader ignore non-utf8
encoded bloom filtersfileIncluded
- the columns (indexed by file columns) that should be
readsargColumns
- true for the columns (indexed by file columns) that
we need bloom filters forversion
- the version of the software that wrote the filebloomFilterKinds
- (output) the stream kind of the bloom filterspublic static DataReader createDefaultDataReader(DataReaderProperties properties)
public static boolean[] findPresentStreamsByColumn(List<OrcProto.Stream> streamList, List<OrcProto.Type> types)
public static void addEntireStreamToRanges(long offset, long length, DiskRangeList.CreateHelper list, boolean doMergeBuffers)
public static void addRgFilteredStreamToRanges(OrcProto.Stream stream, boolean[] includedRowGroups, boolean isCompressed, OrcProto.RowIndex index, OrcProto.ColumnEncoding encoding, OrcProto.Type type, int compressionSize, boolean hasNull, long offset, long length, DiskRangeList.CreateHelper list, boolean doMergeBuffers)
public static long estimateRgEndOffset(boolean isCompressed, boolean isLast, long nextGroupOffset, long streamLength, int bufferSize)
public static int getIndexPosition(OrcProto.ColumnEncoding.Kind columnEncoding, OrcProto.Type.Kind columnType, OrcProto.Stream.Kind streamType, boolean isCompressed, boolean hasNulls)
columnEncoding
- the encoding of the columncolumnType
- the type of the columnstreamType
- the kind of the streamisCompressed
- is the file compressedhasNulls
- does the column have a PRESENT stream?public static boolean isDictionary(OrcProto.Stream.Kind kind, OrcProto.ColumnEncoding encoding)
public static String stringifyDiskRanges(DiskRangeList range)
range
- ranges to stringifyCopyright © 2013–2018 The Apache Software Foundation. All rights reserved.