| Package | Description |
|---|---|
| org.apache.tika.config |
Tika configuration tools.
|
| org.apache.tika.embedder | |
| org.apache.tika.extractor |
Extraction of component documents.
|
| org.apache.tika.fork |
Forked parser.
|
| org.apache.tika.parser |
Tika parsers.
|
| org.apache.tika.parser.digest | |
| org.apache.tika.parser.external |
External parser process.
|
| org.apache.tika.parser.external2 | |
| org.apache.tika.parser.multiple | |
| org.apache.tika.renderer | |
| org.apache.tika.sax |
SAX utilities.
|
| org.apache.tika.utils |
Utilities.
|
| Modifier and Type | Method and Description |
|---|---|
static long |
TikaTaskTimeout.getTimeoutMillis(ParseContext context,
long defaultTimeoutMillis) |
| Modifier and Type | Method and Description |
|---|---|
void |
ExternalEmbedder.embed(Metadata metadata,
InputStream inputStream,
OutputStream outputStream,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
|
void |
Embedder.embed(Metadata metadata,
InputStream originalStream,
OutputStream outputStream,
ParseContext context)
Embeds related document metadata from the given metadata object into the
given output stream.
|
Set<MediaType> |
ExternalEmbedder.getSupportedEmbedTypes(ParseContext context) |
Set<MediaType> |
Embedder.getSupportedEmbedTypes(ParseContext context)
Returns the set of media types supported by this embedder when used with
the given parse context.
|
| Modifier and Type | Method and Description |
|---|---|
static EmbeddedDocumentExtractor |
EmbeddedDocumentUtil.getEmbeddedDocumentExtractor(ParseContext context)
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.
|
static Parser |
EmbeddedDocumentUtil.getStatelessParser(ParseContext context)
Utility function to get the Parser that was sent in to the
ParseContext to handle embedded documents.
|
EmbeddedDocumentExtractor |
ParsingEmbeddedDocumentExtractorFactory.newInstance(Metadata metadata,
ParseContext parseContext) |
EmbeddedDocumentExtractor |
EmbeddedDocumentExtractorFactory.newInstance(Metadata metadata,
ParseContext parseContext) |
static Parser |
EmbeddedDocumentUtil.tryToFindExistingLeafParser(Class clazz,
ParseContext context)
Tries to find an existing parser within the ParseContext.
|
| Constructor and Description |
|---|
EmbeddedDocumentUtil(ParseContext context) |
ParsingEmbeddedDocumentExtractor(ParseContext context) |
| Modifier and Type | Method and Description |
|---|---|
Set<MediaType> |
ForkParser.getSupportedTypes(ParseContext context) |
void |
ForkParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
This sends the objects to the server for parsing, and the server via
the proxies acts on the handler as if it were updating it directly.
|
| Modifier and Type | Method and Description |
|---|---|
void |
DigestingParser.Digester.digest(InputStream is,
Metadata m,
ParseContext parseContext)
Digests an InputStream and sets the appropriate value(s) in the metadata.
|
Map<MediaType,List<Parser>> |
CompositeParser.findDuplicateParsers(ParseContext context)
Utility method that goes through all the component parsers and finds
all media types for which more than one parser declares support.
|
protected Parser |
DelegatingParser.getDelegateParser(ParseContext context)
Returns the parser instance to which parsing tasks should be delegated.
|
protected EncodingDetector |
AbstractEncodingDetectorParser.getEncodingDetector(ParseContext parseContext)
Look for an EncodingDetetor in the ParseContext.
|
protected Parser |
CompositeParser.getParser(Metadata metadata,
ParseContext context) |
Map<MediaType,Parser> |
DefaultParser.getParsers(ParseContext context) |
Map<MediaType,Parser> |
CompositeParser.getParsers(ParseContext context) |
Set<MediaType> |
Parser.getSupportedTypes(ParseContext context)
Returns the set of media types supported by this parser when used
with the given parse context.
|
Set<MediaType> |
DelegatingParser.getSupportedTypes(ParseContext context) |
Set<MediaType> |
CryptoParser.getSupportedTypes(ParseContext context) |
Set<MediaType> |
ParserDecorator.getSupportedTypes(ParseContext context)
Delegates the method call to the decorated parser.
|
Set<MediaType> |
CompositeParser.getSupportedTypes(ParseContext context) |
Set<MediaType> |
EmptyParser.getSupportedTypes(ParseContext context) |
Set<MediaType> |
RecursiveParserWrapper.getSupportedTypes(ParseContext context) |
Set<MediaType> |
ErrorParser.getSupportedTypes(ParseContext context) |
Set<MediaType> |
NetworkParser.getSupportedTypes(ParseContext context) |
Set<MediaType> |
RegexCaptureParser.getSupportedTypes(ParseContext context) |
void |
ParserPostProcessor.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Forwards the call to the delegated parser and post-processes the
results as described above.
|
void |
Parser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Parses a document stream into a sequence of XHTML SAX events.
|
void |
DelegatingParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Looks up the delegate parser from the parsing context and
delegates the parse operation to it.
|
void |
CryptoParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
ParserDecorator.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the method call to the decorated parser.
|
void |
CompositeParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the call to the matching component parser.
|
void |
EmptyParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
RecursiveParserWrapper.parse(InputStream stream,
ContentHandler recursiveParserWrapperHandler,
Metadata metadata,
ParseContext context) |
void |
ErrorParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
AutoDetectParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
NetworkParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
RegexCaptureParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
DigestingParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
| Constructor and Description |
|---|
ParsingReader(Parser parser,
InputStream stream,
Metadata metadata,
ParseContext context)
Creates a reader for the text content of the given binary stream
with the given document metadata.
|
ParsingReader(Parser parser,
InputStream stream,
Metadata metadata,
ParseContext context,
Executor executor)
Creates a reader for the text content of the given binary stream
with the given document metadata.
|
| Modifier and Type | Method and Description |
|---|---|
void |
CompositeDigester.digest(InputStream is,
Metadata m,
ParseContext parseContext) |
void |
InputStreamDigester.digest(InputStream is,
Metadata metadata,
ParseContext parseContext) |
| Modifier and Type | Method and Description |
|---|---|
Set<MediaType> |
ExternalParser.getSupportedTypes(ParseContext context) |
void |
ExternalParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
|
| Modifier and Type | Method and Description |
|---|---|
Set<MediaType> |
ExternalParser.getSupportedTypes(ParseContext context) |
void |
ExternalParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
| Modifier and Type | Method and Description |
|---|---|
Set<MediaType> |
AbstractMultipleParser.getSupportedTypes(ParseContext context) |
void |
AbstractMultipleParser.parse(InputStream stream,
ContentHandlerFactory handlers,
Metadata metadata,
ParseContext context)
Deprecated.
The
ContentHandlerFactory override is still experimental
and the method signature is subject to change before Tika 2.0 |
void |
AbstractMultipleParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Processes the given Stream through one or more parsers,
resetting things between parsers as requested by policy.
|
protected abstract boolean |
AbstractMultipleParser.parserCompleted(Parser parser,
Metadata metadata,
ContentHandler handler,
ParseContext context,
Exception exception)
Used to notify implementations that a Parser has Finished
or Failed, and to allow them to decide to continue or
abort further parsing
|
protected boolean |
FallbackParser.parserCompleted(Parser parser,
Metadata metadata,
ContentHandler handler,
ParseContext context,
Exception exception) |
protected boolean |
SupplementingParser.parserCompleted(Parser parser,
Metadata metadata,
ContentHandler handler,
ParseContext context,
Exception exception) |
protected void |
AbstractMultipleParser.parserPrepare(Parser parser,
Metadata metadata,
ParseContext context)
Used to allow implementations to prepare or change things
before parsing occurs
|
| Modifier and Type | Method and Description |
|---|---|
Set<MediaType> |
Renderer.getSupportedTypes(ParseContext context)
Returns the set of media types supported by this renderer when used
with the given parse context.
|
Set<MediaType> |
CompositeRenderer.getSupportedTypes(ParseContext context) |
RenderResults |
Renderer.render(InputStream is,
Metadata metadata,
ParseContext parseContext,
RenderRequest... requests) |
RenderResults |
CompositeRenderer.render(InputStream is,
Metadata metadata,
ParseContext parseContext,
RenderRequest... requests) |
| Modifier and Type | Method and Description |
|---|---|
ContentHandler |
ContentHandlerDecoratorFactory.decorate(ContentHandler contentHandler,
Metadata metadata,
ParseContext parseContext) |
| Constructor and Description |
|---|
BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE type,
int writeLimit,
boolean throwOnWriteLimitReached,
ParseContext parseContext) |
WriteOutContentHandler(ContentHandler handler,
int writeLimit,
boolean throwOnWriteLimitReached,
ParseContext parseContext)
The default is to throw a
WriteLimitReachedException |
| Modifier and Type | Method and Description |
|---|---|
static Document |
XMLReaderUtils.buildDOM(InputStream is,
ParseContext context)
This checks context for a user specified
DocumentBuilder. |
static Document |
XMLReaderUtils.buildDOM(Reader reader,
ParseContext context)
This checks context for a user specified
DocumentBuilder. |
static Future |
ConcurrentUtils.execute(ParseContext context,
Runnable runnable)
Execute a runnable using an ExecutorService from the ParseContext if possible.
|
static void |
XMLReaderUtils.parseSAX(InputStream is,
ContentHandler contentHandler,
ParseContext context)
This checks context for a user specified
SAXParser. |
static void |
XMLReaderUtils.parseSAX(Reader reader,
ContentHandler contentHandler,
ParseContext context)
This checks context for a user specified
SAXParser. |
Copyright © 2007–2024 The Apache Software Foundation. All rights reserved.