| Package | Description |
|---|---|
| org.apache.tika |
Apache Tika.
|
| org.apache.tika.config |
Tika configuration tools.
|
| org.apache.tika.detect |
Media type detection.
|
| org.apache.tika.embedder | |
| org.apache.tika.exception |
Tika exception.
|
| org.apache.tika.extractor |
Extraction of component documents.
|
| org.apache.tika.fork |
Forked parser.
|
| org.apache.tika.io |
IO utilities.
|
| org.apache.tika.language |
Language detection.
|
| org.apache.tika.language.translate | |
| org.apache.tika.mime |
Media type information.
|
| org.apache.tika.parser |
Tika parsers.
|
| org.apache.tika.parser.external |
External parser process.
|
| org.apache.tika.sax |
SAX utilities.
|
| Modifier and Type | Method and Description |
|---|---|
String |
Tika.parseToString(File file)
Parses the given file and returns the extracted text content.
|
String |
Tika.parseToString(InputStream stream)
Parses the given document and returns the extracted text content.
|
String |
Tika.parseToString(InputStream stream,
Metadata metadata)
Parses the given document and returns the extracted text content.
|
String |
Tika.parseToString(InputStream stream,
Metadata metadata,
int maxLength)
Parses the given document and returns the extracted text content.
|
String |
Tika.parseToString(Path path)
Parses the file at the given path and returns the extracted text content.
|
String |
Tika.parseToString(URL url)
Parses the resource at the given URL and returns the extracted
text content.
|
| Constructor and Description |
|---|
TikaConfig()
Creates a default Tika configuration.
|
TikaConfig(Document document) |
TikaConfig(Document document,
ServiceLoader loader) |
TikaConfig(Element element) |
TikaConfig(Element element,
ClassLoader loader) |
TikaConfig(File file) |
TikaConfig(File file,
ServiceLoader loader) |
TikaConfig(InputStream stream) |
TikaConfig(Path path) |
TikaConfig(Path path,
ServiceLoader loader) |
TikaConfig(String file) |
TikaConfig(URL url) |
TikaConfig(URL url,
ClassLoader loader) |
TikaConfig(URL url,
ServiceLoader loader) |
| Constructor and Description |
|---|
AutoDetectReader(InputStream stream) |
AutoDetectReader(InputStream stream,
Metadata metadata) |
AutoDetectReader(InputStream stream,
Metadata metadata,
ServiceLoader loader) |
| Modifier and Type | Method and Description |
|---|---|
void |
ExternalEmbedder.embed(Metadata metadata,
InputStream inputStream,
OutputStream outputStream,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
|
void |
Embedder.embed(Metadata metadata,
InputStream originalStream,
OutputStream outputStream,
ParseContext context)
Embeds related document metadata from the given metadata object into the
given output stream.
|
| Modifier and Type | Class and Description |
|---|---|
class |
AccessPermissionException
Exception to be thrown when a document does not allow content extraction.
|
class |
EncryptedDocumentException |
| Modifier and Type | Method and Description |
|---|---|
void |
ParserContainerExtractor.extract(TikaInputStream stream,
ContainerExtractor recurseExtractor,
EmbeddedResourceHandler handler) |
void |
ContainerExtractor.extract(TikaInputStream stream,
ContainerExtractor recurseExtractor,
EmbeddedResourceHandler handler)
Processes a container file, and extracts all the embedded
resources from within it.
|
| Modifier and Type | Method and Description |
|---|---|
void |
ForkParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
| Modifier and Type | Class and Description |
|---|---|
static class |
EndianUtils.BufferUnderrunException |
| Modifier and Type | Method and Description |
|---|---|
void |
TemporaryResources.dispose()
Calls the
TemporaryResources.close() method and wraps the potential
IOException into a TikaException for convenience
when used within Tika. |
| Modifier and Type | Method and Description |
|---|---|
static LanguageProfilerBuilder |
LanguageProfilerBuilder.create(String name,
InputStream is,
String encoding)
Creates a new Language profile from (preferably quite large - 5-10k of
lines) text file
|
float |
LanguageProfilerBuilder.getSimilarity(LanguageProfilerBuilder another)
Calculates a score how well NGramProfiles match each other
|
| Modifier and Type | Method and Description |
|---|---|
String |
Translator.translate(String text,
String targetLanguage)
Translate text to the given language.
|
String |
DefaultTranslator.translate(String text,
String targetLanguage)
Translate, using the first available service-loaded translator
|
String |
Translator.translate(String text,
String sourceLanguage,
String targetLanguage)
Translate text between given languages.
|
String |
DefaultTranslator.translate(String text,
String sourceLanguage,
String targetLanguage)
Translate, using the first available service-loaded translator
|
| Modifier and Type | Class and Description |
|---|---|
class |
MimeTypeException
A class to encapsulate MimeType related exceptions.
|
| Modifier and Type | Method and Description |
|---|---|
SAXParser |
ParseContext.getSAXParser()
Returns the SAX parser specified in this parsing context.
|
void |
AutoDetectParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata) |
void |
AbstractParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata)
Deprecated.
use the
Parser.parse(InputStream, ContentHandler, Metadata, ParseContext) method instead |
void |
RecursiveParserWrapper.parse(InputStream stream,
ContentHandler ignore,
Metadata metadata,
ParseContext context)
Acts like a regular parser except it ignores the ContentHandler
and it automatically sets/overwrites the embedded Parser in the
ParseContext object.
|
void |
ParserPostProcessor.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Forwards the call to the delegated parser and post-processes the
results as described above.
|
void |
ParserDecorator.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the method call to the decorated parser.
|
void |
Parser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Parses a document stream into a sequence of XHTML SAX events.
|
void |
NetworkParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
ErrorParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
DigestingParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
DelegatingParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Looks up the delegate parser from the parsing context and
delegates the parse operation to it.
|
void |
CryptoParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
void |
CompositeParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the call to the matching component parser.
|
void |
AutoDetectParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context) |
| Modifier and Type | Method and Description |
|---|---|
static void |
ExternalParsersFactory.attachExternalParsers(TikaConfig config) |
static List<ExternalParser> |
ExternalParsersFactory.create() |
static List<ExternalParser> |
ExternalParsersFactory.create(ServiceLoader loader) |
static List<ExternalParser> |
ExternalParsersFactory.create(String filename,
ServiceLoader loader) |
static List<ExternalParser> |
ExternalParsersFactory.create(URL... urls) |
void |
ExternalParser.parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
|
static List<ExternalParser> |
ExternalParsersConfigReader.read(Document document) |
static List<ExternalParser> |
ExternalParsersConfigReader.read(Element element) |
static List<ExternalParser> |
ExternalParsersConfigReader.read(InputStream stream) |
| Constructor and Description |
|---|
CompositeExternalParser() |
CompositeExternalParser(MediaTypeRegistry registry) |
| Modifier and Type | Method and Description |
|---|---|
void |
SecureContentHandler.throwIfCauseOf(SAXException e)
Converts the given
SAXException to a corresponding
TikaException if it's caused by this instance detecting
a zip bomb. |
Copyright © 2007–2016 The Apache Software Foundation. All rights reserved.