org.apache.tika.parser.pdf
Class PDFParser
java.lang.Object
org.apache.tika.parser.pdf.PDFParser
- All Implemented Interfaces:
- java.io.Serializable, Parser
public class PDFParser
- extends java.lang.Object
- implements Parser
PDF parser.
This parser can process also encrypted PDF documents if the required
password is given as a part of the input metadata associated with a
document. If no password is given, then this parser will try decrypting
the document using the empty password that's often used with PDFs.
- See Also:
- Serialized Form
|
Field Summary |
static java.lang.String |
PASSWORD
Metadata key for giving the document password to the parser. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PASSWORD
public static final java.lang.String PASSWORD
- Metadata key for giving the document password to the parser.
- Since:
- Apache Tika 0.5
- See Also:
- Constant Field Values
PDFParser
public PDFParser()
getSupportedTypes
public java.util.Set<MediaType> getSupportedTypes(ParseContext context)
- Specified by:
getSupportedTypes in interface Parser
parse
public void parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
throws java.io.IOException,
org.xml.sax.SAXException,
TikaException
- Specified by:
parse in interface Parser
- Throws:
java.io.IOException
org.xml.sax.SAXException
TikaException
parse
public void parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata)
throws java.io.IOException,
org.xml.sax.SAXException,
TikaException
- Deprecated. This method will be removed in Apache Tika 1.0.
- Specified by:
parse in interface Parser
- Throws:
java.io.IOException
org.xml.sax.SAXException
TikaException
Copyright © 2007-2011 The Apache Software Foundation. All Rights Reserved.