|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.tika.parser.AbstractParser
org.apache.tika.parser.pdf.PDFParser
public class PDFParser
PDF parser.
This parser can process also encrypted PDF documents if the required password is given as a part of the input metadata associated with a document. If no password is given, then this parser will try decrypting the document using the empty password that's often used with PDFs.
| Field Summary | |
|---|---|
static String |
PASSWORD
Metadata key for giving the document password to the parser. |
| Constructor Summary | |
|---|---|
PDFParser()
|
|
| Method Summary | |
|---|---|
boolean |
getEnableAutoSpace()
|
boolean |
getExtractAnnotationText()
If true, text in annotations will be extracted. |
Set<org.apache.tika.mime.MediaType> |
getSupportedTypes(org.apache.tika.parser.ParseContext context)
|
void |
parse(InputStream stream,
ContentHandler handler,
org.apache.tika.metadata.Metadata metadata,
org.apache.tika.parser.ParseContext context)
|
void |
setEnableAutoSpace(boolean v)
If true (the default), the parser should estimate where spaces should be inserted between words. |
void |
setExtractAnnotationText(boolean v)
If true (the default), text in annotations will be extracted. |
| Methods inherited from class org.apache.tika.parser.AbstractParser |
|---|
parse |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final String PASSWORD
| Constructor Detail |
|---|
public PDFParser()
| Method Detail |
|---|
public Set<org.apache.tika.mime.MediaType> getSupportedTypes(org.apache.tika.parser.ParseContext context)
public void parse(InputStream stream,
ContentHandler handler,
org.apache.tika.metadata.Metadata metadata,
org.apache.tika.parser.ParseContext context)
throws IOException,
SAXException,
org.apache.tika.exception.TikaException
IOException
SAXException
org.apache.tika.exception.TikaExceptionpublic void setEnableAutoSpace(boolean v)
public boolean getEnableAutoSpace()
#setEnableAutoSpace.public void setExtractAnnotationText(boolean v)
public boolean getExtractAnnotationText()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||