org.apache.tika.parser.html
Class HtmlEncodingDetector
java.lang.Object
org.apache.tika.parser.html.HtmlEncodingDetector
- All Implemented Interfaces:
- org.apache.tika.detect.EncodingDetector
public class HtmlEncodingDetector
- extends Object
- implements org.apache.tika.detect.EncodingDetector
Character encoding detector for determining the character encoding of a
HTML document based on the potential charset parameter found in a
Content-Type http-equiv meta tag somewhere near the beginning. Especially
useful for determining the type among multiple closely related encodings
(ISO-8859-*) for which other types of encoding detection are unreliable.
- Since:
- Apache Tika 1.2
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
HtmlEncodingDetector
public HtmlEncodingDetector()
detect
public Charset detect(InputStream input,
org.apache.tika.metadata.Metadata metadata)
throws IOException
- Specified by:
detect in interface org.apache.tika.detect.EncodingDetector
- Throws:
IOException
Copyright © 2007-2013 The Apache Software Foundation. All Rights Reserved.