org.apache.tika.embedder
Interface Embedder

All Superinterfaces:
Serializable
All Known Implementing Classes:
ExternalEmbedder

public interface Embedder
extends Serializable

Tika embedder interface

Since:
Apache Tika 1.3

Method Summary
 void embed(Metadata metadata, InputStream originalStream, OutputStream outputStream, ParseContext context)
          Embeds related document metadata from the given metadata object into the given output stream.
 Set<MediaType> getSupportedEmbedTypes(ParseContext context)
          Returns the set of media types supported by this embedder when used with the given parse context.
 

Method Detail

getSupportedEmbedTypes

Set<MediaType> getSupportedEmbedTypes(ParseContext context)
Returns the set of media types supported by this embedder when used with the given parse context.

The name differs from the precedence of Parser.getSupportedTypes(ParseContext) so that parser implementations may also choose to implement this interface.

Parameters:
context - parse context
Returns:
immutable set of media types

embed

void embed(Metadata metadata,
           InputStream originalStream,
           OutputStream outputStream,
           ParseContext context)
           throws IOException,
                  TikaException
Embeds related document metadata from the given metadata object into the given output stream.

The given document stream is consumed but not closed by this method. The responsibility to close the stream remains on the caller.

Information about the parsing context can be passed in the context parameter. See the parser implementations for the kinds of context information they expect.

In general implementations should favor preserving the source file's metadata unless an update to a field is explicitly defined in the Metadata object. More specifically:

Parameters:
metadata - document metadata (input and output)
originalStream - the document stream (input)
outputStream - the output stream to write the metadata embedded data to
context - parse context
Throws:
IOException - if the document stream could not be read
TikaException - if the document could not be parsed


Copyright © 2007-2013 The Apache Software Foundation. All Rights Reserved.