public interface IHTMLReader
The IHTMLReader generates events for HTML documents. Events are sent to the
IHTMLContentHandler. There can be only
one content handler per reader.
A document is parsed by first setting the input source and then calling
parse() once or parseNextEvent() repeatedly. parseNextEvent()
parses the document until the next event was sent to the
content handler and then returns to the caller. It is not garantueed that
exactly one event is generated.
Input Sources and Encodings:
Reader is used as input source, no attempt is
made to detect the encoding of the html document. getEncoding() will
return null in that case. InputStream together with an encoding is used,
any encoding specified in meta tags of the document is ignored
and the given encoding is used. InputStream together without encoding is used,
the reader will look into the first n octets of the html document to detect
a HTML meta tag with Content-Type which specifies the
character set to use. If no encoding is found, ISO-8859-1 is assumed. The
number of octets used for encoding detection is implementation defined.
Note that implementations of this class are not multithread-safe .
Copyright (c) SAP AG 2001-2002
| Modifier and Type | Method and Description |
|---|---|
void |
discard()
Free all allocated resources.
|
IHTMLContentHandler |
getContentHandler()
Get the registered content handler.
|
String |
getEncoding()
Return the encoding used in the document.
|
ITextContentHandler |
getRawContentHandler()
Get the registered raw content handler.
|
void |
parse()
Parse the complete document, generating events, until the source is read
emtpy.
|
boolean |
parseNextEvent()
Parse the document, generating an events, and return to the caller.
|
void |
setContentHandler(IHTMLContentHandler handler)
Set the content handler to a new value.
|
void |
setRawContentHandler(ITextContentHandler handler)
Set the content handler to a new value.
|
void |
setSource(InputStream input)
Set InputStream as document source.
|
void |
setSource(InputStream input,
String encoding)
Set InputStream as document source, use the given encoding.
|
void |
setSource(Reader input)
Set Reader as document source, encoding is irrelevant.
|
IHTMLContentHandler getContentHandler()
null if none is
installed.ITextContentHandler getRawContentHandler()
null if none is
installed.void setContentHandler(IHTMLContentHandler handler)
null is allowed to
deregister an installed handler.handler - to registervoid setRawContentHandler(ITextContentHandler handler)
null is allowed to
deregister an installed handler.handler - to registerString getEncoding() throws HTMLException, IOException
null if unknown.HTMLException - when document is not legal HTMLIOException - on read errorsvoid setSource(InputStream input) throws HTMLException, IOException
input - stream to read document fromHTMLException - when document is not legal HTMLIOException - on read errorsvoid setSource(InputStream input, String encoding) throws HTMLException, IOException
input - stream to read document fromencoding - to use for streamHTMLException - when document is not legal HTMLIOException - on read errorsvoid setSource(Reader input) throws HTMLException, IOException
input - to read document fromHTMLException - when document is not legal HTMLIOException - on read errorsvoid parse()
throws HTMLException,
IOException
HTMLException - when document is not legal HTMLIOException - on read errorsboolean parseNextEvent()
throws HTMLException,
IOException
HTMLException - when document is not legal HTMLIOException - on read errorsvoid discard()
| Access Rights |
|---|
| SC | DC | Public Part | ACH |
|---|---|---|---|
[sap.com] KMC-CM
|
[sap.com] tc/km/frwk
|
api
|
EP-KM-CM
|
[sap.com] KMC-WPC
|
[sap.com] tc/kmc/wpc/wpcfacade
|
api
|
EP-PIN-WPC-WCM
|
Copyright 2021 SAP SE Complete Copyright Notice