com.sapportals.wcm.util.html

Interface IHTMLReader

All Known Subinterfaces:
IHTMLFilter
All Known Implementing Classes:
HTMLFilterImpl

public interface IHTMLReader

Reads HTML documents and generates events.

The IHTMLReader generates events for HTML documents. Events are sent to the IHTMLContentHandler. There can be only one content handler per reader.

A document is parsed by first setting the input source and then calling parse() once or parseNextEvent() repeatedly. parseNextEvent() parses the document until the next event was sent to the content handler and then returns to the caller. It is not garantueed that exactly one event is generated.

Input Sources and Encodings:

Note that implementations of this class are not multithread-safe .

Copyright (c) SAP AG 2001-2002


Method Summary
 void discard()
          Free all allocated resources.
 IHTMLContentHandler getContentHandler()
          Get the registered content handler.
 String getEncoding()
          Return the encoding used in the document.
 ITextContentHandler getRawContentHandler()
          Get the registered raw content handler.
 void parse()
          Parse the complete document, generating events, until the source is read emtpy.
 boolean parseNextEvent()
          Parse the document, generating an events, and return to the caller.
 void setContentHandler(IHTMLContentHandler handler)
          Set the content handler to a new value.
 void setRawContentHandler(ITextContentHandler handler)
          Set the content handler to a new value.
 void setSource(InputStream input)
          Set InputStream as document source.
 void setSource(InputStream input, String encoding)
          Set InputStream as document source, use the given encoding.
 void setSource(Reader input)
          Set Reader as document source, encoding is irrelevant.
 

Method Detail

getContentHandler

public IHTMLContentHandler getContentHandler()
Get the registered content handler. Returns null if none is installed.

Returns:
registered content handler

getRawContentHandler

public ITextContentHandler getRawContentHandler()
Get the registered raw content handler. Returns null if none is installed.

Returns:
registered content handler

setContentHandler

public void setContentHandler(IHTMLContentHandler handler)
Set the content handler to a new value. null is allowed to deregister an installed handler.

Parameters:
handler - to register

setRawContentHandler

public void setRawContentHandler(ITextContentHandler handler)
Set the content handler to a new value. null is allowed to deregister an installed handler.

Parameters:
handler - to register

getEncoding

public String getEncoding()
                   throws HTMLException,
                          IOException
Return the encoding used in the document.

Returns:
encoding used in document or null if unknown.
Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

setSource

public void setSource(InputStream input)
               throws HTMLException,
                      IOException
Set InputStream as document source. Encoding will be detected.

Parameters:
input - stream to read document from
Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

setSource

public void setSource(InputStream input,
                      String encoding)
               throws HTMLException,
                      IOException
Set InputStream as document source, use the given encoding.

Parameters:
input - stream to read document from
encoding - to use for stream
Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

setSource

public void setSource(Reader input)
               throws HTMLException,
                      IOException
Set Reader as document source, encoding is irrelevant.

Parameters:
input - to read document from
Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

parse

public void parse()
           throws HTMLException,
                  IOException
Parse the complete document, generating events, until the source is read emtpy.

Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

parseNextEvent

public boolean parseNextEvent()
                       throws HTMLException,
                              IOException
Parse the document, generating an events, and return to the caller. Will return true as long as there are more events to read.

Returns:
if there are more events to read
Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

discard

public void discard()
Free all allocated resources. Not necessary to call when parsing has finished.



Copyright 2006 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.