com.sapportals.wcm.util.html

Class HTMLFilterImpl

java.lang.Object
  extended bycom.sapportals.wcm.util.html.HTMLFilterImpl
All Implemented Interfaces:
IHTMLContentHandler, IHTMLFilter, IHTMLReader, ITextContentHandler
Direct Known Subclasses:
HTMLScriptRemover

public class HTMLFilterImpl
extends Object
implements IHTMLFilter, IHTMLContentHandler, ITextContentHandler

Default Implementation of IHTMLFilter.

Provides a default implmentation which is the null filter. It forwards all events unchanged to its content handler.

Filters only interested in a subset of the events can extend this class to ease their implementation effort.

Copyright (c) SAP AG 2001-2002


Constructor Summary
HTMLFilterImpl()
          Empty filter with a parent reader installed.
HTMLFilterImpl(IHTMLReader reader)
          Filter which receives its events from the given reader.
 
Method Summary
 void characters(char[] ch, int start, int length)
          Notification of a character event.
 void discard()
          Free all allocated resources.
 void endDocument()
          Notification that the document is finished.
 void endElement(IHTMLElement element)
          Notification that an end tag was encountered (e.g. starting with '</').
 void endTextDocument()
          Notification that the document is finished.
 IHTMLContentHandler getContentHandler()
          Get the registered content handler.
 String getEncoding()
          Return the encoding used in the document.
 IHTMLReader getParent()
          Get the reader this filter gets its events from.
 ITextContentHandler getRawContentHandler()
          Get the registered raw content handler.
 void parse()
          Parse the complete document, generating events, until the source is read emtpy.
 boolean parseNextEvent()
          Parse the document, generating an events, and return to the caller.
 void setContentHandler(IHTMLContentHandler handler)
          Set the content handler to a new value.
 void setParent(IHTMLReader reader)
          Set the reader where this filter should get its events from.
 void setRawContentHandler(ITextContentHandler handler)
          Set the content handler to a new value.
 void setSource(InputStream input)
          Set InputStream as document source.
 void setSource(InputStream input, String encoding)
          Set InputStream as document source, use the given encoding.
 void setSource(Reader input)
          Set Reader as document source, encoding is irrelevant.
 void startDocument()
          Notification that the document is about to start.
 void startElement(IHTMLElementStart element)
          Notification that a tag was encountered.
 void startTextDocument()
          Notification that the document is about to start.
 void textCharacters(char[] buffer, int start, int len)
          Notification of a character event.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

HTMLFilterImpl

public HTMLFilterImpl()
Empty filter with a parent reader installed.


HTMLFilterImpl

public HTMLFilterImpl(IHTMLReader reader)
Filter which receives its events from the given reader.

Parameters:
reader - to get events from
Method Detail

getParent

public IHTMLReader getParent()
Description copied from interface: IHTMLFilter
Get the reader this filter gets its events from.

Specified by:
getParent in interface IHTMLFilter
Returns:
parent reader

setParent

public void setParent(IHTMLReader reader)
Description copied from interface: IHTMLFilter
Set the reader where this filter should get its events from.

Specified by:
setParent in interface IHTMLFilter
Parameters:
reader - new parent reader

getContentHandler

public IHTMLContentHandler getContentHandler()
Description copied from interface: IHTMLReader
Get the registered content handler. Returns null if none is installed.

Specified by:
getContentHandler in interface IHTMLReader
Returns:
registered content handler

getRawContentHandler

public ITextContentHandler getRawContentHandler()
Description copied from interface: IHTMLReader
Get the registered raw content handler. Returns null if none is installed.

Specified by:
getRawContentHandler in interface IHTMLReader
Returns:
registered content handler

setSource

public void setSource(InputStream input)
               throws HTMLException,
                      IOException
Description copied from interface: IHTMLReader
Set InputStream as document source. Encoding will be detected.

Specified by:
setSource in interface IHTMLReader
Parameters:
input - stream to read document from
Throws:
IOException - on read errors
HTMLException - when document is not legal HTML

setSource

public void setSource(InputStream input,
                      String encoding)
               throws HTMLException,
                      IOException
Description copied from interface: IHTMLReader
Set InputStream as document source, use the given encoding.

Specified by:
setSource in interface IHTMLReader
Parameters:
input - stream to read document from
encoding - to use for stream
Throws:
IOException - on read errors
HTMLException - when document is not legal HTML

setSource

public void setSource(Reader input)
               throws HTMLException,
                      IOException
Description copied from interface: IHTMLReader
Set Reader as document source, encoding is irrelevant.

Specified by:
setSource in interface IHTMLReader
Parameters:
input - to read document from
Throws:
IOException - on read errors
HTMLException - when document is not legal HTML

setContentHandler

public void setContentHandler(IHTMLContentHandler handler)
Description copied from interface: IHTMLReader
Set the content handler to a new value. null is allowed to deregister an installed handler.

Specified by:
setContentHandler in interface IHTMLReader
Parameters:
handler - to register

setRawContentHandler

public void setRawContentHandler(ITextContentHandler handler)
Description copied from interface: IHTMLReader
Set the content handler to a new value. null is allowed to deregister an installed handler.

Specified by:
setRawContentHandler in interface IHTMLReader
Parameters:
handler - to register

getEncoding

public String getEncoding()
                   throws HTMLException,
                          IOException
Description copied from interface: IHTMLReader
Return the encoding used in the document.

Specified by:
getEncoding in interface IHTMLReader
Returns:
encoding used in document or null if unknown.
Throws:
IOException - on read errors
HTMLException - when document is not legal HTML

parse

public void parse()
           throws HTMLException,
                  IOException
Description copied from interface: IHTMLReader
Parse the complete document, generating events, until the source is read emtpy.

Specified by:
parse in interface IHTMLReader
Throws:
HTMLException - when document is not legal HTML
IOException - on read errors

parseNextEvent

public boolean parseNextEvent()
                       throws HTMLException,
                              IOException
Description copied from interface: IHTMLReader
Parse the document, generating an events, and return to the caller. Will return true as long as there are more events to read.

Specified by:
parseNextEvent in interface IHTMLReader
Returns:
if there are more events to read
Throws:
IOException - on read errors
HTMLException - when document is not legal HTML

discard

public void discard()
Description copied from interface: IHTMLReader
Free all allocated resources. Not necessary to call when parsing has finished.

Specified by:
discard in interface IHTMLReader

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws HTMLException
Description copied from interface: IHTMLContentHandler
Notification of a character event. The characters of the event are found in ch at offset start . There are length number of characters.

The content of the buffer before start or after start + length is undefined. Modification of the character array is strictly forbidden. The content of the array is undefined after this method returns.

Specified by:
characters in interface IHTMLContentHandler
Parameters:
ch - array holding characters of event
start - where in the array the characters begin
length - number of characters in event
Throws:
HTMLException - to indicate error in event handling

endDocument

public void endDocument()
                 throws HTMLException
Description copied from interface: IHTMLContentHandler
Notification that the document is finished.

Specified by:
endDocument in interface IHTMLContentHandler
Throws:
HTMLException - to indicate error in event handling

endElement

public void endElement(IHTMLElement element)
                throws HTMLException
Description copied from interface: IHTMLContentHandler
Notification that an end tag was encountered (e.g. starting with '</'). The element paramter is only valid for the duration of the call. The content of element are undefined when the method returns. See IHTMLElement for further information.

Specified by:
endElement in interface IHTMLContentHandler
Parameters:
element - TBD: Description of the incoming method parameter
Throws:
HTMLException - to indicate error in event handling

startDocument

public void startDocument()
                   throws HTMLException
Description copied from interface: IHTMLContentHandler
Notification that the document is about to start.

Specified by:
startDocument in interface IHTMLContentHandler
Throws:
HTMLException - to indicate error in event handling

startElement

public void startElement(IHTMLElementStart element)
                  throws HTMLException
Description copied from interface: IHTMLContentHandler
Notification that a tag was encountered. The element paramter is only valid for the duration of the call. The content of element are undefined when the method returns. See IHTMLElementStart for further information.

Specified by:
startElement in interface IHTMLContentHandler
Parameters:
element - TBD: Description of the incoming method parameter
Throws:
HTMLException - to indicate error in event handling

startTextDocument

public void startTextDocument()
                       throws IOException
Description copied from interface: ITextContentHandler
Notification that the document is about to start.

Specified by:
startTextDocument in interface ITextContentHandler
Throws:
IOException - to indicate error in event handling

endTextDocument

public void endTextDocument()
                     throws IOException
Description copied from interface: ITextContentHandler
Notification that the document is finished.

Specified by:
endTextDocument in interface ITextContentHandler
Throws:
IOException - to indicate error in event handling

textCharacters

public void textCharacters(char[] buffer,
                           int start,
                           int len)
                    throws IOException
Description copied from interface: ITextContentHandler
Notification of a character event. The characters of the event are found in ch at offset start . There are length number of characters.

The content of the buffer before start or after start + length is undefined. Modification of the character array is strictly forbidden. The content of the array is undefined after this method returns.

Specified by:
textCharacters in interface ITextContentHandler
Parameters:
buffer - array holding characters of event
start - where in the array the characters begin
len - number of characters in event
Throws:
IOException - to indicate error in event handling


Copyright 2006 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.