com.sapportals.wcm.util.string

Class RemoveHTMLTags

java.lang.Object
  extended byjava.io.Reader
      extended byjava.io.FilterReader
          extended bycom.sapportals.wcm.util.string.RemoveHTMLTags

public class RemoveHTMLTags
extends FilterReader

Title: Strip HTML tags from a Reader Description: It works only with proper tags, so character references needs to be converted before applying this Filter, see ReplaceHTMLTokens

Usage Example:

in = new BufferedReader(
 new ReplaceTokens(
 new RemoveHTMLTags(
 new CharArrayReader(content.toCharArray()))));
  
Copyright (c) SAP AG 2001-2002 Company: SAP AG


Field Summary
static String newline
           
 
Fields inherited from class java.io.FilterReader
in
 
Fields inherited from class java.io.Reader
lock
 
Constructor Summary
RemoveHTMLTags(Reader in)
          
 
Method Summary
 void addEntity(String entity, String substitute)
          Defines an entity substitution set.
 void addSubstitute(String tag, String substitute)
          Defines a tag substitution set.
 void clearEntities()
          Tells the filter to clear all current entity substitutions
 void clearSubstitutes()
          Tells the filter to clear all current tag substitutions.
 boolean getNoCR()
          You can tell the filter to remove carraige returns (\n and \r) when parsing.
static void main(String[] args)
          The main program for test
 int read()
          Reads a char.
 int read(char[] b, int off, int len)
          read
 void setCompressSpace(boolean flag)
          Define whether multiple spaces should be compressed into a single space.
 void setNoCR(boolean flag)
          You can tell the filter to remove carraige returns (\n and \r) when parsing.
 void useStandardEntities()
          Tells the filter to set up a standard entity substitution set.
 void useStandardSubstitutes()
          Tells the filter to set up a standard tag substitution set.
 
Methods inherited from class java.io.FilterReader
close, mark, markSupported, ready, reset, skip
 
Methods inherited from class java.io.Reader
read
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

newline

public static String newline
Constructor Detail

RemoveHTMLTags

public RemoveHTMLTags(Reader in)

Parameters:
in - parameter for
Method Detail

read

public int read(char[] b,
                int off,
                int len)
         throws IOException
read

Parameters:
len - parameter for read
b - TBD: Description of the incoming method parameter
off - TBD: Description of the incoming method parameter
Returns:
the returned int
Throws:
IOException - -

read

public int read()
         throws IOException
Reads a char. Will block if no input is available.

Returns:
the char read, or -1 if the end of the stream is reached.
Throws:
IOException - If an I/O error has occurred.
See Also:
InputStream.read()

addEntity

public void addEntity(String entity,
                      String substitute)
               throws NullPointerException
Defines an entity substitution set. The user should only use the base tag definition and not include the & and ; characters.

Parameters:
entity - - the entity to substitute. Example: quot, #90, copy.
substitute - - string to sub for the entity.
Throws:
NullPointerException
See Also:
useStandardEntities()

addSubstitute

public void addSubstitute(String tag,
                          String substitute)
                   throws NullPointerException
Defines a tag substitution set. The user should only use the base tag definition (when substituting, attributes are ignored anyway). For example, addSubstitute("hr", "-=-=-=") will substitute any HR tag found (even with attributes) with the -=-=-= characters. Note: beginning and ending need their own definitions

Parameters:
tag - - the tag to substitute. Use base tag without attributes.
substitute - - string to sub for tag.
Throws:
NullPointerException
See Also:
useStandardSubstitutes()

useStandardSubstitutes

public void useStandardSubstitutes()
Tells the filter to set up a standard tag substitution set.


useStandardEntities

public void useStandardEntities()
Tells the filter to set up a standard entity substitution set. This is relatively limited and the default set is shown below: / Original Substitution / quot " / amp & / lt < / gt > / copy (C)


clearSubstitutes

public void clearSubstitutes()
Tells the filter to clear all current tag substitutions.


clearEntities

public void clearEntities()
Tells the filter to clear all current entity substitutions


setCompressSpace

public void setCompressSpace(boolean flag)
Define whether multiple spaces should be compressed into a single space. This does not affect other "white space" characters.

Parameters:
flag -

setNoCR

public void setNoCR(boolean flag)
You can tell the filter to remove carraige returns (\n and \r) when parsing. This allows the caller to insert CR's where desired since CR's in HTML do not confer any special meaning

Parameters:
flag - - remove carraige returns during parse and substitue spaces

getNoCR

public boolean getNoCR()
You can tell the filter to remove carraige returns (\n and \r) when parsing. This allows the caller to insert CR's where desired since CR's in HTML do not confer any special meaning

Returns:
boolean

main

public static void main(String[] args)
The main program for test

Parameters:
args - The command line arguments


Copyright 2006 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP AG. The information contained herein may be changed without prior notice. Microsoft, Windows, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation. Oracle is a registered trademark of Oracle Corporation. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium, Massachusetts Institute of Technology. Java is a registered trademark of Sun Microsystems, Inc. JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and implemented by Netscape. MaxDB is a trademark of MySQL AB, Sweden. SAP, R/3, mySAP, mySAP.com, xApps, xApp, SAP NetWeaver, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group products and services are those that are set forth in the express warranty statements accompanying such products and services, if any. Nothing herein should be construed as constituting an additional warranty.