Show TOC Anfang des Inhaltsbereichs

Hintergrunddokumentation Parsing an XML document event-based  Dokument im Navigationsbaum lokalisieren

The previous section described how to create a DOM representation of an XML document. In some cases - especially when very larger documents are involved or when the DOM representation is immediately traversed and the content copied into a different data structure (e.g. database tables) - it might be better to use the event-based parsing mode. This parsing mode does not create a complete DOM representation of the document parsed but passes data in small chunks to the application as it is detected and parsed from the XML document.

Event-based parsing of an XML document

Event-based parsing of an XML document allows you to process data from that document in smaller chunks and in the order in which it is found by the parser. Whenever the parser has found a new logical element - e.g. an element, an attribute, text data, a comment, a processing instruction etc. - it will signal the occurrence to your application in a corresponding event. With iXML the event is not just a point in time, but an interface, the if_ixml_event interface. You can use this interface to obtain the associated data - e.g. the element's name or the attribute's value etc. - and trigger the further processing in your application.

To make things clearer, here an example:

<person status="retired">Walt Whitman</person>

The above XML fragment would - when parsed with the iXML parser - result in the following events being fired in sequence:

SeqNo

Assoc.DOM node

PIT

Available data

1

cl_ixml_element

pre

if_ixml_event::get_name() = "person"

2

cl_ixml_attribute

pre

if_ixml_event::get_name() = "status"</TD

3

cl_ixml_attribute

post

if_ixml_event::get_name() = "status"
if_ixml_event::get_value() = "retired"

4

cl_ixml_element

pre2

if_ixml_event::get_name() = "person"

5

cl_ixml_text

pre

if_ixml_event::get_value() = null

6

cl_ixml_text

post

if_ixml_event::get_value() = "Walt Whitman"

7

cl_ixml_element

post

if_ixml_event::get_name() = "person"

PIT in the above table stands for point-in-time. iXML distinguishes two (three with elements) points in time an event can occur: The pre-event and the post-event. The pre-event is fired as soon as the parser can tell what kind or type of logical element is the next to come. Since the parser has not yet parsed the missing parts, not all information is available yet. The post-event is fired when the complete logical element has been parsed and consequently all information is available. An example: If the XML parser detects a <-character in a document and the following characters are those valid for tag-names, it will know that there is a start-tag of an element coming. It will parse the name of the element/start-tag and tell your application about this by sending a cl_ixml_event. You can ask the cl_ixml_event to tell you the type of event by calling if_ixml_event::get_type() and it will answer as honest as possible with if_ixml_event~co_event_element_pre.

For a detailed description of the events supported and the information available with each event please refer to the Package Event of this documentation.

Event-based parsing in detail

In event-based parsing, the parser will return events to your application one by one. Whenever a new event has been found, the parser will return from the parsing call and let you do your job. In order to keep the parser going - i.e. parsing the complete document - you have to call the parse method again once you are done with processing the previous event. I call this the "iterator-like" approach since it looks very similar to calling get_next() on an iterator.

When using this iterator-like approach, you have to explicitely tell for which events you want to get control i.e. for which events the "iterator" is supposed to stop. You can do this by calling the if_ixml_parser::set_event_subscription method and passing an OR-combined list of event types.

In order to tell this parsing approach from the DOM-based one, there is a another parse method available: if_ixml_parser::parse_event (). Here's some code illustrating the operation:

data: event     type ref to if_ixml_event,

      event_sub type i.

 

* let the parser know which events I am interested in

event_sub = if_ixml_event=>co_event_element_pre2 +

            if_ixml_event=>co_event_element_post.

 

parser->set_event_subscription( events = event_sub ).

 

do.

  event = parser->parse_event( ).

  if event is initial.

    exit. ' either end reached or error (check below)

  endif.

  data: str type string.

  case event->get_type( ).

    when if_ixml_event~co_event_element_pre2.

      str = event->get_name( ).

      write: '<' str '>'.

    when if_ixml_event~co_event_text_post.

      str = event->get_value( ).

      write: str.

    ...

  endcase.

enddo.

 

* always check for errors:

if parser->num_errors( ) ne 0.

  ...

endif.

Since creating the iXML main factory, the streamfactory, the document etc. hasn't changed a bit, I will leave it as an excercise to the reader ;-)

 

Ende des Inhaltsbereichs