This SAP Manufacturing Integration and Intelligence (SAP MII) action is used to do the following:
Retrieve an HTML page
Look for text patterns in the source
Return data elements in the pattern
To retrieve a pattern that spans multiple lines, you can use the symbol {WS} to ignore white space, line breaks, and so on. Use curly brackets, {}, to surround the element value that you want to return.
To scrape data from an HTML page, use the HTML Loader action to load it. You can link its StringContent property to the Sourceproperty of this action.
The properties for this action are listed in the following table:
Property | Data Type | Access | Description |
Source | String | In and out | The HTML page source. |
Pattern | String | In and out | The pattern used to find data in the HTML source. |
Output | String | In and out | An XML document in SAP MII XML format. |
Success | Boolean | Out | Indicates whether the action succeeded or failed. If it failed, errors are displayed in the server trace log. |
The following source HTML exists:
<TABLE ALIGN="CENTER" BORDER="5">
<TR>
<TD width="90" align="center"><B>Interface</B></TD>
<TD width="100" align="center"><B>Actual flow</B></TD>
<TD width="100" align="center"><B>Warning Level</B></TD>
<TD width="100" align="center"><B>Transfer Limit</B></TD>
</TR>
<TR>
<TD>EAST</TD>
<TD align="right">4824</TD>
<TD align="right">5473</TD>
<TD align="right">5761</TD>
</TR>
<TR>
<TD>CENTRAL</TD>
<TD align="right">3698</TD>
<TD align="right">4169</TD>
<TD align="right">4388</TD>
</TR>
<TR>
<TD>WEST</TD>
<TD align="right">5383</TD>
<TD align="right">5919</TD>
<TD align="right">6230</TD>
</TR>
<TR>
<TD>APSOUTH</TD>
<TD align="right">2902</TD>
<TD align="right">3034</TD>
<TD align="right">3194</TD>
</TR>
<TR>
<TD>BED-BLA</TD>
<TD align="right">1809</TD>
<TD align="right">1788</TD>
<TD align="right">1882</TD>
</TR>
</TABLE>
It appears in the following way in your browser:
Interface | Actual Flow | Warning Level | Transfer Limit |
EAST | 4824 | 5473 | 5761 |
CENTRAL | 3698 | 4169 | 4388 |
WEST | 5383 | 5919 | 6230 |
APSOUTH | 2902 | 3034 | 3194 |
BED-BLA | 1809 | 1788 | 1882 |
To return each row of data, use the following match pattern:
<TR>{WS}<TD>{INTERFACE}</TD>{WS}<TD align="right">{ACTUAL}</TD>{WS}<TD align="right">{WARNING}</TD>{WS}<TD align="right">{LIMIT}</TD>{WS}</TR>
The data in the pattern you want to retrieve is replaced by a variable name in curly brackets. For example, the Interface column is in the following pattern:
<TR>
<TD>EAST</TD>
The match pattern to return that piece of data is:
<TR>{WS}<TD>{INTERFACE}</TD>
Where <TR> is followed by the white space symbol to ignore the line break, and the EAST value that you want returned is replaced by a variable named INTERFACE. The variable is declared to the action by placing it in curly brackets. The data value EAST is placed into the variable INTERFACE. This setup allows you to return all matches in the table.
The resulting XML document is the output of the action and is in standard SAP MII XML format. It can be sent to an applet through a transaction variable, linked to another document, or written to a database.