Crawler Monitor (SAP Library - Knowledge Management)

Crawler Monitor

Use

You can use the crawler monitor to monitor and control the activity of crawlers.

Integration

Crawlers are executed on the system of a load-balancing environment to which the index service task queue reader is assigned. The system ID of this system is displayed in the detailed view of crawler tasks.

Features

Every crawler carries out a crawling process on the server. The crawler monitor displays a list of these crawling tasks.

You can call up information on a crawling task whilst it is running or after it has been completed.

You can change between three different views: Overview, Delivered, and Statistics (see below).

You can display all active, all suspended, and all previous crawler tasks. You can also call up crawling tasks that have taken place within the last hour, last few days, and last week.

You can sort the list of crawling tasks according to different criteria. Select the sort criteria you want from the sort field in the upper right-hand corner. The arrow to the right of the sort field shows if the list is sorted in ascending or descending order. You can reverse the sort order by clicking on this arrow.

Note

The crawler monitor always shows the last run of a crawling task. For information on previous runs, see Application Log.

View: Overview

This view displays the current statistics for a crawling task.

Name	Description
Task	Name of crawling task. The name consists of the index ID and the repository name. If multiple data sources are assigned to an index, a crawling task is generated for each data source. To call up detailed information on a crawling task, click on its name.
Starting Point	Data source that the crawling task is processing. To open the data source, click on the link.
State	Current status of the crawler. Inactive: Process is not yet active or was terminated. Starting: Process is starting. Running: Process is running Suspending: Process is being suspended. Suspended: Process was suspended manually and can be continued by clicking on Resume. Resuming: Process is being resumed. Postprocessing: Objects are being post-processed. Done: Process is completely finished. Failed: Process failed.
Elapsed Time	Time elapsed since the crawler was started (including intended interruptions), in hours, minutes, and seconds. There may be a small wait before the crawler starts.
Delivered	Number of documents and folders transmitted to be further processed by TREX or other applications.
Incremental	Specifies whether the update is incremental.
Errors	Number of errors that occurred.
Processing Average (ms)	Average processing time for an object in milliseconds. This is the time that passes between calling the object and delivering it. It does not include the time taken for database operations.

View: Delivered

This view displays up-to-date information for delivered objects.

Name	Description
Task	Name of the crawling task (description: see Overview view)
State	Current status of the crawler (description: see Overview view)
Processed	Number of documents that have been processed by the crawler. This value does not have to match the value for the delivered documents and folders, since at this point filters have not yet been applied.
Provided	Number of documents that the crawler has processed and made available to TREX or other applications.
New	Number of new documents in an incremental update.
Changed	Number of changed documents in an incremental update.
Deleted	Number of deleted documents in an incremental update.

View: Statistics

This view displays the current statistics for a crawling task.

Name	Description
Task	Name of the crawling task (description: see Overview view)
State	Current status of the crawler (description: see Overview view)
Delivered	Number of delivered documents and folders.
Processing Errors	Number of errors that occurred during processing.
Retrieving Errors	Number of errors that occurred when the objects were being called.
Providing Errors	Number of errors that occurred when the objects were being forwarded.
Bad Links	Number of links with errors.
Filtered	Number of documents that were filtered.
Retrieving Time	In hours, minutes, and seconds.
Providing Time	In hours, minutes, and seconds.
Retrieving Average (ms)	Average time taken to call a document in milliseconds.
Providing Average (ms)	Average time taken to forward a processed document in milliseconds.

Note that crawlers that are used by the content exchange service or the subscription service are only visible in the crawler monitor at certain times. If you interrupt the crawling tasks of the subscription service by restarting the portal, they are continued at the next time entered in the corresponding scheduler tasks.

If you interrupt the crawling tasks of the subscription service by restarting the portal, they are restarted at the next time entered in the corresponding scheduler tasks.

Detailed Information on Crawling Tasks

When you click on the name of a crawling task, detailed information on the selected crawling task is displayed in a new window. This information is split into groups. Any log files that exist for the crawler can also be called up here.

To update the view, choose Refresh. You can also define an auto refresh for the window. To do this, choose the interval required from the dropdown box Auto Refresh.

To display information on documents that the selected crawler is currently accessing, choose On from the dropdown box Show Documents.

Note

If the display doesn’t change within a few minutes and after repeated refreshes, check the data sources that the crawler is accessing. For example, a Web server may have slowed down or frozen due to a high load.

Activities

To call up the crawler monitor, choose System Administration ® Monitoring ® Knowledge Management ® Crawler Monitor.

You can use the following functions:

Function	Description
Suspend	You use this function to suspend the selected crawling tasks. Each crawler notes the position at which it suspended and can be continued from there later on.
Resume	You use this function to reactivate the activity of suspended crawling tasks. They start from the position at which you stopped them.
Stop	You use this function to stop the selected crawling tasks. Stopped crawling tasks cannot be continued. However, you can restart them using Reindex or Incremental Update in the index administration area.
Delete	You use this function to remove the selected crawling tasks from the list. The crawler must be stopped. Note that it can take several minutes to delete a large number of documents. You can restart the tasks using the Reindex function in the index administration area.

The chosen functions are started after a short time delay.