Show TOC

Setting Up Distributed PreprocessingLocate this document in the navigation structure

Use

The procedure below explains how to implement distributed preprocessing. The description assumes that:

  • You have set up a distributed system with at least one master host.

  • You want to connect a host that exclusively preprocesses documents (preprocessor host). You want the preprocessors on this host to have as many system resources as possible.

Procedure

Adding a Preprocessor Host to the Distributed System

  1. Install TREX on the preprocessor host. During the installation specify the number of preprocessors to run on the host.

  2. If TREX is not running, start it.

  3. Start the TREX admin tool on a host that is already configured in the distributed system.

  4. Go to the Landscape Configuration window.

  5. Use Add Host to add the new preprocessor host.

Configuring Preprocessor Hosts

  1. Choose the preprocessor mode index for the preprocessor host.

  2. Configure the TREX daemon on the preprocessor host so that only the name server and preprocessors run there:

    1. Select the host in question and choose Edit Services.

    2. Change the programs parameter as follows:

      [daemon]

      programs = nameserver, preprocessor1, ..., preprocessor<n>

  3. Go to the Landscape Services window.

  4. Select one of the servers to run on the preprocessor host. Choose Start New/Stop Removed Services@<hostname>(*) from the context menu.

Configuring Master and Backup Hosts

  1. Go to the Landscape Ini window.

  2. Establish the maximum possible number of preprocessor threads for all hosts that preprocess documents. Take into account all hosts on which a preprocessor is running in either any or index mode.

    For more information about the calculation, see Number of Preprocessors and Preprocessor Threads.

  3. Calculate the pool size for each queue server.

  4. Edit the configuration file TREXQueueServer.ini for all queue servers. Enter the calculated value in the parameter poolsize.

  5. Go to the Landscape Services window.

  6. Select a queue server whose configuration you have changed. Choose Restartqueueserver@<hostname>:<port> from the context menu.

    Carry out this step for all other queue servers.

    The queue servers are automatically restarted by the TREX daemon.

Result

You can check whether the preprocessors are receiving as many system resources as possible by looking at the CPU load for the hosts in question in the TREX admin tool. When documents are being preprocessed, the CPU usage should be at the upper limit.