Show TOC

Setting Up Distributed PreprocessingLocate this document in the navigation structure

Use

The procedure below explains how to implement distributed preprocessing. The description assumes that:

  • You have set up a distributed system with at least one master host.
  • You want to connect a host that exclusively preprocesses documents (preprocessor host). You want the preprocessors on this host to have as many system resources as possible.

Adding a Preprocessor Host to the Distributed System

  1. Install TREX on the preprocessor host. During the installation specify the number of preprocessors to run on the host.
  2. If TREX is not running, start it.
  3. Start the TREX admin tool on a host that is already configured in the distributed system.
  4. Go to the Landscape Configuration window.
  5. Use Add Host to add the new preprocessor host.

Configuring Preprocessor Hosts

  1. Choose the preprocessor mode index for the preprocessor host.
  2. Configure the TREX daemon on the preprocessor host so that only the name server and preprocessors run there:
    1. Select the host in question and choose Edit Services.
    2. Change the programs parameter as follows:

      [daemon]

      programs = nameserver, preprocessor1, ..., preprocessor<n>

  3. Go to the Landscape Serviceswindow.
  4. Select one of the servers to run on the preprocessor host. Choose Start New/Stop Removed Services@<hostname>(*)from the context menu.

Configuring Master and Backup Hosts

  1. Go to the Landscape Iniwindow.
  2. Establish the maximum possible number of preprocessor threads for all hosts that preprocess documents. Take into account all hosts on which a preprocessor is running in either any or index mode.

    For more information about the calculation, seePreprocessor Threads and Queue Server Pool Size.

  3. Calculate the pool size for each queue server.

    For more information, seePreprocessor Threads and Queue Server Pool Size.

  4. Edit the configuration file TREXQueueServer.ini for all queue servers. Enter the calculated value in the parameter poolsize.
  5. Go to the Landscape Serviceswindow.
  6. Select a queue server whose configuration you have changed. Choose Restart queueserver@<host_name>:<port> from the context menu.

    Carry out this step for all other queue servers.

    The queue servers are automatically restarted by the TREX daemon.

Result

You can check whether the preprocessors are receiving as many system resources as possible by looking at the CPU load for the hosts in question in the TREX admin tool. When documents are being preprocessed, the CPU usage should be at the upper limit.