Data Storage

In a distributed system you can keep TREX data (indexes, queues, and index snapshots) centrally or on the separate hosts.

Decentralized Data Storage

If data is not kept centrally, each host stores its data in its own directory structure. The data is normally located locally on the hosts.

The following graphic depicts the data and directory structure with decentralized data storage:

The master indexes, corresponding queues, and the index snapshots are located on a master host. The index snapshots are index copies that the system needs for index replication.

The slave indexes are located on a slave host. They are created and updated by index replication. There is no other data on the slave hosts.

You cannot use backup hosts in systems where data storage is decentralized. This means that you cannot make indexing highly available in such systems.

Centralized Data Storage

With centralized data storage, the data is stored so that all TREX hosts can access it.

Centralized data storage can be realized with different hardware solutions: The data can be located on a server that is optimized for file sharing, in a storage area network (SAN), or on a network attached storage server (NAS server). It is important that the connection between the TREX hosts and the data is sufficiently fast. In the following documentation, a central storage location is referred to as a file server regardless of the underlying hardware.

Centralized data storage is necessary if you want indexing to be highly available. You can only move from a master index or queue server to a backup index or queue server if you are using centralized data storage. You can use standard solutions such as the RAID system to make data highly available.

Centralized data storage also has the following advantages if you are only using master and slave hosts:

Index replication generates less of a network load because the replicated files do not have to be copied onto every slave host.
Index replication is quicker.
Less disk space is required for the replicated indexes because all slave hosts share an index copy.

The following graphic depicts the data and directory structure with centralized data storage:

Features of a Blade System

If you do not want to implement individual hosts you can install TREX on a blade system. TREX supports blade systems that run on UNIX.

A blade system consists of hosts in the form of server blades. A blade system has the advantage that the initial costs and running costs for maintaining the system are less than if you were using individual hosts.

The server blades are connected to a central disk storage. This is referred to here as a file server, regardless of the underlying hardware.

The special feature of a TREX installation on a blade system is that the TREX software can be stored centrally as well as the TREX data. This means that you only have to install the software once on the file server. Maintaining the system is efficient because you only have to implement software updates once.

All server blades on which TREX is running access the same program files. However, each server blade has its own configuration files. The configuration files in the directory <TREX_DIR> are only used as templates. A script contained in the TREX delivery creates a separate subdirectory for each server blade and copies the configuration files to this subdirectory. For more information, seeActivating the Configuration Clones for Server Blades.

Except for the activation of this script, the remaining configuration takes place as for a system with individual hosts.

The graphic below depicts how data, programs, and configuration files might be stored in a blade system.