The queue parameters in the table below define the following:
Queue Parameter | Purpose |
---|---|
Index Bulk Size |
Maximum number of documents to be indexed at one time. If there are more documents awaiting indexing, the queue server distributes them over more than one indexing run. Tip
Indexing takes place at hourly intervals. You should only index a maximum of 1000 document at a time. When the start condition is next reached, 3000 documents are ready for indexing. The queue server distributes the documents over three indexing runs. If there are fewer documents ready for indexing, the queue server nevertheless triggers indexing. Tip
Indexing takes place at hourly intervals. You should only index a maximum of 1000 document at a time. When the start condition is next reached, only 900 documents are ready for indexing. Although this is less than specified in the Index Bulk Size parameter, the queue server nevertheless triggers indexing. |
Max Size of Index Bulk |
Maximum number of bytes to be indexed at one time. The duration of indexing depends on the size of the documents. If documents are several MB in size, indexing takes a corresponding amount of time. This parameter therefore defines an upper limit for the data quantity. If the documents exceed this limit, the queue server distributes them accordingly. |
Optimize Bulk Size |
Specifies the number of indexing intervals after which the queue server triggers optimizing. In the optimization phase, the index server rebuilds the index. It inserts new documents in the index, removes deleted objects from the index, and optimizes the index structure so that it can reply to search queries as quickly as possible. While optimization is running, the queue server does not trigger any more indexing. It waits until optimization is completed. |
Initial Indexing Mode |
Specifies the situations in which the queue server triggers optimizing. This parameter influences the performances of the initial indexing of large data sets (100,000 documents or more). The following settings are possible:
|
When changing the parameters, remember that if the queue has the status Indexing or Optimizing, the changes do not affect actions that are currently taking place. The changes only take effect when the queue server has completed the actions.
Initial indexing of large database tables
You want to index a large database table with around 200 million data records. TREX treats each data set as a document that consists only of attributes.
The queue server and index server should process the documents as efficiently as possible. Both servers should have a reasonable load. To avoid idle time and reduce administration overheads, the document sets should not be too small. However, the document sets should also not be so large that they overload both servers.
The application sends the table content to TREX in packages that each contain 25,000 documents. The queue server should always collect 100,000 documents before triggering indexing. You should only index a maximum of 50,000 documents at a time. The index server should only carry out optimization once it has indexed 20 million documents. This means that there must be 400 indexing runs, for 50,000 documents each, before the queue server triggers optimization.
In this case, you set the queue parameters as follows:
Parameters | Value |
---|---|
Schedule Type |
Count |
Schedule Max Documents |
100000 |
Index Bulk Size |
50000 |
Optimize Bulk Size |
400 |
Initial Indexing Mode |
On |
The setting Initial Indexing Mode = On ensures that optimization first takes place after the index server has indexed 20 million documents.
If you set the Initial Indexing Mode to Off and keep the remaining configuration, the system would trigger indexing as soon as it has collected 100,000 documents. When indexing is complete (that is, 2 * 50,000 documents have been transmitted) the queue server triggers optimization. This means that optimization takes place after 100,000 documents have been indexed rather than 20 million documents.
Initial indexing of large document collections
You want to index 200 million documents (Word files, PDF files, and so on). Indexing should be as efficient as possible.
Processing documents with text content takes much more effort than processing documents that consist only of attributes. Therefore, the queue server should always collect only 10,000 documents before triggering indexing. You should only index a maximum of 10,000 documents at a time, with a maximum of 100 MB. The index server should trigger optimization once 100,000 documents have been indexed.
In this case, you set the queue parameters as follows:
Parameters | Value |
---|---|
Schedule Type |
Count |
Schedule Max Documents |
10000 |
Index Bulk Size |
10000 |
Max Size of Index Bulk |
1073741824 (corresponds to 100 MB) |
Optimize Bulk Size |
10 |
Initial Indexing Mode |
On |
Daily update of index
You should update a large index every day. To avoid creating a heavy load during the day, you set the update for 2am.
In this case, you set the queue parameters as follows:
Parameters | Value |
---|---|
Schedule Type |
Time |
Schedule Time |
All(02:00 AM) |
Index Bulk Size |
10000 |
Max. Size of Index Bulk |
1073741824 (corresponds to 100 MB) |
Optimize Bulk Size |
1 |
Initial Indexing Mode |
Off |
The queue server triggers indexing at 2am. If there are 10,000 documents or less, and the quantity of data does not exceed 100 MB, the documents are indexed in one run and then optimized.
If there are more than 10,000 documents, the queue server triggers indexing and optimizing for the first 10,000 documents. The queue server waits until the optimization of these documents is complete. It then processes the next lot of documents.
Hourly update of index
You want to index a smaller document set that changes continually. This index should be updated hourly.
Because the document set to be indexed initially is not large, set the queue parameters as follows straight away:
Parameters | Value |
---|---|
Schedule Type |
Time |
Schedule Time |
All-1 |
Index Bulk Size |
10000 |
Max. Size of Index Bulk |
1073741824 (corresponds to 100 MB) |
Optimize Bulk Size |
1 |
Initial Indexing Mode |
Off |