Show TOC

Excluding File Formats from ProcessingLocate this document in the navigation structure

Use

You can use a list of MIME types in the configuration file TREXValidMimeTypes.ini to control which file formats are to be processed by TREX. MIME types for graphic formats such as image/jpeg, image/gif, and image/bmp are not listed in the configuration file although these formats are supported by the filter software integrated into TREX (seeSupported File Formats). This exclusion prevents TREX from being unnecessarily burdened by the processing of these formats, since it is not normally sensible to index images and graphics. There may be other scenarios where it makes sense to exclude certain file formats.

Tip

A company archives its financial statements in the form of PDF files. These files contain mostly figures, with hardly any relevant text information. The processing of these large files would unnecessarily hamper the performance of TREX but not simplify the indexing of the content. It therefore makes sense to exclude these files from processing.

Procedure

You exclude the document content of a particular file format from being processed by TREX by removing the corresponding MIME types from the configuration file TREXValidMimeTypes.ini. Proceed as follows to do this.

  1. Stop TREX.
  2. Open the configuration file <TREX_installation_directory>\TREXValidMimeTypes.ini with a text editor.

    The configuration file TREXValidMimeTypes.ini is located in the TREX installation directory. The path to the directory is:

    • On UNIX: /usr/sap/trex_<instance_number>
    • On Windows: <disk_drive>:\usr\sap\trex_<instance_number>
  3. Remove the entry for the file format that you want to exclude from the list.
    Tip

    You do not want TREX to process PDF files because such files contain no relevant text information for your scenario. You remove the entry application/pdf from the list of MIME types in the configuration file TREXValidMimeTypes.ini.

  4. Save the file.
  5. Start TREX.

List of MIME Types in the Configuration File TREXValidMimeTypes.ini

MIME Type

File Extension Application

application/andrew-inset

ec

 

application/dca-rft

rft

IBM Revisable Form Text

application/excel

xls

MS EXCEL

application/macwriteii

MWII

MacWrite II

application/msword

doc,dot

MS Word

application/oda

oda

CALS Raster (GP4)

application/pdf

pdf

Adobe PDF

application/powerpoint

ppt

MS Powerpoint

application/rtf

rtf

Rich Text Format

application/smil

smil, smi

 

application/vnd.lotus-1-2-3

123, w4, w3, w1

Lotus 1-2-3

application/vnd.lotus-freelance

prz, pre

Lotus Freelance

application/vnd.lotus-wordpro

lwp, sam

Lotus WordPro

application/vnd.ms-excel

xls, xlb

MS EXCEL

application/vnd.ms-powerpoint

ppt, pps, pot

MS PowerPoint

application/vnd.ms-wpl

wpl

DEC WPS Plus (WPL)

application/wordperfect5.1

wp5

Word Perfect 5.1

application/x-123

w1, wk3, wk4, wks

Lotus 1-2-3 (DOS & Windows)

application/x-cdlink

vcd

 

application/x-chess-pgn

pgn

 

application/x-compress

 

UNIX compress

application/x-csh

csh

UNIX CShell Script

application/x-dvi

dvi

 

application/x-freelance

pre

Freelance for Windows

application/x-gtar

gtar

GNU UNIX tar archive

application/x-gzip

gz, tgz

GNU Zip compressed data

application/x-httpd-php

   

application/x-javascript

js

JavaScript

application/x-latex

latex

LaTex

application/x-maker

frm, maker, frame, rm, fb, book, fbdoc

Adobe FrameMaker

application/x-mif

mif

Adobe FrameMaker (MIF)

application/x-msdos-program

dll

Dynamic Link Library

application/x-msexcel

xls, xlb

MS EXCEL

application/x-msmetafile

wmf

MS Metafile

application/x-netcdf

nc, cdf

 

application/x-ns-proxy-autoconfig

pac

Netscape Proxy Auto Config

application/x-perl

pl, pm

Perl Program

application/x-sh

sh

UNIX Bourne Shell Script

application/x-tar

tar

UNIX tar Archive

application/x-tcl

tcl

TCL Script

application/x-tex

tex

 

application/x-texinfo

texinfo, texi

 

application/x-troff

t, tr, troff

UNIX troff document

application/x-troff-man

man

UNIX man page

application/x-troff-me

me

UNIX troff document

application/x-troff-ms

ms

UNIX troff document

application/x-ustar

ustar

 

application/x-wais-source

src

 

application/xlc

xlc

 

application/zip

zip

 
Note

File formats of the MIME types text/*, including HTML, XML, and plain text formats such as *.txt and *.rtf, are processed by TREX without being filtered.

text/asp

asp

Active Server Pages

text/css

css

Cascading Style Sheets

text/html

html, htm, shtml

Hypertext Markup Language

text/plain

txt, c, ec, cpp, h, hpp, eml, sap

 

text/richtext

rtx

 

text/rtf

rtf

 

text/src-c

c

 

text/src-c++

cpp

 

text/src-java

java

 

text/src-perl

perl

 

text/src-tcl

tcl

 

text/tab-separated-values

tsv

 

text/thtml

   

text/vnd.wap.wml

wml

 

text/wiki

   

text/wml

wml

 

text/x-asm

   

text/x-setext

   

text/x-sgml

   

text/x-ssi-html

   

text/x-uil

   

text/x-uuencode

   

text/x-vCalendar

   

text/x-vCard

   

text/xml

xml

Extensible Markup Language