Show TOC

Excluding File Formats from ProcessingLocate this document in the navigation structure

Use

You can use a list of MIME types in the configuration file TREXValidMimeTypes.ini to control which file formats are to be processed by TREX. MIME types for graphic formats such as image/jpeg, image/gif, and image/bmp are not listed in the configuration file although these formats are supported by the filter software integrated into TREX. This exclusion prevents TREX from being unnecessarily burdened by the processing of these formats, since it is not normally sensible to index images and graphics. There may be other scenarios where it makes sense to exclude certain file formats.

Example

A company archives its financial statements in the form of PDF files. These files contain mostly figures, with hardly any relevant text information. The processing of these large files would unnecessarily hamper the performance of TREX but not simplify the indexing of the content. It therefore makes sense to exclude these files from processing.

Procedure

You exclude the document content of a particular file format from being processed by TREX by removing the corresponding MIME types from the configuration file TREXValidMimeTypes.ini.

  1. Stop TREX.

  2. Open the configuration file <TREX_installation_directory>\TREXValidMimeTypes.ini with a text editor.

    The configuration file TREXValidMimeTypes.ini is located in the TREX installation directory. The path to the directory is:

    • On UNIX: /usr/sap/trex_<instance_number>

    • On Windows: <disk_drive>:\usr\sap\trex_<instance_number>

  3. Remove the entry for the file format that you want to exclude from the list.

    Example

    You do not want TREX to process PDF files because such files contain no relevant text information for your scenario. You remove the entry application/pdf from the list of MIME types in the configuration file TREXValidMimeTypes.ini.

  4. Save the file.

  5. Start TREX.

List of MIME Types in the TREX Configuration File TREXValidMimeTypes.ini

MIME Types

File Extension

Application

application/andrew-inset

ec

application/dca-rft

rft

IBM Revisable Form Text

application/excel

xls

MS EXCEL

application/macwriteii

MWII

MacWrite II

application/msword

doc,dot

MS Word

application/oda

oda

CALS Raster (GP4)

application/pdf

pdf

Adobe PDF

application/powerpoint

ppt

MS Powerpoint

application/rtf

rtf

Rich Text Format

application/smil

smil, smi

application/vnd.lotus-1-2-3

123, w4, w3, w1

Lotus 1-2-3

application/vnd.lotus-freelance

prz, pre

Lotus Freelance

application/vnd.lotus-wordpro

lwp, sam

Lotus WordPro

application/vnd.ms-excel

xls, xlb

MS EXCEL

application/vnd.ms-powerpoint

ppt, pps, pot

MS Powerpoint

application/vnd.ms-wpl

wpl

DEC WPS Plus (WPL)

application/wordperfect5.1

wp5

Word Perfect 5.1

application/x-123

w1, wk3, wk4, wks

Lotus 1-2-3 (DOS & Windows)

application/x-cdlink

vcd

application/x-chess-pgn

pgn

application/x-compress

UNIX compress

application/x-csh

csh

UNIX CShell Script

application/x-dvi

dvi

application/x-freelance

pre

Freelance for Windows

application/x-gtar

gtar

GNU UNIX tar archive

application/x-gzip

gz, tgz

GNU Zip compressed data

application/x-httpd-php

application/x-javascript

js

JavaScript

application/x-latex

latex

LaTex

application/x-maker

frm, maker, frame, rm, fb, book, fbdoc

Adobe FrameMaker

application/x-mif

mif

Adobe FrameMaker (MIF)

application/x-msdos-program

dll

Dynamic Link Library

application/x-msexcel

xls, xlb

MS EXCEL

application/x-msmetafile

wmf

MS Metafile

application/x-netcdf

nc, cdf

application/x-ns-proxy-autoconfig

pac

Netscape Proxy Auto Config

application/x-perl

pl, pm

Perl Program

application/x-sh

sh

UNIX Bourne Shell Script

application/x-tar

tar

UNIX tar Archive

application/x-tcl

tcl

TCL Script

application/x-tex

tex

application/x-texinfo

texinfo, texi

application/x-troff

t, tr, troff

UNIX troff document

application/x-troff-man

man

UNIX man page

application/x-troff-me

me

UNIX troff document

application/x-troff-ms

ms

UNIX troff document

application/x-ustar

ustar

application/x-wais-source

src

application/xlc

xlc

application/zip

zip

Note

File formats of the MIME types text/*, including HTML, XML, and plain text formats such as *.txt and *.rtf, are processed by TREX without being filtered.

text/asp

asp

Active Server Pages

text/css

css

Cascading Style Sheets

text/html

html, htm, shtml

Hypertext Markup Language

text/plain

txt, c, ec, cpp, h, hpp, eml, sap

text/richtext

rtx

text/rtf

rtf

text/src-c

c

text/src-c++

cpp

text/src-java

java

text/src-perl

perl

text/src-tcl

tcl

text/tab-separated-values

tsv

text/thtml

text/vnd.wap.wml

wml

text/wiki

text/wml

wml

text/x-asm

text/x-setext

text/x-sgml

text/x-ssi-html

text/x-uil

text/x-uuencode

text/x-vCalendar

text/x-vCard

text/xml

xml

Extensible Markup Language