Show TOC

Activating Python ExtensionsLocate this document in the navigation structure

Use

Some TREX functions are implemented as Python extensions. If the application used by TREX uses these functions, you have to activate the Python extensions. The installation documentation for the application in question contains information on whether you have to activate any Python extensions.

The following Python extensions are available:

Extension Description

XML attribute extraction

Extracts the attributes to be indexed from XML files.

This extension is required if the texts to be indexed consist only of attributes and the attributes are transmitted to TREX as XML files.

Expansion of linguistic search queries

Enhances linguistic search queries so that TREX can carry out an exact search as well as a linguistic search.

Metadata extraction

Extracts metadata from HTML documents.

Topic maps

Uses topic maps to determine terms that have a semantic relationship to the search term.

The semantic relationships involved depend on the structure of the topic map. In most cases the topic map stores synonyms, hypernyms, and hyponyms (superordinate and subordinate terms).

Semantic search

Uses topic maps to enhance search queries with additional search terms.

This extension allows you to include lists of synonyms in the search, for example.

The following procedure explains how you activate the Python extensions globally for all indexes.

Note

If you need to activate Python extensions locally for your application, the relevant information can be found in SAP Note 700771.

The global activation consists of the following two steps:

  1. Activate the Python extension handler.
  2. Registering the required Python extensions

Activate the Python extension handler.

  1. Edit the configuration file <TREX_DIR>/TREXExtensions.ini.
  2. Check that the [activate] section has the structure below, and modify the section if necessary.

    [activate]

    imsapi=search, thesaurus, admin

    preprocessor

  3. In the [extensionhandlers]section, add the line trexxpy and/or remove the comment sign (#).

    [extensionhandlers]

    trexxpy

Registering the Python extensions

The directory <TREX_DIR>\extensions\examplecontains the file _extensions.py. This serves as a template for the configuration file extensions.py.

  1. Copy the file _extensions.py to the TREX installation directory <TREX_DIR> and rename it to extensions.py.
  2. Edit the configuration file extensions.py.
  3. In the relevant section, change the entry if 0: to if 1:. You identify the extensions by the class name.
    Extension Class

    XML attribute extraction

    XmlExtractor

    Expansion of linguistic search queries

    LinguistFix

    Metadata extraction

    AttributeExtractor

    Topic maps

    XtmExpander

    Semantic search

    SemanticSearch

Tip

Register XML attribute extraction:

# XML attribute extractor extension

# --------------------

if 1:

sys.path.append(os.path.join(os.getenv('SAP_RETRIEVAL_PATH'),

'extensions', 'attribute-extractor'))

from xmlextractor import XmlExtractor

trexx.registerExtension(trexx.EXTCLASS_INDEXING,

XmlExtractor(debug=0, mimetypes=['text/xml']))

Result

The changes take effect when you next start the TREX daemon.

If you want to use the semantic search or topic maps, you must carry out further configuration steps. If necessary, contact SAP Support.

If errors occur during routine operation and the required functions are not available, check the trace file (<TREX_DIR>/trace/PythonExtension.log). This contains information on the incorrect entries in the TREX configuration files. If you cannot solve the problem, contact SAP support.