Show TOC

Activating Python ExtensionsLocate this document in the navigation structure

Context

Some TREX functions are implemented as Python extensions. If the application used by TREX uses these functions, you have to activate the Python extensions. The installation documentation for the application in question contains information on whether you have to activate any Python extensions.

The following Python extensions are available:

Extension

Description

XML attribute extraction

Extracts the attributes to be indexed from XML files.

This extension is required if the texts to be indexed consist only of attributes and the attributes are transmitted to TREX as XML files.

Expansion of linguistic search queries

Enhances linguistic search queries so that TREX can carry out an exact search as well as a linguistic search.

Metadata extraction

Extracts metadata from HTML documents.

Topic maps

Uses topic maps to determine terms that have a semantic relationship to the search term.

The semantic relationships involved depend on the structure of the topic map. In most cases the topic map stores synonyms, hypernyms, and hyponyms (superordinate and subordinate terms).

Semantic search

Uses topic maps to enhance search queries with additional search terms.

This extension allows you to include lists of synonyms in the search, for example.

The following procedure explains how you activate the Python extensions globally for all indexes.

Note

If you need to activate Python extensions locally for your application, the relevant information can be found in SAP Note 700771.

The global activation consists of the following two steps:

  1. Activate the Python extension handler

  2. Register the required Python extensions

Procedure

  • Activating the Python Extension Handler
    1. Edit the configuration file <TREX_DIR>/TREXExtensions.ini.

    2. Check that the [activate] section has the structure below, and modify the section if necessary.

      [activate]

      imsapi=search, thesaurus, admin

      preprocessor

    3. In the [extensionhandlers] section, add the trexxpy line or remove the comment sign #.

      [extensionhandlers]

      trexxpy

  • Registering the Python Extensions

    The directory <TREX_DIR>\extensions\example contains the file _ extensions.py. This serves as a template for the configuration file extensions.py.

    1. Copy the file _ extensions.py to the TREX installation directory <TREX_DIR> and rename it to extensions.py.

    2. Edit the configuration file extensions.py.

    3. In the relevant section, change the entry if 0: to if 1:. You identify the extensions by the class name.

      Extension

      Class

      XML attribute extraction

      XmlExtractor

      Expansion of linguistic search queries

      LinguistFix

      Metadata extraction

      AttributeExtractor

      Topic maps

      XtmExpander

      Semantic search

      SemanticSearch

      Example

      Register XML attribute extraction:

                    
      # XML attribute extractor extension
                    
      # --------------------
                    
      if 1:
                    
      sys.path.append(os.path.join(os.getenv('SAP_RETRIEVAL_PATH'),
                    
      'extensions', 'attribute-extractor'))
                    
      from xmlextractor import XmlExtractor
                    
      trexx.registerExtension(trexx.EXTCLASS_INDEXING,
                    
      XmlExtractor(debug=0, mimetypes=['text/xml']))
                           

Results

The changes take effect when the TREX daemon is next started.

If you want to use the semantic search or topic maps, you must carry out further configuration steps. If necessary, contact SAP Support.

If errors occur during routine operation and the required functions are not available, check the trace file ( <TREX_DIR>/trace/PythonExtension.log). This contains information on the incorrect entries in the TREX configuration files. If you cannot solve the problem, contact SAP Support.