CM Repository ManagerLocate this document in the navigation structure

Use

A CM repository is used as the main repository for storing documents and folders that are managed by CM.

 

Prerequisites
  • You have installed a database for storing content and/or metadata.

    For an overview of supported databases, see the SAP Service Marketplace at service.sap.com/platforms → Product Availability Matrix.

  • You have defined the memory caches to be used by the repository managers (see below for more information.)

 

Features

The CM repository manager supports all CM functions.

 

Persistence Modes

CM repository managers can be set up in various modes as follows:

  • DB Mode
  • DBFS Mode
  • FSDB Mode

 

DB Mode

All data (documents, folders, and metadata) is stored in the database.

If there is a large number of write requests in your CM usage scenario, set up the CM repository in database mode. Since all documents are stored in the database, this avoids unintentional external manipulation of the data.

Another advantage of storing all data in the database is that the procedure for data backup and restore is easy since only the database needs to be backed up.

 

DBFS Mode

Metadata and folders are stored in the database, but documents are stored in the file system.

This mode is faster than the database mode if you have large documents since there is no database data streaming. This mode also enables the size of the database to be controlled more easily since documents are stored in the file system.

Caution

Do not make any changes or manipulate files using the file system in DBFS mode.

 

The documents and metadata are stored in different places, so you have to take into account both the database and the file system and synchronize them both when backing up and restoring data (more information: Backing Up and Restoring CM Repositories in DBFS Mode ).

 

FSDB Mode

Folders and documents are stored in the file system, but metadata is stored in the database.

In this mode, the file system is predominant. File systems are not transactional, so this mode has restrictions and affects performance. If read and write operations take place for one document in the file system at the same time, these operations have to be coordinated by the repository manager. This happens by recording both write accesses and read accesses in the database. This affects performance.

The database is activated automatically if data is removed from the file system or added to it. The automatic update can be switched off in the configuration of the repository manager using the Disable Automatic FSDB Synchronization parameter.

Caution

Note that metadata (for example, locks or custom properties) can be lost if you edit or manipulate documents or folders directly in the file system. In case of changes performed directly in the file system, you may experience partial or complete metadata loss. For example, if you change content directly in the file system, the last known value of the Modified By system property is lost after synchronization. If you perform namespace operations such as Rename or Move , the resource loses all its metadata after synchronization. Only administrators should work in the file system, and they should only carry out mass operations such as copy and delete.If the Disable Automatic FSDB Synchronization parameter is active, you can use the synchronization report (see Reports ) to avoid metadata loss.

If this parameter is active, data can be corrupted when you read a document if write operations take place for the document at the same time as the read operation.

For a scenario with write accesses, this mode cannot ensure consistent data backup and restore. This is because changes to the folder structure in the file system could be made in the time between the two actions. This would lead to inconsistent data following the restore process. Consistent data backup and restore can therefore be ensured only if read accesses alone are taking place.

Caution

The design of the FSDB persistence mode is based on the relation 1:1 for example, one entry in the database (DB) represents one resource in the file system (FS) and the mode should work only with shares which keep this relation (case sensitive shares or shares which behave in such way). In other scenarios where this relation is not met for example, if you have an N:1 relation (DB:FS, case insensitive shares) the FSDB persistence mode will not work correctly.

Selection of Suitable Mode

If you mainly have read requests, choose a mode in which content is stored in the file system. If this is the case, make sure that access to the relevant part of the file system is denied or restricted to other applications.

Note

We recommend DBFS mode if you want to manage large amounts of data.

 

Subsequent Change to the Persistence Mode

A subsequent change of the persistence mode is possible only from the DB mode to the DBFS mode. Changing from DB to FSDB, from FSDB to DB/DBFS, or from DBFS to DB/FSDB is not possible.

To change the mode, work through the following steps:

  1. Make sure that enough free disk space is available to migrate the content from the database to the file system.
  2. Change the configuration of the CM repository manager. In the Root Directory parameter, specify the path to the file system that the documents from the database are to be stored under. Change the entry in the Persistence Mode parameter from DB to DBFS.
  3. Restart the application server.
  4. Run the CM Store: Content Crawler report.

 

Versioning Modes

You can set up the CM repository in one of the following versioning modes:

  • Standard mode

    When the CM repository is set up in standard mode, the system always creates different copies of the content for the versioned resource and the last version. Depending on the persistence mode of the CM repository, the copies are stored either in the file system or in the database. Thus, the last version and the versioned resource always point to different copies of the content. This is the default mode for all CM repositories.

  • Disc-optimized mode

    When the CM repository is set up in disc-optimized mode, the system does not create content for the last version of the versioned resource. The last version and the versioned resource share one and the same content.

    If you check out the versioned resource, the content shared between the versioned resource and the last version is duplicated. The last version points to the newly copied content. Thus, the versioned resource and the last version point to different copies of the content. Therefore, even though the disc-optimized mode reduces the disc space utilization, you still need to have enough disc space available for a content copy of the last version when the resource is checked out.

    When you discard the changes to the versioned resource, the system sets the resource to point to the content of the last version. Thus, both the versioned resource and the last version share the same content. The old content of the versioned resource is deleted.

    We recommend that you use the standard versioning mode for CM repositories in FSDB persistence mode because of the specifics of this persistence mode. If you switch to disc-optimized versioning mode when operating in FSDB persistence mode, a 0 KB file is created for the last version, which handles the binding between the content of the versioned resource and the last version. In addition, any operations on the files that are not covered by the standard KM APIs (for example, replacing the file directly on the file system, without using a standard check-out and check-in mechanisms), lead to a loss of the last version content.

 

Resource Properties That Are Supported by the CM Repository Manager

The CM repository manager is the only repository manager that supports the entire spectrum of standard resource properties, including Description, Read-only, and Hidden. The default value of the Description property is blank, and the default value for Read-only and Hidden is 'false'. The file system repository manager does not support these three properties.

 

Namespace Restrictions for Resource Names

Maximum length of resource names:FSDB: Restrictions of the file system are valid (Windows: 255 characters)  DBFS and DB: 448 characters

Maximum length of path (URI without prefix):FSDB: Restrictions of the file system are valid (Windows: 255 characters)  DBFS and DB: No restriction

Permitted characters in resource names:FSDB: Restrictions of the file system are valid (Unicode or Windows)DBFS and DB: All Unicode characters, but restrictions may exist for some components of the repository framework.

Caution

Do not put a full stop/period at the end of the resource name.

 

Database Restrictions

Because Knowledge Management has no restriction on size, the maximum size of repositories depends on the store used for them (the maximum technical size of a database or file system).

In the case of a database, the maximum size is the total of all CM repositories located in the database.

 

Note

Note that extremely large amounts of data in repositories generate a higher system load for data backup and restore.

 

The following restrictions apply for the database:

  • Maximum number of resources (folders and documents) in a repository instance:FSDB: No restriction (but possibly limited by file system restrictions)DBFS and DB: 2,147,483,648
  • Maximum size of an individual resource:FSDB: No restriction (but possibly limited by file system restrictions) DBFS: 8 exabytes (MS SQL Server 2000), 2 GB (Oracle)DB: 2 GB
  • Maximum length of large property values (string type): 2,147,583,647 bytes (2 GB).

 

Persistent Caching in the File System

You can activate persistent caching for CM repositories that are operated in DB or DBFS persistence mode. An automatically generated cache saves the content of a document locally in the file system of the portal server, using the document ID. You can use this cache to help reduce database traffic, database accesses, and the network load.

The File System Content Cache Directory parameter is available for the configuration of a CM repository manager. You can restrict the size of the cache using the Max. Size of File System Cache and Maximum Number of Cache Files parameters. The criterion that is reached first restricts the size of the cache.

If you have configured persistent caching for a CM repository manager, each time that a document is launched the system first looks in the cache to find out whether an entry is already available for the document. If the entry is not available in the cache, the corresponding content is fetched from the database, written to a temporary file, and compressed as GZIP.

If the document for a request is found in the cache, the relevant entry is unpacked and sent to the client. The cache entry is then given a new time stamp so that it is recorded correctly in the cache statistics.

Documents that have been changed are given a new ID and replaced by a new entry in the cache.

The ACLs of the original documents are adopted for the entries in the cache.

When you start Knowledge Management or reactivate a CM repository, the content of the persistent cache is read in full, sorted internally, and staged. This process can take a certain amount of time depending on the number of entries.

The persistent cache of a CM repository, which is created automatically, is located in the cache monitor with the name <repository prefix>_fsContentCache. If you empty the cache, all entries are deleted from the file system. In a cluster environment, the cache monitor displays the 'local' load of the cache on the specific portal server. If you empty the cache, only those entries are deleted that are in the file system of this portal server.

 

Parameters of a CM Repository Manager

Parameter Required Description

Name

Yes

Name of the repository manager.

Description

No

Description of the repository manager.

Prefix

Yes

The URI prefix for which the manager is registered.

This specification is entered in the list in the root directory.

The URIs of all resources managed by this repository manager have this prefix in common. This prefix is used to identify the repository manager that is responsible for a resource with a given URI. Note that you must enter the prefix with a forward slash, for example, /documents.

Repository ID in Database

No

Identifier of the repository in the database.

This is required because a database is generally used for storing the data of multiple repositories.

The value must be an alphanumeric string that is unique amongst the IDs of all repositories that use the database. If the ID is not specified, the name of the repository is used as the ID. If you use an ID, the prefix can be changed without having to coordinate it with the database.

Do not use special characters in the ID.

Root Directory

 

No

This parameter is only needed if the Persistence Mode parameter is set to FSDB or DBFS. It denotes the path to the root directory in the file system to which the repository manager is assigned. This can be a directory on the local server (for example, /usr/myshare/somedir) or on a released remote server. In order for you to access a file system of another UNIX host, the directory in question must be mounted on the portal server. The repository manager is responsible for this directory and all its subdirectories.

Multiple repository managers are not allowed to share a subpath in their Root Directory parameter.

Note

For security reasons, restrict authorizations for this path specification in the file system.

Root Directory for Versions

No

The path to the root directory in the file system that is used to store versions.

This parameter is only needed if the Persistence Mode parameter is set to FSDB.

This cannot be the path specified in Root Directory or any of its subdirectories.

If permissions have been assigned for the current version of a document, these are passed on to previous versions of the document.

Windows Landscape System

No

Not required for CM repository managers on UNIX.

Active

No

You can (de)activate the repository manager using the Active parameter.

Auto Check-In/Check-Out

No

If this parameter is activated, users can change versioned documents without having to check them out and then check them back in.

When documents of this type are saved, the system automatically creates a new version.

If this parameter is deactivated, you must always check out versioned documents before you can make changes. After making the changes, you must check in the changed document again.

Hide in Root Folder

No

Specifies whether or not the repository is listed in the root directory.

If you activate this parameter, the repository is not listed in the root directory.

Internal Links Default To Dynamic

No

If this parameter is activated, newly created internal links always reference the item in question, even if the item is moved within the repository where it is located. If the object is deleted, the corresponding link is also deleted.

Preserve Version Histories

No

Specifies whether versions are deleted or retained when the document to which they belong is deleted.

Activated: Versions are retained.

Deactivated: Versions are deleted.

The versions are located in the /.~system~ directory of the repository in question. This directory can only be called up using the admin explorer.

If you activate this parameter, you should restrict the permissions for the /.~system~ directory of the repository in question. Only allow system administrators to access this directory.

You cannot use this parameter if the CM repository is run in FSDB persistence mode and the W2K security manager has been implemented.

Send Events

No

Specifies whether or not the repository sends events when operations such as delete and update content are performed.

The repository sends events if this parameter is activated. This is necessary to use services such as the subscription service.

Persistence Mode

Yes

Selection of the persistence mode for the CM repository manager.

Defines where the namespace, content, and metadata are stored.

If you use database mode (DB), do not specify the Root Directory and Root Directory for Versions parameters, since they apply only to repositories that use the file system. Refer to the table below.

The persistence mode of an existing repository may never be changed.

In the case of customer-specific repository managers, you can subsequently change from DB mode to DBFS mode.

Property Search Manager

No

Selection of the manager for the property search.

Choose CM Property Search Manager.

Compress content greater than

No

Content greater than the specified value is stored in compressed form.

Repository services

No

Specifies the repository services that you want to use with the repository.

ACL Manager Cache

No

Specifies a memory cache for resource ACLs: ca_rsrc_acl

This parameter is required if an ACL security manager is specified in the Security Manager parameter. The cache is already created in the standard delivery (more information: Caches ).

Memory Cache

No

Specifies a memory cache to be used for the CM repository manager.

The cache stores the names of resources, and properties and locks.No content is stored in the cache.

Memory Cache for small Content (<32KB)

No

Specifies a memory cache to be used for content smaller than 32 KB.

Security Manager

No

Selection of the security manager that controls access to repository content.

If you want CM to perform an authorization check when resources are accessed, you need to specify a security manager.

Generally, the AclSecurityManager is to be used for CM repositories. Only the CM /collaboration repository is to use the CollaborationSecurityManager.

If you are running a CM repository in FSDB persistence mode on WINDOWS , you can use the W2K security manager.

File System Content Cache Directory

No

Specifies a folder in the local file system of the portal server in which the cache entries are stored persistently.

Example: /tmp/cmcontentcache

For more information, see the Persistent Caching in the File System section above.

In a clustered environment, make sure that a folder with this name is created on each host in the cluster. The folder name and the path must be identical on all hosts.

You should not use the same folder for different CM repositories. You should also not specify a folder belonging to a remote server - this can lead to loss of performance.

Note

For security reasons, restrict authorizations for this folder in the file system.

Max. Size of File System Cache

No

Maximum size, in megabytes, of entries in the persistent cache in the file system.

The size restriction refers to the compressed cache entries.

-1 signifies no restriction. However, we do not recommend using this.

Maximum Number of Cache Files

No

Maximum number of entries that the persistent cache in the file system can contain.

-1 signifies no restriction. However, we do not recommend using this.

After the last cache access, the entries are sorted. Old entries that are seldom called are removed from the cache and replaced with new entries.

Disable Automatic FSDB Synchronization

No

Controls the synchronization of CM repositories that are operated in the FSDB persistence mode.

If it is activated, automatic file system synchronization is not performed for the repository in question.

Changes that are made directly in the file system (changes to properties or the pasting, renaming, moving, or deleting of folders or documents) are not automatically made in the database as well. These changes are not visible in the portal - accessing objects that no longer exist can lead to errors.

In order to make these changes visible in the portal, you can use a report to start file system synchronization manually as required (more information: CM Repository FSDB Synchronization ).

If this parameter is active, data can be corrupted when you read a document if write operations take place for the document at the same time as the read operation.

Enable Disk Optimized Mode

No

Controls the switching between one of the two versioning modes: standard mode or disc-optimized mode. In order to ensure backward compatibility, the default value of the Enable Disk Optimized Mode parameter is set to standard versioning mode.

You can use the Versioned Content Converter Report to convert versioned resources from standard to disc-optimized mode or vice versa.  More information: Versioned Content Converter Report

Enable FSDB Content Tracking

No

Controls the tracking of content for repositories operated in FSDB persistence mode.

If this parameter is active, the CM repository manager uses the database to coordinate the read and write access to content in the file system by recording both write accesses and read accesses in the database. This ensures the consistency of the returned content when, for example, several threads try to read while other threads modify the same data.

Note

The database synchronization of content access might have a negative impact on performance. Every read or write content request to an FSDB resource waits to obtain a write lock on the lock record in the database. Therefore, the accumulated waiting time for obtaining the write lock in the database might increase and the waiting threads might consume a considerable amount of the available threads in the thread pool.

If the parameter is deactivated, no tracking of content is performed and content access is not synchronized. This might result in better performance due to a shorter waiting time for obtaining database write locks, as well as fewer database accesses and database locks.

Caution

Deactivating the Enable FSDB Content Tracking parameter might result in inconsistencies in the returned content. As the content access is not synchronized, several client threads might interfere with each other and the returned content might be corrupted.

 

Read-only Content Expiry Delay

No

Specifies how long clients are to store the content of write-protected resources.

During this time, write-protected resources are not reloaded from the client to the server.

This specification is in seconds.

You can only use this function if the persistence mode is set as DB or DBFS.

The cache options and the settings for the temporary internet files of the browser used need to be set to automatic.

 

You enter particular parameters depending on the persistence mode chosen:

Persistence Modes and Root Parameters

Persistence Mode Required Parameters Unnecessary Parameters

DB

 

Root Directory Root Directory for Versions

DBFS

Root Directory

Root Directory for Versions

FSDB

Root Directory Root Directory for Versions

Root Directory for Versions and Root Directory cannot be subfolders of each other and cannot be identical.

 

 

Activities

Several CM repositories are preconfigured in the KM standard configuration (more information: Internal Repositories ). They are used as data and system repositories.

To create and configure a new CM repository manager for your own data, or to change the configuration of an existing one, choose System Administration → System Configuration → Knowledge Management → Content Management → Repository Managers → CM Repository in the portal.

 

Configuration of the Necessary Memory Caches

You can specify three caches in the CM repository definition:

  • ACL Manager Cache (ca_rsrc_acl)
  • Memory Cache (ca_cm)
  • Memory Cache for small Content (ca_cm_content)

The caches have already been preconfigured. You can adapt the configuration to your own requirements later on.

For more information about the configuration of caches, see Components and Their Caches .

 

Example

CM Repository Manager Configuration

 

Name                    = documentsDescription             = Standard repository for contentPrefix                  = /documentsRepository-ID in DB     = documentsSend Events             = truePersistence Mode        = DBProperty Search Manager = CMPropertySearchManagerRepository Services     = properties, feedback, comment, rating,                           accessstatistic, personalnote, discussion,                           tbp, statemngt, subscription, svc_aclACL Manager Cache       = ca_rsrc_acl Memory Cache            = ca_cmMemory Cache for small Content = ca_cm_contentSecurity Manager        = AclSecurityManagerFile System Content Cache Directory = /tmp/cmdocumentscontentcacheMax. Size of File System Cache      = 100Maximum Number of Cache Files       = 10000

 

The example above shows parameter settings for the documents repository manager, which is included in the KM standard configuration. The manager stores both content and metadata in the database. Persistent caching is set up.

 

More Information

CM Repository File System Check 

CM Repository Database Check 

Non-Configured CM Repositories 

CM Repository FSDB Synchronization