Physical Distribution of Data in CO-PA (SAP Library

Physical Distribution of Data in CO-PA

If you have large data volumes, the system may need a lot of time to access the segment level (for example, in order to fill summarization levels with data). Since the standard installation of the SAP System is not optimized for handling large amounts of data in CO-PA, there is a lot of potential for fine-tuning if you distribute data physically among the available hard drives.

The following information is always applicable to Releases 2.1 through 3.0B. Beginning with Release 3.0C, you can use summarization levels so that it is not always necessary to access the segment table and the segment level in the online functions.

However, it is still necessary to access the segment table and segment level when you fill the summarization levels with data, or if you are working with report-specific summarization data as in Release 2.2.

Systems in an I/O-Bound State

In order to avoid increasing the complexity of a standard installation, the tables CE3xxxx and CE4xxxx (where operating concern = xxxx) are always created in the table space PSAPBTABD. The corresponding indexes are found in table space PSAPBTABI.

When reading large data volumes from the segment level, the system accesses the same hard drives a number of times. As a result, these hard drives regularly operate at the limits of their workload capacity and thus represent the limiting factor for the overall performance of the reading procedure. The processors involved are usually have the status "idle" or "wait" during this time.

If this situation occurs, the system is said to be in an "I/O-bound" state. By distributing the data among several hard drives, you can increase the overall performance of the system considerably.

A Simple Way to Improve I/O

In standard installations, the system can typically read about 200,000 records per hour from the segment level. This performance is largely independent of the hardware used.

However, if you have four table spaces available (for example, PSAPCE4D, PSAPCE4I, PSAPCE3D and PSAPCE3I) which are stored on four different hard drives, you can distribute the data in CO-PA as follows:

Table CE4xxxx in table space PSAPCE4D

Indexes CE4xxxxn for table CE4xxxx in table space PSAPCE4I

Table CE3xxxx in table space PSAPCE3D

Primary index CE3xxxx0 for table CE34xxxx in table space PSAPCE3I

If no data has been posted to Profitability Analysis in your system yet, it makes sense to redefine the parameters for those database objects using the database utility (transaction SE14). Otherwise you will have to back up the existing data before the conversion and then restore it later, a time-consuming process.

This "mini-solution", which can be achieved with relatively little effort, can usually increase the typical speed at which the system reads the segment level to about 500,000 records per hour.

More I/O Distribution

Despite the increase in performance achieved using the simple data distribution described above, systems with powerful processors will usually still be in an I/O-bound state. In that case, it makes sense to take additional measures to distribute the I/O workload.

Here we will only look at striping at the hardware level, which is in most cases supported in a transparent fashion by the database system.

Striping Data Files

In striping, files are created at the operating system level and stored (in small pieces) on as many drives as possible. The idea is that the system reads randomly in this type of file more quickly because the time needed to position the header can be eliminated or performed several times in parallel. Since the file is stored in stripes on the different drives, this is referred to as "striping". The drives involved are called "stripe sets".

The files which are distributed to several drives are subsequently used to create a table space. The database system sees the table space as a consecutive sequence of blocks, which are used for the EXTENTS of a table. The operating concern makes sure that the random read accesses will most likely be performed on different drives. The "width" of a stripe should correspond to the size of the database blocks (generally 8 KB).

Striping for Tables in Profitability Analysis

Because of the way the system accesses data in Profitability Analysis, it makes sense to store the tables CE3xxxx and CE4xxxx (as well as their indexes) in a number of different table spaces in order to steer consecutive read accesses in drilldown reporting to as many physical drives as possible (and thus make it possible to run them in parallel).

To achieve as broad an I/O distribution as possible, you should stripe the four table spaces PSAPCE4D, PSAPCE4I, PCAPCE3D and PSAPDE3I as evenly as possible across all the available drives.

Supporting Measures

In addition to the striping, you should also make sure that the physical blocks within the table spaces are being used optimally (e.g. for ORACLE PCTFREE=1 and PCTUSED=99). In addition, the tables should contain as few EXTENTS as possible. That requires a careful volume analysis of the expected data volume.

Security Aspects

When you use stripe sets, you need to be especially careful about backing up your data. Remember that if even one hard drive containing part of a stripe set falls out of service, the entire stripe set usually can no longer be used and needs to be restored from a data backup and the corresponding archive log.

This can be quite time-consuming, especially if you use the "ad-hoc" solution described above (all table spaces striped to all hard drives).

The technical realization of the recommendations given here varies depending on the database and operating systems you are using.