Handling Non-Active Data to Optimize the Usage of the Main Memory in SAP HANA

Data that is either never or rarely needed in Data Warehouse processes or for analysis is referred to as non-active data. This data needs to be retained in the system for various reasons (for example, storage of historical data, data storage due to legal reasons, data storage for security reasons to enable reporting-relevant layers to be recovered if necessary). The concept of non-active data allows you to make more effective use of the main memory in SAP HANA.

Non-active data in a BW system is handled as follows:

If the main memory has sufficient capacity, non-active data resides in the main memory.
If main memory bottlenecks occur, the non-active data is displaced from the main memory with a higher priority than other data.
If non-active data that resides in the main memory is accessed, only the smallest possible volume of data is loaded into the main memory (the columns of the relevant partitions).

Reporting-relevant and process-relevant data nearly always resides in the main memory, which guarantees improved access to this data. It is also possible to reduce the main memory if a large proportion of the data in the system is non-active.

Note Note that the concept of Extended Tables with SAP HANA Dynamic Tiering can optimize the main memory usage in SAP HANA even further than the concept of non-active data. Unlike the concept of active/non-active data, the main memory in SAP HANA is not required for data persistence (in extended tables). For more information, see SAP HANA SAP HANA Dynamic Tiering for Using Extended Tables

Displacement Concept for Main Memory Bottlenecks in the SAP HANA Database

If bottlenecks occur in the main memory of the SAP HANA database (or while cleaning up the main memory), strict time-based criteria are used to displace the data from the main memory into the SAP HANA file system. The data is displaced if a threshold value is exceeded by a database process requiring main memory. The columns of table partitions are displaced based on the last access time (last recently used concept). Table columns containing data that has not been accessed for the longest time are displaced first.

Non-Active Data and Displacement

The SAP BW non-active data concept in SAP HANA Support Package Stack 05 and higher ensures that any BW data that does not need to be actively retained in the main memory is displaced first if bottlenecks occur.

Most of the non-active data (classified as "warm" based on multi-temperature classification) is stored in the BW system in the following locations:

In the Persistent Staging Areas (PSAs) of DataSources

In the data acquisition layer, the PSA continually returns the backup status for the other layers until the entire staging process is confirmed. The data storage duration in the PSA is medium to long term.
In the write-optimized DataStore objects of the data acquisition layer and of the corporate memories.

The write-optimized DataStore object retains the entire data history in the corporate memory.

The data in these objects is retained for security reasons to enable reporting-relevant layers to be restored or to guarantee access to old data. These objects often contain data that is no longer required. New data is also loaded into these objects every day. Apart from the current request however, this data is usually not required for processing in BW. If a memory bottleneck occurs, these objects therefore should be displaced (instead of data for reporting-relevant objects for example) - even if this data was last used less than 24 hours ago. The data that is no longer needed should not be reloaded into the main memory when new data is loaded either.

Partitioning the PSA and Write-Optimized DataStore Objects by Request

PSAs and write-optimized DataStore objects are created in the database and are partitioned by request. There is a restriction here for DataStore objects: duplicate data records must be allowed.

Partitioning by request means that a complete request is written to a partition. If a threshold value is exceeded, a new partition is created. The data of the next request is written to this new partition. The default threshold value for PSAs is 5,000,000 lines and for write-optimized DataStore objects the default is 20,000,000 lines.

The Data Warehouse processes (load data, read data) usually only access data from the newest partitions. These processes always specify the partition ID for operations performed on tables. This ensures that no data from other partitions is accessed.

The fact that PSAs and write-optimized DataStore objects can be partitioned by request means that only the partitions containing non-active data in old requests are displaced from the main memory. The new partitions with requests relevant to the Data Warehouse processes are not displaced. These partitions are used for write and merge operations and therefore remain in the main memory.

Caution

With write-optimized DataStore objects connected to 3.x data flows (incoming or outgoing update rules), the system always loads all the data into the main memory, because the partition ID is not used to access the partition ID here.
Write-optimized DataStore objects with a semantic key (for which duplicate data records are not allowed) are not partitioned by request. In this case, all the data is loaded into the main memory every time the object is accessed.

Implementing the Concept of Non-Active Data in BW

The Early Unload setting is used to control the displacement of BW object data from the main memory. If a memory bottleneck occurs, the data of these objects is displaced with increased priority from the main memory (UNLOAD PRIORITY7). This is achieved by multiplying the time elapsed since the object was last accessed by factor 27 (default). This means that objects flagged with Early Unload are displaced more quickly than BW objects - even if the BW objects have not been accessed for a greater length of time than the flagged objects. This setting only applies to the flagged objects.

Tables are automatically assigned the Early Unload setting, to ensure that PSAs and write-optimized objects are displaced with a higher priority.

The fact that PSAs and write-optimized DataStore objects are partitioned by request has the following effects:

The columns in a non-active partition of a PSA table or table of a write-optimized DataStore object are displaced from the main memory with a higher priority - to more exact, a priority increased by factor 27 compared to tables with the default setting.
Once a partition has been displaced, it is not reloaded into the main memory because new data is only written to the newest partition.
Older data is usually not accessed. If older data is accessed, it is loaded into the main memory.

This usually occurs with loading processes used to create new targets or when data has to be reloaded. In these cases, it is justifiable to initially load the data into the main memory.
If not all the columns of displaced objects are accessed (for example, transformations that only read specific columns), only these specific columns are loaded into the main memory. All other columns remain in the SAP HANA file system.

In the Non-active data monitor in the SAP HANA database (transaction code RSHDBMON), you can monitor the handling of non-active data and change the settings for this data at object level.

Notes and Restrictions

Note that displacement from the main memory is based on actual access and is not based on the data classification for access frequency (warm). If "warm" data is frequently needed, this data is therefore not displaced from the main memory (as explained above).
This concept cannot be implemented for 3.x data flows, because the partition ID is not used to access the data and all the data of the object is loaded into the main memory.
The BW system is optimized in such a way that PSAs and write-optimized DataStore objects can only be accessed with the relevant partition ID. This avoids the need to load the entire table into the main memory.

Caution
Avoid accessing these tables. In other words, avoid making regular manual changes (SQL Editor, transaction SE16) or using customized code. Otherwise, the entire table will be loaded into the main memory.
Other BW objects can be flagged as Early Unload. We advise against doing this however. For more information, see SAP Note 1767880 .

Effects on Hardware Sizing

The concept of displacing non-active data from the main memory improves main memory resource management. This has a positive effect on hardware sizing when dealing with a large quantity of non-active data in the PSAs and write-optimized DataStore objects. The data is retained in the SAP HANA file system and the main memory can be reduced in size as appropriate. For more information, see SAP Note 1736976 Information published on SAP site .

Caution

Note the following about making entries in the sizing report, which is explained in SAP Note 1736976 Information published on SAP site :

Enter real and correct data classifications for access frequency. If you classify data as warm that is still frequently needed, this will not be displaced with a higher priority from the main memory as explained above. This can cause problems in main memory management if there is a constant lack of free memory.

More Information

For more information about classification of data based on access frequency or usage behavior, see Multi-Temperature Data Management.
For information about activation and monitoring the concept in SAP BW, see

Activating the Handling of Non-Active Data During Main Memory Bottlenecks

Monitoring the Handling of Non-Active Data
For information about main memory consumption in the SAP HANA database, see SAP HANA Database - Administration Guide at http://help.sap.com/hana_appliance.
For information about how the non-active data concept affects sizing, see the relevant SAP Notes 1637145 (SAP BW on HANA: Sizing SAP In-Memory Database) and 1736976 (Sizing Report for BW on HANA).