Show TOC

Handling Duplicate Data RecordsLocate this document in the navigation structure

Use

DataSources for texts or attributes can transfer several data records with the same key into BW in one request. Whether the DataSource transfers multiple data records with the same key within one request is a property of the DataSource. If you want to transfer multiple data records with the same key (also known as duplicate data records) to BW more than once within a request, this could be specific to an application, in which case it is not an error. The BW provides functions that can deal with such an ambiguity when handling duplicate data records.

Features

In a dataflow that is modeled using a transformation, you can work with duplicate data records for time-dependent and time-independent attributes and texts.

If you are updating attributes or texts from a DataSource to an InfoObject using a data transfer process (DTP), you can go to the Update tab page in the DTP maintenance and set the Handle Duplicate Record Keys indicator to specify how data records with the same record key are handled.

This indicator is not set by default.

If this indicator is not set, data records that have the same key are written to the error stack of the DTP.

If you set the indicator, data records with the same key are handled as follows:

  • Time-independent data:

    If data records have the same key, the last data record in the data package is interpreted as being valid and is updated to the target.

  • Time-dependent data:

    If data records have the same key, the system calculates new time intervals for the data record values based on the overlapping time intervals and the sequence of the data records. The prime criteria is the interval of the last data record; the intervals of the previous data records are corrected accordingly.

    Example

    Data record 1 is valid from 01.01.2006 to 31.12.2006

    Data record 2 with the same key is valid from 01.07.2006 to 31.12.2007

    When handling duplicate data records, the system corrects the time interval for data record 1 to 01.01.2006 to 30.06.2006, because from 01.07.2006 onwards, the next data record in the data package (data record 2) is valid.

    Caution

    If you set the indicator for time-dependent data, note the following:

    The semantic key in the DTP (semantic grouping) specifies the structure of the data packages that are read from the DataSource. Therefore, the field of the data source that contains the DATETO information may not be contained in the semantic key of the DTP. Data records with the same key are otherwise sorted incorrectly, and the validity periods are calculated incorrectly.

Example

You have two data records with the same key within one data package.

In the following figure, DATETO is not an element of the semantic key:

In the data package, the data records are sorted in the sequence data record 1, data record 2. In this case, the time interval for data record 1 is corrected:

Data record 1 is valid from 1.1.2002 to 31.12.2006.

Data record 2 is valid from 1.1.2000 to 31.12.2001.

In the following figure, DATETO is an element of the semantic key:

If DATETO is an element of the key, the records are sorted by DATETO. In this case, the data record with the earliest date is put before the data record with the most recent date. In the data package, the data records are sorted in the sequence data record 2, data record 1. In this case, the time interval for data record 2 is corrected:

Data record 2 is valid from 1.1.2000 to 31.12.2000.

Data record 1 is valid from 1.1.2001 to 31.12.2006.

Note

You can specify how data records with the same key within a request are handled independently of whether the setting that allows DataSources to deliver data records that potentially have the same key has been made. This is useful if the setting was not made for the DataSource, but the system knows from other sources that data records with the same key are transferred (for example, when flat files are loaded).