Select language:

Process documentation Data Reconciliation 

Purpose

An important aspect in ensuring the quality of data in BI is the consistency of the data. As a data warehouse, BI integrates and transforms data and stores it so that it is made available for analysis and interpretation. The consistency of the data between the various process steps has to be ensured. Data reconciliation for DataSources allows you to ensure the consistency of data that has been loaded into BI and is available and used productively there. You use the scenarios that are described below to validate the loaded data. Data reconciliation is based on a comparison of the data loaded into BI and the application data in the source system. You can access the data in the source system directly to perform this comparison.

The term productive DataSource is used for DataSources that are used for data transfer in the productive operation of BI. The term data reconciliation DataSource is used for DataSources that are used as a reference for accessing the application data in the source directly and therefore allow you to draw comparisons to the source data.

You can use the process for transaction data. Limitations apply when you use the process for master data because, in this case, you cannot total key figures, for example.

Model

The following figure shows the data model for reconciling application data and loaded data in the data flow with transformation. The data model can also be based on 3.x objects (data flow with transfer rules).

This graphic is explained in the accompanying text

The productive DataSource uses data transfer to deliver the data that is to be validated to BI. The transformation connects the DataSource fields with the InfoObject of a DataStore object that has been created for data reconciliation, by means of a direct assignment. The data reconciliation DataSource allows a VirtualProvider direct access to the application data. In a MultiProvider, the data from the DataStore object is combined with the data that has been read directly. In a query that is defined on the basis of a MultiProvider, the loaded data can be compared with the application data in the source system.

In order to automate data reconciliation, we recommend that you define exceptions in the query that proactively signal that differences exist between the productive data in BI and the reconciliation data in the source. You can use information broadcasting to distribute the results of data reconciliation by e-mail, for example.

Modeling Aspects

Data reconciliation for DataSources allows you to check the integrity of the loaded data by, for example, comparing the totals of a key figure in the DataStore object with the corresponding totals that the VirtualProvider accesses directly in the source system.

In addition, you can use the extractor or extractor error interpretation to identify potential errors in the data processing. This function is available if the data reconciliation DataSource uses a different extraction module to the productive DataSource.

We recommend that you keep the volume of data transferred as small as possible because the data reconciliation DataSource accesses the data in the source system directly. This is best performed using a data reconciliation DataSource delivered by BI Content or a generic DataSource using function modules because this allows you to implement an aggregation logic. For mass data, you generally need to aggregate the data or make appropriate selections during extraction.

The data reconciliation DataSource has to provide selection fields that allow the same set of data to be extracted as the productive DataSource.

Selecting the DataSource for Data Reconciliation

Different DataSources can take on the function of a data reconciliation DataSource. The DataSources that can be used in your data reconciliation scenario are explained below.

BI Content DataSources for Data Reconciliation and Recommendations from BI Content for Data Reconciliation

Use the following process to validate your data:

  If a data reconciliation DataSource is specified for a productive DataSource in the BI Content documentation.

You can see that a DataSource of this type is delivered with BI Content if the documentation in the Technical Data table contains an appropriate entry in row Checkable.

  If the DataSource documentation contains instructions on building a data reconciliation scenario.

Special DataSources for data reconciliation can be delivered in systems that have PI Basis Release 2005.1 or higher or 4.6C source systems PI 2004.1 SP10.

If the BI Content documentation does not include a reference to a delivered data reconciliation scenario, the decision as to which data reconciliation DataSource you use depends on the properties of the data that is to be compared.

Generic DataSource for Database View or InfoSet

Use this process:

  If BI Content does not deliver a data reconciliation DataSource and the documentation for the productive BI Content DataSource does not include instructions on building a data reconciliation scenario.

  If the data that the productive DataSource supplies is made available in a database table or can be extracted using an InfoSet.

  If you can use selections to significantly limit the volume of data that is to be extracted and transferred.

This process is particularly appropriate for calculated key figures.

If you have created a suitable database view or InfoSet, create a corresponding generic DataSource in the source system in transaction SBIW Generic DataSources   Maintain Generic DataSource.

Generic DataSource for Function Module

Use this process:

  If BI Content does not deliver a data reconciliation DataSource and the documentation for the productive BI Content DataSource does not include instructions on building a data reconciliation scenario.

  If the data is not available in a database table or cannot be extracted using an InfoSet.

  If you can supply equivalent data for data reconciliation, despite the complex extraction logic of the productive DataSource.

You can reproduce a complex extraction logic using a generic DataSource that extracts data using a customer-defined function module. This allows you to stage data that is equivalent to the productive DataSource, without using the same extraction module as the productive DataSource. In addition, you can use aggregation to reduce the volume of data that is to be transferred.

Note that the extraction logic of the data reconciliation DataSource is prone to errors if the extraction logic of the productive DataSource is complex. Errors in the extraction logic of the data reconciliation DataSource lead to errors in the data reconciliation. We recommend that only experienced developers use this scenario.

Productive DataSource with Direct Access

Use this process:

  If none of the processes described above are possible.

  If the productive DataSource allows direct access.

Since the runtime largely depends on the volume of data that has to be read by the database and transferred, the prerequisite for using this process is that you have set meaningful selections in order to keep the volume of data that is to be transferred small.

During data reconciliation, the data loaded into BI by means of delta transfer is compared with the data in the source system that the extractor accesses directly. Because the same extractor is used for loading and direct access, this process does not allow you to identify potential systematic errors in the logic of the extractor. Errors in processing the delta requests can be identified.

Prerequisites for Performing Data Reconciliation

You have to be able to use suitable selections (time intervals, for example) or pre-aggregation to restrict the scope of the data that you are going to compare so that it can be accessed directly by the VirtualProvider.

In addition, you have to ensure that the selection conditions for the productive DataSource and the data reconciliation DataSource filter the same data range.

Process Flow

...

  1.  Create the object model for data reconciliation according to the requirements of your scenario.

  2.  Load data from the productive DataSource into the DataStore object using suitable selection conditions.

  3.  Make sure that there is no unloaded data in the delta queue or in the application for the productive DataSource when the check is performed. The application for validating the data is either stopped or the data that is to be reconciled is limited by means of selections (for example, by creating and using time stamps for the data records).

  4.  Check the data in the query.

  5.  If you find inconsistencies, proceed as follows:

  6.  Check whether all the data was loaded from the source system. Load the data that has not yet been loaded into BI, if applicable, and perform reconciliation again.

If the loaded data is not complete, start a repair request.

If the loaded data is complete but not correct, reinitialize the delta or contact SAP.

Example

Information about application-specific scenarios for performing a data reconciliation is available in the How-To-Guide How to… Reconcile Data Between SAP Source Systems and SAP NetWeaver BI in SDN at http://www.sdn.sap.com/irj/servlet/prt/portal/prtroot/docs/library/uuid/7a5ee147-0501-0010-0a9d-f7abcba36b14.

 

Was this page helpful to you?