Show TOC

Data Flow in SAP NetWeaver Business WarehouseLocate this document in the navigation structure

Use

The data flow in the SAP NetWeaver Business Warehouse (BW) defines which objects are needed at design time and which processes are needed at runtime. These objects and processes are needed to transfer data from a source to BW, to cleanse, consolidate and integrate the data, so that it can be used for analysis, reporting and planning. The individual requirements of your company processes are supported by numerous options for designing the data flow. You can use any data sources that transfer the data to BW or access the source data directly, apply simple or complex cleansing and consolidating methods, and define data repositories that correspond to the requirements of your layer architecture.

With SAP NetWeaver 7.0, the concepts and technologies for certain elements in the data flow were changed. The most important components of the new data flow are explained below, whereby the changes to the previous data flow are also mentioned. To distinguish them from the new objects, the objects previously used are appended with 3.x.

The following graphic shows the data flow in the Data Warehouse:

In BW, the metadata description of the source data is modeled with DataSources. A DataSource is a set of fields that are used to extract data of a business unit from a source system and transfer it to the entry layer of the BW system or provide it for direct access.

There is a new object concept for DataSources in BW (SAP NetWeaver 7.0 and higher). In BW, the DataSource is edited or created independently of 3.x objects on a unified user interface. When the DataSource is activated, the system creates a PSA table in the Persistent Staging Area (PSA), the entry layer of BW. In this way the DataSource represents a persistent object within the data flow.

Before data can be processed in BW, it has to be loaded into the PSA using an InfoPackage. In the InfoPackage, you specify the selection parameters for transferring data into the PSA. In the new data flow, InfoPackages are only used to load data into the PSA.

Using the transformation, data is copied from a source format to a target format in BW. Transformation thereby allows you to consolidate and cleanse data from multiple sources. You can perform semantic synchronization of data from various sources. You integrate the data into the BW system by assigning fields from the DataSource to InfoObjects. In the data flow, the transformation replaces the update and transfer rules, including transfer structure maintenance.

InfoObjects are the smallest information units in BW. They structure the information in the form needed to build up InfoProviders.

InfoProviders consist of several InfoObjects. They are persistent data repositories that are used in the layer architecture of the Data Warehouse or in data views. They provide data for analysis, reporting and planning. You also have the option of writing the data to other InfoProviders.

Using an InfoSource (optional in the data flow), you can connect multiple sequential transformations. You therefore only require an InfoSource for complex transformations (multistep procedures).

You use the data transfer process (DTP) to transfer the data within BW from one persistent object to another object, in accordance with certain transformations and filters. Possible sources for the data transfer include DataSources and InfoProviders; possible targets include InfoProviders and open hub destinations. To distribute data within BW and in downstream systems, the DTP replaces the InfoPackage, the Data Mart Interface (export DataSources) and the InfoSpoke.

You can also distribute data to other systems using an open hub destination.

In BW, process chains are used to schedule the processes associated with the data flow, including InfoPackages and data transfer processes.

The complexity of data flows varies. As an absolute minimum, you need a DataSource, a transformation, an InfoProvider, an InfoPackage and a data transfer process.

Uses and Advantages of the Data Flow with SAP NetWeaver 7.0

DataSource object type RSDS (new DataSource):

The new DataSource of object type RSDS enables real-time data acquisition, as well as direct access to source systems of type File and DB Connect.

Data Transfer Process:

The data transfer process (DTP) makes the transfer processes in the data warehousing layers more transparent. The performance of the transfer processes increases when you optimize parallelization. With the DTP, delta processes can be separated for different targets and filtering options can be used for the persistent objects on different levels. Error handling can also be defined for DataStore objects with the DTP. The ability to sort out incorrect records in an error stack and to write the data to a buffer after the processing steps of the DTP simplifies error handling. When you use a DTP, you can also directly access each DataSource in the SAP source system that supports the corresponding mode in the metadata (also master data and text DataSources).

Transformation:

Transformations simplify the maintenance of rules for cleansing and consolidating data. Instead of two rules (transfer rules and update rules), as in the past, only the transformation rules are still needed. You edit the transformation rule in an intuitive graphic user interface. InfoSources are no longer mandatory; they are optional and are only required for certain functions. Transformations also provide additional functions - such as quantity conversion, performance-optimized reading of master data and DataStore objects - as well as the option to create an end routine or expert routine.

Data Flow Modeling Tool

You model data flows and elements in the Modeling functional area of the Data Warehousing Workbench. The graphical user interface here helps you to create top-down model and use the best practice models (data flow templates provided by SAP). With top-down modeling, you create a model blueprint on the BW system, which you can use later on to create a persistent data flow.

For more information, seeModeling (especially the Graphical Modeling section).

Migration of 3.x Data Flows

For more information on migrating existing data flows with 3.x objects, see Migrating a Data Flow.