Data Flow Design

A data flow may contain multiple sources, but has a single target object.

The first transform takes its input from source tables or files. The input is transformed as needed and mapped to the Output pane. Subsequent transforms in the data flow take as input the output columns of the previous transform step. The final transform must be a target transform. SAP Cloud Integration for data services automatically creates the correct type of target transform based on the target type.

About the target schema

The Output pane of the final transform shows the target object schema. Changes to the schema cannot be made in the Output pane of the target transform. If changes are required, they must be made in the database, file format or web service. Changed database and web service objects must be reimported in the datastore. Changed file format objects do not need to be reimported.
Note
In order to reimport a web service object, the web service must be up and running.

Transform order in a data flow

Within a data flow, data must be transformed in a specific order. First any ABAP transforms, (for SAP sources), next any additional transforms, and finally a target transform.

The target transform is the only required transform in a data flow. All other transforms are optional and serve to manipulate the data as needed to meet your requirements.

Considerations

Before you begin to create a data flow from scratch, consider the following points:
  • For each target object, determine what sources are required and what transformations are needed for that data. With that information, you can map out what transform types you will use.

  • Consider what global variables will be useful.

    Values assigned to global variables apply across all data flows within a task.

  • If you have an existing data flow that you can adapt, you can create a duplicate and then modify the duplicated data flow as needed.

Best Practices

Best practice when creating a data flow from scratch is to begin by defining the first transform in the data flow. This is the transform that extracts the data from your source and may also manipulate your data. As needed, you can add intermediate transforms to manipulate the data. The target transform loads data to the target and must be the final transform in the data flow. As such, it would be the last transform you define.

Best practice is to rename columns or edit data types so they match those in the target schema as early in the data flow as possible. By doing this you can take advantage of Automap functionality in the Target Query transform.