Show TOC

Creating DataSources for File Source SystemsLocate this document in the navigation structure

Use

Before you can transfer data from a file source system, the metadata (the file and field information) must be available in BW in the form of a DataSource.

Prerequisites

Note the following with regard to CSV files:

  • Fields that are not filled in a CSV file are filled with a blank space if they are character fields or with a zero (0) if they are numerical fields.

  • If separators are used inconsistently in a CSV file, the incorrect separator (which is not defined in the DataSource) is read as a character and both fields are merged into one field and can be shortened. Subsequent fields are then no longer in the correct order.

  • A line break cannot be used as part of a value, even if the value is enclosed with an Escape sign.

Note the following with regard to CSV files and ASCII files:

  • The conversion routines that are used determine whether you have to specify leading zeros. See Conversion Routines in BW Systems.

  • For dates, you usually use the format YYYYMMDD, without internal separators. Depending on the conversion routine used, you can also use other formats.

Notes on Loading

  • When you load external data, you have the option of loading the data from any workstation into BW. Loading from a workstation is only possible in the dialog however. For performance reasons, you should store the data on an application server and load it from there into BW. This means that you can also load the data in the background and thus use it in process chains too.

  • If you want to load a large amount of transaction data into BW from a flat file and you can specify the file type of the flat file, you should create the flat file as an ASCII file. From a performance point of view, loading data from an ASCII file is the most cost-effective method. Loading from a CSV file takes longer because the separator characters and escape characters then have to be sent and interpreted. In some circumstances, generating an ASCII file can involve more effort.

  • When loading from a workstation, the data is loaded to a file in a single data package. With very large files (large number of data records, broad structure), a crash can occur when loading into the PSA via an InfoPackage. There are various ways of avoiding this:
    • We recommend placing the file on the application server. The InfoPackage can read from the application server on a package-by-package basis. Note however that Microsoft Excel files cannot be read from the application server. You should therefore convert your file into a CSV file.
    • If possible, an alternative solution is to split your file into multiple smaller files.
Procedure

You are in the DataSource tree in the Data Warehousing Workbench.

  1. Select the application component that you want to create the DataSource in and choose Create DataSource.

  2. On the next screen, enter a technical name for the DataSource, select the type of DataSource and choose Copy.

    The DataSource maintenance screen appears.

  3. Go to the General tab page.

    1. Enter descriptions for the DataSource (short, medium, long).

    2. If required, specify whether the DataSource builds an initial non-cumulative and has the ability to return duplicate data records in a request.

    3. Specify whether you want to generate the PSA for the DataSource in the character format. If the PSA is not typed it is not generated in a typed structure but is generated with character-like fields of type CHAR only.

      Use this option if conversion during loading causes problems, for example because there is no appropriate conversion routine, or if the source cannot guarantee that data is loaded with the correct data type.

      After activating the DataSource, you can then load data into the PSA and correct it there.

  4. Go to the Extraction tab page.

    1. Define the delta process for the DataSource.

      You can use the generic delta. Using a delta-relevant field whose value rise monotonously over time, the system determines which data to transfer at runtime. More information: Using Generic BW Deltas.

    2. Specify whether you want the DataSource to support direct access to data.

    3. Real-time data acquisition is not supported for data transfer from files.

    4. Select the adapter for the data transfer. You can load text files or binary files from your local work station or from the application server:

      • Text-type files only contain characters that can be displayed and read as text. CSV and ASCII files are examples of text files. For CSV files, you have to specify a character that separates the individual field values. In BW you have to specify this separator character and an escape character which specifies this character as a component of the value, as required. After you have specified these characters you have to use them in the file. ASCII files contain data in a specified length. The defined field length in the file must be the same as the length of the assigned field in BW.

      • Binary files contain data in the form of bytes. A file of this type can contain any type of byte value, including bytes that cannot be displayed or read as text. In this case, the field values in the file have to be the same as the internal format of the assigned field in BW.

      Choose Properties if you want to display the general adapter properties.

    5. Select the path to the file that you want to load or enter the name of the file directly, for example C:/Daten/US/Kosten97.csv.

      You can also create a routine that determines the name of your file. If you do not create a routine to determine the name of the file, the system reads the file name directly from the File Name field.

    6. Depending on the adapter and the file to be loaded, there are further settings that need to be made.

      • For binary files:

        Specify the character record settings for the data that you want to transfer.

      • Text-type files:

        1. Specify how many rows in your file are header rows and can therefore be ignored when the data is transferred.

        2. Specify the character record settings for the data that you want to transfer:

          If you are loading data from an ASCII file, the data is requested with a fixed data record length.

          If you are loading data from an Excel CSV file, specify the data separator and the escape character.

          1. In the Data Separator field, specify the separator that your file uses to divide the fields.

          2. If the data separator character is a part of the value, the file indicates this by enclosing the value in particular start and end characters. Enter these start and end characters in the Escape Characters field.

          3. In a text editor (Notepad for example) check the data separator and the escape character currently being used in the file. These depend on the country version of the file you are using.

          4. If you select the Hex flag, you can specify the data separator and the escape character in hexadecimal format. When you enter a character for the data separator and the escape character, these are displayed as hexadecimal code after the entries have been checked. A two character entry for a data separator or an escape sign is always interpreted as a hexadecimal entry.

          Note
          • Example of how to use data separators and escape characters: You have chosen the ; character as the data separator. Your file contains the value 12;45 for a field however. If you set " as the escape character, the value in the file must be "12;45" so that 12;45 is loaded into BW. The complete value that you want to transfer has to be enclosed by the escape characters.

          • If the escape characters do not enclose the value but are used within the value, the system interprets the escape characters as a normal part of the value. If you have specified " as the escape character, the value 12"45 is transferred as 12"45 and 12"45" is transferred as 12"45".

          • Note that if you do not specify an escape character, the space character is interpreted as the escape character. We recommend using a different character as the escape character.

    7. Make the settings for the number format (thousand separator and character used to represent a decimal point) if necessary.

    8. Make the settings for currency conversion if necessary.

    9. Make any further settings that are dependent on your selection if necessary.

  5. Go to the Proposal tab page.

    Here, you create a proposal for the field list of the DataSource based on the sample data of your file.

    1. Specify the number of data records that you want to load and choose Upload Sample Data.

      The data is displayed in the upper area of the tab page in the format of your file.

      The system displays the proposal for the field list in the lower area of the tab page.

    2. In the table of proposed fields, use Copy to Field List to select the fields you want to copy to the DataSource's field list. In the default setting, all fields are selected.

  6. Go to the Fields tab page.

    Here you edit the fields that you transferred to the field list of the DataSource from the Proposal tab page. If you did not transfer the field list from a proposal, you can define the fields of the DataSource here.

    If the system detects changes between the proposal and the field list when switching from the Proposal tab to the Fields tab, a dialog box is displayed where you can specify whether you want to copy changes from the proposal to the field list.

    1. To define a field, choose Insert Row and enter a field name.

    2. Under Transfer, specify the decision-relevant DataSource fields that you want to be available for extraction and transferred to BW.

    3. Instead of generating a proposal for the field list, you can enter InfoObjects to define the fields for the DataSource. Under Template InfoObject, specify InfoObjects for the fields in BW. This allows you to transfer the technical properties of the InfoObjects into the DataSource field.

      Note Entering InfoObjects here does not equate to assigning them to DataSource fields. This assignment is made in the transformation. When you define the transformation, the system proposes the InfoObjects you entered here as InfoObjects that you might want to assign to a field.
    4. Change the data type of the field if necessary.

    5. Specify the key fields of the DataSource.

      These fields are generated as a secondary index in the PSA. This is important to ensure good performance for data transfer process selections, in particular with semantic grouping.

    6. Specify whether lowercase is supported.

    7. Specify whether the source provides the data in the internal or external format.

    8. If you choose the external format, make sure that the output length of the field (external length) is correct. Change the entries if necessary.

    9. If necessary, specify a conversion routine that converts data from an external format into an internal format.

    10. Select the fields that you want to be able to set selection criteria for when scheduling a data request using an InfoPackage. Data for this type of field is transferred in accordance with the selection criteria specified in the InfoPackage.

    11. Choose the selection options (such as EQ, BT) that you want to be available for selection in the InfoPackage.

    12. If necessary, define whether the data to be selected is language-dependent or time-dependent in the field type.

  7. Check, save and activate the DataSource.

  8. Go to the Preview tab.

    If you select Read Preview Data, the number of data records you specified in your field selection is displayed in a preview.

    This function allows you to check whether the data formats and data are correct.

Result

The DataSource is created and is visible in the Data Warehousing Workbench in the DataSource overview for the file source system in the application component. When you activate the DataSource, the system generates a PSA table and a transfer program.

You can now create an InfoPackage. You define the selections for the data request in the InfoPackage. The data can be loaded into the inbound layer of the BW system, the PSA. With a data transfer process, you can load the data into the InfoProvider directly without loading it into the PSA first. You can also access the data directly if the DataSource supports direct access, and you have defined a VirtualProvider in the data flow.