Modeling Guide

Configure the Data Source Node

Data Source nodes provide connections to the input data.

Prerequisites

  • You have modeled the Data Transform operator with the Data Source node.
  • You have created connections in the SAP Data Hub Connection management application.
  • The connection type used in the connection definition has the STRUCTURED_TRANSFORM capability.

Procedure

  1. Double-click the Data Source node.
  2. Select the source data set.
    1. In Connection ID text field, enter the required connection ID.
      You can also browse and select the required connection.
    2. In the Source field, browse and select the required source data set.
    In the Columns section, the tool displays all columns from the selected data set.
  3. (Optional) Import data set from Metadata Catalog.
    You can browse the folders in the Metadata Catalog and select the required data set.
    1. In the editor, choose Import from Catalog.
    2. Browse and select the required data set.
    3. Choose OK.
      The tool automatically populates the connection details based on the selected data set. For more information on Metadata Catalog, see topic Managing the Catalog in the Data Governance User Guide.
  4. Define the output columns.
    Depending on the connection type of the selected connection ID, in the Columns section, define the data set.

    Connection Type

    Next Steps

    HDFS, Amazon S3, ADL, Google Cloud Storage (GCS), and WASB

    1. In the Source field, browse to the required file in HDFS, Amazon S3, ADLS, Google Cloud Storage, or WASB.

    2. In the Format dropdown list, select the file format.

    The tool supports Parquet, CSV, and ORC file formats. For CSV files, you can define additional CSV-specific properties such as the character set, column delimiter, and the text delimiter. The tool parses the CSV files based on the values that you provide.

    3. For CSV files, select a value for Includes Header.

    This value helps the tool identify whether the selected file contains a header row. If the file already has a header row, enable the Includes Header toggle button.

    VORA

    1. In the Schema Name field, browse to the required SAP Vora table.

    The tool automatically populates the Table Name field.

    1. (Optional) Choose + (Add Column) to define additional columns.
    2. Select a data type for each column.
      Depending on the selected data type, you can define the length, scale, precision, or format.
    3. Under the Primary Key column, select a column that serves as the primary key.
      Typically the data in this column is unique.
    4. Under the Nullable column, choose whether the column value can be empty (nullable).
    5. (Optional) To reorder the columns, click the up and down arrow icon. Select the column that you want to move and click the up or down arrows.
    6. (Optional) If you want to use the structure definition that the tool proposes for the selected file or table, in the Columns section, choose Auto Propose.
  5. Connect nodes.
    If you want to configure the Data Transform operator with another node,
    1. In the menu bar, use the breadcrumb navigation to navigate back to the operator configuration editor.
    2. Add new nodes.
    3. To connect the nodes, select the output port of a node and drag the cursor to an input port of another node.