Modeling Guide for SAP Data Hub

Creating Graphs

A graph is a network of operators connected to each other using typed input ports and output ports for data transfer. Users can define and configure the operators in a graph.

Procedure

  1. Start the SAP Data Hub Modeler.
  2. In the navigation pane, select the Graphs tab.
  3. In the navigation pane bar, choose + (Create Graph).
    The application opens an empty graph editor in the same window, where you can define your graph.
  4. Select operators.
    A graph can contain a single operator, or a network of operators based on the business requirement.
    1. In the navigation pane, choose the Operators tab.
    2. Select the required operator.
      You can also use the search bar to search and select the required operator. Double-click the operator (or drag and drop it to the graph editor) to include it as a process for the graph execution.
  5. Configure the operators.
    Operators are defined with default configuration parameters. You can customize the operator by providing own values to the parameters, or you can also define new additional configuration parameters.
    1. In the graph editor, select the required operator.
    2. In the editor bar, choose (Open Configuration).
      The application displays the default configuration parameters defined for the selected operator. You can customize the operator by providing new values to the default parameters. Depending on the operator type, you can also specify values to the following configuration parameters.

      Configuration Parameter

      Description

      subengines

      If multiple implementations exist for the operator, you can select the required subengine in which you want to execute the operator. The default value is main (Pipeline Engine). To select the required engine, in the subengines section, choose + and in the dropdown list, select the required subengine.

  6. (Optional) Define new configuration parameters.
    You can add and define new configuration parameters for an operator:
    1. In the Configuration pane, choose + (Add Parameter).
    2. In the Add Property dialog box, provide a name for the new configuration parameter.
    3. Select the property type.
      Select Text, JSON, or Boolean, based on whether the parameter value is a string value, JSON value, or a Boolean value.
    4. In the Value text field, provide a value.
    5. Choose OK.
  7. (Optional) Validate configurations.
    If the operator has a type scheme associated with it, then the application enables you validate the configuration parameters values against the conditions defined in the schema. For example, you can validate mandatory fields, minimum length or maximum length, value formats, regular expression, and so on, which are defined in the type schema.
    1. In the Configuration pane, choose the Validate button.
    The application runs validations on the configuration parameter values and displays validation errors, if any.
  8. (Optional) Add new ports.
    For JavaScript operators, Python operators, multiplexer operators, and other extensible operators, you can define more input and output ports.
    1. In the graph editor, right-click the operator (multiplexer and JavaScript) and choose Add Port.
    2. Provide a port name and a type.
    3. Select whether the port is an input or an output port.
    4. Choose OK.
    For more information on valid port types, see the topic Port Types.
  9. Connect operators.
    Select an output port of an operator and drag the cursor to an input port of another operator.
    The application highlights all input ports, based on the output port type, to which you can connect the operator.
  10. (Optional) Creating groups.
    You can partition a graph into many groups. Each group's subgraph will run in a different Docker container that can be assigned to different cluster nodes. On the other hand, operators inside the same group always run in the same node. You can configure each group with a different restart policy, tags or multiplicity. For more information on the use cases of groups, see Groups, Tags, and Dockerfiles.
    1. Using the SHIFT key, select the multiple operators that you want to group together.
    2. In the context menu, choose Group.
    3. If you want to ungroup the operators, right-click the group and choose Ungroup.
    4. If you want to expand an existing group to include more operators, click the mouse cursor and drag the group boundary over the required operators.

      Alternatively, you can also drag and drop the operators inside the existing group region.

  11. (Optional) Configure groups.
    Like operators, each group (subgraph) is associated with certain configuration parameters. You can provide your own values to the configuration parameters, or you can also define new additional configuration parameters for the group.
    1. In the editor, select a group (subgraph).
    2. In the editor bar, choose .
      The application displays the predefined configuration parameters available for the group. You can override the default values for the predefined configuration parameters.

      Configuration Parameter

      Description

      description

      Provide a description for the group

      restartPolicy

      The restart policy describes the behavior of the cluster scheduler when a group execution results in a crash. In the dropdown list, select a value for this property. If you do not specify any value, the application uses the default value as never.

      If you set the restart policy to never, the cluster scheduler does not restart the crashed group. The crash results in the final state of the graph as being ‘dead’. If the restart policy is set to restart, then the cluster scheduler restarts the group execution. The restart changes the state of the graph from dead to pending and then running.

      tags

      Tags describe the runtime requirements of groups.

      Choose + (Add Tag) to define tags for the group. In the dropdown list, select the required tag and version. You can associate the group with more than one tag. For more information on tags, see Groups, Tags, and Dockerfiles

      multiplicity

      Specify the multiplicity as an integer value. For example, a multiplicity of 3 implies that the application executes 3 instances of this group at runtime. The application uses the round-robin fashion to send the data arriving at a group with multiplicity larger than one.

    3. If you want to define new configuration parameters for the group, in the Configuration pane, choose + (Add Parameter).
    4. In the Add Property dialog box, provide a name for the new property.
    5. Select the property type.
      Select Text, JSON, or Boolean, based on whether the property value is a string value, JSON value, or a Boolean value.
    6. In the Value text field, provide a value.
    7. Choose OK.
  12. (Optional) Configure the graph.
    You can provide a description and select a display icon that the application must use for the graph.
    1. In the editor bar, choose .
    2. Provide values to the configuration parameters.

      Configuration Parameter

      Description

      description

      Description of the graph

      iconsrc

      Path to the graph display icon that the application must use.

      icon

      In the icon dropdown list, you can select the required icon for the graph.

      The application uses this icon for display only if you have not provided any value to iconsrc.

  13. (Optional) Export the JSON definition.
    1. After creating the graph, if you want to export the JSON definition for the graph, in the navigation pane, select the Graphs tab.
    2. Right-click the required graph and choose the Export menu option.
  14. (Optional) Refer to operator documentation.
    SAP Data Hub provides documentation for its built-in operators and example graphs. The documentation provides more information on how to use the operator or a graph.

    Documentation

    Steps

    Graphs

    Open the graph in the graph editor and in the editor bar, choose (Show Documentation).

    Operator

    Right-click an operator in the graph editor and choose the Open Documentation menu option.

  15. Save the graph.
    1. After creating a graph, in the editor bar, choose (Save) to save your graph.
    2. Choose the Save menu option.
    3. Provide a name along with the fully qualified path to the graph.
      For example, com.sap.others.graphname.
    4. Provide a description for the graph.
    5. Choose OK.
      The graphs and operators are stored in a folder structure within the modeler repository. For example, com.sap.others.graphname. If you want to save another instance of the graph, in the editor bar, choose (Save As) and provide a name along with the fully qualified path to the graph.
  16. (Optional) Validate the graph
    The Modeler allows you to validate the configurations of operators in a graph before executing the graph.
    1. In the bottom pane, choose the Validation tab.
    2. Select the Enable validation toggle button.
      The application validates the graph only if this toggle button is enabled. Once enabled for a selected graph, the changes are saved for the logged in user and the application runs validations for all open graphs.
    3. In the editor toolbar, choose (Save) to save the changes.

    In the Validation tab, the application displays the validation results. The validation errors are grouped for each open graph. You can expand the results to view operator-specific validation results.