Modeling Guide

Create Graphs

A graph is a network of operators connected to each other using typed input ports and output ports for data transfer. Users can define and configure the operators in a graph.

Procedure

  1. Start the SAP Data Hub Modeler.
  2. In the navigation pane, select the Graphs tab.
  3. In the navigation pane toolbar, choose + (Create Graph).
    The tool opens an empty graph editor in the same window, where you can define your graph.
  4. Select operators.
    A graph can contain a single operator or a network of operators based on the business requirement.
    1. In the navigation pane, choose the Operators tab.
    2. Select the required operator.
      You can also use the search bar to search and select the required operator. Double-click the operator (or drag and drop it to the graph editor) to include it as a process for the graph execution.
  5. Configure the operators.
    Operators are defined with default configuration parameters. You can customize the operator by providing own values to the parameters, or you can also define new additional configuration parameters.
    1. In the graph editor, select the required operator.
    2. In the editor toolbar, choose (Open Configuration).
      The tool displays the default configuration parameters defined for the selected operator. You can customize the operator by providing new values to the default parameters. Depending on the operator type, you can also specify values to the following configuration parameters.

      Configuration Parameter

      Description

      subengines

      If multiple implementations exist for the operator, you can select the required subengine in which you want to execute the operator. The default value is main (SAP Data Hub Pipeline Engine). To select the required engine, in the subengines section, choose + and in the dropdown list, select the required subengine.

      securitycontexts

      To access Hadoop, you need to create a security context. The security context values are available to the operators at runtime. If you want to annotate the operators with security contexts, in the securitycontexts section, choose + (Add Security Context) and in the dropdown list, select the required security context.

      For more information on security contexts, see 624ce81c22f94cb99a1100f4aae925e8.html

  6. (Optional) Define new configuration parameters.
    If you want to configure the operator with new additional configuration parameters,
    1. In the Configuration pane, choose + (Add Parameter).
    2. In the Add Property dialog box, provide a name for the new configuration parameter.
    3. Select the property type.
      Select Text, JSON, or Boolean, based on whether the parameter value is a string value, JSON value, or a Boolean value.
    4. In the Value text field, provide a value.
    5. Choose OK.
  7. (Optional) Add new ports.
    For JavaScript operators, Python operator, multiplexer operators, and other extensible operators, you can define additional input and output ports.
    1. In the graph editor, right-click the operator (multiplexer and JavaScript) and choose Add Port.
    2. Provide a port name and a type.
    3. Select whether input or output port.
    4. Choose OK.
  8. Connect operators.
    1. Select an output port of an operator and drag the cursor to an input port of another operator.
      The tool highlights all input ports, based on the output port type, to which you can connect the operator.
  9. (Optional) Group operators into subgraph.
    You can group related operators (or based on use cases) into a subgraph. This means that, you can partition a single graph into multiple subgraphs.
    1. Using the SHIFT key, select the multiple operators that you want to group together.
    2. In the context menu, choose Group.
    3. If you want to ungroup the operators, right-click the group and choose Ungroup.
  10. (Optional) Configure groups.
    Like operators, each group (subgraph) is associated with certain configuration parameters. You can provide your own values to the configuration parameters, or you can also define new additional configuration parameters for the group.
    1. In the editor, select a group (subgraph).
    2. In the editor toolbar, choose .
      The tool displays the predefined configuration parameters available for the group. You can override the default values for the predefined configuration parameters.

      Configuration Parameter

      Description

      description

      Provide a description for the group

      restartPolicy

      The restart policy describes the behavior of the cluster scheduler, when a subgraph execution crashes. In the dropdown list, select a value for this property. If you do not specify any value, the tool uses the default value as never.

      If you set the restart policy to never, the cluster scheduler does not restart the crashed subgraph. The crash results in the final state of the graph as ‘dead’. If the restart policy is set to restart, then the cluster scheduler restarts the subgraph execution. The restart changes the state of the graph from 'dead' to 'pending' and 'running'.

      tags

      Tags describe the runtime requirements of groups and are the annotations of Dockerfiles that the tool provides.

      Choose + (Add Tag) to define tags for the group. In the dropdown list, select the required tag and version. You can associate the group with more than one tag.

      multiplicity

      Specify the multiplicity as an integer value. For example, a multiplicity of 3 implies that the tool executes 3 instances of this group at runtime.

    3. If you want to define new configuration parameters for the group, in the Configuration pane, choose + (Add Parameter).
    4. In the Add Property dialog box, provide a name for the new property.
    5. Select the property type.
      Select Text, JSON, or Boolean, based on whether the property value is a string value, JSON value, or a Boolean value.
    6. In the Value text field, provide a value.
    7. Choose OK.
  11. (Optional) Configure the graph.
    Provide a description and select a display icon that the tool must use for the graph.
    1. In the editor toolbar, choose .
    2. Provide values to the configuration parameters.

      Configuration Parameter

      Description

      description

      Description for the graph

      iconsrc

      Path to the graph display icon that the tool must use.

      icon

      In the icon dropdown list, you can select the required icon for the graph.

      The tool uses this icon for display only if you have not provided any value to iconsrc.

  12. (Optional) Export the JSON definition.
    1. After creating the graph, if you want to export the JSON definition for the graph, in the navigation pane, select Graphs tab.
    2. Right-click the required graph and choose the Export menu option.
  13. (Optional) Refer documentation.
    SAP Data Hub provides documentation for its built-in operators and example graph. The documentation provides more information on how to use the operator or a graph. To view this documentation,

    Documentation

    Steps

    Graphs

    Open the graph in the graph editor and in the editor toolbar, choose (Show Documentation).

    Operator

    Right-click an operator in the graph editor and choose the Open Documentation menu option.

  14. Save the graph.
    1. After creating a graph, in the editor toolbar, choose (Save) to save your graph.
    2. Choose the Save menu option.
    3. Provide a name along with the fully qualified path to the graph.
      For example, com.sap.others.graphname.
    4. Provide a description for the graph.
    5. Choose OK.
      The graphs and operators are stored in a folder structure within the SAP Data Hub Pipeline Modeler repository. For example, com.sap.others.graphname. If you want to save another instance of the graph, in the editor toolbar, choose (Save As) and provide a name along with the fully qualified path to the graph.