Modeling Guide for SAP Data Hub

Execution Model

This section provides technical details of Modeler's execution behavior.

Operators in a graph execute concurrently. They communicate with each other by sending data to their outports and by receiving data from their inports. Back-pressure in a connection between two operators is handled by blocking the operator that tries to send data until the receiving operator reads it. This prevents data to accumulate on a region of the pipeline that produces data faster than other parts can consume.

Deadlock on Graphs with Cycles

Consider the graph below which has a cycle containing operators B and C:

A--->B--->C
     ^    |
     +----+

Suppose A generates just one message and sends it to B. Each time B or C receives a message through their inport, they will process it and then send a new message through their outport. This implies that there will always be a single message circulating around the cycle. This graph does not deadlock. Now suppose that instead of A feeding just one message into the graph, it produces two messages. In this case, the graph will deadlock because at some point B will be blocked trying to send to C, which doesn't read from its inport because it is trying to send to B, which also does not read from its inport because it is trying to send to C, and so on. In general, a deadlock happens in those types of graphs when there are at least as many messages within the cycle as there are nodes in it. Another way in which the above example graph could fit into this deadlock rule is if A generated just one message, but B outputted two messages for each one that it received at its input port.

References

When a operator sends data to another operator the data sent is not copied, but rather only the reference to the data is sent. This behavior decreases the communication cost and is also important because when programming a script operator, if you change a mutable object received as an input then this may reflect in other parts of the pipeline. To be safe, we recommend making a copy of a mutable object before changing it. The message data type is a common example of mutable object that should be copied when changed.