Sizing of Replications

Replication Management Service (RMS) allows you to replicate a number of tables (or other data structures) from a source system to a target system.

Sizing of Replication Flows in Data Intelligence Cloud

Like all other workloads, replications are executed in the form of pipelines, which are called worker graphs in this case. The worker graphs run in the background when replication flows are started via the replication flow user interface. Worker graphs are internal and cannot be modified like regular pipelines being created by a user in the Modeler application. A replication flow allows one tenant to replicate an arbitrary number (N) of data sets (for example, tables or CDS views) from one source system to one target system. Each table constitutes a task of the replication flow. All data are replicated in relatively small chunks (typically between 10 MB and 30 MB during the initial load), such that the memory usage of the worker graphs remains constant.

Each worker graph has five connections to the source and target system; that means five parallel threads can read and write data. The total number of connections (threads) used by a replication flow is a configurable parameter. This can be configured in the Monitoring application, and the default value is 10. In this case, two worker graphs are started for each replication flow. The general formula where the result has to be rounded up to the next integer is as follows:
Number of worker graphs per replication flow = Number of connections per replication flow / 5

To scale out replications in order to achieve better throughput, there are two options:

Option Result
Increase the number of replication flows. For example, don't replicate all tables as tasks within one replication flow, but distribute the tables (tasks) over several replication flows that can run in parallel. Increasing the number of replication flows will create for each replication flow background processes in the source and target systems. Scale them carefully so as not to overload the source or target system with background processes.

Increase the number of connections within a replication flow to get more threads and, equivalently, more worker graphs.

With the start of multiple connections to the source and target systems, background processes get blocked for the handling of the data packages. Scale the number of connections carefully to not block too many resources in the system.

Scale-out will work only up to a certain point, because at some point a threshold in throughput will be reached due to constraints of the source and target systems, or due to other limitations. Therefore, up to this threshold (which depends on the details of the connected systems), the trade-off between higher throughput and higher resource consumption (higher TCO) versus lower throughput and lower resource consumption (lower TCO) is a matter of configuration.

The resource consumption of a worker graph is relatively predictable. It usually consumes around 2 GB of RAM and between 1 and 2 vCPUs. If we assume roughly 1.5 vCPUs on average, the RMS formula for RAM and CPU becomes (with A the number of replication flows running in parallel and B the configured number of connections per replication flow) as shown below:
RAM_RMS = A * [B/5] * 2 GB
CPU_RMS = A * [B/5] * 1.5

If there is a certain requirement for total throughput (in terms of [MB/s] or [GB/h]), and if this required throughput is still in the realm of linear scalability for the given constellation, it can be achieved by configuring A and B such that the throughput that is achieved per worker graph depends on the specific scenario. However, from various measurements it is known that around 5 MB/s (uncompressed data volume) is often a good estimate.

A * [B/5] >= (required throughput) / (throughput per worker graph)