Modeling Guide

Introduction to SAP Data Hub Modeling

SAP Data Hub helps close the gap between data silos and seamlessly integrate them into the business processes. It also helps combine data silos with enterprise data and process them for better business insights.

The objective of SAP Data Hub to seamlessly integrate data silos and to combine and process them for business insights is achieved with tasks and task workflows. These task workflows help set data flows in a logical order and trigger their executions based on conditions. The SAP Data Hub Modeling tool, a component of SAP Data Hub, helps graphically model different design-time objects that are necessary to execute a task workflow.

In addition, you can also use the capabilities in the SAP Data Hub Modeling tool to transfer data from an SAP BW system or from an SAP HANA system to an SAP Vora system.

The image here illustrates the end-to-end process for creating and executing a task, task workflow, or transferring data.

Create a system, which is a stand-alone data source in the distributed data landscape. A system can belong to only one zone. Create a connection, which is a technical access point to a system via Data Hub Agent. A system may have multiple connections, but a connection is always related to a system within a zone. Projects are containers for design-time objects in SAP Data Hub. You create SAP Data Hub objects such as destination, data sets, tasks, and task workflows within a project. A data set is a generic but logical data reference consisting of a name, type, URL, and a logical destination name. It is an abstraction of the actual data or structures that reside in distributed stores. A destination is a reference to a physical connection, which can be resolved on a connected logical system. You create and activate a destination to receive and persist data from the system. Creating and executing a file operation task helps perform file operations such as copy and delete on data sets. Creating and executing a flowgraph task helps execute a flowgraph in an SAP Vora system. Creating and executing a data pipeline task helps execute a data pipeline in an SAP Vora system. Creating and executing an SAP BW process chain task helps execute an SAP BW process chain in an SAP BW system. Creating and executing an SAP HANA flowgraph task helps execute an SAP HANA flowgraph in an SAP HANA System. Creating and executing an SAP Data Services task helps execute an SAP Data Services job. The execution of SAP Data Services job helps users to integrate, transform, and improve the data quality. Creating and executing a notification task helps send e-mails to recipients with a message. You can customize the message to help recipients identify the status of a task in a task workflow that completed with a certain state. A task workflow orchestrates multiple tasks and executes them in a given order. After creating a task workflow, execute the task workflow from within the SAP Data Hub Modeling tool. After creating a task workflow, you can schedule the execution of the task workflow in the SAP Data Hub cockpit. Use the monitoring dashboard in the SAP Data Hub cockpit to monitor the execution status of tasks and task workflows.
The different entities involved in the end-to-end SAP Data Hub modeling scenarios are described here:

Component

Description

System

Systems are stand-alone data sources in the distributed data landscape. A system can belong to only one zone.

Connection

Connection is technical access point to a system. A system may have multiple connections, but a connection is always related to a system within a zone.

Project

Projects are containers for design-time objects in SAP Data Hub. You create SAP Data Hub objects within a project.

Task Assist

Task assist simplifies the creation and activation of lengthy and complicated tasks.

Destination

A destination is a reference to a connection that points to an actual system in the landscape. You create and activate a destination to receive and persist data from the system.

Data set

Data set is a generic but logical data reference consisting of a name, type, URL, and a logical destination name. It is an abstraction of the actual data or structures that reside in distributed stores.

Task

Tasks are automatic operations that you can execute, control, and monitor based on certain user-defined conditions. The tool supports different task types. Depending on the task type, a task can contain one or more data sets (as both source and target), or can be a complex data pipeline.

Task Workflow

A task workflow orchestrates multiple tasks and executes them in a given order.