Create a Connection
Create a connection, which is a technical access point to a system via Data Hub Agent. A system may have multiple connections, but a connection is always related to a system within a zone.
Context
You create a connection in the SAP Data Hub cockpit.
Procedure
- In the Landscape Management page, click Manage Connections.
- Click + to add a new connection to an existing system.
-
Enter the connection ID, and select the system that it belongs to.
NoteIn order to execute flowgraph and profiling tasks on SAP Data Hub Pipeline connections, you must define a default connection per system. The default connection ID must include the suffix _DEFAULT in uppercase (for example, Pipeline_DEFAULT). There must be only one default SAP Data Hub Pipeline connection per system.
-
Select the connection type. The connection type determines the rest of the
connection information that you must provide.
The following connection types are supported for each system type:
System Type
Connection
Operation
Admin
Email (Server): connection and access information to an email server Allows email tasks to be created in the Modeling tool. For example, a user can create an email task and include it in a task workflow, where the email is sent after a set of tasks are completed.
SAP HANA
SAP HANA SDI connection: connection and access information to a REST API Service Allows browsing and execution of jobs (flowgraphs) in a remote SAP HANA system
SAP HANA SDI connection: connection and access information to a REST API Service Allows creation of HANA-based datasets.
SAP Vora
HDFS connection: connection and access information to an HDFS server Allows:- Browsing folders and files in the HDFS server
- Obtaining file metadata and data profiling
-
Copying and deleting files in the HDFS server
-
Performing flowgraph tasks with HDFS files as the source and/or target
NoteAlong with RPC, HDFS can now extend connections with HTTP and HTTPS.
Amazon S3 connection: connection and access information on a s3 provider
Allows:- Browsing buckets, folders, and files on an Amazon S3 endpoint
- Obtaining file metadata and data profiling
- Copying and deleting files in Amazon S3 buckets.
- Performing flowgraph tasks with Amazon S3 files as the source and/or target
SAP Data Service connection: connection and access information to a SOAP server from an SAP Data Services administration server
Allows browsing and execution of SAP Data Services jobs. The jobs run in the SAP Data Services Job Server.
SAP VORA Catalog: connection and access information to an SAP VORA Catalog service
Allows:- Browsing Schemas and Tables in a SAP Vora instance
- Obtaining SAP Vora table metadata and data profiling
- Performing flowgraph tasks with an SAP VORA table as the source and/or target
- Loading of other cloud storages such as Amazon S3, Google Cloud Storage, Azure Data Lake, and Azure Storage Blobs
Azure Data Lake (ADL)
Allows:- Browsing folders and files in the Azure Data Lake Storage server
- Obtaining file metadata and data profiling
- Previewing data
- Performing flowgraph tasks with Azure Data Lake files as the source and/or target
Google Cloud Storage (GCS)
Allows:- Browsing folders and files in the Google Cloud Storage server
- Obtaining file metadata and data profiling
- Previewing data
- Performing flowgraph tasks with Google Cloud Storage files as the source and/or target
Azure Storage Blobs (WASB) Allows:- Browsing folders and files in the Azure Storage Blob server
- Obtaining file metadata and data profiling
- Previewing data
- Performing flowgraph tasks with Azure Storage Blob files as the source and/or target
SAP Data Hub Pipeline: connection and access information to VFlow
Allows browsing and execution of VFlow graphs
SAP BW
SAP BW Process Chains: connection and access information to SAP BW Allows browsing and execution of process chains
SAP LT Replication Server
SAP Landscape Transformation REST API
Allows all requests to be forwarded to the REST API
-
Enter and select the remainder of the connection information.
NoteProfiling permissions: Ensure that proper privileges exist for writing to the specified output. If no “Cloud Storage” is specified, then a default HDFS connection must be deployed and the user must have proper permissions to write to the /tmp directory.
-
For HDFS, SAP VORA Catalog, and SAP Data Hub Pipeline connections, you must
specify whether the configuration is default or manual. A default configuration
uses the default connection for the connection type. For manual, you must
specify the connection URL.
NoteProfiling on SAP Vora: When profiling data on SAP Vora, the user can set a cloud storage connection/location, or create an HDFS connection to the system where the SAP Data Hub adapter is installed. There must be an SAP Data hUb Pipeline connection on the same system as the connection being profiled. The name of the SAP Data Hub Pipeline connection must end with the '_DEFAULT' suffix. Then enter the user and password for the VSystem.
- To validate the connection, click the Validate button.
- Click Add.