Modeling Guide for SAP Data Hub

Microsoft Azure Blob Storage (WASB)

Windows Azure Storage Blob (WASB) is one of Microsoft Azure's Storage cloud service. Additional information, including the documentation, can be found at the official WASB homepage.

Many of the SAP Data Hub storage operators offer support for WASB. This documentation regards the common characteristics that this service has across operators.

This document may refer to an object as a "file", and to an object's prefix as a "directory", if it fits the context of the operator.

Connection

In order to use any operator that connects to WASB, you may use a Connection ID from the Connection Management, or set a Manual connection with the following values:
  • Account Name [Mandatory]
    The account name from WASB.
    • ID: accountName
    • Type: string
    • Default: "myaccount"
  • Root Path
  • The optional root path name for browsing. Starts with a slash and the container name (e.g. /MyContainer/MyFolder).
    • ID: rootPath
    • Type: string
    • Default: ""
  • Endpoint Suffix
    Optional endpoint suffix.
    • ID: endpointSuffix
    • Type: string
    • Default: "core.windows.net"
  • Protocol [Mandatory]
    The protocol schema to be used (WASBS/HTTPS or WASB/HTTP).
    • ID: protocol
    • Type: string (May be "wasb" or "wasbs")
    • Default: "wasbs"
  • Account Key [Mandatory]
    The account key from WASB.
    • ID: accountKey
    • Type: string (Password format)
    • Default: ""
Further connection configurations may be set, which are not in the Connection Management. Such are:
  • Container
    Optional container name to be accessed. It works as a "fallback" of the Connection's Root Path configuration, i.e. if no bucket is given in the Root Path, the value from Container is used.
    • ID: containerName

    • Type: string

    • Default: "mycontainer"

  • Blob Type
    Only used in Write File operator. It sets the blob type of the destination blob ("file").
    • ID: wasbBlobType

      Type: string

      Default: "BlockBlob"

    • Values:
      • "BlockBlob"

      • "PageBlob"

      • "AppendBlob"

Permissions

Permissions in Azure Blob Storage are required to operate over blobs. WASB currently restricts access to blobs through the container's policy:
  • Full public read access

  • Public read access for blobs only

  • No public read access

Find more information hereInformation published on non-SAP site and hereInformation published on non-SAP site.

Operators will need full access to the data, thus the container should have "Full public read access" if the given credentials are not from the owner of the container; otherwise, any permission should be enough.

Restrictions

Any WASB specific restriction in the operators is documented here. Some may apply broadly to every operator:
  • Directories:

    In order for a path to be interpreted as a directory, it should end with /. For example: /tmp/ is a directory, while /tmp is a file named tmp.

  • Working directory:

    Since there is no concept of a "working directory", any relative directory given to/by this service will have the root directory (/) as working directory.

Move File Restrictions

As the WASB API does not support the move operation, the operation consists of a copy followed by removing the source file. Thus, in cases of failure, the file may be copied and not removed.

Further restrictions are documented in Copy File Restrictions.

Copy File Restrictions

Taking that the operation has a "source" and a "destination" path:
  • If the destination is a file, source must also be a file.

  • If the destination is a directory, it must be empty.

For instance, in the given file structure:
.
|
+-- a
|   +-- file1.txt
|   +-- file2.txt
+-- b
    +-- f1.txt
    +-- f2.txt
  • Copying source: a/file1.txt to destination: newfile.txt, would succeed, since the destination does not exist.

  • Copying source: a/file1.txt to destination: b/f1.txt, would succeed and overwrite b/f1.txt, since the destination is an existing file.

  • Copying source: a/file1.txt to destination: b/, would fail, since b/ already exists and is not empty.

  • Copying source: a/ to destination: b/ would fail, since b/ already exists and is not empty.

  • Copying source: a/ to destination: b/dir/ would succeed, since b/dir/ does not exist.