Amazon S3
AWS S3 is an Object Store service, further documented in the owner's page. Other services may also support the S3 API, such as Rook, Minio and Swift, which have already been tested. Any other service supporting the S3 API is not guaranteed to be compatible.
This aims to document only the operators' relation with the S3 API.
This document may refer to an object as a "file", and to an object's prefix as a "directory", if it fits the context of the operator.
Connection
- Custom endpoint
Allows using a custom endpoint to access the S3 service. If not set, the default AWS endpoint is used.
- ID: endpoint
- Type: string
- Default: ""
- Protocol [Mandatory]
Sets which protocol to be used. The set value overwrites the protocol prefix in the Custom endpoint configuration, if any given.
- ID: protocol
- Type: string
- Default: "HTTP"
- Possible values:
- "HTTP"
- "HTTPS"
- Region [Mandatory]
The AWS region the configured bucket (found in Root Path) belongs to.
- ID: region
- Type: string
- Default: "eu-central-1"
- Access Key [Mandatory]
The Access Key ID used to authenticate to the service. It pairs with the Secret Key in order to authenticate.
- ID: accessKey
- Type: string
- Default: "AWSAccessKeyID"
- Secret Key [Mandatory]
The Secret Access Key used to authenticate to the service. It pairs with the Access Key in order to authenticate.
- ID: secretKey
- Type: string
- Default: "AWSSecretAccessKey"
- Root Path
The bucket and an optional root path name for browsing. Starts with a slash and the bucket name (e.g. /MyBucket/My Folder), followed by another slash and the optional root path. Dataset names for this connection don't contain segments of the rootPath; instead their first segment is a subdirectory of the root path.
- ID: rootPath
- Type: string
- Default: "/MyBucket/MyFolder"
- Bucket
Optional bucket name to be accessed. It works as a "fallback" of the Connection's Root Path configuration. For instance, if no bucket is given in the Root Path, the value from Bucket is used.
-
ID: awsBucket
-
Type: string
-
Default: "com.sap.datahub.test"
-
- Proxy
An option proxy to be used in the connection to the service.
-
ID: awsProxy
Type: string
Default: ""
-
- Use SSL
Whether to use SSL/TLS when connecting to the service.
-
ID: useSSL
Type: boolean
Default: true
-
Permissions
Permissions in AWS are required to operate over S3 objects. Each operator may require a determined set to successfully operate.Read File Permissions
-
s3:GetObject for the given object. See also AWS S3 GET Object.
s3:GetObjectVersion for the given object. See also, AWS S3 Object Permissions.
- s3:ListBucket for the bucket where the prefix is to be listed. Note that the permission may be narrowed to a directory inside the bucket, and the prefix is subject to this restriction. See also, AWS S3 GET Bucket.
- s3:DeleteObject for the given object. See also, DELETE Object.
Write File Permissions
- s3:PutObject for the bucket to receive the object. See also, AWS S3 Object Permissions.
- s3:GetObject for the given object. See also, AWS S3 GET Object. This is due to the restrictions documented further.
Remove File Permissions
- s3:DeleteObject for the given object. See also, AWS S3 Object Permissions.
Move File Permissions
As moving consists of copying and removing in S3, you will need the permissions documented in Remove File Permissions and Copy File Permissions.
Copy File Permissions
-
s3:GetObject for the source object.
-
s3:PutObject for the bucket to receive the copied object. See also, AWS S3 Multipart Upload API and Permissions.
If copying by prefix (i.e. a "directory"), the operation is bound to the same permissions documented in Read File Permissions.
Restrictions
- Directories:
In order for a path to be interpreted as a directory, it should end with /. For example: /tmp/ is a directory, while /tmp is a file named tmp.
- Working directory:
Since there is no concept of a "working directory", any relative directory given to/by this service will have the root directory (/) as working directory.
Write File Restrictions
If using "Append" mode, as the S3 API does not support it, the whole file is retrieved from the service in order to append the data and write back to S3; thus, compromising the operation's efficiency.
Move File Restrictions
As the S3 API does not support the move operation, the operation consists of a copy followed by removing the source file. Thus, in cases of failure, the file may be copied and not removed.
Further restrictions are documented in Copy File Restrictions.
Copy File Restrictions
-
If the destination is a file, source must also be a file.
-
If the destination is a directory, it must be empty.
. | +-- a | +-- file1.txt | +-- file2.txt +-- b +-- f1.txt +-- f2.txt
-
Copying source: a/file1.txt to destination: newfile.txt, would succeed, since the destination does not exist.
-
Copying source: a/file1.txt to destination: b/f1.txt, would succeed and overwrite b/f1.txt, since the destination is an existing file.
-
Copying source: a/file1.txt to destination: b/, would fail, since b/ already exists and is not empty.
-
Copying source: a/ to destination: b/ would fail, since b/ already exists and is not empty.
-
Copying source: a/ to destination: b/dir/ would succeed, since b/dir/ does not exist.