Modeling Guide

Remove File

This operator is used to remove files in a storage service.

This operation is recursive, meaning it will remove any files under the given path.

Supported services are:
  • Azure Data Lake Store (ADLS)

  • Local File System (file)

  • Google Cloud Storage (GCS)

  • HDFS

  • Amazon S3

  • Azure Storage Blob (WASB)

  • WebHDFS

Configuration Parameters

Parameter

Type

Description

service

string

The file service to operate. Additional parameters may depend on the selected service.

Default: "file"

terminateOnError

boolean

Sets if the graph should terminate when the operator fails.

Default: "true"

timeoutInMs

int

Sets the time limit to execute the operation. If `0`, no timeout is used.

Default: 0

retryPeriodInMs

int

The time interval in milliseconds between connection trials.

Default: 0

numRetryAttempts

int

The number of times to retry a connection.

Default: 0

simultaneousRequests

int

The number of simultaneous requests generated on recursive calls (only available for GCS, S3 and WASB).

Default: 1

stopRequestOnError

boolean

Sets if simultaneous requests from recursive calls should stop at first error (only available for GCS, S3 and WASB).

Default: false

connection

object

Holds information about connection information for the services.

configurationType

string

connection parameter: Which type of connection information will be used: Manual (user input) or retrieved by the Connection Management Service.

Default: ""

connectionID

string

connection parameter: The ID of the connection information to retrieve from the Connection Management Service.

Default: ""

connectionProperties

object

connection parameter: All the connection properties for the selected service for manual input.

clientId

string

ADL parameter: Mandatory. The client ID from ADLS.

Default: ""

tenantId

string

ADL parameter: Mandatory. The tenant ID from ADLS.

Default: ""

clientKey

string

ADL parameter: Mandatory. The client key from ADLS.

Default: ""

accountName

string

ADL parameter: Mandatory. The account name from ADLS.

Default: ""

rootPath

string

ADL parameter: The optional root path name for browsing. Starts with a slash (e.g. /MyFolder/MySubfolder).

Default: "/MyFolder/MySubfolder"

host

string

HDFS parameter: Mandatory. The IP address to the Hadoop name node.

Default: "127.0.0.1"

port

string

HDFS parameter: Mandatory. The port to the Hadoop name node.

Default: "9000"

user

string

HDFS parameter: Mandatory. The Hadoop user name.

Default: "hdfs"

rootPath

string

HDFS parameter: The optional root path name for browsing. Starts with a slash (e.g. /MyFolder/MySubfolder).

Default: "/MyFolder/MySubfolder"

keyFile

string

GCS parameters: Mandatory. Service account json key.

Default: ""

projectId

string

GCS parameters: Mandatory. The ID of project that will be used.

Default: "projectID"

rootPath

string

GCS parameters: "The optional root path name for browsing. Starts with a slash and the **bucket** name (e.g. /MyBucket/MyFolder).

Default: "/MyBucket/MyFolder"

accessKey

string

S3 parameter: Mandatory. The AWS access key ID.

Default: "AWSAccessKeyId"

secretKey

string

S3 parameter: Mandatory. The AWS secret access key.

Default: "AWSSecretAccessKey"

endpoint

string

S3 parameter: allows a custom endpoint http://awsEndpointURL

Default: ""

awsProxy

string

S3 parameter: The optional proxy URL.

Default: ""

region

string

S3 parameter: Mandatory. The AWS region to create the bucket in.

Default: "eu-central-1"

rootPath

string

S3 parameter: Mandatory. The optional root path name for browsing. Starts with a slash and the bucket name (e.g. /MyBucket/MyFolder).

Default: "/MyBucket/MyFolder"

protocol

string

S3 parameter: Mandatory. The protocol schema to be used (HTTP or HTTPS).

Default: "HTTP"

accountName

string

WASB parameter: Mandatory. The account name from WASB.

Default: ""

accountKey

string

WASB parameter: Mandatory. The account key from WASB.

Default: ""

rootPath

string

WASB parameter: Mandatory. The optional root path name for browsing. Starts with a slash and the **container** name (e.g. /MyContainer/MyFolder).

Default: "/MyContainer/MyFolder"

protocol

boolean

WASB parameter: The protocol schema to be used (WASBS/HTTPS or WASB/HTTP)

Default: true

rootPath

string

WebHDFS parameter: The optional root path name for browsing. Starts with a slash (e.g. /MyFolder/MySubfolder).

Default: "/MyFolder/MySubfolder"

protocol

string

WebHDFS parameter: Mandatory. The scheme used on WebHDFS connection (webhdfs/http or swebhdfs/https).

Default: "webhdfs"

host

string

WebHDFS parameter: Mandatory. The IP address to the WebHDFS node.

Default: "127.0.0.1"

port

string

WebHDFS parameter: Mandatory. The port to the WebHDFS node.

Default: "9000"

user

string

WebHDFS parameter: Mandatory. The WebHDFS user name.

Default: "hdfs"

webhdfsToken

string

WebHDFS parameter: The Token to authenticate to WebHDFS with.

Default: ""

webhdfsOAuthToken

string

WebHDFS parameter: The OAuth Token to authenticate to WebHDFS with.

Default: ""

webhdfsDoAs

string

WebHDFS parameter: The user to impersonate. Has to be used together with webhdfsUser.

Default: ""

Input

Input

Type

Description

in

string

The path of the file or directory to be removed.

Output

Output

Type

Description

out

string

A string which copies the input once the operation is successful.