Read File
The Read File operator is used to read a file or periodically poll a directory for its contents in a storage service.
The operation takes only one input parameter: the path of the file. This is given as a string in the body of a message in the inPath port. If no input is connected to the port, the operator will periodically poll from the configured path.
The file content is outputted as the body of a message in the outFile port. Further details of the operation are reported as headers of the message, as listed in the port documentation.
An example of usage is given in the com.sap.demo.file graph.
Polling directories: when the given path points to a directory, this operator will poll all files inside that directory. It may be recursive if set to do so.
Configuration Parameters
Parameter |
Type |
Description |
---|---|---|
service |
string |
The file service to operate. Additional parameters may depend on the selected service. Default: "file" |
path |
string |
A directory to be polled (ends with /) or a file to be read. This only applies if inPath is not connected. Default: "/tmp/test.txt" |
deleteAfterSend |
bool |
A flag that indicates whether the file should be deleted after its contents have been
sent.
Default: false |
chunkSize |
string |
The maximum number of bytes that can be read from files at once. It reads the bytes in blocks until it reaches the end of the file. This can be used to reduce graph latency and memory usage. If chunkSize is zero, files are read in a single chunk. Otherwise, it will be broken in
chunks with a maximum size `chunkSize`. It may be dynamically
customized through the message header
storage.chunkSize. This field allows metric
prefixes to be used and an optional "i" to indicate binary
bases. For example:
Default: "0" |
numRetryAttempts |
int |
The number of times to retry a connection. Default: 0 |
retryPeriodInMs |
int |
The time interval in milliseconds between connection trials. Default: 0 |
pollPeriodInMs |
int |
The time interval in milliseconds between successive polls. If no interval is needed, the value `0` should be used. Default: 1000 |
batchRead |
bool |
A flag that controls whether all files should be read in batches. If set to true and a directory is being polled, then outFilename will be given a list with one filename per line; outFile will be a message whose body is a list of messages, each of which corresponds to a single file. Default: false |
recursive |
bool |
A flag that controls whether a directory listing should recursively include all sub-directories. Default: false |
pattern |
string |
A regular expression used to filter file paths before reading them. If empty, all files
are accepted. The expression is applied to a path after being
converted to an absolute one and only if:
Default: "" |
onlyReadOnChange |
bool |
If true, only outputs a file if it is new or changed, which avoids the repetitive reading of unchanged files. This uses the date and time given by the file system as the latest modification time as opposed to the actual file contents. Default: false |
terminateOnError |
boolean |
Sets if the graph should terminate when the operator fails. Default: "true" |
connection |
object |
Holds information about connection information for the services. Each service connection
parameters is documented separately:
|
configurationType |
string |
connection parameter: Which type of connection information will be used: Manual (user input) or retrieved by the Connection Management Service. Default: "" |
connectionID |
string |
connection parameter: The ID of the connection information to retrieve from the Connection Management Service. Default: "" |
connectionProperties |
object |
connection parameter: All the connection properties for the selected service for manual input. |
Input
Input |
Type |
Description |
---|---|---|
inPath |
message |
A message whose body is the path (relative or absolute) of a file or directory (ends with /) to be read. When reading a single file, the message header storage.offsetmay be set to read a specific chunk from a file, which is also subject to the chunkSize configuration. |
Output
Output |
Type |
Description |
---|---|---|
outFilename |
message |
A message whose body is the path of the file. This will be equal to the the path that prompted the reading (either inPath or path). |
outFile |
message |
A message whose headers describe the file read and whose body contains the file's
contents as a blob. The message contains the following
headers:
|