Show TOC

File/Hadoop Event XML Input Adapter ConfigurationLocate this document in the navigation structure

Configure the File/Hadoop Event XML Input adapter by specifying values for the ESP connector, formatter, and transporter modules in the adapter configuration file. The File/Hadoop Event XML Input adapter uses SimpleDateFormat formatting codes.

Logging
XML Element Description
Log4jProperty

Type: string

(Optional) Specify a full path to the log4j.properties logging file you wish to use. The default value is STREAMING_HOME/adapters/framework/config/log4j.properties.

XML Element Description
CharsetName

Type: string

(Optional) Specify the name of a supported charset for the input file. The default value is US-ASCII.

Transporter Module: File Input Transporter

The File Input transporter reads data from local files, wraps the data with string, and sends it to the next module specified in the adapter configuration file. Set values for this transporter in the adapter configuration file.

XML Element Description
Module

(Required) Element containing all information for this module. It contains a type attribute for specifying the module type.

For example, transporter.

InstanceName

Type: string

(Required) Instance name of the specific module you want to use. For example, MyInputTransporter.

Name

Type: string

(Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter.

Next

Type: string

(Required) Instance name of the module that follows this one.

BufferMaxSize

Type: integer

(Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240.

Parameters

(Required) Element containing the FileInputTransporterParameters element.

FileInputTransporterParameters

(Required) Element containing elements for the File Input transporter.

Dir

Type: string

(Required) Specify the absolute path to the data file which you want the adapter to read. For example, <username>/<foldername>.

Alternatively, you can leave this value empty and specify the absolute path in the File property. No default value.

To use Hadoop system files, use an HDFS folder uri instead of a local file system folder. For example, hdfs://<hdfsserver>:9000/<foldername>/<subfoldername>/<leaffoldername>.

To use Hadoop, download the binaries for Hadoop version 1.2.1 or 2.2.0 from http://hadoop.apache.orgInformation published on non-SAP site. For version 1.2.1, copy the hadoop-core-1.2.1.jar file to %STREAMING_HOME%\adapters\framework\libj.

For version 2.2.0, copy these files over to %STREAMING_HOME%\adapters\framework\libj:
  • hadoop-common-2.2.0.jar
  • hadoop-auth-2.2.0.jar
  • hadoop-hdfs-2.2.0.jar
  • guava-11.0.2.jar
  • protobuf-java-2.5.0.jar
Ensure you use a stable version rather than a beta.

Use a forward slash for both UNIX and Windows paths.

File

Type: string

(Required) Specify the relative path to the file you want the adapter to read or the regex pattern to filter the files on a given directory. See the DynamicMode element.

If the Dir property is left blank, use this property to specify the absolute path to the data files which you want the adapter to read. No default value.

AccessMode

Type: string

(Required) Specify an access mode:
  • rowBased the adapter reads one text line at a time.
  • Streaming the adapter reads a preconfigured size of bytes into a buffer.
No default value.
DynamicMode

Type: string

(Advanced) Specify a dynamic mode:
  • Static the adapter reads the file specified in the Dir and File elements.
  • dynamicFile the adapter reads the file specified in the Dir and File elements and keeps polling the new appended content. The polling period is specified in the PollingPeriod element.
  • dynamicPath the adapter polls all the new files under the Dir element. Also, the File element acts as a regex pattern and filters out the necessary files.
The default value is Static.

If DynamicMode has been set to dynamicPath and you leave the File element empty, the adapter reads all the files under the specified directory.

An example regex pattern is ".*\.txt", which selects only files that end with ".txt". In regex patterns, you must include an escape character, "\", before meta chars to include them in the pattern string.

PollingPeriod

Type: integer

(Advanced) Define the period, in seconds, to poll the specified file or directory. Set this element only if the value of the DynamicMode element is set to dynamicFile or dynamicPath.

The default value is 0, which, along with all other values less than 0, turns off polling.

RemoveAfterProcess

Type: boolean

(Optional) If this property is set to true, the file is removed from the directory after the adapter processes it. This element takes effect if the value of the DynamicMode element is set to dynamicPath and ignored if it is set to dynamicFile instead.

The default value is false.

ScanDepth

Type: integer

(Optional) Specify the depth of the schema discovery. The adapter reads the number of rows specified by this element value when discovering the input data schema.

The default value is three.

Formatter Module: XML String to ESP Formatter

The XML String to ESP formatter translates ESP XML strings to AepRecord objects.

XML Element Description
Module

(Required) Element containing all information for this module. It contains a type attribute for specifying the module type.

For example, formatter.

InstanceName

Type: string

(Required) Instance name of the specific module you want to use. For example, MyInputTransporter.

Name

Type: string

(Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter.

Next

Type: string

(Required) Instance name of the module that follows this one.

BufferMaxSize

Type: integer

(Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240.

Parallel

Type: boolean

(Optional) If set to true, the module runs as a separated thread. The default value is true.

Parameters

(Required) Element containing the XmlStringToEspFormatterParameters element.

XmlStringToEspFormatterParameters

(Required) Element containing the XML String to ESP formatter elements.

SecondDateFormat

Type: string

(Optional) Format string for parsing SecondDate values.

For example, yyyy-MM-dd'T'HH:mm:ss is the default value.

MsDateFormat

Type: string

(Optional) Format string for parsing MsDate values.

For example, yyyy-MM-dd'T'HH:mm:ss.SSS is the default value.

BigDatetimeFormat

Type: string

(Advanced) Specify the format for parsing bigdatetime values.

For example, yyyy-MM-dd'T'HH:mm:ss.SSSSSS is the default value. Using less than six Ss gives precision to that exact number of Ss and ignores values past that specification. Using more than six Ss truncates any values beyond the sixth, and replaces them with zero. This may result in slower behavior.

ESP Connector Module: ESP Publisher

The ESP Publisher module obtains data from a transporter or formatter module and publishes it to the SAP Event Stream Processor project.

XML Element Description
Module

(Required) Element containing all information for this module. It contains a type attribute for specifying the module type.

For example, formatter.

InstanceName

Type: string

(Required) Instance name of the specific module you want to use. For example, MyInputTransporter.

Name

Type: string

(Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter.

BufferMaxSize

Type: integer

(Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240.

Parameters

(Required) Element containing the EspPublisherParameters element.

EspPublisherParameters

(Required) Element containing elements for the ESP publisher.

ProjectName

Type: string

(Required if adapter is running in standalone mode; optional if it is running in managed mode) Name of the ESP project to which the adapter is connected. For example, StreamingProject2.

This is the same project tag that you specify later in the adapter configuration file in the Name element within the Event Stream Processor (EspProjects) element.

If you are starting the adapter with the ESP project to which it is attached (that is, running the adapter in managed mode), you need not set this element as the adapter automatically detects the project name.

StreamName

Type: string

(Required if adapter is running in standalone mode; optional if it is running in managed mode) Name of the stream to which the adapter publishes data.

If you are starting the adapter with the ESP project to which it is attached (that is, running the adapter in managed mode), you need not set this element as the adapter automatically detects the stream name.

MaxPubPoolSize

Type: positive integer

(Optional) Maximum size of the record pool. Record pooling, also referred to as block or batch publishing, allows for faster publication since there is less overall resource cost in publishing multiple records together, compared to publishing records individually.

Record pooling is disabled if this value is 1. The default value is 256.

MaxPubPoolTime

Type: positive integer

(Optional) Maximum period of time, in milliseconds, for which records are pooled before being published. If not set, pooling time is unlimited and the pooling strategy is governed by maxPubPoolSize. No default value.

UseTransactions

Type: boolean

(Optional) If set to true, pooled messages are published to Event Stream Processor in transactions. If set to false, they are published in envelopes. The default value is false.

SafeOps

Type: boolean

(Advanced) Converts the opcodes INSERT and UPDATE to UPSERT, and converts DELETE to SAFEDELETE. The default value is false.

SkipDels

Type: boolean

(Advanced) Skips the rows with opcodes DELETE or SAFEDELETE. The default value is false.

SAP Event Stream Processor Elements

SAP Event Stream Processor elements configure communication between Event Stream Processor and the File/Hadoop Event XML Input adapter.

XML Element Description
EspProjects

(Required) Element containing elements for connecting to Event Stream Processor.

EspProject

(Required) Element containing the Name and Uri elements. Specifies information for the ESP project to which the adapter is connected.

Name

Type: string

(Required) Specifies the unique project tag of the ESP project which the EspConnector (publisher/subscriber) module references.

Uri

Type: string

(Required) Specifies the total project URI to connect to the ESP project. For example, if you have SSL enabled, use esps://localhost:19011/ws1/p1.

Security

(Required) Element containing all the authentication elements below. Specifies details for the authentication method used for Event Stream Processor.

User

Type: string

(Required) Specifies the user name required to log in to Event Stream Processor (see AuthType). No default value.

Password

Type: string

(Required) Specifies the password required to log in to Event Stream Processor (see AuthType).

Includes an "encrypted" attribute indicating whether the Password value is encrypted. The default value is false. If set to true, the password value is decrypted using RSAKeyStore and RSAKeyStorePassword.

AuthType

Type: string

(Required) Method used to authenticate to Event Stream Processor. Valid values are:
  • server_rsa - RSA authentication using keystore
  • kerberos - Kerberos authentication using ticket-based authentication
  • user_password - LDAP, SAP BI, and Native OS (user name/password) authentication

If the adapter is operated as a Studio plug-in, AuthType is overridden by the Authentication Mode Studio start-up parameter.

RSAKeyStore

Type: string

(Dependent required) Specifies the location of the RSA keystore, and decrypts the password value. Required if AuthType is set to server_rsa, or the encrypted attribute for Password is set to true, or both.

RSAKeyStorePassword

Type:string

(Dependent required) Specifies the keystore password, and decrypts the password value. Required if AuthType is set to server_rsa, or the encrypted attribute for Password is set to true, or both.

KerberosKDC

Type: string

(Dependent required) Specifies host name of Kerberos key distribution center. Required if AuthType is set to kerberos.

KerberosRealm

Type: string

(Dependent required) Specifies the Kerberos realm setting. Required if AuthType is set to kerberos.

KerberosService

Type: string

(Dependent required) Specifies the Kerberos principal name that identifies the Event Stream Processor cluster. Required if AuthType is set to kerberos.

KerberosTicketCache

Type: string

(Dependent required) Specifies the location of the Kerberos ticket cache file. Required if AuthType is set to kerberos.

EncryptionAlgorithm

Type: string

(Optional) Used when the encrypted attribute for Password is set to true. If left blank, RSA is used as default.