Show TOC

File/Hadoop JSON Output Adapter ConfigurationLocate this document in the navigation structure

Configure the File/Hadoop JSON Output adapter by specifying values for the ESP connector, formatter, and transporter modules in the adapter configuration file.The File/Hadoop JSON Output adapter uses SimpleDateFormat formatting codes.

Logging
XML Element Description
Log4jProperty

Type: string

(Optional) Specify a full path to the log4j.properties logging file you wish to use. The default value is STREAMING_HOME/adapters/framework/config/log4j.properties.

Encryption
XML Element Description
Cipher

Type: string

(Optional) Specify a full path to the adapter.key encryption file you wish to use. The default value is $STREAMING_HOME/adapters/framework/adapter.key.

ESPConnector Module: ESP Subscriber

The ESP Subscriber module obtains data from the SAP Event Stream Processor project and passes it along to a transporter or formatter module.

XML Element Description
Module

(Required) Element containing all information for this module. It contains a type attribute for specifying the module type.

For example, espconnector.

InstanceName

Type: string

(Required) Instance name of the specific module you want to use. For example, MyInputTransporter.

Name

Type: string

(Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter.

Next

Type: string

(Required) Instance name of the module that follows this one.

BufferMaxSize

Type: integer

(Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240.

Parameters

Element containing the EspSubscriberParameters element.

EspSubscriberParameters

(Required) Element containing elements for the ESP subscriber.

ProjectName

Type: string

(Required if running adapter in standalone mode; optional if running in managed mode) Specifies the unique project tag of the ESP project to which the adapter is connected. For example, StreamingProject2.

This is the same project tag that you specify later in the adapter configuration file in the Name element within the Event Stream Processor (EspProjects) element.

If you are starting the adapter with the ESP project to which it is attached (running the adapter in managed mode), you do not need to set this element as the adapter automatically detects the project name.

StreamName

Type: string

(Required if running adapter in standalone mode; optional if running in managed mode) Name of te stream from which the adapter subscribes to data.

If you are starting the adapter with the ESP project to which it is attached (running the adapter in managed mode), you do not need to set this element as the adapter automatically detects the stream name.

OutputBase

Type: boolean

(Optional) If set to true, the adapter outputs the initial stream contents in addition to stream updates.

If this option is enabled and the adapter is running in GD mode, once the adapter has done a GD commit on the entire base data, the ESP Server does not redeliver the base data on adapter restart and only sends deltas that are saved for delivery. The default value is false.

OnlyBase

Type: boolean

(Advanced) Sends a one-time snapshot of initial contents in a stream. Default value is false. If set to true, OutputBase automatically becomes true.

EnableGDMode

Type: boolean

(Advanced) Specifies whether the adapter runs in guaranteed delivery (GD) mode. GD ensures that data continues to be processed in the case that the ESP Server fails, or the destination (third-party server) fails or does not respond for a long time. See Guaranteed Delivery in the SAP Event Stream Processor: Developer Guide for details on enabling GD for your project.

The default value is false.

EnableGDCache

Type: boolean

(Advanced) If set to true, only rows that can be recovered (that is, checkpointed) by the ESP Server on restart are sent to the end source. Other rows are cached internally by the adapter.

GDSubscriberName

Type: string

(Advanced) If the adapter is running in GD mode (the EnableGDMode property is set to true), specify a unique name to identify the GD subscription client. If this value is empty when running in GD mode, the adapter logs an error and shuts down.

Formatter Module: ESP to JSON Stream Formatter

The ESP to JSON Stream formatter translates AepRecord objects to JSON strings, and sends the JSON strings to the next streaming output transporter that is configured in the adapter configuration file.

XML Element Description
Module

(Required) Element containing all information for this module. It contains a type attribute for specifying the module type.

For example, formatter.

InstanceName

Type: string

(Required) Instance name of the specific module you want to use. For example, MyInputTransporter.

Name

Type: string

(Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter.

Next

Type: string

(Required) Instance name of the module that follows this one.

BufferMaxSize

Type: integer

(Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240.

Parameters

(Required) Element containing the EsptoJsonStreamFormatterParameters element.

EsptoJsonStreamFormatterParameters

(Required) Element containing the ESP to JSON Stream formatter elements.

ColsMapping

(Required) Element containing the Column element.

Column

Type: complextype

(Required) Specify JSONPath expressions for the JSON data that you want to map to columns of the a stream. You can have multiple Column elements.

For example, if you had the following JSON data about a person,
{
"firstName": "John",       
"lastName": "Smith", 
"phoneNumbers": [
        {
            "type": "home",
            "number": "212 555-1234"
        },
        {
            "type": "fax",
            "number": "646 555-4567"
        }
],
"friends": [
["female1","female2","female3"],
["male1","male2","male3"]
]
}
you could get the individual's first name by using the JSONPath expression firstname. If you want the first phone number, specify phoneNumbers[0].number as the JSONPath expression. Each JSONPath expression must represent one record.

The first <Column/> element and its value are mapped to the first column of a stream, the second <Column/> element and its value are mapped to the second column of a stream, and so on.

SecondDateFormat

Type: string

(Advanced) The format string for SecondDate values.

For example, yyyy-MM-dd'T'HH:mm:ss is the default value.

MsDateFormat

Type: string

(Advanced) The format string for MsDate values.

For example, yyyy-MM-dd'T'HH:mm:ss.SSS is the default value.

BigDatetimeFormat

Type: string

(Advanced) Format string for parsing bigdatetime values. The default value is yyyy-MM-dd'T'HH:mm:ss.SSSSSS.

Using less than six Ss gives precision to that exact number of Ss and ignores values past that specification. Using more than six Ss truncates any values beyond the sixth, and replaces them with zero. This may result in slower behavior.

Transporter Module: File Output Transporter

The File Output transporter obtains data from the previous module specified in the adapter configuration file and writes it to local files.

XML Element Description
Module

(Required) Element containing all information for this module. It contains a type attribute for specifying the module type.

For example, transporter.

InstanceName

Type: string

(Required) Instance name of the specific module you want to use. For example, MyInputTransporter.

Name

Type: string

(Required) Name of the module as defined in the modulesdefine.xml file. For example, <TransporterType>InputTransporter.

BufferMaxSize

Type: integer

(Advanced) Capacity of the buffer queue between this module and the next. The default value is 10240.

Parameters

(Required) Element containing the FileOutputTransporterParameters element.

FileOutputTransporterParameters

(Required) Element containing the elements for the File Output transporter.

Dir

Type: string

(Required) Specify the absolute path to the data files that you want the adapter to write to. For example, <username>/<foldername>.

The default value is ".", meaning the current directory in which the adapter is running.

Alternatively, you can leave this value empty and specify the absolute path in the File property.

To use Hadoop system files, use an HDFS folder instead of a local file system folder. For example, hdfs://<hdfsserver>:9000/<foldername>/<subfoldername>/<leaffoldername>.

To use Hadoop, download the binaries for Hadoop version 1.2.1 or 2.2.0 from http://hadoop.apache.orgInformation published on non-SAP site. For version 1.2.1, copy the hadoop-core-1.2.1.jar file to %STREAMING_HOME%\adapters\framework\libj.

For version 2.2.0, copy these files over to %STREAMING_HOME%\adapters\framework\libj:
  • hadoop-common-2.2.0.jar
  • hadoop-auth-2.2.0.jar
  • hadoop-hdfs-2.2.0.jar
  • guava-11.0.2.jar
  • protobuf-java-2.5.0.jar
Ensure you use a stable version rather than a beta.

Use a forward slash for both UNIX and Windows paths.

File

Type: string

(Required) Specify the relative path to the file to which the adapter writes.

If the Dir property is left blank, use this property to specify the absolute path to the data file to which you want the adapter to write.

No default value.

AccessMode

Type: string

(Required) Specify an access mode:
  • rowBased the adapter writes one text line at a time into the file.
  • Streaming the adapter writes the raw data in ByteBuffer into the file.
No default value.
AppendMode

Type: boolean

(Optional) If set to true, the adapter appends the data into the existing file. If set to false, the adapter overwrites existing content in the file.

If the adapter is running in GD mode, SAP recommends that you set this property to true. Otherwise, the content of the output file will be erased every time the adapter is restarted.

The default value is false.

BatchSize

Type: integer

(Advanced) If the adapter is running in GD (guaranteed delivery) mode, specify the number of message rows after which the adapter issues a commit command to the external data source and a GD commit to the stream to which the adapter is attached.

If the adapter is running without GD mode, specify the number of message rows after which the adapter issues a commit command to the external data source. The default value is 1.

Increasing this value improves performance at the expense of latency. It also increases memory consumption in the ESP Server because the uncommitted rows need to be preserved for redelivery in case of failure.

If the ESP Subscriber module EnableGDMode element is set to true, set either this or the BatchPeriod property to greater than 0. If neither property is set to greater than 0, a warning is sent and this property is set to 1.

BatchPeriod

Type: integer

(Advanced) If the adapter is running in GD mode, specify the number of seconds after which the adapter issues a commit command to the external data source and a GD commit to the stream to which the adapter is attached.

If the adapter is running without GD mode, specify the number of seconds after which the adapter issues a commit command to the external data source. The default value is 0.

Increasing this value improves performance at the expense of latency. It also increases memory consumption in the ESP Server because the uncommitted rows need to be preserved for redelivery in case of failure.

FileSizeLimit

Type: integer

(Optional) Specify the maximum size, in bytes, of the output file. If this property is set, the adapter starts writing a new file every time the size of the current output file becomes greater than this value. The files are named <filename>, <filename>.001, <filename>.002, and so on where <filename> is the value of the File element.

If you do not specify a value or you specify a value of 0 for this property, the output file does not have a size limit.

No default value.

TimeBasedRotate

Type: boolean

(Optional) Specify whether to rotate files at predefined intervals.

The default value is false.

TimeBasedRotateInterval

Type: interval

(Optional) Specify the amount of time, in seconds, to wait between file rotations.

The default value is 24 hours.

TimeBasedRotateStartAt

Type: time

(Optional) Specify the time of the first file rotation. The supported formats are:       
  • yyyy-MM-DD HH:mm:ss.SS z
  • yyyy-MM-DD HH:mm:ss.SS
  • yyyy-MM-DD HH:mm:ss z
  • yyyy-MM-DD HH:mm:ss
  • yyyy-MM-DD HH:mm z
  • yyyy-MM-DD HH:mm
  • yyyy-MM-DD z
  • yyyy-MM-DD
  • HH:mm:ss z
  • HH:mm:ss
  • HH:mm z
  • HH:mm
The default value is 0:00 UTC.

If you do not specify a time zone, the adapter defaults to UTC.

TimestampInFilenames

Type: boolean

(Optional) Specify whether to append the system time, in UTC, to the output file name when a new output file is created.

The default value is false.

TimestampInFilenamesFormat

Type: string

(Optional) Specify the timestamp format that gets appended to the output file name. Valid formats include any formats accepted by the Java SimpleDateFormat class. For example, yyyy-MM-dd_HH-mm-ss.

On Windows, the following symbols are not permitted in the file name: \ / : * ? " < > |.

The default timestamp format is yyyyMMdd_HHmmss.

HDFSReplaceDataNodeOnFailureEnable

Type: boolean

(Optional; applicable only for use with HDFS folders) Enable this property if you wish to provide an alternate value for the dfs.client.block.write.replace-datanode-on-failure.enable property. Use only with Hadoop 2.2.0 or higher. For additional information on configuring this property, see the Hadoop documentation.

By default, this property is commented out and the adapter inherits the default value for the dfs.client.block.write.replace-datanode-on-failure.enable property on the Hadoop client side.

HDFSReplaceDataNodeOnFailurePolicy

Type: string

(Optional; applicable only for use with HDFS folders) Enable this property if you wish to provide an alternate value for the dfs.client.block.write.replace-datanode-on-failure.policy property. Use only with Hadoop 2.2.0 or higher. For additional information on configuring this property, see the Hadoop documentation.

By default, this property is commented out and the adapter inherits the default value for the dfs.client.block.write.replace-datanode-on-failure.policy property on the Hadoop client side.

HDFSReplication

Type: int

(Optional; applicable only for use with HDFS folders) Enable this property if you wish to provide an alternate value for dfs.replication property. Use only with Hadoop 2.2.0 or higher.

If you are using a Hadoop system with less than three nodes, set this property accordingly. For additional information on configuring this property, see the Hadoop documentation.

By default, this property is commented out and the adapter inherits the default value for the dfs.replication property on the Hadoop client side.

SAP Event Stream Processor Elements

SAP Event Stream Processor elements configure communication between Event Stream Processor and the File/Hadoop JSON Output adapter.

XML Element Description
EspProjects

(Required) Element containing elements for connecting to Event Stream Processor.

EspProject

(Required) Element containing the Name and Uri elements. Specifies information for the ESP project to which the adapter is connected.

Name

Type: string

(Required) Specifies the unique project tag of the ESP project which the EspConnector (publisher/subscriber) module references.

Uri

Type: string

(Required) Specifies the total project URI to connect to the ESP project. For example, if you have SSL enabled, use esps://localhost:19011/ws1/p1.

Security

(Required) Element containing all the authentication elements below. Specifies details for the authentication method used for Event Stream Processor.

User

Type: string

(Required) Specifies the user name required to log in to Event Stream Processor (see AuthType). No default value.

Password

Type: string

(Required) Specifies the password required to log in to Event Stream Processor (see AuthType).

Includes an "encrypted" attribute indicating whether the Password value is encrypted. The default value is false. If set to true, the password value is decrypted using RSAKeyStore and RSAKeyStorePassword.

AuthType

Type: string

(Required) Method used to authenticate to Event Stream Processor. Valid values are:
  • server_rsa - RSA authentication using keystore
  • kerberos - Kerberos authentication using ticket-based authentication
  • user_password - LDAP, SAP BI, and Native OS (user name/password) authentication

If the adapter is operated as a Studio plug-in, AuthType is overridden by the Authentication Mode Studio start-up parameter.

RSAKeyStore

Type: string

(Dependent required) Specifies the location of the RSA keystore, and decrypts the password value. Required if AuthType is set to server_rsa, or the encrypted attribute for Password is set to true, or both.

RSAKeyStorePassword

Type:string

(Dependent required) Specifies the keystore password, and decrypts the password value. Required if AuthType is set to server_rsa, or the encrypted attribute for Password is set to true, or both.

KerberosKDC

Type: string

(Dependent required) Specifies host name of Kerberos key distribution center. Required if AuthType is set to kerberos.

KerberosRealm

Type: string

(Dependent required) Specifies the Kerberos realm setting. Required if AuthType is set to kerberos.

KerberosService

Type: string

(Dependent required) Specifies the Kerberos principal name that identifies the Event Stream Processor cluster. Required if AuthType is set to kerberos.

KerberosTicketCache

Type: string

(Dependent required) Specifies the location of the Kerberos ticket cache file. Required if AuthType is set to kerberos.

EncryptionAlgorithm

Type: string

(Optional) Used when the encrypted attribute for Password is set to true. If left blank, RSA is used as default.