Modeling Guide for SAP Data Hub

HANA Client

The HANA Client operator executes SQL statements and inserts CSV or JSON data into an SAP HANA instance. It supports three fundamental operations:

  • Executing arbitrary SQL code:

    SQL code arrives at port sql.

  • Executing prepared statements:

    A message arrives at port data with the hana.preparedStatement attribute, which names which of the prepared statements configured in preparedStatements is to be executed. Data is expected to arrive in tabular form, that is, an array of arrays of values ([][]interface{}). If the message body is not of this type, it will be ignored and the operator will try to execute the prepared statement without arguments.

  • Inserting data into a predefined table:

    A message arrives at port data without the hana.preparedStatement attribute. The message body is parsed as either CSV or JSON (see inputFormat) and the resulting rows are inserted into the table identified by tableName, according to the column names and types in tableColumns.

Configuration Parameters

Parameter

Type

Description

Connection

object

Mandatory. The object containing the connection parameters to a HANA instance.

ConnectionType

string

The type of connection to use when connecting to HANA. The three values below are accepted:
  • Insecure (no encryption): a clear-text connection, without TLS encryption.
  • Insecure TLS (skip verify): an encrypted TLS connection that skips server certificates validation.
  • Secure TLS: an encrypted TLS connection that validates server certificates. If this option is chosen, the certificates field must be filled out.

Default: "Insecure (no encryption)"

Certificates

string

The filename, (absolute or relative to Pipeline Modeler's repo-root) to a PEM-encoded root CA certificate. Only used when a Secure TLS connection is selected.

Default: "

initStatements

string

The SQL code to be executed when the operator is initialized. Note that the result of running these statements is not written to any ports.

Default: "

preparedStatements

array

A JSON array of SQL statements. Those statements will be prepared during operator initialization and can be invoked using the hana.preparedStatement message attribute for port data.

For efficient insertion, prefer BULK INSERT INTO over BULK INSERT, since the latter will insert rows one by one whereas the former will batch them to reduce network usage. Note that BULK INSERT statements without arguments (question-mark placeholders) will produce no effect.

Default: []

tableName

string

The name of the table to insert data into. If empty, port data can only receive messages with the hana.preparedStatement attribute.

Default: "

tableColumns

array

An array describing the columns of the table identified by tableName. If empty, port data can only receive messages with the hana.preparedStatement attribute. The size field, present for types (N)VARCHAR and FLOAT, is only required if initTable is not None.

Default: []

inputFormat

string

Mandatory. The format used to parse the data received at the data port. The following values are accepted:
  • CSV: input will be broken into records using csvBreak; records will be broken into fields using csvComma. Each field will be parsed according to the type defined for its column in tableColumns..
  • JSON: input must come as a JSON array. Each of its elements can be either:
    • a JSON array representing a table row; its elements must be in the same order as tableColumn.

    • a JSON object whose keys are the column names in tableColumns; spare keys will be ignored.

When using JSON, each column value will be treated in its JSON-native type, unconverted.

csvMode string Specifies how the operator should interpret the CSV data that arrives at the data port for insertion (without the hana.preparedStatement attribute):
  • Batch: each chunk from the input will be parsed as a self-contained CSV string. The records will be inserted before reading the next input from data.

  • Stream: each chunk is considered as the next piece of a single stream, so each individual input does not need to be complete. Once csvStreamWindow records are buffered (or csvBulkTimeoutMs milliseconds go by without any new records), they are inserted into the table in a single bulk and an output message is sent to result (if applicable).

    When the graph stops, the stream is ended, so all buffered records are inserted even if fewer than csvStreamWindow, and output is not produced.

    Note that the end of the stream may truncate the last record, so if it was not complete (had fewer fields than required), it will be discarded. An the error will be logged and the pending records that came before will be inserted as usual.

Default: "Batch"

csvBulkSize int When in CSV stream mode, this property determines how many CSV records should be accumulated to be inserted in a single bulk.

Default: 100

csvBulkTimeoutMs int When in CSV stream mode, if this value is positive and no new record was received for this duration, all buffered records will be inserted as a bulk.

Default: 10000

csvEmptyFieldValue string When input is in CSV format, this property specifies whether empty fields are interpreted as empty strings ("Empty string") or as null values ("NULL").

Default: "Empty string"

csvBreak string The character used as record delimiter in CSV input. If set to "\n", both LF and CRLF will be accepted.

Default: "\n"

csvComma string The character used as field delimiter in CSV input.

Default: ","

csvQuote string The character used as quote.

Default: """

csvLazyQuote bool If true, allows fields to contain unescaped quotes. With this option, only quotes at the start of a field need to be escaped.

Default: true

insertMode

string

Mandatory. Specifies whether the INSERT or the UPSERT command should be used for inserting data. When using UPSERT, the target table must have a primary key. For other uses of UPSERT, such as WHERE clauses, use preparedStatements.

Default: "INSERT"

initTable string Mandatory. Whether the table should be initialized or not. It is executed before the initStatements. If different than None ensures the schema exists (if explicitly given in tableName) and requires parameter tableName and tableColumns. The following options are given:
  • None: no table initialization is done.
  • Create: ensures table exists.
  • Truncate: ensures table exists and is empty, truncating the table.
  • Delete Rows: ensures table exists and is empty, deleting all rows.
  • Drop (Cascade): ensures table exists and is empty, dropping table with cascade option.
  • Drop (Restrict): ensures table exists and is empty, dropping table with restrict option.

numRetryAttempts

int

The number of retry attempts.

Default: 0

retryPeriodInMs

int

The waiting time in milliseconds between consecutive retry attempts.

Default: 0

connectionTimeoutInMs

int

The number of milliseconds to wait for when connecting to HANA before timing out.

Default: 10000

terminateOnError

bool

A flag that indicates whether or not the graph should stop if this operator encounters an error. If set to false, errors messages will be sent as output to result.

Default: true

Input

Input

Type

Description

sql

message

Accepts SQL code in a message's body to be executed by the SAP HANA instance.

data

message

Accepts CSV or JSON data in a message's body. If the attribute hana.preparedStatement is present, its value is used to identify a prepared statement to which this data will be submitted. Otherwise, data will be inserted into the table identified by tableName.

Output

Output

Type

Description

result

message

A message containing the result of an operation. If terminateOnError is false, a message may represent an error, in which case its attribute message.error will be true and its body will contain a string describing the error. The original input message attributes will be preserved in the output.

Supported Types

The following data types are supported by this operator when using the data port:
  • Numeric Types
    • TINYINT
    • SMALLINT
    • INTEGER
    • BIGINT
    • FLOAT
    • DOUBLE
  • Boolean Type:
    • BOOLEAN
  • Character String Types:
    • VARCHAR
    • NVARCHAR

    There must be no size specifier for the string types, so they must appear exactly as above, without trailing parentheses.

  • Datetime Types: the following types are supported, along with HANA's default format for each of them:
    • DATE: YYYY-MM-DD (e.g. '1957-06-13')
    • TIME: HH24:MI:SS (e.g. '23:32:18')
    • TIMESTAMP: YYYY-MM-DD HH24:MI:SS.FF7 (e.g. '1957-06-13 23:32:18.8261933')
    • SECONDDATE: YYYY-MM-DD HH24:MI:SS (e.g. '1957-06-13 23:32:18')
When executing queries with the sql port, the following equivalence is observed for port result:
Column Type Output Type
TINYINT uint8
SMALLINT int16
INTEGER int32
BIGINT int64
REAL float32
FLOAT, DOUBLE float64
DECIMAL, SMALLDECIMAL string ("numerator/denominator")
ALPHANUM, SHORTTEXT, VARCHAR, NVARCHAR string
SECONDDATE string ("YYYY-mm-dd HH:MM:SS")
TIMESTAMP string ("YYYY-mm-dd HH:MM:SS.NNNNNNN")
DATE string ("YYYY-mm-dd")
TIME string ("HH:MM:SS")
BOOLEAN uint8 (0 or 1)
VARBINARY, TEXT, BLOB, CLOB, NCLOB []byte