Modeling Guide

SAP Vora Ingestor

The SAP Vora Ingestor operator allows you to dynamically ingest data into SAP Vora based on incoming records messages. The DB table and its column definitions are determined by the metadata included in the message under attribute "vora.record.definition". This metadata is typically generated by SAP Vora PreIngestor. For more information, refer to the documentation for SAP Vora PreIngestor.

Sample Graph

Description

com.sap.demo.vora.ingestion.avro_ingestion_example_disk

Reading avro messages from kafka and writing to vora disk engine.

com.sap.demo.vora.ingestion.avro_ingestion_example_series

Reading avro messages from kafka and writing to vora timeseries engine.

com.sap.demo.vora.ingestion.csv_ingestion_example2_disk

Generating csv messages and writing to vora disk engine

com.sap.demo.vora.ingestion.csv_ingestion_example2_series

Generating csv messages and writing to vora timeseries engine.

com.sap.demo.vora.ingestion.csv_ingestion_example3_disk

Generating csv messages and writing to vora disk engine using an avro schema with additional metadata to customize the table.

com.sap.demo.vora.ingestion.json_ingestion_example2_disk

Generating JSON messages and writing to vora disk engine.

com.sap.demo.vora.ingestion.rec_ingestion_example2_disk

Generating record messages and writing to vora disk engine.

Configuration Parameters

Parameter

Type

Description

connectionType

string

The connection to SAP Vora can be configured directly using dsn or indirectly using connection.

Default: "dsn"

dsn

string

A valid data source name in the format v2://host:port/?binary=true. Make sure that you add /?binary=true to the end, because only binary transfer is available for the SAP Vora Transaction Coordinator.

Default: "v2://localhost:2204/?binary=true"

user

string

The user name if the connection is configured using dsn.

Default: ""

password

string

The password if the connection is configured using dsn.

Default: ""

connection

object

A valid connection configuration provided by ConnectionManager.

aggregation

bool

Enables the automatic aggregation of records to trigger a series of bulk inserts independently of the number of records contained in each Avro message.

Default: false

aggregateMaxBytes

int

Limits the maximal size of the aggregated records in bytes under the auto-aggregation mode. Until the aggregated records reach this limit, the records are aggregated.

Default: 4194304

aggregateMaxRecs

int

Limits the maximal number of the aggregated records under the auto-aggregation mode. Until the aggregated records reach this limit, the records are aggregated.

Default: 1000

aggregateMaxTime

int

Limits the maximal time in milliseconds to wait until flushing the aggregated records that have not reached the size constraints.

Default: 2000

databaseSchema

string

The database schema name.

Default: "TPCH"

engineType

string

The engine type ("DISK" or "SERIES").

Default: "DISK"

partitionKeyRegex

string

A regular expression to match some Avro field names to identify the partition key.

Default: ""

partitionCriterion

string

A partition function.

Default: false

tableType

string

The table type ("STREAMING").

Default: "STREAMING"

ingestionMode

string

The ingestion mode INSERT or UPSERT

Default: "INSERT"

primaryKeyRegex

string

A regular expression to match some record field names to identify the primary keys. The primary keys may be specified in the avro schema and this parameter is only useful when the primary keys are not specified in the avro schema.

Default: ""

Input

Input

Type

Description

in

message

Accepts record messages to be processed.

Output

Output

Type

Description

out

message

Messages with the header properties that indicate the commit progress. If the input message does not contain a commit token (i.e., message header message.commit.token), no output will be generated.