Vora Ingestor
The Vora Ingestor operator allows you to dynamically ingest data into SAP Vora based on incoming records messages. The DB table and its column definitions are determined by the metadata included in the message under attribute "vora.record.definition".
The target table will be automatically created with the information provided in the metadata. If the corresponding table already exists, its columns must match those columns defined in the metadata. This metadata is typically generated by SAP Vora Avro Decoder which is capable of extracting record fields based on the provided Avro schema from input data in various formats such as Avro, Json and CSV.
Sample Graph |
Description |
---|---|
com.sap.demo.vora.ingestion.avro_ingestion_example_disk |
Reading Avro messages from Kafka and writing to Vora Disk Engine. |
com.sap.demo.vora.ingestion.avro_ingestion_example_series |
Reading Avro messages from Kafka and writing to Vora Timeseries Engine. |
com.sap.demo.vora.ingestion.csv_ingestion_example2_disk |
Generating CSV messages and writing to Vora Disk Engine. |
com.sap.demo.vora.ingestion.csv_ingestion_example2_series |
Generating CSV messages and writing to Vora Timeseries Engine. |
com.sap.demo.vora.ingestion.csv_ingestion_example3_disk |
Generating CSV messages and writing to Vora Disk Engine using an Avro schema with additional metadata to customize the table. |
com.sap.demo.vora.ingestion.json_ingestion_example2_disk |
Generating CSV messages and writing to Vora Disk Engine using an Avro schema with additional metadata to customize the table. |
com.sap.demo.vora.ingestion.rec_ingestion_example2_disk |
Generating record messages and writing to Vora Disk Engine. |
Configuration Parameters
Parameter |
Type |
Description |
---|---|---|
connectionType |
string |
The connection to SAP Vora can be configured directly using dsn or indirectly using connection. Default: "dsn" |
dsn |
string |
A valid data source name in the format v2://host:port/?binary=true. Make sure that you add /?binary=true to the end, because only binary transfer is available for the SAP Vora Transaction Coordinator. Default: "v2://localhost:2204/?binary=true" |
user |
string |
The user name if the connection is configured using dsn. Default: "" |
password |
string |
The password if the connection is configured using dsn. Default: "" |
connection |
object |
A valid connection configuration provided by ConnectionManager. |
aggregation |
bool |
Enables the automatic aggregation of records to trigger a series of bulk inserts independently of the number of records contained in each Avro message. Default: false |
aggregateMaxBytes |
int |
Limits the maximal size of the aggregated records in bytes under the auto-aggregation mode. Until the aggregated records reach this limit, the records are aggregated. Default: 4194304 |
aggregateMaxRecs |
int |
Limits the maximal number of the aggregated records under the auto-aggregation mode. Until the aggregated records reach this limit, the records are aggregated. Default: 1000 |
aggregateMaxTime |
int |
Limits the maximal time in milliseconds to wait until flushing the aggregated records that have not reached the size constraints. Default: 2000 |
databaseSchema |
string |
The database schema name. Default: "TPCH" |
engineType |
string |
The engine type ("DISK" or "SERIES"). Default: "DISK" |
partitionKeyRegex |
string |
A regular expression to select a sequence of Avro record field names to be bound to
the arguments to the specified particion function. Details: When
generating the partitioning scheme (the actual binding of arguments
to parameters of partition keys) the system deduces the matching of
actual columns to parameters of the partition scheme using this
regular expression.
Example: Suppose we have a table "T(col1 VARCHAR(500), col2 BIGINT, col3 DATE)", and a hash partition function "pf(pa, pb)". Then, specifying partitionKeyRegex , .*1|.*3 will select col1 and col3 to be bound to the parameters pa and pb, respectively. Default: "" |
partitionCriterion |
string |
A partition function. Default: false |
tableType |
string |
The table type ("STREAMING"). Default: "STREAMING" |
ingestionMode |
string |
The ingestion mode INSERT or UPSERT Default: "INSERT" |
primaryKeyRegex |
string |
A regular expression to select some Avro record field names to be used as the primary keys. The primary keys may be specified in the Avro schema using extension property "primaryKey". This parameter is only useful when the primary keys are not specified in the Avro schema. Default: "" |
Input
Input |
Type |
Description |
---|---|---|
in |
message |
Accepts record messages to be processed. |
Output
Output |
Type |
Description |
---|---|---|
out |
message |
Messages with the header properties that indicate the commit progress. If the input message does not contain a commit token (i.e., message header message.commit.token), no output will be generated. |