Modeling Guide

CSV Ingest3 via Disk

This sample graph is similar to the sample graph csv_ingestion_example2_disk. However, it uses an avro schema with additional extension attributes to customize the table defintion.

The table definition is derived from the avro schema configured at the preingestor, which is given as:
{ "name": "sample_demo_deep_record3", "type": "record", "fields": [ {"name": "idx", "type": "int", "colName": "_idx"}, {"name": "code", "type": "string", "maxLength": 5}, {"name": "magnitude", "type": "double"}, {"name": "name", "type": "string", "maxLength": 32}, {"name": "coordinates", "type": "record", "fields": [ {"name": "latitude", "type": "double"}, {"name": "longtitude", "type": "double"}]}, {"name": "ts", "type": "long", "logicalType": "timestamp-millis"}, {"name": "status", "type": "boolean"}] }
In the above avro schema, attribute colName is used to customize the column name and attribute maxLength is used to customize the size limit. With those attributes, the table columns will result in:
_idx INTEGER, code VARCHAR(5), magnitude DOUBLE, name VARCHAR(32), coordinates_latitude DOUBLE, coordinates_longtitude DOUBLE, ts TIMESTAMP, status BOOLEAN
Without those attributes, the table columns would result in:
idx INTEGER, code VARCHAR(*), magnitude DOUBLE, name VARCHAR(*), coordinates_latitude DOUBLE, coordinates_longtitude DOUBLE, ts TIMESTAMP, status BOOLEAN

Prerequisites

You need a running SAP Vora instance.

Configure and Run the Graph

Follow the steps below to run the example from the Data Pipeline UI:
  1. In the left panel, select the Graphs tab and navigate to com/sap/demo/vora/ingestion/csv_ingestion_example3_disk.
  2. Check the configuration of the ingestor node: dsn.
  3. In the tool bar, select Run (play button).
  4. The Status panel indicates if the graph is running.
  5. Use the context menu Open UI of the Wiretap node to open the wiretap.
  6. The wiretap opens and you see the commit tokens.
  7. Stop the graph and change the generator's batchSize and run the graph again.