HDFS

The Data Generator produces arbitrary data.

The data is written to files on HDFS by Write File. The files are stored in the directory /tmp/demo/ (of the specified bucket). The file names follow the scheme file_<counter>.txt.

The Read File reads this file and the content is written to a terminal.

Configure and Run the Graph

Follow the steps below to run the example from the Data Pipeline UI:

In the left panel, select the Graphs tab and navigate to com/sap/demo/hdfs.
Check the configuration of the Write File node: hadoopNamenode, hadoopUser
Check the configuration of the Read File node: hadoopNamenode, hadoopUser
In the tool bar, select Run (play button).
The Status panel indicates if the graph is running.
Use the context menu Open UI of the Terminal node to open the terminal.
The terminal opens and you see the produced files and their content.

Optional Step: Check the files on HDFS

Connect to HDFS and browse the directory /tmp/demo/ to see the produced files.