Acquiring data from a text file

You can acquire data from one or more text files, if the data is stored with delimiters or in fixed-width columns. An example of a text file using delimiters is a comma-separated value (.csv) file.

A .csv file stores numbers and text in plain-text format. Each record consists of fields usually separated by a comma or a tab, and records are separated by line breaks. Here is an example of a .csv file, with data separated by commas:

"Product","Country","Year","Quantity","Margin"
"Skis","Italy","2013","1,297","1,929"
"Computers","China","2014","609","10,659"

Acquiring data from this .csv file results in five columns in the dataset: "Product," "Country," "Year," "Quantity," and "Margin." Column 2, in this example, would contain the values "Country", "Italy", and "China".

Here is an example of a text file with the data stored in fixed-width columns:

Product   Country   Year      Quantity  Margin
Skis      Italy     2013      1,297     1,929
Computers China     2014      609       10,659
You can acquire data from multiple-file data sources. The files must have the same format and data type.
Table 1: Add new dataset dialog options for text files

Option

Description

Dataset Name

The name of the dataset

File(s)

The file or files that contain the data for the new dataset. You can import data from one or multiple files. To specify multiple files, separate the file paths in the File(s) field with semicolons, or select Add Files and choose one or more files to add to the selection.

Separator

Choose whether data in your files is separated by delimiters or is entered in fixed-width columns. Delimiters are symbols, such as commas, tabs, or spaces, that separate fields in the data source and that will specify columns in the dataset in SAP Lumira.

Set first row as column names

Select this check box to use the first row of data as column names in the dataset.

Clear this check box to use the default column names ("Column1", "Column2", and so on).

Start of the navigation path Advanced Options Next navigation step Number format End of the navigation path

The format for numeric columns in the dataset

Start of the navigation path Advanced Options Next navigation step Date format End of the navigation path

The format for date columns in the dataset

Start of the navigation path Advanced Options Next navigation step Break Column End of the navigation path

When acquiring data stored as fixed-width columns, analyze the data file and suggest column widths (in characters) for separating data into columns in the dataset.

If the suggested widths aren’t suitable, you can change the widths by entering values separated by commas. For example, if your data is in three columns and the column widths are five, 10, and 15 characters, you would enter 5,10,15 in the Break Column box, and select Apply to see a preview of the resulting dataset.

Start of the navigation path Advanced Options Next navigation step Trim leading spaces End of the navigation path

Select this check box to remove leading and trailing values from numbers and text in the dataset so that column headers do not appear as empty fields. For example, if a "Product" entry has a leading space (" Product"), the space is removed and "Product" appears as the column header.

  1. On the Home page, select Acquire Data.
  2. In the Add new dataset dialog, select Text, and select Next.
  3. Choose one or more text files, and select Open.
    Data from the files is previewed in the Add new dataset dialog.
  4. (Optional) Adjust the dataset options in the dialog as needed.
  5. Select Create.
The Visualize room opens, and you can start building charts and analyzing data. If you want to modify the dataset first, switch to the Prepare room.