Import and Prepare Fact Data for a Classic Account Model
You are working with a classic account model and want to import fact data to your model. The process can be run once on demand, or scheduled to be executed on recurring basis.
-
Select
or depending where your data is stored. - Select your file, if your data is stored in a local file or connect to your data source.
- Select the data you want to import .
The uploaded draft data appears in either the Draft Data list or the Draft Sources list on the Data Management screen, and expires 7 days after the upload.
After you perform data integration on the draft data, the draft source is identified as one of the Draft Sources on the Data Management screen. You can set schedules for different data sources as necessary.
It's the step where you can wrangle your data: Rename your columns, create transformations, etc, and resolve quality issues in your data. Click on your draft source to edit your data.
Selecting Dimension Types
When importing data to a new model, columns containing text are typically identified as Dimensions, and numeric data may be identified as Measures. If the assigned dimension type isn't appropriate, you can change it to another type; for example, to a Date or Organization dimension. For details, see Learn About Dimensions and Measures.
Location Dimensions
If your data contains columns of latitude and longitude coordinate data, or state and county data (for the USA only), you can create a Location dimension. Select the (Geo Enrichment) icon under Transformations, and choose either By Area Name or By Coordinates.
When creating locations using coordinate data, location dimensions can be based on a combination of latitude and longitude pairs. You also have the option to specify a column as an identifier for the location dimension, and another column as the location description.
For more information, see Creating a Model with Coordinate or Area Data for Geospatial Analysis.
Maximum Number of Dimension Members
When importing data into a new model, there is an upper limit applied to the number of unique members in a dimension, to maintain good performance. See System Requirements and Technical Prerequisites.
However, for non-planning-enabled models only, you can import dimensions with more than the maximum number of members, with these restrictions:
- In the data integration view, these dimensions cannot have any dimension attributes added to them, such as description or property.
- Once the dimensions are imported into the Modeler, they will have only one ID column, and will be read-only.
- The dimensions can't be used as exception aggregation dimensions or required dimensions.
- The dimensions can't be referenced in formulas.
Calculated Columns
While you're preparing data in the data integration view, you can create a calculated column based on input from another column and the application of a formula expression.
For a listing of supported functions, see Supported Functions for Calculated Columns.
Dimensions Attributes
Dimension attributes are information that is not suitable to be standalone dimensions.
For example, if you have a Customers column, and a Phone Numbers column, you could set Phone Numbers to be an attribute of the Customers dimension.
You can use dimension attributes to create charts, filters, calculations, input controls, linked analyses, and tables (with “import data” models only).
These are the available dimension attributes:
- Description: The column can be used for descriptive
labels of the dimension members when the member IDs (the unique identifiers for
the dimension members) are technical and not easily understandable.
For example, if your imported data contains a pair of related columns Product_Description and Product_ID, containing data descriptions and data identifiers, you can set Product_Description to be the Description attribute for the Product_ID dimension. Note that the Product_ID column would then need to contain unique identifiers for the dimension members.
ID
Description
X123
Computer and equipment
Y456
Books
- Property: The column represents information that is related to the dimension; for example, phone numbers.
- Parent-Child Hierarchy (Parent): The column is the parent of the
parent/child hierarchy pair.
For example, if your imported data contains the two columns Country and City, you can set the Country column to be the parent of the Country-City hierarchy.
The hierarchy column is a free-format text attribute where you can enter the ID value of the parent member. By maintaining parent-child relationships in this way, you can build up a data hierarchy that is used when viewing the data to accumulate high-level values that can be analyzed at lower levels of detail.
You can also create level-based hierarchies when your data is organized into levels, such as Product Category, Product Group, and Product. When the data is displayed in a story, hierarchies can be expanded or collapsed. In the toolbar, select (Level Based Hierarchy). For more information about hierarchies, see Learn About Hierarchies.
- Currency: If you set a column to be an Organization dimension, the Currency attribute is available. The Organization dimension offers an organizational analysis of the account data, based, for example, on geographic entities. You can add the Currency attribute to provide currency information for the geographic entities.
Data Quality Considerations
During data import, anomalies in your data can prevent the data from being imported properly, or prevent the model from being created. If any of the following issues are found in your data, those data cells are highlighted, and messages in the Details panel explain the issues. You'll need to resolve these issues before the data can be fully imported or the model can be created.
- In account dimensions only, dimension member IDs cannot contain these restricted characters , ; : ' [ ] =.
- Numeric data cells in measures cannot contain non-numeric characters, and scientific notation is not supported.
- When importing data to an existing model, all dimensions of the existing model need to be mapped.
- When importing data to an existing model, cells in a column that is mapped to an existing dimension must match the existing dimension members. Unmatched values will result in those rows being omitted from the model.
- For stories, if any member IDs are empty, you can type values in those cells, or select Delete empty rows in the Details panel to remove those rows.
- When creating a new model, if member IDs are empty, they are automatically filled with the “#” value if you select the Fill applicable empty ID cells with the "#" value option. Otherwise, those rows are omitted from the model.
- In dimensions and properties, a single member ID cannot correspond to different
Descriptions in multiple rows (but a single Description can correspond to
multiple member IDs).
For example, if member IDs are employee numbers, and Descriptions are employee names, it's allowable to have more than one employee with the same name, but not allowable to have more than one employee with the same member ID.
- In a Date dimension column, cell values must match the format specified in the
Details panel. The following date formats are
supported:
- dd-mm-yy
- dd.mm.yy
- dd/mm/yy
- dd-mm-yyyy
- dd.mm.yyyy
- dd/mm/yyyy
- dd-mmm-yyyy
- dd-mmmm-yyyy
- mm-dd-yy
- mm.dd.yy
- mm/dd/yy
- mm-dd-yyyy
- mm.dd.yyyy
- mm/dd/yyyy
- mm.yyyy
- mmm yyyy
- yy-mm-dd
- yy.mm.dd
- yy/mm/dd
- yyyq
- yyyy
- yyyymm
- yyyy-mm
- yyyy.mm
- yyyy/mm
- yyyymmdd
- yyyy-mm-dd
- yyyy-mm/dd
- yyyy.mm.dd
- yyyy/mm-dd
- yyyy/mm/dd
- yyyy.mmm
- yyyyqq
Examples:
- mmm: JAN/Jan/jan
- mmmm: JANUARY/January/january
- q: 1/2/3/4
- qq: 01/02/03/04
- Latitude and longitude columns, from which location dimensions are created, must contain values within the valid latitude and longitude ranges.
- For planning-enabled models, in a hierarchy measure, non-leaf-node members are
not allowed.NoteFor analytic (non-planning-enabled) models only, non-leaf node members are allowed, but be aware of the effects of this behavior. Example: in an organizational chart that includes employee salaries, the Manager has her individual salary, and her staff members have their own salaries as well. In a visualization, do you expect the Manager’s data point to reflect her individual salary, or the sum of her staff members' salaries?
- Nodes in a hierarchy cannot be circular. For example, this hierarchy arrangement is not allowed: USA > California > San Francisco > California.
Mapping Considerations
-
At least one measure must be defined for the model.
If the source data doesn't contain a column that can be used directly as a measure, you can create a measure column based on a count of an existing column that contains 100% unique values. The generated measure column will contain the value 1 for all rows, representing the count of the unique values.
If you proceed to create the model without having created a measure column, a measure column will be generated automatically, based on the first column that contains 100% unique values.
To create the measure column yourself, select a column with 100% unique values, and then select Create Count in the Details panel.
If there isn't an existing column that contains 100% unique values, a measure column will be generated anyway when the model is created, containing the value 0 for all rows.
- There can be only one of each of these dimension types:
- Account
- Organization
- Version
Viewing the data imported to create a new model
During the initial data import, the data type of each column is inferred, as well as dimension type in the data integration view. In the right hand Details side panel, you will have a summary of the characteristics of the model.
- Under Layout, select the Card
View icon to display all the columns as separate cards. Each
card includes the column name, a specific summary of the data corresponding
to its type (mean value for numbers, date ranges for dates, number of unique
values for text, etc.), as well as a quality bar that shows in green the
proportion of values formatted correctly – those in red formatted
incorrectly and require action, and those in gray are empty.
You can sort the displayed cards in original, ascending, or descending order, or according to the model structure (listing all measures first, followed by dimensions). If you switch to (Grid View), the columns associated with your selected cards are automatically highlighted.
NoteUse the (Show/Hide) icon to limit the view to a few selected columns in either (Grid View) or Card View. A drop-down list of all the columns in the uploaded data is displayed under (Show/Hide). All columns listed with a check mark are displayed in the current view. Select a column name to show or hide the column. You can use the search tool to look for a specific column. You can also hide columns by directly selecting columns in the data integration view, and choosing . - The Details panel is divided into two sections:
- Model level: shows general information about the imported data, including any data-quality issues including mapping issues, and model-specific information. A Model Options section is provided to enable planning, set the default currency for the model, plus other options.
- Column level: allows you to map columns to existing dimensions in the model, displays histograms for the currently selected column (rectangles indicating frequency of data items in successive numerical intervals), lists the attribute and dimension types that you can apply to the column, and displays any data-quality issues represented as quality bars under Data Quality. The green section indicates correct values, while those in red indicate incorrect values. Those in gray are empty.
- When you select a column in (Grid
View) or Card
View, numerical or textual histograms are displayed under
Data Distribution.
- Text histograms are horizontal, and the values are clustered by
count. The number of clusters can be adjusted by dragging the slider
shown above the histogram. When a cluster contains more than one
value, the displayed count is the average count for the cluster. The
count is prefaced by a tilde symbol (~) if there are multiple
different occurrences. Expand the cluster for a more detailed view
of the values in the cluster along with individual counts. Use the
search tool to
look up specific column values, and press
Enter to initiate the search. When you
select a value in the histogram, the corresponding column is sorted
and the value is highlighted in the grid.NoteThe displayed histogram is determined by the data type of the column. A column with numbers could still be considered as text if the data type is set to text.
- Numerical histograms are vertical, and represent the range of values
along the x-axis. Hover over any bar to show the count, minimum, and
maximum values for the data in the bar. The number of bars can also
be adjusted by using the slider above the histogram. Toggling the
Show Outliers box includes or removes
outliers from the histogram.NoteBelow the histogram, there is a box and whisker plot to visualize the histogram's distribution of values.
- Text histograms are horizontal, and the values are clustered by
count. The number of clusters can be adjusted by dragging the slider
shown above the histogram. When a cluster contains more than one
value, the displayed count is the average count for the cluster. The
count is prefaced by a tilde symbol (~) if there are multiple
different occurrences. Expand the cluster for a more detailed view
of the values in the cluster along with individual counts. Use the
search tool to
look up specific column values, and press
Enter to initiate the search. When you
select a value in the histogram, the corresponding column is sorted
and the value is highlighted in the grid.
- You use level-based hierarchies when your data is organized into levels, such as Product Category, Product Group, and Product. When the data is displayed in a story, hierarchies can be expanded or collapsed. In the toolbar, select (Level Based Hierarchy). For more information, see Learn About Hierarchies.
- Select any of the following Model Options:
- Specify whether you want the model to be Planning Enabled. For information about model types, see Selecting the Right Model for Your Data.
- Specify whether you want empty ID cells to be filled with automatically generated values. If you clear this check box, rows with empty ID cells will be omitted.
Viewing and mapping data imported to an existing model
The Card View displays after you import data into an existing model. By default, cards representing dimensions and measures are displayed. Double-click a card to display attribute cards under the selected card.
Cards display as either mapped or unmapped entities. A card represents mapped data if it is shaded solid and has defined borders.
A card represents unmapped data if it does not have a border and appears transparent.
A red dot in the top right corner of the card indicates there are mapping errors that need to be addressed. A blue dot indicates that new values have been added from the import to the existing data.
Icon |
Data Type |
---|---|
Generic dimension | |
Date dimension | |
Account dimension | |
Measure | |
Version dimension | |
Organization dimension | |
Attribute only available for organization dimension | |
Description or property attributes (for example, unit, scale, or decimal) | |
Timestamp dimension | |
Hierarchy attribute |
When you select a card, the Details panel is populated with information specific to your selection – the number of rows, unique values, data types, and corresponding histograms. A pie chart is provided under Mapping Quality showing the proportion of Accepted versus Rejected values.
- Original Order: reflects the original order in the imported data source.
- A-Z: alphabetical order based on column headings.
- Z-A: reverse alphabetical order based on column headings.
- Mapped First: mapped columns listed alphabetically before unmapped columns.
- Unmapped First: unmapped columns listed alphabetically before mapped columns.
The following headings appear under color-coded progress bars indicating what percentage of the imported data was accepted: Dimensions, Attributes, and Measures. If a progress bar is not 100 percent blue, select the heading for more information on the mapping progress and a listing of unmapped entities.
Combining data with your acquired data
- Under Transformations, select the
(Combine Data) icon.
The Let's add some data! dialog is displayed.
- Choose how you want to add data:
- Import a file.
- Acquire data from a data source.
Once the new data is loaded, the Combine Data dialog is displayed. Under Combine Settings, a table representing all columns in the original data is displayed on the left, and a table listing columns from the new dataset is on the right.
- Choose the most appropriate columns in each table to combine data. Click
to select, or drag columns to the field provided under
Combine Column for each table. NoteThe columns should be the best match and should not contain duplicates. You can use up to three columns to join the two datasets.
- Select how you want to combine the columns:
- All primary data: maintains the non-intersecting data from the original dataset.
- Intersecting data only: omits any data from the original dataset that does not match in the new data.
The results for the column combination in terms of Accepted, Duplicated, Nulls, and Omitted values are provided under Combine Preview. A sample of combined data rows is displayed under Data Sample.
- Select Combine when you are ready to combine the
two tables. NoteCurrently, when you combine data:
- The number of combined cells cannot exceed 30,000,000.
- The number of combined rows cannot exceed 1,000,000.
- The number of combined columns cannot exceed 100.
- You cannot run more than ten combine data processes on a given dataset.
- You cannot combine two datasets using a calculated column.
- You cannot combine two datasets in a story.
Importing cross table data
- Once the acquired data (or sample) is loaded in the data integration
view, select the icon from the
Actions menu.
The Transpose Columns into Rows panel appears in Details.
- Under Transpose Columns select columns you want to include in the new table format. Choose Select All to include all the listed columns.
- Set Preview on grid on to preview the new format.
- Select OK to implement the transform and to
actualize the new format.NoteTo undo the transformation, open the Transformation Log and delete the associated entry.
- Determine which unmapped imported columns can be associated with specific transparent borderless cards in the card view.
- Drag and drop an unmapped imported column on the associated card.NoteIf you need to undo a mapping, hover over the card and select the (Cancel) button.
- Map the remaining imported columns.NoteThe final import cannot be started until all dimensions have been mapped to a corresponding card. You also need to map at least one imported measure. If there are fewer cards than imported columns, any extra columns will be ignored.NoteCheck Fiscal Year in Details to map imported fiscal period data into a date dimension in a model enabled to support fiscal year. Review the format listed under Fiscal Year Format and change it if required.
- To map an unmapped location dimension, select the corresponding card and from Map LocationsGeo by Area Name if your locations are based on area data, or the to display either the Map Latitude/Longitude dialog if your locations are based on coordinates. For coordinates, use the Latitude and Longitude fields to select corresponding columns. If your locations are based on area, select a column corresponding to Region and another for Sub-Region 1. choose
- Under Mapping Options select View all options to choose any of the following
options:
- Update dimension with new values: Select this option if you want to add and update dimension
members:
- New members in the dimension are added to the model.
- Changes to the description or properties of existing members are updated in the model.
- Changes to a hierarchy that has been previously defined are updated in the model.
Restriction- Dimension members can be updated if the dimension is private (i.e. not shared in multiple models). Updates to public dimensions are not supported.
- Account dimension members are updated only if a column in the imported dataset is mapped to that account dimension.
- Although changes to existing properties are updated, new properties are not added.
- Convert value symbol by account type: Choose how accounting data is
handled in this import. Special data types for accounting data are
available. These automatically apply negative signs to values in the
database as necessary, although the data shown on screen is shown as a
positive value (this applies to the types INC and LEQ – refer to the
Account dimension type for more information). When this option is
selected, the value symbol is matched to the account type. See Attributes of an Account Dimension and Combine Data with Your Acquired Data to display either
for details.
The signs (+/-) of INC and LEQ account values in a model are automatically reversed during data analysis when displayed in tables, charts, and so on. If you want your INC and LEQ values to show up as positive, import them as negative values, and vice-versa. Before importing data, make sure to check the signs of the original source values first, and then set the Reverse the Sign of the Data Based on Account Type option accordingly.
If you’re importing account values with the same signs and want to keep them unchanged during analysis, select this option. Your INC and LEQ values will be imported with reversed signs, too. For example, a positive gross sales value (an INC account value) in your draft data will be imported to the model as a negative value, and displayed for analysis as a positive value.
- Fill applicable empty ID cells with the "#" value: When creating a new model, if member IDs are empty, they are automatically filled with the “#” value if you select this option. Otherwise, those rows are omitted from the model.
- Update dimension with new values: Select this option if you want to add and update dimension
members:
- Select the Import Method to determine how existing data is handled.
A choice of three import methods is available.
NoteThese options affect measures and dimensions. To include both measures and dimensions, see Update and Schedule Models.Update Updates the existing data and adds new entries to the target model. The scope of this update is based on a combination of all dimensions in the target model. For a more refined scope, use either the Clean and replace selected version data or Clean and replace subset of data update options. Append Keeps the existing data as is and adds new entries to the target model. NoteWhen importing data into a planning model using the Append option, you can choose between aggregating duplicated rows, or rejecting them.Clean and replace selected version data Deletes the existing data and adds new entries to the target model, only for the versions that you specify in the import. You can choose to use either the existing version or specify a new version under Version. If you specify to import data for the "actual" version, only the data in the "actual" version is cleaned and replaced. Other versions, for example "planning", are not affected. Clean and replace subset of data Replaces existing data and adds new entries to the target model for a defined subset of the data based on a scope of selected versions using either the Existing Version or New Version buttons. You can also limit the scope to specific dimensions. To define a scope based on a combination of dimensions, select + Add Scope and use the Select a dimension field to specify a dimension.
When a Date dimension is defined in the scope, the time range in the source data (calculated based on the target model's granularity) combined with other dimensions in the scope will determine what existing data is cleaned and replaced.If for example, Date and Region dimensions are defined as part of a scope, only entries that fall within the time range and match Region from the source data will be replaced in the target model. Existing data that does not match the scope will be kept as is.NoteBlank dates (allowed in non-Planning models) are not part of the time range: They are matched directly in order to clean and replace data with blank dates in addition to data within the time range. - Choose which Category you want to use for the imported data.
- You use level-based hierarchies when your data is organized into levels, such as Product Category, Product Group, and Product. When the data is displayed in a story, hierarchies can be expanded or collapsed. On the toolbar, select (Level Based Hierarchy). For more information see Learn About Hierarchies.
For details about storing data in non-leaf members, see Entering Values with Multiple Hierarchies.
- a. If you're working with a dataset sample, choose Validate Data to apply your transforms across the entire
dataset and check for data quality errors. NoteAny newly detected errors will be retrieved in the sample you're working on, and appear under the Quality Issues or Model Requirements areas in the Details panel. You should resolve these errors before proceeding.
- b. Choose either Create Model to import the entire dataset and create your new model, or if you're adding
data to a model or story, choose Finish Mapping.NoteIf you choose to create the model without resolving detected errors, you'll be asked if you wish to continue. Depending on the types of errors, either entire rows or invalid cells will be omitted from the new model.NoteIf any rows in the dataset were rejected, you can select Download Rejected Rows to save the rejected rows as a .csv file. You can then examine this file to see which data was rejected. You could fix the data in the .csv file and then import the .csv file into the model using the Import Data workflow, or fix the data in the source system.NoteOnce you have saved the model, you can view the data mapping and transform history by selecting (View Mapping and Transforms). The Mapping Summary shows which columns in the data source were mapped to which attributes (for example, dimension properties or geo enrichment mapping) for the mapping targets (the dimensions and measures in the model). For an import job for a dimension, that dimension is the only mapping target. The Action Log shows the transforms, version mapping, and any renaming of source columns that were done during the data import. You can select a row in the mapping summary to show only the actions that apply to that mapped column.
Open this video in SAP Media Share
In this video, you will import data into an existing model, map your data to the model dimensions and measures, review the available mapping options and import methods, and validate your data.