Models vs. Datasets
Here's information to help you decide between a model and a dataset.
Dataset | Acquired-data Model |
---|---|
Best for ad-hoc data (ungoverned) | Best for governed data |
Create in a story, or separately | Create outside of a story |
No data is deleted during data preparation | Data is overwritten during data preparation |
No row-level security | Row-level security |
Data is stored as a table, plus separate metadata | Data is stored as a star schema |
Limited data management of dimensions | Fine-grained data management of dimensions |
If all you want to do is upload some data, for example in .csv or .xlsx format, and start analyzing it immediately in a story, a dataset will probably be the right choice.
Any changes you want to make to the data or data structure are done simply by editing the dataset.
Example: Say you want to change the data type of a field from a Dimension to a Measure. If you're using a dataset, only the metadata definition of that column would need to be changed, whereas for a model, it would mean deleting the dimension table and updating the fact table to include an additional column, which would be more time-consuming.
If you prefer to start by defining the data structure, a model will probably be the right choice.
Typical examples of when a model is suitable are:
- Connecting to live data.
- Planning use cases, where the planner already has the structure in mind and would then either input the data or import it from different sources to fit into the model.
- Governed data that IT owns and wants to share with others.
Models guarantee that the data they hold follows a series of business rules that certify that workflows such as planning can be run. Changes made to the structure of the model can be done either at the structure level, if the fact table is empty, or by rebuilding the model from the original data preparation session.
Models also support row-level security, fine-grained data management of dimensions, and fact tables.