Variable Statistical Types

A variable can have several statistical types:
Type Description Example
Continuous

Values are numerical, continuous, and sortable. They can be used to calculate measures; for example, mean or variance.

During modeling, a continuous variable may be grouped into significant discrete bins.

The variable <salary> is both a numerical variable, and a continuous variable. It may, for example, take on the following values: <$1,050>, <$1,700,> or <$1,750>.

The mean of these values may be calculated.

Ordinal

Values are discrete. They can be regrouped into categories and are sortable.

Ordinal variables may be:
  • Numerical: the values are numbers and they are ordered according to the natural number system (0, 1, 2, and so on).
  • Textual: the values are character strings. They are ordered according to alphabetic conventions.
The variable <school grade> is an ordinal variable. Its values actually belong to definite categories and can be sorted. This variable can be:
  • numerical, if its values range between <0> and <20>,
  • textual, if its values are A, B, C, D, E and F.
Nominal

Values are discrete. They can be regrouped into categories.

Caution
Binary variables (variable with 2 distinct values only) are considered as nominal variables. They are the ones that can be used as target for classification predictive models

The variable <zip code> is a nominal variable. The set of values that this variable may assume are clearly distinct, non-ranked categories, although they happen to be represented by numbers. For example: <10111>, <20500> or <90210>.

The variable <eye color> is a nominal variable. The set of values that this variable may assume are clearly distinct, non-ordered categories, and are represented by character strings. For example: <blue>, <brown>, <black>.

Textual
Note
These variables are currently not supported by Smart Predict, and are therefore excluded from the training of a predictive model.
A type of nominal variable containing phrases, sentences, or complete texts. Textual variables are used for text analyses. For example the variable <Bluetooth Headphones Customer Feedback> is a textual variable. The values for this variable can be <Durable cord, connect easy to phone and plug.>, <Great fit and great sound!> or <Great length and color. Super fast charging.>.
Note
During training, the values of the categorical variables are regrouped into homogeneous categories. These categories are then ordered as a function of their relative contribution with respect to the values of the target variable. For more information, see Category Influence.