hana_ml.model_storage

This module provides the features of model storage.

All these features are accessible to the end user via a single class:

exception hana_ml.model_storage.ModelStorageError

Bases: hana_ml.ml_exceptions.Error

Exception class used in Model Storage

class hana_ml.model_storage.ModelStorage(connection_context, schema=None, meta=None)

Bases: object

The ModelStorage class allows users to save, list, update, restore or delete models.

Models are saved into SAP HANA tables in a schema specified by the user.

A model is identified with:

  • A name (string of 255 characters maximum),

    It must not contain any characters such as coma, semi-colon, tabulation, end-of-line, simple-quote, double-quote (',', ';', '"', ''', 'n', 't').

  • A version (positive integer starting from 1).

A model can be saved in three ways:

  1. It can be saved for the first time.

    No model with the same name and version is supposed to exist.

  2. It can be saved as a replacement.

    If a model with the same name and version already exists, it will be overwritten.

  3. It can be saved with a higher version.

    The model will be saved with an incremented version number.

Internally, a model is stored as two parts:

  1. The metadata.

    It contains the model identification (name, version, algorithm class) and also its python model object attributes required for reinstantiation. It is saved in a table named HANAML_MODEL_STORAGE by default.

  2. The back-end model.

    It consists in the model returned by APL or PAL.

    For APL, it is always saved into the table HANAMl_APL_MODELS_DEFAULT, while for PAL, a model can be saved into different tables depending on the nature of the algorithm.

Parameters
connection_contextConnectionContext

The connection object to an SAP HANA database. It must be the same as the one used by the model.

schemastr

The schema name where the model storage tables are created.

Examples

Creating and training a model:

>>> conn = ConnectionContext(HDB_HOST, HDB_PORT, HDB_USER, HDB_PASS)
>>> # Train dataset
>>> data = hana_df.DataFrame(conn, 'SELECT * '
...                               'from PERFTEST_01.IRIS')
>>> data_test = hana_df.DataFrame(conn, 'SELECT ID, '
...                                     '"sepal length (cm)","sepal width (cm)",'
...                                     '"petal length (cm)","petal width (cm)" '
...                                     'from PERFTEST_01.IRIS_MULTICLASSES '
...                                     )
>>> model_pal_name = 'MLPClassifier 1'
>>> model_name = 'AutoClassifier 1'
>>> model_pal = MLPClassifier(conn, hidden_layer_size=[10, ], activation='TANH',                             output_activation='TANH', learning_rate=0.01, momentum=0.001)
>>> model = AutoClassifier(conn_context=conn)
>>> model_pal.fit(data,
...           label='IS_SETOSA',
...           key='ID')
>>> model.fit(data,
...           label='IS_SETOSA',
...           key='ID')

Creating an instance of ModelStorage:

>>> MODEL_SCHEMA = 'MODEL_STORAGE' # HANA schema in which models are to be saved
>>> model_storage = ModelStorage(connection_context=conn, schema=MODEL_SCHEMA)

Saving the trained model for the first time:

>>> # Saves model
>>> model_pal.name = model_pal_name
>>> model_storage.save_model(model=model_pal)
>>> model.name = model_name
>>> model_storage.save_model(model=model)

Listing saved models:

>>> # Lists model
>>> list_models2 = model_storage.list_models()
>>> print(list_models2)
               NAME  VERSION LIBRARY                         ...
0  AutoClassifier 1        1     APL  hana_ml.algorithms.apl ...
1  MLPClassifier 1         1     PAL  hana_ml.algorithms.pal ...

Reloading a saved model:

>>> # Loads model
>>> model1 = model_storage.load_model(name=model_pal_name, version=1)
>>> model2 = model_storage.load_model(name=model_name)

Using the loaded model for new predictions:

>>> # predict
>>> out = model2.predict(data=data_test)
>>> out = out.head(3).collect()
>>> print(out)
   ID PREDICTED  PROBABILITY IS_SETOSA
0   1      True     0.999492      None    ...
1   2      True     0.999478      None
2   3      True     0.999460      None

Saving the model again:

>>> # Saves model by overwriting
>>> model_storage.save_model(model=model, if_exists='replace')
>>> list_models = model_storage.list_models(name=model.name)
>>> print(list_models)
               NAME  VERSION LIBRARY                            ...
0  AutoClassifier 1        1     APL  hana_ml.algorithms.apl    ...
>>> # Upgrades model
>>> model_storage.save_model(model=model, if_exists='upgrade')
>>> list_models = model_storage.list_models(name=model.name)
>>> print(list_models)
               NAME  VERSION LIBRARY                            ...
0  AutoClassifier 1        1     APL  hana_ml.algorithms.apl    ...
1  AutoClassifier 1        2     APL  hana_ml.algorithms.apl    ...

Deleting model:

>>> model_storage.delete_model(name=model.name, version=model.version)

Deleteing models of all verions

>>> model_storage.delete_models(name=name)

Clean up all the models and meta data

>>> model_storage.clean_up()

Methods

clean_up(self)

Be cautious! This function will delete all the models and the meta table.

delete_model(self, name, version)

Deletes the model of a given name and version.

delete_models(self, name[, start_time, end_time])

Deletes the model in a batch model with specified time range.

disable_persistent_memory(self, name, version)

Disable persistent memory.

enable_persistent_memory(self, name, version)

Enable persistent memory.

list_models(self[, name, version])

Lists existing models.

load_into_memory(self, name, version)

Load a model to memory.

load_model(self, name[, version])

Loads an existing model from the database.

model_already_exists(self, name, version)

Checks if a model already exists in the model storage.

save_model(self, model[, if_exists])

Saves a model.

unload_from_memory(self, name, version[, ...])

Unload a model to memory.

list_models(self, name=None, version=None)

Lists existing models.

Parameters
connection_contextConnectionContext object

The SAP HANA connection.

namestr, optional

The model name pattern to be matched.

Defaults to None.

versionint, optional

The model version.

Defaults to None.

Returns
pandas.DataFrame

The model metadata matching the provided name and version

model_already_exists(self, name, version)

Checks if a model already exists in the model storage.

Parameters
namestr

The model name

versionint

The model version

Returns
bool

If True, there is already a model with the same name and version. If False, there is no model with the same name.

save_model(self, model, if_exists='upgrade')

Saves a model.

Parameters
modela model instance

The model name must have been set. The couple (name, version) will serve as unique id.

if_existsstr, optional
It specifies the behavior when a model with a same name/version already exists:
  • fail: Raises an Error.

  • replace: Overwrites the model.

  • upgrade: Saves the model with an incremented version.

Defaults to 'upgrade'.

delete_model(self, name, version)

Deletes the model of a given name and version.

Parameters
namestr

The model name.

versionint

The model version.

delete_models(self, name, start_time=None, end_time=None)

Deletes the model in a batch model with specified time range.

Parameters
namestr

The model name

start_timestr, optional

The start timestamp for deleting.

Defaults to None.

end_timestr, optional

The end timestamp for deleting.

Defaults to None.

clean_up(self)

Be cautious! This function will delete all the models and the meta table.

load_model(self, name, version=None, **kwargs)

Loads an existing model from the database.

Parameters
namestr

The model name.

versionint, optional

The model version. By default, the last version will be loaded.

Returns
PAL/APL object

The loaded model ready for use.

enable_persistent_memory(self, name, version)

Enable persistent memory.

Parameters
namestr

The name of the model to be deleted

versionint

The model version

disable_persistent_memory(self, name, version)

Disable persistent memory.

Parameters
namestr

The name of the model to be deleted

versionint

The model version

load_into_memory(self, name, version)

Load a model to memory.

Parameters
namestr

The name of the model to be deleted

versionint

The model version

unload_from_memory(self, name, version, persistent_memory=None)

Unload a model to memory. The dataset will be loaded back into memory after next query.

Parameters
namestr

The name of the model to be deleted

versionint

The model version

persistent_memory{'retain', 'delete'}, optional

Only works when persistent memory is enabled.

Defaults to None.