hana_ml.artifacts package

The artifacts package consists of the following sections:

hana_ml.artifacts.generators.abap and hana_ml.artifacts.deployers.amdp modules provide various methods which helps you to embed machine learning algorithms of SAP HANA (e.g. Predictive Analysis Library (PAL)) via the Python API into SAP S/4HANA business applications with Intelligent Scenario Lifecycle Management (ISLM) framework. The ISLM framework is integrated into the ABAP layer (SAP Basis) so that the intelligent scenarios from above layers in SAP S/4HANA stack can utilize the framework completely. Specifically, a custom ABAP Managed Database Procedure (AMDP) class for a machine learning model needs to be created which can be consumed by ISLM.

Suppose you have a machine learning model developed in hana-ml and decide to embed it into a SAP S/4HANA business application. First, you can create an AMDPGenerator to establish a corresponding AMDP class and then import the generated ABAP class code within your ABAP development environment. Then create a AMDPDeployer to upload such class into the ISLM framework by creating an Intelligent Scenario managed by ISLM. Within ISLM, you can perform operations, such as training, activating, and monitoring of the intelligent scenario for a specific SAP S/4HANA.

Note

SAP S/4HANA System Requirement: S/4HANA 2020 FPS1 or higher.

Supported hana-ml algorithm for AMDP: UnifiedClassification.

Apart of the standard developer authorizations, you need to get the SAP_INTNW_ISLM role for deployers related functions.

AMDP Examples

Let's assume we have a connection to SAP HANA called connection_context and a basic Random Decision Trees Classifier 'rfc' with training data 'diabetes_train_valid' and prediction data 'diabetes_test'. Remember that every model has to contain fit and predict logic, therefore the methods fit() and predict() have to be called at least once. Note that we also need to enable the sql tracer before the training.

>>> connection_context.sql_tracer.enable_sql_trace(True)
>>> connection_context.sql_tracer.enable_trace_history(True)
>>> rfc_params = dict(n_estimators=5, split_threshold=0, max_depth=10)
>>> rfc = UnifiedClassification(func="randomdecisiontree", **rfc_params)
>>> rfc.fit(diabetes_train_valid,
            key='ID',
            label='CLASS',
            categorical_variable=['CLASS'],
            partition_method='stratified',
            stratified_column='CLASS')
>>> rfc.predict(diabetes_test.drop(cols=['CLASS']), key="ID")

Then, generate abap managed database procedures (AMDP) artifact by creating an AMDPGenerator:

>>> generator = AMDPGenerator(project_name="PIMA_DIAB", version="1", connection_context=connection_context, outputdir="out/")
>>> generator.generate()

The generate() process creates a .abap file on your local machine based on the work that was done previously. This .abap file contains the SQL logic wrapped in AMDPs you have created by interacting with the hana-ml package.

Then, you can now take the generated code in the 'outputdir' and deploy it to SAP S/4HANA with ISLM. All you need is to provide the .abap file, and some basic parameters for the ISLM registration.

>>> deployer = AMDPDeployer(backend_url=backend_url,
                            backend_auth=(backend_user,
                                          backend_password),
                            frontend_url=frontend_url,
                            frontend_auth=(frontend_user,
                                           frontend_password))
>>> guid = deployer.deploy(fp="XXX.abap",
                           model_name="MODEL_01",
                           catalog="$TMP",
                           scenario_name="DEMO_CUSTOM01",
                           scenario_description="Hello S/4 demo!",
                           scenario_type="CLASSIFICATION",
                           force_overwrite=True,
                           master_system="ER9",
                           transport_request="$TMP",
                           sap_client='000')

After the deployment is competed, you can see an intelligent scenario in the 'Intelligent Scenarios' Fiori app of the ISLM framework. This scenario has a name specified during the deployment step.

hana_ml.artifacts.deployers.amdp

This module provides AMDP related functionality.

The following class is available:

hana_ml.artifacts.deployers.amdp.gen_pass_key(url, user, passwd=None)

This function provides a way to encrypt user name and password, and then returns the key for future access.

Parameters:
urlstr

The url of backend/frontend.

userstr

User name.

passwdstr

Password.

Returns:
pass_key
class hana_ml.artifacts.deployers.amdp.AMDPDeployer(backend_url=None, backend_auth=None, frontend_url=None, frontend_auth=None, backend_key=None, frontend_key=None, url=None, auth=None, user_key=None)

Bases: object

This class provides AMDP deployer related functionality. After you create an AMDPGenerator to establish a corresponding AMDP class, then you can create an AMDPDeployer to upload such class into the ISLM framework by creating an intelligent scenario.

Note

Supported hana-ml algorithm for AMDP: UnifiedClassification.

Apart of the standard developer authorizations, you need to get the SAP_INTNW_ISLM role for deployers related functions.

Parameters:
backend_urlstr

The url of backend.

backend_authstr or tuple

The authentication information of backend which contain user name and password.

frontend_urlstr

The url of frontend.

frontend_authstr or tuple

The authentication information of frontend which contain user name and password.

backend_keybytes, optional

If backend_key has been generated, it can be used instead of password.

Defaults to None.

frontend_keybytes, optional

If frontend_key has been generated, it can be used instead of password.

Defaults to None.

urlstr

The url.

authstr or tuple

The authentication information of backend and frontend which contain user name and password.

user_keybytes, optional

If user key has been generated, it can be used instead of password.

Defaults to None.

Examples

After you use an AMDPGenerator to generate a .abap file, you can now take the generated code in the 'outputdir' and deploy it to SAP S/4HANA or any ABAP stack with ISLM for that matter. All you need is to provide the .abap file, and some basic parameters for the ISLM registration.

>>> deployer = AMDPDeployer(backend_url=backend_url,
                            backend_auth=(backend_user,
                                          backend_password),
                            frontend_url=frontend_url,
                            frontend_auth=(frontend_user,
                                           frontend_password))
>>> guid = deployer.deploy(fp="XXX.abap",
                           model_name="MODEL_01",
                           catalog="$TMP",
                           scenario_name="DEMO_CUSTOM01",
                           scenario_description="Hello S/4 demo!",
                           scenario_type="CLASSIFICATION",
                           force_overwrite=True,
                           master_system="ER9",
                           transport_request="$TMP",
                           sap_client='000')

After the deployment is competed, you can see an intelligent scenario in the 'Intelligent Scenarios' Fiori app of the ISLM framework. This scenario has a name specified during the deployment step.

Methods

deploy(fp, model_name, catalog, ...[, ...])

The deploy method is to deploy an AMDP class into SAP S/4HANA with Intelligent Scenario Lifecycle Management (ISLM).

deploy_class(class_name, abap_class_code[, ...])

Deploy the class.

format(abap_class_code, master_system)

Format from AMDP session.

get_is_information_from_islm(scenario_name, ...)

Get Intelligent Scenario Lifecycle Management (ISLM) information.

register_islm(class_name, model_name, ...)

Register in Intelligent Scenario Lifecycle Management (ISLM).

deploy(fp, model_name, catalog, scenario_name, scenario_type, class_description=None, scenario_description=None, force_overwrite=False, master_system='ER9', transport_request='$TMP', sap_client='000')

The deploy method is to deploy an AMDP class into SAP S/4HANA with Intelligent Scenario Lifecycle Management (ISLM).

Parameters:
fpstr

Name of the abap file to be opened.

model_namestr

Name of the model.

catalogstr

Name of the catalog.

scenario_namestr

Name of the intelligent scenario.

scenario_typestr

Type of the intelligent scenario.

class_descriptionstr, optional

Description of the class.

Defaults to None.

scenario_descriptionstr, optional

Description of the intelligent scenario.

Defaults to None.

force_overwritebool, optional

Whether to overwrite the class if class already exists.

Defaults to False.

master_systemstr, optional

Name of the master system. Please enter the name of master system you are working on.

Defaults to "ER9".

transport_requeststr, optional

Name of the package. Please enter the name of package you are working on.

Defaults to '$TMP'.

sap_clientstr, optional

The client of SAP. Please enter the name of client you are using.

Defaults to '000'.

Returns:
GUID (Globally Unique Identifier).

Examples

Create an AMDPDeployer object:

>>> deployer = AMDPDeployer(backend_url=backend_url,
                            backend_auth=(backend_user,
                                          backend_password),
                            frontend_url=frontend_url,
                            frontend_auth=(frontend_user,
                                           frontend_password))

Deploy:

>>> guid = deployer.deploy(fp="XXX.abap",
                           model_name=model_name,
                           catalog="XXX",
                           scenario_name=scenario_name,
                           scenario_description=scenario_description,
                           scenario_type=scenario_type,
                           force_overwrite=True,
                           master_system="ER9",
                           transport_request="$TMP",
                           sap_client='000')
deploy_class(class_name, abap_class_code, class_description=None, master_system='ER9', force_overwrite=False, transport_request='$TMP')

Deploy the class.

Note that all request data in this class is kept in XML because it allows for an easier development in combination with the SAP ABAP Development Tools (ADT, Eclipse). In the communication log that can be viewed in the IDE everything is done in XML -> easier translation to this method.

Parameters:
class_namestr

Name of the class.

abap_class_codestr

Code of SAP ABAP class.

class_descriptionstr, optional

Description of the class.

Defaults to None.

master_systemstr, optional

Name of master system. Please enter the name of master system you are working on.

Defaults to "ER9".

force_overwritebool, optional

whether to overwrite the class if class already exists.

Defaults to False.

transport_requeststr, optional

Name of the package. Please enter the name of package you are working on.

Defaults to '$TMP'.

register_islm(class_name, model_name, catalog, scenario_name, scenario_type, scenario_description, sap_client)

Register in Intelligent Scenario Lifecycle Management (ISLM).

Parameters:
class_namestr

Name of the class.

model_namestr

Name of the model.

catalogstr

Name of the catalog.

scenario_namestr

Name of the intelligent scenario.

scenario_typestr

Type of the intelligent scenario.

scenario_descriptionstr

Description of the intelligent scenario.

sap_clientstr

The client of SAP.

get_is_information_from_islm(scenario_name, sap_client)

Get Intelligent Scenario Lifecycle Management (ISLM) information.

Parameters:
scenario_namestr

Name of the intelligent scenario.

sap_clientstr

The client of SAP.

format(abap_class_code, master_system)

Format from AMDP session.

Parameters:
abap_class_codestr

Code of SAP ABAP class.

master_systemstr

Name of master system.

hana_ml.artifacts.generators.abap

This module handles generation of all AMDP(ABAP Managed Database Procedure) related artifacts based on the provided consumption layer elements. Currently this is experimental code only.

The following class is available:

class hana_ml.artifacts.generators.abap.AMDPGenerator(project_name, version, connection_context, outputdir)

Bases: object

This class provides AMDP(ABAP Managed Database Procedure) specific generation functionality. It also extends the config to cater for AMDP generation specific config.

Note

Supported hana-ml algorithm for AMDP: UnifiedClassification.

Parameters:
project_namestr

Name of the project.

versionstr

The version.

connection_contextstr

The connection to the SAP HANA.

outputdirstr

The directory of output.

Examples

Let's assume we have a connection to SAP HANA called connection_context and a basic Random Decision Trees Classifier 'rfc' with training data 'diabetes_train_valid' and prediction data 'diabetes_test'. Remember that every model has to contain fit and predict logic, therefore the methods fit() and predict() have to be called at least once.

>>> rfc_params = dict(n_estimators=5, split_threshold=0, max_depth=10)
>>> rfc = UnifiedClassification(func="randomdecisiontree", **rfc_params)
>>> rfc.fit(diabetes_train_valid,
            key='ID',
            label='CLASS',
            categorical_variable=['CLASS'],
            partition_method='stratified',
            stratified_column='CLASS')
>>> rfc.predict(diabetes_test.drop(cols=['CLASS']), key="ID")

Then, generate abap managed database procedures (AMDP) artifact by creating an AMDPGenerator:

>>> generator = AMDPGenerator(project_name="PIMA_DIAB", version="1", connection_context=connection_context, outputdir="out/")
>>> generator.generate()

The generate() process creates a .abap file on your local machine based on the work that was done previously. This .abap file contains the SQL logic wrapped in AMDPs you have created by interacting with the hana-ml package.

Methods

generate([training_dataset, apply_dataset, ...])

Generate artifacts by first building up the required folder structure for artifacts storage and then generating different required files.

generate(training_dataset='', apply_dataset='', no_reason_features=3)

Generate artifacts by first building up the required folder structure for artifacts storage and then generating different required files.

Parameters:
training_datasetstr, optional

Name of training dataset.

Defaults to ''.

apply_datasetstr, optional

Name of apply dataset.

Defaults to ''.

no_reason_featuresint, optional

The number of features that contribute to the classification decision the most. This reason code information is to be displayed during the prediction phase.

Defaults to 3.

hana_ml.artifacts.generators.hana

This module handles generation of all HANA design-time artifacts based on the provided base and consumption layer elements. These artifacts can incorporate into development projects in SAP Web IDE for SAP HANA or SAP Business Application Studio and be deployed via HANA Deployment Infrastructure (HDI) into a SAP HANA system.

The following class is available:

class hana_ml.artifacts.generators.hana.HANAGeneratorForCAP(project_name, output_dir, namespace=None)

Bases: object

HANA artifacts generator for the existing CAP project.

Parameters:
project_namestr

The name of project.

outputdirstr

The directory of output.

namespacestr, optional

Specifies the namespace for the project.

Defaults to "hana.ml".

Examples

>>> my_pipeline = Pipeline([
                    ('PCA', PCA(scaling=True, scores=True)),
                    ('HGBT_Classifier', HybridGradientBoostingClassifier(
                                            n_estimators=4, split_threshold=0,
                                            learning_rate=0.5, fold_num=5,
                                            max_depth=6))])
>>> my_pipeline.fit(diabetes_train, key="ID", label="CLASS")
>>> my_pipeline.predict(diabetes_test_m, key="ID")
>>> hanagen = HANAGeneratorForCAP(project_name="my_proj",
                                  output_dir=".",
                                  namespace="hana.ml")
>>> hanagen.generate_artifacts(my_pipeline)

Methods

generate_artifacts(obj[, cds_gen, ...])

Generate CAP artifacts.

materialize_ds_data([to_materialize])

Create input table for the input DataFrame.

materialize_ds_data(to_materialize=True)

Create input table for the input DataFrame.

Parameters:
to_materializebool, optional

If True, the input DataFrame will be materialized.

Defaults to True.

generate_artifacts(obj, cds_gen=False, model_position=None, tudf=False)

Generate CAP artifacts.

Parameters:
objhana-ml object

The hana-ml object that has generated the execution statement.

cds_genbool, optional

Control whether to allow Python client to generate HANA tables, procedures, and so on. If True, it will generate HANA artifacts from cds.

Defaults to False.

model_positionbool or dict, optional

Specifies the model table position from the procedure outputs and the procedure inputs such that {"out": 0, "in" : 1}. If True, the model position {"out": 0, "in" : 1} will be used.

Defaults to None.

tudfbool, optional

If True, it will generate a table UDF for applying. Defaults to False.

class hana_ml.artifacts.generators.hana.HanaGenerator(project_name, version, grant_service, connection_context, outputdir, generation_merge_type=1, generation_group_type=12, sda_grant_service=None, remote_source='')

Bases: object

This class provides HANA specific generation functionality. It also extends the config file to cater for HANA specific config generation.

Parameters:
project_namestr

The name of project.

versionstr

The version name.

grant_servicestr

The grant service.

connection_contextstr

The connection to the SAP HANA.

outputdirstr

The directory of output.

generation_merge_typeint, optional

Merge type is which operations should be merged together. There are at this stage only 2 options:

  • 1: GENERATION_MERGE_NONE: All operations are generated separately (i.e. individual procedures in HANA)

  • 2: GENERATION_MERGE_PARTITION: A partition operation is merged into the respective related operation and generated as 1 (i.e. procedure in HANA).

Defaults to 1.

generation_group_typeint, optional
  • 11: GENERATION_GROUP_NONE # No grouping is applied. This means that solution specific implementation will define how to deal with this

  • 12: GENERATION_GROUP_FUNCTIONAL # Grouping is based on functional grouping. Meaning that logical related elements such as partition / fit / and related score will be put together.

Defaults to 12.

sda_grant_service: str, optional

The grant service of Smart Data Access (SDA).

Defaults to None.

remote_sourcestr, optional

The name of remote source.

Defaults to ''.

Examples

Let's assume we have a connection to SAP HANA called connection_context and a basic Random Decision Trees Classifier 'rfc' with training data 'diabetes_train_valid' and prediction data 'diabetes_test'.

>>> rfc_params = dict(n_estimators=5, split_threshold=0, max_depth=10)
>>> rfc = UnifiedClassification(func="randomdecisiontree", **rfc_params)
>>> rfc.fit(diabetes_train_valid,
            key='ID',
            label='CLASS',
            categorical_variable=['CLASS'],
            partition_method='stratified',
            stratified_column='CLASS',)
>>> rfc.predict(diabetes_test.drop(cols=['CLASS']), key="ID")

Then, we could generate HDI artifacts:

>>> hg = hana.HanaGenerator(project_name="test", version='1', grant_service='', connection_context=connection_context, outputdir="./hana_out")
>>> hg.generate_artifacts()

Returns a output path of the root folder where the hana related artifacts are stored:

>>> './hana_out\test\hana'

Methods

generate_artifacts([base_layer, ...])

Generate the artifacts by first building up the required folder structure for artifacts storage and then generating the different required files.

generate_artifacts(base_layer=True, consumption_layer=True, sda_data_source_mapping_only=False)

Generate the artifacts by first building up the required folder structure for artifacts storage and then generating the different required files. Be aware that this method only generates the generic files and offloads the generation of artifacts where traversal of base and consumption layer elements is required.

Parameters:
base_layerbool, optional

The base layer is the low level procedures that will be generated.

Defaults to True.

consumption_layerbool, optional

The consumption layer is the layer that will consume the base layer artifacts.

Defaults to True.

sda_data_source_mapping_onlybool, optional

In case data source mapping is provided, you can force to only do this for the Smart Data Access (SDA) HANA deployment infrastructure (HDI) container.

Defaults to False.

Returns:
str

Return the output path of the root folder where the hana related artifacts are stored.