FAQs

Q1 : Which version of hana-ml should I install?

A1 : We recommend that you install the latest version of hana-ml no matter what version of SAP HANA you are using. For example, you could use the command "pip install hana-ml" for the first installation or the command 'pip install --upgrade hana-ml' for update.

Q2 : Why do I meet a 'matplotlib' error when I invoke box_plot()?

A2 : This issue may be caused by matplotlib version mismatch. The version of matplotlib required for hana-ml 2.6.201016 is 3.1.3 and the latest matplotlib version 3.3.2 will cause errors as some APIs have been modified. This issue has been fixed in hana-ml 2.6.20110600 which could work with matplotlib 3.3.2.

Q3 : Why do I meet a error "[WinError 126] The specified module could not be found" when I import hana-ml?

A3 : This error happens for Windows users if they have installed Shapely with pip install command in the conda environment. Current solution is to use "conda install -c conda-forge shapely" to install Shapely.

Q4 : What dependencies are required for hana-ml?

A4 : If you use pip command to install hana-ml, all dependencies from PyPI are installed by default. Special case is as follows:

Shapely. If you want to use spatial and graph features, please install Shapely. If you use conda environment in Windows, please refer to Q3 for installation.

Q5 : How to solve the garbled Chinese font problem like □□, when I use library 'Matplotlib'?

A5 : To solve the issue, you need to configure the Chinese font for Matplotlib. Please follow the steps below:

Download and unpack the Chinese font zip.
Find the font file 'SourceHanSansSC-Normal.ttf'.
Copy this file to the path '~/site-packages/matplotlib/mpl-data/fonts/ttf'.
Restart your notebook to make it work.

Q6 : Why the output of FeatureNormalizer and KBinsDiscretizer is not what I expect? For example, the transformed value is a integer when a float is expected in FeatureNormalizer.

A6 : In KBinsDiscretizer and FeatureNormalizer, the data type of the output value is the same as that of the input value. Therefore, if the data type of the original data is integer, the output value will be converted to an integer instead of the result you expect.

The solution is to cast the feature column(s) from INTEGER to DOUBLE before invoking these functions.

Q7 : How could I solve the "RuntimeError: Failed to transform image object to string!", when I use UnifiedReport?

A7 : This issue may be caused by the low version of Matplotlib. Please try to upgrade the version above 3.4.0.

Q8 : How could I solve the issue "KeyError: 'STORAGE_TYPE'" when I use model storage functionalities after I upgrade hana-ml to version 2.9.21XXXX?

A8 : This issue is caused by the upgrade of model storage module. The solution is to invoke a new function upgrade_meta() of a ModelStorage object to update the meta table. This new function upgrade_meta() is available in hana-ml 2.9.210630.

Q9 : Why do I meet a error message like "SAP DBTech JDBC: [328]: invalid name of function or procedure: no procedure with name XXX found:"?

A9: This error message means there is no such function supported in your SAP HANA instance although hana-ml offers such function. Similarly, some new added parameters do not work on the SAP HANA instance with older version. Please refer to the corresponding version of PAL documentation of your HANA instance for more details.

Some versions of PAL documentation are listed as follows:

The following functions are not available in SAP HANA SPS05 but available in SAP HANA cloud QRC 4/2021:

Similarly, the following functions are not available in SAP HANA SPS06 but available in SAP HANA cloud:

LSTM

Please note that one exception is Online Linear Regression (Stateful) which is available in SAP HANA SPS05/06 but not in SAP HANA cloud. And hana-ml only supports Online Linear Regression (Stateless) in SAP HANA Cloud.

Q10 : Why do I meet a error "'NoneType' object has no attribute 'connection'" when I use model storage functionality for ARIMA/AutoARIMA in predict function?

A10 : This issue has been fixed in hana-ml 2.6.20110600. If you use a lower version of hana-ml, please use set_conn() explicitly after loading ARIMA/AutoARIMA model to set the connection to SAP HANA. ARIMA/AutoARIMA is a special case for model storage because it does not require data input for prediction.

Q11 : Why there is no figure appear when I invoke :class:`~hana_ml.visualizers.shap.ShapleyExplainer` and its function summary_plot()?

A11 : This issue may be caused by the high version of ipykernel and matplotlib. Please try to use the version ipykernel <= 6.3.1 and matplotlib <= 3.4.3.

Q12 : Why an additional figure ( e.g. a confusion matrix) is displayed under each page of a report when I invoke :class:`~hana_ml.visualizers.unified_report.UnifiedReport` to display a model report?

A12 : This issue may be caused by the incompatible versions of dependent packages - ipykernel and matplotlib.

Please try to use the version ipykernel <= 6.3.1 and matplotlib <= 3.4.3 and DO NOT use the command '%matplotlib inline' as it affects the result of the display.

Q13 : How can I install hana-ml on Mac with new M1 processor because I get "ERROR: Could not find a version that satisfies the requirement hdbcli (from versions: none) and ERROR: No matching distribution found for hdbcli"?

A13 : First, the status of distribution of hdbcli for M1 macbook is:

Universal binaries are planned for future version of HANA client.

There is actually no timeline yet.

Currently Apple's Rosetta technology will still allow running x86_64 binaries on the new ARM platform.

Hence, please follow the steps below and get hdbcli working on an M1 with Apple's Rosetta technology.

Install Rosetta by executing the following in a Terminal:
$ softwareupdate --install-rosetta
Duplicate your Terminal.app, Rename the duplicate e.g. to Terminal Rosetta.app and configure the duplicate to be run with Rosetta.

Open your Rosetta terminal and install Brew by running:
$ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Note that you also need to do this in case you've already installed brew on the ARM terminal.

In the Rosetta Terminal use Brew to install Python:
$ brew install python3
Now you could use this Python environment to install hdbcli or hana-ml (such as in VS Code or Jupyter Notebooks).

By using the platform package in this Python environment, you can verify that your environment thinks it runs on x86.
>>> import platform
>>> platform.machine()
'x86_64'

Q14 : How to manage SAP HANA workload with Workload Classes?

A14 : In AutomaticClassification and AutomaticRegression, we provide a method called enable_workload_class() to set SAP HANA workload. For example, auto_c is an instance of AutomaticClassification, you could manage the workload:

>>> auto_c.enable_workload_class(workload_class_name='MY_WORKLOAD_CLASS')

For other algorithms, we currently offer a method called apply_with_hint() to achieve workload management and in the next release (hana-ml 2.14), we will support enable_workload_class() for all algorithms. An example is below:

>>> amf = additive_model_forecast.AdditiveModelForecast(growth='linear')
>>> workload_class_name = 'MY_WORKLOAD_CLASS'
>>> amf.apply_with_hint(with_hint='WORKLOAD_CLASS("{}")'.format(workload_class_name), apply_to_anonymous_block=True)

Q15 : How to solve an error "search table error:_SYS_AFL.AFLPAL:TMGETRELATEDDOC_ANY" when invoke :func:`~hana_ml.text.tm.get_related_doc` in hana_ml.text.tm module?

A15 : The input table DataFrame named pred_data in get_related_doc() can only have one row, as only one content can be treated each time. This also applies to other functions including get_related_doc(), get_related_term(), get_relevant_doc(), get_relevant_term() and get_suggested_term().

Q16 : How do I establish a connection to SAP HANA Cloud or HANA on premise instance?

A16 : Before you can connect to the HANA instance, you should have the valid IP address, port number, user name, and password. The required port information would vary for cloud or on-premise scenarios, as detailed below,

For HANA On-Premise:

>>> from hana_ml import dataframe
>>> conn = dataframe.ConnectionContext(address="<hostname>",
                                       port=3<NN>MM,
                                       user="<username>",
                                       password="<password>")

NN and MM in port 3<NN>MM is explained as follows:

For HANA tenant databases, use the port number 3NN13 (where NN is the SAP instance number - e.g. 30013).
For HANA system databases in a multitenant system, the port number is 3NN13.
For HANA single-tenant databases, the port number is 3NN15.

For HANA Cloud:

>>> from hana_ml import dataframe
>>> conn = dataframe.ConnectionContext(address="<hostname>",
                                       port=<port>,
                                       user="<username>",
                                       password="<password>")

If you encounter an error during the connection, you can try adding the relevant authentication parameter, such as,

>>> conn = dataframe.ConnectionContext(address="<hostname>",
                                       port=<port>,
                                       user="<username>",
                                       password="<password>",
                                       encrypt=True,
                                       sslValidateCertificate=None)

For Windows users encountering an error message such as Error: (-10709, 'Connection failed (RTE:[1000013] Key not valid for use in specified state...'), a possible fix could be upgrading hdbcli to its latest version and removing the Crypto folder located in Users\<yourUsername>\AppData\Roaming\Microsoft. After deleting the Crypto folder, there should be no need to specify the encrypt and sslValidateCertificate values in the ConnectionContext.