ScheduledExecution¶

class hana_ml.algorithms.pal.scheduler.ScheduledExecution(connection_context, current_user=None)¶

Python implementation of PAL scheduled execution. Basically, with an instance of class ScheduledExecution, users can take the following actions:

create a single/composite task
create a scheduled execution for a task
alter the scheduled execution for a task
pause the scheduled execution for a task
resume the scheduled execution for a task
remove the scheduled execution for a task
remove a task
create a scheduled execution of the fit() method of a hana-ml object
create a scheduled execution of the predict() method of a fitted hana-ml object
create a scheduled execution of the score() method of a fitted hana-ml object (limited to the case that the score() method is associated with a PAL SCORE procedure)

Parameters

connection_contextConnectionContext: Specifies the valid connection to SAP HANA database.

Attributes

connection_contextConnectionContext: Representing the connection to SAP HANA database.
current_userstr: Representing the info of CURRENT_USER reflected by the connection to SAP HANA database.

Methods

`alter_task_schedule`(task_id[, cron, ...])	Alter a schedule.
`cancel_schedule_job`(task_id, max_wait_duration)	Cancel running scheduled job.
`create_fit_schedule`(obj, fit_params, ...[, ...])	Create a scheduled execution of the fit() method of a hana-ml object.
`create_fit_task`(obj, fit_params, task_id[, ...])	Create a task for the fit() method of a hana-ml object.
`create_one_off_task_schedule`(task_id)	Create one-off schedule for a task and return the corresponding job ID.
`create_parallel_composite_task`(task_id, subtasks)	Create parallel composite task.
`create_predict_schedule`(obj, predict_params, ...)	Create a scheduled execution of the predict() method of a hana-ml object.
`create_predict_task`(obj, predict_params, task_id)	Create a task for the predict() method of a hana-ml object.
`create_score_schedule`(obj, score_params, ...)	Create a scheduled execution of the score() method of a hana-ml object (the method must invoke a PAL SCORE procedure internally).
`create_score_task`(obj, score_params, task_id)	Create a task for the score() method of a hana-ml object.
`create_sequential_composite_task`(task_id, ...)	Create sequential composite task.
`create_task`(task_id, proc_name, proc_schema)	Create a task to be scheduled for execution.
`create_task_schedule`(task_id, cron[, ...])	Create scheduled execution for a task.
`get_executed_task_jobs`(task_id[, job_id, order])	Retrieving the executed task jobs (from table "PAL_SCHEDULED_EXECUTION"."TASK_SCHEDULE_JOB").
`get_fit_sql_proc_create_statement`()	Get the SQL statement for creating the fit procedure that has been scheduled for execution.
`get_predict_sql_proc_create_statement`()	Get the SQL statement for creating the predict procedure that has been scheduled for execution.
`get_score_sql_proc_create_statement`()	Get the SQL statement for creating the score procedure that has been scheduled for execution.
`get_task_definition`(task_id)	Get the definition of a created task given task ID.
`get_task_log`(task_id)	Get the log of a specified task given its task_id.
`get_task_param`(task_id)	Get the parameters of a created task given a task_id.
`get_task_schedules`([task_owner])	Get the info of scheduled jobs from system view SCHEDULER_JOBS via task owner specification.
`list_materialized_tables_fit`()	Get the materialization table names of temp tables for the scheduled hana-ml fit() execution.
`list_materialized_tables_predict`()	Get the materialization table names of temp tables for the scheduled hana-ml predict() execution.
`list_materialized_tables_score`()	Get the materialization table names of temp tables for the scheduled hana-ml score() execution.
`list_output_tables_fit`()	Get the output table names for the scheduled hana-ml fit() execution.
`list_output_tables_predict`()	Get the output table names for the scheduled hana-ml predict() execution.
`list_output_tables_score`()	Get the output table names for the scheduled hana-ml score() execution.
`pause_task_schedule`(task_id)	Pause a running schedule.
`query_schedule_job`(task_id[, job_id])	Query the info of a scheduled job.
`query_task`(task_id)	Query the information of procedure task as well as composite task that belong to the current user through task_id.
`query_task_schedule`(task_id[, last_days])	Query the schedule info of a scheduled task.
`remove_task`(task_id[, force, with_hierarchy])	Remove a task.
`remove_task_schedule`(task_id)	Remove the schedule execution of a task.
`resume_task_schedule`(task_id)	Resume a paused schedule.

Examples

Scenario : There is a dataset that has been updated continuously. Assuming that the dataset is stored in a table called 'EXPERIMENT_DATA_FULL_TBL' in SAP HANA database. We want to schedule the training of an HGBT model on this dataset at 8AM each Monday, and have the latest HGBT model stored in a table called 'EXPERIMENT_MODEL_TBL'.

The entire scheduling process of the scenario above can be illustrated as follows:

First, we create a ScheduledExecution instance as follows:

>>> from hana_ml.dataframe import ConnectionContext
>>> url, port, user, pwd = 'mocksite.com', 30015, 'MOCK_USER', 'pt&%$sdxy'
>>> conn = ConnectionContext(url, port, user, pwd)
>>> sexec = ScheduledExecution(conn)
>>> sexec.current_user
... 'MOCK_USER'

Then we can execute the following SQL statement to create a stored SQL procedure for each single training process:

CREATE PROCEDURE EXPERIMENT_HGBT_TRAIN(TREE_NUM INTEGER)
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
AS
BEGIN
DECLARE param_tab TABLE("PARAM_NAME" VARCHAR(256), "INT_VALUE" INTEGER, "DOUBLE_VALUE" DOUBLE, "STRING_VALUE" VARCHAR(1000));
:param_tab.insert(('HAS_ID', 1, NULL, NULL));
:param_tab.insert(('DEPENDENT_VARIABLE', NULL, NULL, 'median_house_value'));
:param_tab.insert(('ITER_NUM', :TREE_NUM, NULL, NULL));
data_tab = SELECT * FROM EXPERIMENT_DATA_FULL_TBL;
CALL _SYS_AFL.PAL_HGBT(:data_tab, :param_tab, model_tab, varimp_tab, cm_tab, stat_tab, cv_tab);
TRUNCATE TABLE EXPERIMENT_MODEL_TBL;
INSERT INTO EXPERIMENT_MODEL_TBL SELECT * FROM :model_tab;
END

Once created, the procedure will be under the schema of current user (i.e. 'MOCK_USER' shown in the connection). Then, we can create a task for it, demonstrated as follows:

>>> task_info = sexec.create_task(task_id='EXPERIMENT_DATA_HGBT_FIT',
...                               proc_name='EXPERIMENT_HGBT_TRAIN',
...                               proc_schema='MOCK_USER',
...                               task_desc='Fitting HGBT model using EXPERIMENT dataset',
...                               task_params=[('TREE_NUM', None, 10, 2)]
...                               force=True)#drop the old task with same task id if exists

The task is successfully created if no error is raised. We can then attach the prescribed schedule mentioned in the beginning of this section to the created task, illustrated as follows:

>>> schedule_info = sexec.create_task_schedule(task_id='EXPERIMENT_DATA_HGBT_FIT',
...                                            cron="* * * mon 8 0 0")#means 8AM each Monday.

If we change our mind and want to postpone the training process to 9AM each Tuesday, then we only need to alter the schedule using a different execution frequency pattern, illustrated as follows:

>>> schedule_info = sexec.alter_task_schedule(task_id='EXPERIMENT_DATA_HGBT_FIT',
...                                           cron="* * * tue 9 0 0")#means 9AM each Tuesday.

We can pause & resume the schedule anytime we want, illustrated as follows:

>>> sexec.pause_task_schedule(task_id='EXPERIMENT_DATA_HGBT_FIT')
>>> sexec.remove_task_schedule(task_id='EXPERIMENT_DATA_HGBT_FIT')

If we no longer need the task to be scheduled, we can remove the schedule:

>>> sexec.remove_task_schedule(task_id='EXPERIMENT_DATA_HGBT_FIT')

Finally if the task is no longer needed, we can remove the task:

>>> sexec.remove_task(task_id='EXPERIMENT_DATA_HGBT_FIT')

create_task(task_id, proc_name, proc_schema, task_owner=None, task_params=None, task_desc='', force=False)¶

Create a task to be scheduled for execution. Basically, a task consists of the task ID, the owner, and a stored SQL procedure (with parameters) to be invoked.

Parameters

task_idstr

Specifies the name of the task to be created. The name must be unique and does not conflict with names of existing tasks.

proc_namestr

Specifies the name of the stored SQL procedure to be invoked.

proc_schemastr

Specifies the schema of the stored SQL procedure given in proc_name.

Two simple examples for illustration:

If the stored SQL procedure to be invoked is created by user 'PAL_TESTER', then proc_schema should be assigned the value of 'PAL_TESTER'.
All PAL procedures are under the schema '_SYS_AFL'. If the stored SQL procedure to be invoked is a PAL procedure, then proc_schema should be assigned the value of '_SYS_AFL'.

task_ownerstr, optional

Specifies the task owner, who must be granted the privilege to call the stored SQL procedure specified by proc_name.

Defaults to CURRENT_USER.

task_paramslist of tuples, optional

Specifies the parameters of the stored SQL procedure, each parameter must be specified with a tuple described as follows:

(parameter name, parameter schema, parameter value, parameter type).

Currently parameter type can take the following values

0 : table
1 : view
2 : literal

Note that if parameter type is literal (i.e. takes the value of 2), then its corresponding parameter schema should be None.

task_descstr, optional

Description of the task.

Defaults to empty string.

forcebool, optional

Specifies whether or not to drop the previously created task with the same task_id.

Set as True if you want to drop the old task with the same task_id. In this case, if the old task is scheduled for execution, the schedule is dropped as well.

If set as False, and a task with the same task_id already exists, error message shall be thrown.

Defaults to False.

Returns

DataFrame: DataFrame containing the information of the created task.

create_sequential_composite_task(task_id, subtasks, task_desc='', force=True, with_hierarchy=True)¶

Create sequential composite task.

Parameters

task_idstr

Specifies the name of the composite task to be created. The name must not conflict with names of existing tasks..

subtaskslist of str

Specifies the list of subtask IDs to be included in the sequential composite task. The subtasks will be executed in the order as given in the list.

task_descstr, optional

Description of the composite task.

Defaults to empty string.

forcebool, optional

Specifies whether or not to drop the previously created task with the same task_id. Set as True if you want to drop the old task with the same task_id. In this case, if the old task is scheduled for execution, the schedule is dropped as well. If set as False, and a task with the same task_id already exists, error message shall be thrown.

Defaults to True.

with_hierarchybool, optional

Specifies whether or not to remove the hierarchy of the composite task when dropping an existing task.

Defaults to True.

Returns

DataFrame containing the information of sequential composite task created.

create_parallel_composite_task(task_id, subtasks, task_desc='', parallel_attribute='ALL', force=True, with_hierarchy=True)¶

Create parallel composite task.

Parameters

task_idstr

Specifies the name of the composite task to be created. The name must be unique and does not conflict with names of existing tasks.

subtaskslist of str

Specifies the list of subtask IDs to be included in the parallel composite task.

task_descstr, optional

Description of the composite task.

Defaults to empty string.

parallel_attributestr, optional

Specifies the parallel attribute of the composite task. It can take either of the following values: - 'ALL' : the parallel composite task is considered successful only if all its subtasks are successful. - 'ANY' : the parallel composite task is considered successful if any of its subtasks is successful.

Defaults to 'ALL'.

forcebool, optional

Specifies whether or not to drop the previously created task with the same task_id. Set as True if you want to drop the old task with the same task_id. In this case, if the old task is scheduled for execution, the schedule is dropped as well. If set as False, and a task with the same task_id already exists, error message shall be thrown.

Defaults to True.

with_hierarchybool, optional

Specifies whether or not to remove the hierarchy of the composite task when dropping an existing task.

Defaults to True.

Returns

DataFrame containing the information of parallel composite task created.

query_task(task_id)¶

Query the information of procedure task as well as composite task that belong to the current user through task_id.

Parameters

task_idstr: Task ID.

Returns

Tuple of DataFrames that contain the chain, definition, parameter and log information of the corresponding task to be queried.

remove_task(task_id, force=False, with_hierarchy=False)¶

Remove a task.

task_idstr

Specifies the name of the task to be removed.

forcebool, optional

Specifies whether or not to continue removing the specified task if the task is scheduled.

If it is set as True and the task is scheduled, the schedule will be removed as well in order to facilitate the removal of the task (otherwise error will be thrown).

Defaults to False.

with_hierarchybool, optional

Specifies whether or not to remove all kinds of tasks owned by current user. Suitable when the target task is a composite task.

Defaults to False.

Returns

DataFrame: DataFrame containing the information of the task that has been removed.

get_task_log(task_id)¶

Get the log of a specified task given its task_id.

Parameters

task_idstr: Task ID.

Returns

DataFrame that contains the log information of the corresponding task.

get_task_definition(task_id)¶

Get the definition of a created task given task ID.

Parameters

task_idstr: Task ID.

Returns

DataFrame that contains the definition of the corresponding task identified through task_id.

get_task_param(task_id)¶

Get the parameters of a created task given a task_id.

Parameters

task_idstr: Task ID.

Returns

DataFrame that contains the parameter information of the corresponding task associated with task_id.

create_task_schedule(task_id, cron, recurrence_range=None, force=False)¶

Create scheduled execution for a task.

Parameters

task_idstr

Name of the task to be scheduled for execution.

cronstr

Specifies the frequency pattern of task to be executed. It should be a string of the following format (please note that there is a space between neighboring frequency categories)

"<YEAR> <MONTH> <MONTHDAY> <WEEKDAY> <HOUR> <MINUTE> <SECOND>"

YEAR Four digit number, representing the year
MONTH 1 - 12, representing the month
MONTHDAY 1 - 31, representing the date
WEEKDAY 'mon', 'tue', 'wed', 'thu', 'fri', 'sat', 'sun', representing the day of week
HOUR 0 - 23, representing the hour
MINUTE 0 - 59, representing the minute
SECOND 0 - 59, representing the second

Besides valid values for each frequency category listed above, each frequency pattern also supports wildcard character, range pattern and cycle pattern, illustrated as follows:

*	Any frequency value
*/n	From the first valid value then any other value step n
a:b	Valid values ranging from a to b, inclusive of endpoints
a:b/n	Valid values from a to b with step n

Moreover, each frequency pattern can also be entered in a comma separated list. For example, the <WEEKDAY> frequency pattern can be specified as 'mon, wed, fri', which means that task is scheduled for execution on Monday, Wednesday and Friday.

Example

cron = "2025 2 25 * 14:16 0 0"

specifies an hourly frequency pattern from 14:00 to 16:00 on Feb 25, 2025.

recurrence_rangedict, optional

This parameter specifies the range of time allowed for scheduled task execution. This setting is optional, user can set either the lower bound (i.e. start) or upper bound (i.e. end) of the range, or neither.

For specifying the start or end points of the recurrence range (or both), user should always use string timestamp of the format "YYYY-MM-DD HH24:MI:SS.FF7", or a python object of class datetime.datetime.

Example recurrence range in dict : {'start': '2025-02-22 14:00:00.0000000', 'end': '2025-02-28 15:00:00.0000000'}, which specifies a recurrence range from 14PM, Feb 22, 2025 to 15PM, Feb 28, 2025.

Returns

DataFrame: DataFrame containing the created schedule for task execution.

query_task_schedule(task_id, last_days=0)¶

Query the schedule info of a scheduled task.

Parameters

task_idstr

Task ID.

last_daysint, optional

Last days log, where 0 indicate all logs.

Defaults to 0.

Returns

Tuple of DataFrames containing the schedule and log information of the corresponding task.

query_schedule_job(task_id, job_id=0)¶

Query the info of a scheduled job.

Parameters

task_idstr

Task ID.

job_idint, optional

Job ID.

Defaults to 0.

Returns

Tuple of DataFrames containing the schedule job information and log information of the corresponding task job.

create_one_off_task_schedule(task_id)¶

Create one-off schedule for a task and return the corresponding job ID.

Parameters

task_idstr: Task ID.

Returns

The job ID of the corresponding one-off task schedule.

alter_task_schedule(task_id, cron=None, recurrence_range=None)¶

Alter a schedule.

Parameters

task_idstr: Name of the task to be scheduled for execution.
cronstr: Specifies the frequency pattern of task execution; its format is the same as the cron parameter in create_task_schedule.
recurrence_rangedict, optional: This parameter specifies the range of time allowed for scheduled task execution. The setting of this parameter is the same as that of the recurrence_range parameter in create_task_schedule.

Returns

DataFrame: DataFrame containing the information of the altered scheduled execution.

pause_task_schedule(task_id)¶

Pause a running schedule.

Parameters

task_idstr: Task ID.

Returns

DataFrame: DataFrame containing the information of the (paused) task schedule.

resume_task_schedule(task_id)¶

Resume a paused schedule.

Parameters

task_idstr: Task ID.

Returns

DataFrame: DataFrame containing the information of the (resumed) task schedule.

get_executed_task_jobs(task_id, job_id=None, order='desc')¶

Retrieving the executed task jobs (from table "PAL_SCHEDULED_EXECUTION"."TASK_SCHEDULE_JOB").

Parameters

task_idstr

Task ID.

job_idint, optional

Job ID.

Defaults to None.

order{'asc', 'desc'}, optional

The displaying order of retrieved records in start time of execution.

Defaults to 'desc'.

Returns

DataFrame: DataFrame containing the information of the executed task jobs.

get_task_schedules(task_owner=None)¶

Get the info of scheduled jobs from system view SCHEDULER_JOBS via task owner specification.

Parameters

task_ownerstr, optional

Task owner.

Defaults to the value of class attribute current_user.

Returns

DataFrame: Filtered view of SCHEDULER_JOBS.

cancel_schedule_job(task_id, max_wait_duration)¶

Cancel running scheduled job.

Parameters

task_idstr: Task ID.
max_wait_durationint: Maximum wait duration for canceling the schedule job, in seconds.

Returns

DataFrame: DataFrame containing result message of the cancel process.

remove_task_schedule(task_id)¶

Remove the schedule execution of a task.

Parameters

task_idstr: Task ID.

Returns

DataFrame: DataFrame containing the information of the scheduled task execution.

create_fit_task(obj, fit_params, task_id, output_table_names=None, proc_name=None, force=True)¶

Create a task for the fit() method of a hana-ml object.

Parameters

objhana-ml object

A hana-ml object (i.e. an instance of some hana-ml class) with a callable fit() method.

For example, obj can be a hana-ml object defined as follows:

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
obj = UnifiedClassification(func='HybridGradientBoostingTree', n_estimators=100)

fit_paramsdict

The key-value arguments (parameters) passed to the fit() method of obj. Intrinsically it is the execution of

obj.fit(**fit_params)

to be scheduled.

task_idstr

The task ID for the task associated with the stored SQL procedure associated with the execution of the fit() method of obj.

output_table_namesListOfStrings, optional

User Specified names of output tables for the corresponding PAL procedure for model fitting.

If not provided, the table names will be automatically generated.

proc_namestr, optional

Procedure name of the generated stored SQL procedure.

Defaults to f"PROCEDURE_{task_id}" if not provided.

forcebool, optional

Specifies whether or not to force the creation of the task schedule for the execution of the fit() method of obj.

If set as True, it will first try to drop previously existing procedures with the same name as well as tasks with the same task_id, then recreate them.

Defaults to True.

create_fit_schedule(obj, fit_params, task_id, cron, recurrence_range=None, output_table_names=None, proc_name=None, force=True)¶

Create a scheduled execution of the fit() method of a hana-ml object. To achieve this designated objective, the following actions will be taken subsequently:

1. A stored SQL procedure is first created for the fit() method to be executed
1. A task is created for the stored SQL procedure
1. The task created in Step 2 is scheduled for future execution

Parameters

objhana-ml object

A hana-ml object (i.e. an instance of some hana-ml class) with a callable fit() method.

For example, obj can be a hana-ml object defined as follows:

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
obj = UnifiedClassification(func='HybridGradientBoostingTree', n_estimators=100)

fit_paramsdict

The key-value arguments (parameters) passed to the fit() method of obj. Intrinsically it is the execution of

obj.fit(**fit_params)

to be scheduled.

task_idstr

The task ID for the task associated with the stored SQL procedure associated with the execution of the fit() method of obj.

cronstr

Specifies the frequency pattern of task to be executed, which is the same as the definition of cron in method create_task_schedule.

recurrence_rangedict, optional

This parameter specifies the range of time allowed for scheduled task execution. It is the same as the definition of recurrence_range in method create_task_schedule.

output_table_namesListOfStrings, optional

User Specified names of output tables for the corresponding PAL procedure for model fitting.

If not provided, the table names will be automatically generated.

proc_namestr, optional

Procedure name of the generated stored SQL procedure.

Defaults to f"PROCEDURE_{task_id}" if not provided.

forcebool, optional

Specifies whether or not to force the creation of the task schedule for the execution of the fit() method of obj.

If set as True, it will first try to drop previously existing procedures with the same name as well as tasks/schedules with the same task_id, then recreate and reschedule them.

Defaults to True.

Examples

Assuming a dataset for classification is stored in table "CLS_DATA_TBL" (with ID column "ID" and label column "CLASS"), we want to schedule the training of an HGBT model using the UnifiedClassification interface provided in hana-ml, then we can proceed as follows:

>>> from hana_ml.dataframe import ConnectionContext
>>> cc = ConnectionContext(address=..., port=..., user=..., password=...)
>>> data = cc.table()
>>> fit_params = dict(data=data, key="ID", label="CLASS")
>>> scheduler = ScheduledExecution(cc)
>>> from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
>>> uhgc = UnifiedClassification(func="HybridGradientBoostingTree", n_estimators=100)
>>> schedule_info = scheduler.create_fit_schedule(obj=uhgc,
...                                               fit_params=fit_params,
...                                               task_id="CLS_DATA_TBL_FIT",
...                                               cron="2025 3 14 * 9 0 0",
...                                               force=True)

get_fit_sql_proc_create_statement()¶: Get the SQL statement for creating the fit procedure that has been scheduled for execution.

list_materialized_tables_fit()¶: Get the materialization table names of temp tables for the scheduled hana-ml fit() execution.

list_output_tables_fit()¶: Get the output table names for the scheduled hana-ml fit() execution.

create_predict_task(obj, predict_params, task_id, proc_name=None, output_table_names=None, force=True)¶

Create a task for the predict() method of a hana-ml object.

Parameters

objhana-ml object

A hana-ml object (i.e. an instance of some hana-ml class) with a callable predict() method.

For example, obj can be a hana-ml object defined as follows:

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
obj = UnifiedClassification(func='HybridGradientBoostingTree', n_estimators=100)

predict_paramsdict

The key-value arguments (parameters) passed to the predict() method of obj, i. e.

obj.predict(**predict_params)

to be scheduled.

task_idstr

The task ID for the task associated with the stored SQL procedure associated with the execution of the fit() method of obj.

output_table_namesListOfStrings, optional

User Specified names of output tables for the corresponding PAL procedure for model fitting.

If not provided, the table names will be automatically generated.

proc_namestr, optional

Procedure name of the generated stored SQL procedure.

Defaults to f"PROCEDURE_{task_id}" if not provided.

forcebool, optional

Specifies whether or not to force the creation of the task schedule for the execution of the fit() method of obj.

If set as True, it will first try to drop previously existing procedures with the same name as well as tasks with the same task_id, then recreate them.

Defaults to True.

create_score_task(obj, score_params, task_id, proc_name=None, output_table_names=None, force=True)¶

Create a task for the score() method of a hana-ml object.

Parameters

objhana-ml object

A hana-ml object (i.e. an instance of some hana-ml class) with a callable fit() method.

For example, obj can be a hana-ml object defined as follows:

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
obj = UnifiedClassification(func='HybridGradientBoostingTree', n_estimators=100)

score_paramsdict

The key-value arguments (parameters) passed to the score() method of obj, i. e.

obj.score(**score_params)

task_idstr

The task ID for the task associated with the stored SQL procedure associated with the execution of the score() method of obj.

output_table_namesListOfStrings, optional

User Specified names of output tables for the corresponding PAL procedure for model scoring.

If not provided, the table names will be automatically generated.

proc_namestr, optional

Procedure name of the generated stored SQL procedure.

Defaults to f"PROCEDURE_{task_id}" if not provided.

forcebool, optional

Specifies whether or not to force the creation of the task schedule for the execution of the score() method of obj.

If set as True, it will first try to drop previously existing procedures with the same name as well as tasks with the same task_id, then recreate them.

Defaults to True.

create_predict_schedule(obj, predict_params, task_id, cron, proc_name=None, recurrence_range=None, output_table_names=None, force=True)¶

Create a scheduled execution of the predict() method of a hana-ml object. A prerequisite is to execute the fit() method of the hana-ml object first, so that a model is available for inference.

Then, to achieve this designated objective, the following actions will be taken subsequently:

1. A stored SQL procedure is first created for the predict() method to be executed
1. A task is created for the stored SQL procedure
1. The task created in Step 2 is scheduled for future execution

Parameters

objhana-ml object

A hana-ml object (i.e. an instance of some hana-ml class) with a callable predict() method. It needs to be fitted first.

For example, obj can be a hana-ml object defined as follows:

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
obj = UnifiedClassification(func='HybridGradientBoostingTree', n_estimators=100).fit(data=data, key=..., label=...)

predict_paramsdict

The key-value arguments (parameters) passed to the predict() method of obj. Intrinsically it is the execution of

obj.predict(**predict_params)

to be scheduled.

task_idstr

The task ID for the task associated with the stored SQL procedure associated with the execution of the predict() method of obj.

cronstr

Specifies the frequency pattern of task to be executed, which is the same as the definition of cron in method create_task_schedule.

recurrence_rangedict, optional

This parameter specifies the range of time allowed for scheduled task execution. It is the same as the definition of recurrence_range in method create_task_schedule.

output_table_namesListOfStrings, optional

User Specified names of output tables for the corresponding PAL procedure for model fitting.

If not provided, the table names will be automatically generated.

proc_namestr, optional

Procedure name of the generated stored SQL procedure.

Defaults to f"PROCEDURE_{task_id}" if not provided.

forcebool, optional

Specifies whether or not to force the creation of the task schedule for the execution of the predict() method of obj.

If set as True, it will first try to drop previously existing procedures with the same name as well as tasks/schedules with the same task_id, then recreate and reschedule them.

Defaults to True.

Examples

Assuming training dataset for classification is stored in table "CLS_DATA_TBL_TRAIN", and a separate data for prediction is stored in table "CLS_DATA_TBL_PREDICT". we want to schedule the prediction of an HGBT model in UnifiedClassification interface for the prediction dataset. Then, we can proceed as follows:

>>> from hana_ml.dataframe import ConnectionContext
>>> cc = ConnectionContext(address=..., port=..., user=..., password=...)
>>> scheduler = ScheduledExecution(cc)
>>> train_data = cc.table("CLS_DATA_TBL_TRAIN")
>>> from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
>>> uhgc = UnifiedClassification(func="HybridGradientBoostingTree",
...                              n_estimators=100).fit(data=train_data, key=...)
>>> predict_data = cc.table("CLS_DATA_TBL_PREDICT")
>>> predict_params = dict(data=predict_data, key=...)
>>> schedule_info = scheduler.create_predict_schedule(obj=uhgc,
...                                                   predict_params=predict_params,
...                                                   task_id="CLS_DATA_TBL_PREDICT",
...                                                   cron="2025 3 14 * 9 0 0",#means 9:00 AM, March 14, 2025
...                                                   force=True)

get_predict_sql_proc_create_statement()¶: Get the SQL statement for creating the predict procedure that has been scheduled for execution.

list_materialized_tables_predict()¶: Get the materialization table names of temp tables for the scheduled hana-ml predict() execution.

list_output_tables_predict()¶: Get the output table names for the scheduled hana-ml predict() execution.

create_score_schedule(obj, score_params, task_id, cron, proc_name=None, recurrence_range=None, output_table_names=None, force=True)¶

Create a scheduled execution of the score() method of a hana-ml object (the method must invoke a PAL SCORE procedure internally). A prerequisite is to execute the fit() method of the hana-ml object first, so that a model is available for scoring on test data.

Then, to achieve this designated objective, the following actions will be taken subsequently:

1. A stored SQL procedure is first created for the score() method to be executed
1. A task is created for the stored SQL procedure
1. The task created in Step 2 is scheduled for future execution

Parameters

objhana-ml object

A hana-ml object (i.e. an instance of some hana-ml class) with a callable score() method which can invoke the execution of a PAL SCORE procedure. It needs to be fitted first.

For example, obj can be a hana-ml object defined as follows:

from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
obj = UnifiedClassification(func='HybridGradientBoostingTree', n_estimators=100).fit(data=data, key=..., label=...)

score_paramsdict

The key-value arguments (parameters) passed to the score() method of obj. Intrinsically it is the execution of

obj.score(**score_params)

to be scheduled.

task_idstr

The task ID for the task associated with the stored SQL procedure associated with the execution of the score() method of obj.

cronstr

Specifies the frequency pattern of task to be executed, which is the same as the definition of cron in

method create_task_schedule.

proc_namestr, optional

Procedure name of the generated stored SQL procedure.

Defaults to f"PROCEDURE_{task_id}" if not provided.

recurrence_rangedict, optional

This parameter specifies the range of time allowed for scheduled task execution. It is the same as the definition of recurrence_range in create_task_schedule.

output_table_namesListOfStrings, optional

User specified names of output tables for the corresponding PAL procedure for model fitting.

If not provided, the table names will be automatically generated.

forcebool, optional

Specifies whether or not to force the creation of the task schedule for the execution of the score method of obj.

If set as True, it will first try to drop previously existing procedures with the same name as well as tasks/schedules with the same task_id, then recreate and reschedule them.

Defaults to True.

Examples

Assuming a dataset for classification is split into train and test parts, stored separately in table "CLS_DATA_TBL_TRAIN" and table "CLS_DATA_TBL_TEST", we want to schedule the training of an HGBT model using the UnifiedClassification interface provided in hana-ml, then we can proceed as follows:

>>> from hana_ml.dataframe import ConnectionContext
>>> cc = ConnectionContext(address=..., port=..., user=..., password=...)
>>> scheduler = ScheduledExecution(cc)
>>> from hana_ml.algorithms.pal.unified_classification import UnifiedClassification
>>> uhgc = UnifiedClassification(func="HybridGradientBoostingTree",
...                              n_estimators=100)
>>> train_data = cc.table("CLS_DATA_TBL_TRAIN")
>>> uhgc.fit(data=train_data, key=...)#fit the training data first to generate a model for the inference task
>>> test_data = cc.table("CLS_DATA_TBL_TEST")
>>> score_params = dict(data=test_data, key=..., label=...)
>>> schedule_info = scheduler.create_score_schedule(obj=uhgc,
...                                                 score_params=score_params,
...                                                 task_id="CLS_DATA_TBL_SCORE",
...                                                 cron="2025 3 14 * 9 0 0",#means 9:00 AM, March 14, 2025
...                                                 force=True)

get_score_sql_proc_create_statement()¶: Get the SQL statement for creating the score procedure that has been scheduled for execution.

list_materialized_tables_score()¶: Get the materialization table names of temp tables for the scheduled hana-ml score() execution.

list_output_tables_score()¶: Get the output table names for the scheduled hana-ml score() execution.

Inherited Methods from PALBase¶

Besides those methods mentioned above, the ScheduledExecution class also inherits methods from PALBase class, please refer to PAL Base for more details.