text_log_parse¶

hana_ml.text.tm.text_log_parse(data, thread_number=None, thread_ratio=None)¶

The log parsing algorithm analyzes logs and extracts log templates. It currently supports English logs.

Note: Table variables do not support emojis.

Parameters

dataDataFrame

Input data, structured as follows:

1st column, ID.
2nd column, Log content with newline-separated entries.

thread_numberint, optional

Number of threads to use.

Defaults to 1.

thread_ratiofloat, optional

Specifies the ratio of threads that can be used by this function. The range of this parameter is from 0 to 1, where 0 means only using one thread and 1 means using all available threads. Values outside this range result in an error.

Defaults to 0.0.

Returns

A tuple of DataFrames

Sentences Template Result: rows of parsed sentences and their templates.
Template Info Result: distinct templates and frequencies.
Extra Result: auxiliary key-value metadata.

Examples

>>> from hana_ml.text.tm import text_log_parse
>>> sent_res, tpl_res, extra_res = text_log_parse(data=text_df)
>>> sent_res.collect()
>>> tpl_res.collect()