text_log_parse

hana_ml.text.tm.text_log_parse(data, thread_number=None, thread_ratio=None)

The log parsing algorithm analyzes logs and extracts log templates. It currently supports English logs.

Note: Table variables do not support emojis.

Parameters
dataDataFrame

Input data, structured as follows:

  • 1st column, ID.

  • 2nd column, Log content with newline-separated entries.

thread_numberint, optional

Number of threads to use.

Defaults to 1.

thread_ratiofloat, optional

Specifies the ratio of threads that can be used by this function. The range of this parameter is from 0 to 1, where 0 means only using one thread and 1 means using all available threads. Values outside this range result in an error.

Defaults to 0.0.

Returns
A tuple of DataFrames
  • Sentences Template Result: rows of parsed sentences and their templates.

  • Template Info Result: distinct templates and frequencies.

  • Extra Result: auxiliary key-value metadata.

Examples

>>> from hana_ml.text.tm import text_log_parse
>>> sent_res, tpl_res, extra_res = text_log_parse(data=text_df)
>>> sent_res.collect()
>>> tpl_res.collect()