sentiment_analysis

hana_ml.text.ta.sentiment_analysis(data, lang=None, thread_ratio=None, timeout=None)

A sentiment score, often referred to as a sentiment analysis score, is a numerical representation of the sentiment or emotion conveyed in a piece of text, be it a tweet, a product review, or an article. It provides insight into whether the expressed sentiment is positive, negative, or neutral. Understanding sentiment scores is essential for businesses, marketers, and data scientists, as it helps them make data-driven decisions and gain valuable insights. This task output doc, sentence, and word level sentiment.

dataDataFrame

The input data for text analysis, should be a DataFrame structured as follows:

  • 1st column : ID of input text, of type INT, VARCHAR if NVARCHAR

  • 2nd column : Text content, of type VARCHAR, NVARCHAR or NCLOB

  • 3rd column (optional) : Specifies the language of the text content, can be 'en', 'de', 'fr', 'es', 'pt' or NULL (means automatically detected).

lang{'en', 'de', 'fr', 'es', 'pt'}, optional

Specifies the language of the input texts in data.

Effective only when the language column in data is not provided (i.e. data has two columns).

thread_ratiofloat, optional

Specifies the ratio of threads that can be used by this function, with valid range from 0 to 1, where

  • 0 means only using a single thread.

  • 1 means using at most all the currently available threads.

Values outside valid range are ignored (no error thrown), and in such case the function heuristically determines the number of threads to use.

Defaults to 0.0.

timeoutint, optional

Specifies the maximum amount of time (in seconds) the client will wait for a response from the server.

Defaults to 10.

Returns
A tuple of DataFrames:
  • DataFrame 1 : Documents sentiment result table

  • DataFrame 2 : Sentences sentiment result table

  • DataFrame 3 : Phrases sentiment result table

  • DataFrame 4 : Sentences result table

  • DataFrame 5 : Extra result table

Examples

>>> doc_sentiment, sentence_sentiment, phrase_sentiment, sentences, extra = sentiment_analysis(data=df, thread_ratio=0.5, timeout=20)