The HANA Sentiment Analysis component enables you to analyze a complex stream of text
(for example, the opinions of Twitter users about a product or service). The component
analyses the opinion contained in each unit of text and relays whether the sentiment is
positive or negative. This way, you can transform your unstructured data into a series of
easily understandable categories to discover influencing factors. From there, you can
generate insights to better run your business.
Prerequisites:
Take the following steps to analyse a stream of text for sentiments:
- In Expert Analytics, connect to a Data Source. For
example, for an analysis of Twitter user opinions on a product or service, you
could use a table called TwitterFeed.
- In the Predict Room, from the Component List select Data
Preparation - Preprocessors -
HANA Sentiment Analysis. Drag-and-drop the
HANA Sentiment Analysis component to the analysis
editor. Alternatively, double-click the HANA Sentiment
Analysis component. Click OK.
- Double-click the HANA Sentiment Analysis component to work with its
configuration settings. Alternatively, on the component click the
Settings
icon and from the context menu, select Configure
Settings.
- In the HANA Sentiment dialog box, in the Properties panel select a
Target Variable from the menu. Note that it is
filtered to list only text columns of the following types: TEXT, BINTEXT,
VARCHAR, NCLOB, CLOB or BLOB.
- Add a Sentiment Column Name which is the output column
name. In the example of Twitter, this is the column name into which the
sentiments are written for each tweet.
- In the Advanced panel, take the following actions in the Behavior section:
- Select the languages of the text for analysis. By default, it will
analyse all supported languages but this can be optimized by specifying
the languages contained in the dataset.
- Select the MIME type to choose the type of text contained in your
target variable. By default, it will analyse all supported MIME types
but this can be optimized by specifying the MIME types contained in the
dataset.
- Choose whether or not to report the number of profanities in the
analysis via the Enable Profanities
checkbox.
- Map the sentiments that you are interested in for analysis. In the same
section, name the sentiments for use in analysis and reporting. In the
example of Twitter, you can map each sentiment as either good or
bad. That way, you can work with a two-class problem. Click
Done.
- When configured, you can use the sentiments for analysis. For example, the analysis can be
completed via a decision tree which you can add to the analysis chain from the
Algorithms section of Components List panel.
Note
The analysis is available for display in visual support tools such as a decision tree.
- Click the Run Analysis
icon. Please allow time for the analysis to complete, because during the
execution a full text index is created, which can extend the execution time
depending on the amount of text that is tokenized and analyzed.
- Click the Results tab to view the Summary of the results.
The Summary includes the Total input records, Records with sentiments and Records without
sentiments, and a breakdown of your mapped sentiments. In the Twitter example, the
Summary includes a percentage of good and bad sentiments and the number of unique
tokens.
You can now configure HANA Sentiment Analysis component and use it as a pre-processing
step in a complex analysis.