Configuring the HANA Sentiment Analysis Component

The HANA Sentiment Analysis component enables you to analyze a complex stream of text (for example, the opinions of Twitter users about a product or service). The component analyses the opinion contained in each unit of text and relays whether the sentiment is positive or negative. This way, you can transform your unstructured data into a series of easily understandable categories to discover influencing factors. From there, you can generate insights to better run your business.

Prerequisites:
  • Server: HANA system (SPS 9+) with PAL, APL and R configured.

  • Client: Predictive Aalytics 2.4 installed and R configured.

Take the following steps to analyse a stream of text for sentiments:

  1. In Expert Analytics, connect to a Data Source. For example, for an analysis of Twitter user opinions on a product or service, you could use a table called TwitterFeed.
  2. In the Predict Room, from the Component List select Data Preparation - Preprocessors - HANA Sentiment Analysis. Drag-and-drop the HANA Sentiment Analysis component to the analysis editor. Alternatively, double-click the HANA Sentiment Analysis component. Click OK.
  3. Double-click the HANA Sentiment Analysis component to work with its configuration settings. Alternatively, on the component click the Settings icon and from the context menu, select Configure Settings.
  4. In the HANA Sentiment dialog box, in the Properties panel select a Target Variable from the menu. Note that it is filtered to list only text columns of the following types: TEXT, BINTEXT, VARCHAR, NCLOB, CLOB or BLOB.
  5. Add a Sentiment Column Name which is the output column name. In the example of Twitter, this is the column name into which the sentiments are written for each tweet.
  6. In the Advanced panel, take the following actions in the Behavior section:
    1. Select the languages of the text for analysis. By default, it will analyse all supported languages but this can be optimized by specifying the languages contained in the dataset.
    2. Select the MIME type to choose the type of text contained in your target variable. By default, it will analyse all supported MIME types but this can be optimized by specifying the MIME types contained in the dataset.
    3. Choose whether or not to report the number of profanities in the analysis via the Enable Profanities checkbox.
    4. Map the sentiments that you are interested in for analysis. In the same section, name the sentiments for use in analysis and reporting. In the example of Twitter, you can map each sentiment as either good or bad. That way, you can work with a two-class problem. Click Done.
  7. When configured, you can use the sentiments for analysis. For example, the analysis can be completed via a decision tree which you can add to the analysis chain from the Algorithms section of Components List panel.
    Note

    The analysis is available for display in visual support tools such as a decision tree.

  8. Click the Run Analysis icon. Please allow time for the analysis to complete, because during the execution a full text index is created, which can extend the execution time depending on the amount of text that is tokenized and analyzed.
  9. Click the Results tab to view the Summary of the results.
The Summary includes the Total input records, Records with sentiments and Records without sentiments, and a breakdown of your mapped sentiments. In the Twitter example, the Summary includes a percentage of good and bad sentiments and the number of unique tokens.
You can now configure HANA Sentiment Analysis component and use it as a pre-processing step in a complex analysis.