SAP HANA 2 - XS Classic JavaScript API Reference

new Session(p) → {$.text.mining.Session}

The Session object represents a Text Mining session.
This constructor function creates a Text Mining session object linked to the given reference table and column. Text Mining functions can subsequently be invoked using this object and they will use this linked reference data and the configuration parameters it was initialized with. Multiple such objects can be created to handle multiple sets of reference data.
This function does not initialize Text Mining for the reference table and column. Initialization is done separately, typically when the full text index is created for the table and column.
Note that it is possible to use a custom XS SQL connection to access the Text Mining reference table as a different user (see "Creating Custom XS SQL Connections" in the SAP HANA Developer Guide).

Parameters:

Name Type Description

p

object

Encapsulates constructor parameters.

Properties

Name	Type	Argument	Description
`referenceTable`	string		The table in which the reference documents are stored.
`referenceColumn`	string		The column in which the reference documents' text content is stored.
`connection`	$.db.Connection	<optional>	A database connection object that will be used to authenticate Text Mining database access. By default the credentials of the caller are used.

Returns:

A Text Mining session object that holds context for the session and is used to call the Text Mining method functions.

Type: $.text.mining.Session

Example

var TM = new $.text.mining.Session({
    referenceTable: "SYSTEM.TMDOCUMENTS",
    referenceColumn: "FILECONTENT"
});

Methods

categorizeKNN(p) → {Array.<$.text.mining.Session~CategoryResult>}

Given an input document, this function returns the top-ranked category values for the given category set columns in the reference data, using the KNN (K Nearest Neighbors) method.

! Security note

For this and the subsequent text mining functions.

The following 4 parameters are SQL expressions. The user application needs to take responsibility for blocking any potentially malicious values from being used:

inputDocumentSubquery
inputDocumentCondition
documentRestriction
termTypeRestriction

Parameters:

Name Type Description

p

object

Encapsulates categorizeKNN parameters.

Properties

Name Type Argument Description

inputDocumentText|
inputDocumentSubquery|
inputDocumentCondition|
inputDocumentIDs

string

Input document to process. One and only one of the following:

inputDocumentText	This literal text
inputDocumentSubquery	Text returned by this SQL subquery
inputDocumentCondition	Text returned from reference table rows for which this SQL "where" clause is true
inputDocumentIDs	Text returned from reference table rows with this (these) internal document ID number(s)

language string <optional> Language code of input text, e.g. "EN", "DE" ("" for all).

mimeType string <optional> Mime type of input text, e.g. "text/plain" ("" for unspecified).

categorySets Array.<string> Category set column names in the reference table that have been assigned category values.

kNN integer <optional> The number of nearest neighbors to be considered.

top integer <optional> Maximum number of returned results.

threshold number <optional> Restricts the returned results to those displaying a score greater than or equal to this numeric value in the range [0,1] (0 to allow all).

documentRestriction string <optional> Specified condition (SQL "where" clause) to be met for reference document rows to be considered in the computation ("" for all).

termTypeRestriction string <optional> Comma-separated list of term types to consider ("" for all).

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Returns:

Array of CategoryResult objects.

Type: Array.<$.text.mining.Session~CategoryResult>

Example

var categoryResults = TM.categorizeKNN({
    inputDocumentSubquery: "SELECT CONTENT FROM TWEETS WHERE ID = 132",
    categorySets: ["SUBJECT", "REGION"], top: 15
});

getRelatedDocuments(p) → {Array.<$.text.mining.Session~DocumentResult>}

Given an input document, this function returns the top-ranked related documents from the reference data, based on co-occurrence statistics of terms.

Parameters:

Name Type Description

p

object

Encapsulates getRelatedDocuments parameters.

Properties

Name Type Argument Description

inputDocumentText|
inputDocumentSubquery|
inputDocumentCondition|
inputDocumentIDs

string

Input document to process. One and only one of the following:

inputDocumentText	This literal text
inputDocumentSubquery	Text returned by this SQL subquery
inputDocumentCondition	Text returned from reference table rows for which this SQL "where" clause is true
inputDocumentIDs	Text returned from reference table rows with this (these) internal document ID number(s)

language string <optional> Language code of input text, e.g. "EN", "DE" ("" for all).

mimeType string <optional> Mime type of input text, e.g. "text/plain" ("" for unspecified).

top integer <optional> Maximum number of returned results.

threshold number <optional> Restricts the returned results to those displaying a score greater than or equal to this numeric value in the range [0,1] (0 to allow all).

documentRestriction string <optional> Specified condition (SQL "where" clause) to be met for reference document rows to be considered in the computation ("" for all).

termTypeRestriction string <optional> Comma-separated list of term types to consider ("" for all).

includeColumns Array.<string> <optional> Specifies columns from the reference table that are to be included in the result table. This provides a way to obtain the content or other data belonging to returned documents.

correlationMatrix boolean <optional> If specified and true, the returned result includes a document correlation matrix.

principalComponents integer <optional> If specified and non-zero, the returned result includes the specified number of principal components of the correlation matrix. Must be in the range from 0 to 3.

clustering string <optional> If specified, the returned result includes hierarchical clusters computed with the method indicated. The possible values "COMPLETE_LINKAGE", "SINGLE_LINKAGE", "AVG_DISTANCE_WITHIN", "AVG_DISTANCE_BETWEEN", and "WARD" stand respectively for the methods Complete Linkage, Single Linkage, Average Distance Within, Average Distance Between, and Ward's Method.

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Returns:

Array of DocumentResult objects.

Type: Array.<$.text.mining.Session~DocumentResult>

Example

var documentResults = TM.getRelatedDocuments ({
    top: 16,
    inputDocumentText: "animals",
    includeColumns: ["KEY", "FILECONTENT"],
});

getRelatedTerms(p) → {Array.<$.text.mining.Session~TermResult>}

Given an input term, this function returns the top-ranked related terms from the reference data, based on co-occurrence statistics.

Parameters:

Name Type Description

p

object

Encapsulates getRelatedTerms parameters.

Properties

Name Type Argument Description

inputTermText|
inputTermIDs

string

Input term to process. One and only one of the following:

inputTermText	This literal text. Typically a single term, but can be multiple terms with optional term types and wildcarding. See SAP HANA SQL and System Views Reference for details.
inputTermIDs	Text associated with reference table columns with this (these) internal term ID number(s)

top integer <optional> Maximum number of returned results.

threshold number <optional> Restricts the returned results to those displaying a score greater than or equal to this numeric value in the range [0,1] (0 to allow all).

documentRestriction string <optional> Specified condition (SQL "where" clause) to be met for reference document rows to be considered in the computation ("" for all).

termTypeRestriction string <optional> Comma-separated list of term types to consider ("" for all).

correlationMatrix boolean <optional> If specified and true, the returned result includes a term correlation matrix.

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Returns:

Array of TermResult objects.

Type: Array.<$.text.mining.Session~TermResult>

Example

var termResults = TM.getRelatedTerms({
    top: 16,
    inputTermText: "animals",
});

getRelevantDocuments(p) → {Array.<$.text.mining.Session~DocumentResult>}

Given an input term, this function returns the top-ranked documents from the reference data that are deemed relevant to the term.

Parameters:

Name Type Description

p

object

Encapsulates getRelevantDocuments parameters.

Properties

Name Type Argument Description

inputTermText|
inputTermIDs

string

Input term to process. One and only one of the following:

inputTermText	This literal text. Typically a single term, but can be multiple terms with optional term types and wildcarding. See SAP HANA SQL and System Views Reference for details.
inputTermIDs	Text associated with reference table columns with this (these) internal term ID number(s)

top integer <optional> Maximum number of returned results.

threshold number <optional> Restricts the returned results to those displaying a score greater than or equal to this numeric value in the range [0,1] (0 to allow all).

documentRestriction string <optional> Specified condition (SQL "where" clause) to be met for reference document rows to be considered in the computation ("" for all).

termTypeRestriction string <optional> Comma-separated list of term types to consider ("" for all).

correlationMatrix boolean <optional> If specified and true, the returned result includes a document correlation matrix.

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Returns:

Array of DocumentResult objects.

Type: Array.<$.text.mining.Session~DocumentResult>

Example

var documentResults = TM.getRelevantDocuments ({
    top: 16,
    inputTermText: "animals",
    includeColumns: ["KEY", "FILECONTENT"],
});

getRelevantTerms(p) → {Array.<$.text.mining.Session~TermResult>}

Given an input document, this function returns the top-ranked keyphrases or relevant terms from the reference data, i.e., the terms that saliently describe the document.

Keyphrases are used to summarize, characterize and provide thematic access to data.

Parameters:

Name Type Description

p

object

Encapsulates getRelevantTerms parameters.

Properties

Name Type Argument Description

inputDocumentText|
inputDocumentSubquery|
inputDocumentCondition|
inputDocumentIDs

string

Input document to process. One and only one of the following:

inputDocumentText	This literal text
inputDocumentSubquery	Text returned by this SQL subquery
inputDocumentCondition	Text returned from reference table rows for which this SQL "where" clause is true
inputDocumentIDs	Text returned from reference table rows with this (these) internal document ID number(s)

language string <optional> Language code of input text, e.g. "EN", "DE" ("" for all).

mimeType string <optional> Mime type of input text, e.g. "text/plain" ("" for unspecified).

top integer <optional> Maximum number of returned results.

threshold number <optional> Restricts the returned results to those displaying a score greater than or equal to this numeric value in the range [0,1] (0 to allow all).

documentRestriction string <optional> Specified condition (SQL "where" clause) to be met for reference document rows to be considered in the computation ("" for all).

termTypeRestriction string <optional> Comma-separated list of term types to consider ("" for all).

correlationMatrix boolean <optional> If specified and true, the returned result includes a term correlation matrix.

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Returns:

Array of TermResult objects.

Type: Array.<$.text.mining.Session~TermResult>

Example

var termResults = TM.getRelevantTerms ({
    top: 16,
    inputDocumentText: "animals",
});

getSuggestedTerms(p) → {Array.<$.text.mining.Session~TermResult>}

Given an input term initial substring, this function returns the top-ranked terms from the reference data that complete that initial substring.

Term suggestion is used to present a user with likely search terms as the user enters characters within a search application.

Parameters:

Name Type Description

p

object

Encapsulates getSuggestedTerms parameters.

Properties

Name Type Argument Description

inputTermText|
inputTermIDs

string

Input term to process. One and only one of the following:

inputTermText	This literal text
inputTermIDs	Text associated with reference table columns with this (these) internal term ID number(s)

top integer <optional> Maximum number of returned results.

threshold number <optional> Restricts the returned results to those displaying a score greater than or equal to this numeric value in the range [0,1] (0 to allow all).

documentRestriction string <optional> Specified condition (SQL "where" clause) to be met for reference document rows to be considered in the computation ("" for all).

termTypeRestriction string <optional> Comma-separated list of term types to consider ("" for all).

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Returns:

Array of TermResult objects.

Type: Array.<$.text.mining.Session~TermResult>

Example

var termResults = TM.getSuggestedTerms ({
    top: 16,
    inputTermText: "a",
});

initialize(p)

This function initializes (or re-initializes) Text Mining for the reference table and column linked to the TextMiningSession object. This creates the Term-Document matrix and other configuration context data that is needed for Text Mining functions. The Text Mining context is specific to the reference data, but it is persistent and global for all users. The configuration context specified at initialization time serves later as defaults for unspecified parameters when Text Mining functions are invoked on the given reference data.

! Advanced function

This function is typically not used. Initialization of Text Mining is normally done separately when the full text index is created for a given reference table and column. Since the Text Mining context is persistent and global for all users, that is the best way to assure consistent results and avoid confusion.

This initialize() function provides a way to directly initialize Text Mining for development purposes or special customer applications. If this function is used carelessly, it can unexpectedly affect other running applications.

Parameters:

Name Type Description

p

object

Encapsulates initialize parameters.

Properties

Name	Type	Argument	Description
`configuration`	string	<optional>	Repository path to the configuration. If omitted, the default configuration is used.
`list of parameters...`	*	<optional> <repeatable>	Text Mining parameters and defaults to use for this reference table and column that override what is specified in the configuration. See the SAP HANA Text Mining Developer Guide for details.

Throws:

Throws an error if the parameters object is not valid or the execution fails.

Example

TM.initialize({
    configuration: "acme.textmining::defaults.textminingconfig",
    minTermFrequency : 3,
    maxTermFrequency : 100,
});

Type Definitions

CategoryResult

Represents a single category value result from Text Mining categorization.

Type:

object

Properties:

Name	Type	Description
`categorySet`	string	The name of the category set column in which this category value occurs.
`category`	string	The category value. One or more reference documents in the group of K nearest neighbors were assigned this category.
`documentCount`	integer	The number of reference documents in the group of K nearest neighbors that were assigned this category value.
`score`	number	The score of this category value in the range [0,1].

DocumentResult

Represents a single document result from certain Text Mining methods.

Type:

object

Properties:

Name	Type	Description
`includeColumns...`	*	The requested columns from the reference table to be included in the result table, as specified via the includeColumns input parameter. These are returned as separate columns with the same names and types as the original specified include columns.
`id`	integer	The document ID number used internally by Text Mining. This can be used in subsequent Text Mining method calls in the inputDocumentIDs parameter for faster performance.
`termCountTotal`	integer	The total number of terms in this document, including duplicates.
`termCount`	integer	The number of different terms in this document.
`correlation1...correlationN`	number	These appear if the document correlation matrix was requested. The columns of the document correlation matrix contain the correlation values for this document and each of the other returned documents. N is the number of returned documents. The document correlation matrix is a square matrix where the rows and the columns each list all the returned documents in order. The matrix portrays every combination of the returned document pairs, with a duplicate reflection across the diagonal of the matrix. Each cell of the matrix contains the correlation value for the two documents at that row and column, based on the co-occurrence of their terms in the reference documents.
`factor1...factorN rotation1...rotationN`	number	These appear if principal components analysis was requested. The factor and rotation values from principal component analysis (dimensionality reduction) for this document. N is the number of principal components requested via the principalComponents input parameter.
`clusteringLevel`	number	This appears if clustering was requested. The clustering level for this document.
`clusteringLeft clusteringRight`	integer	These appear if clustering was requested. The clustering left value and right value for this document.
`score`	number	The score of this document in the range [0,1]

TermResult

Represents a single term result from certain Text Mining methods.

Type:

object

Properties:

Name	Type	Description
`term`	string	The term.
`termNormalized`	string	The normalized version of this term. Text Mining terms are normalized with respect to capitalization, whitespace, and accentuation.
`termType`	string	The type of this term, an entity type or part-of-speech.
`id`	integer	The term ID number used internally by Text Mining. This can be used in subsequent Text Mining method calls in the inputTermIDs parameter for faster performance.
`frequencyTotal`	integer	The total number of times this term occurs in the reference documents.
`frequencyDocumentCount`	integer	The number of reference documents in which this term occurs.
`correlation1...correlationN`	number	These appear if the term correlation matrix was requested (not available with the getSuggestedTerms method). The columns of the term correlation matrix contain the correlation values for this term and each of the other returned terms. N is the number of returned terms. The term correlation matrix is a square matrix where the rows and the columns each list all the returned terms in order. The matrix portrays every combination of the returned term pairs, with a duplicate reflection across the diagonal of the matrix. Each cell of the matrix contains the correlation value for the two terms at that row and column, based on their co-occurrence in the reference documents.
`factor1...factorN rotation1...rotationN`	number	These appear if principal components analysis was requested (not available with the getSuggestedTerms method). The factor and rotation values from principal component analysis (dimensionality reduction) for this term. N is the number of principal components requested via the principalComponents input parameter.
`clusteringLevel`	number	This appears if clustering was requested (not available with the getSuggestedTerms method). The clustering level for this term.
`clusteringLeft clusteringRight`	integer	These appear if clustering was requested (not available with the getSuggestedTerms method). The clustering left value and right value for this term.
`score`	number	The score of this term in the range [0,1].

AntiVirus	$.security
Application	$
Body	$.web
CallableStatement	$.db
Client	$.net.http
ColumnMetadata	$.hdb
Connection	$.db
Connection	$.hdb
Destination	$.net
Destination	$.net.http
EntityList	$.web
Job	$.jobs
JobLog	$.jobs
JobSchedules	$.jobs
Mail	$.net
ParameterMetaData	$.db
Part	$.net.Mail
PreparedStatement	$.db
ProcedureResult	$.hdb
Request	$.net.http
ResultSet	$.db
ResultSet	$.hdb
ResultSetIterator	$.hdb
ResultSetMetaData	$.db
ResultSetMetaData	$.hdb
SAXParser	$.util
Session	$
Session	$.text.analysis
Session	$.text.mining
SMTPConnection	$.net
SQLException	$.db
SQLException	$.hdb
Store	$.security
TupelList	$.web
WebEntityRequest	$.web
WebEntityResponse	$.web
WebRequest	$.web
WebResponse	$.web
Zip	$.util

Class: Session

new Session(p) → {$.text.mining.Session}

Parameters:

Properties

Returns:

Example

Methods

categorizeKNN(p) → {Array.<$.text.mining.Session~CategoryResult>}

Parameters:

Properties

Throws:

Returns:

Example

getRelatedDocuments(p) → {Array.<$.text.mining.Session~DocumentResult>}

Parameters:

Properties

Throws:

Returns:

Example

getRelatedTerms(p) → {Array.<$.text.mining.Session~TermResult>}

Parameters:

Properties

Throws:

Returns:

Example

getRelevantDocuments(p) → {Array.<$.text.mining.Session~DocumentResult>}

Parameters:

Properties

Throws:

Returns:

Example

getRelevantTerms(p) → {Array.<$.text.mining.Session~TermResult>}

Parameters:

Properties

Throws:

Returns:

Example

getSuggestedTerms(p) → {Array.<$.text.mining.Session~TermResult>}

Parameters:

Properties

Throws:

Returns:

Example

initialize(p)

Parameters:

Properties

Throws:

Example

Type Definitions

CategoryResult

Type:

Properties:

DocumentResult

Type:

Properties:

TermResult

Type:

Properties: