hanaml.LinkPredict {hana.ml.r}R Documentation

Link Prediction

Description

hanaml.LinkPredict is an R wrapper for PAL link prediction algorithm.

Usage

hanaml.LinkPredict(conn.context,
                    data,
                    used.cols = NULL,
                    method,
                    katz.beta = NULL,
                    min.score = NULL)

Arguments

conn.context

ConnectionContext
Connection to the SAP HANA system.

data

DataFrame
DataFrame containing links among all nodes in a social network.

used.cols

list of character, optional
This parameter specifies the two columns for the two node information of all those links in data. In the settings here, one node is named "node1", and another node named "node2". Defaults to the 1st and 2nd column of data if not provided.

method

c("common.neighbors", "jaccard", "adamic.adar", "katz")
Method for predicting potential missings links between nodes.

katz.beta

double, optional
The beta parameter for the 'katz' method. The value should be between 0 and 1. Values closer to 0 are ususally prefered. Defaults to 0.005.

min.score

double, optional
Links prediction algorithms compute scores for all pair of nodes with missing links. A link is assumed to exist only if the computed score is above 'min.score', and the links whose scores are lower than this threshold will be filtered out from the result table. Defaults to 0.

Details

Predicting potential missing links between different nodes is a common task in social network analysis. Link prediction algorithms compute the distance of any two nodes using existing links in a social network, and make prediction on the missing links based on these distances..

Value

Examples

## Not run: 
Social networks data that contain existing links between nodes:\cr

> df
   NODE1 NODE2
1      1     2
2      1     4
3      2     3
4      3     4
5      5     1
6      6     2
7      7     4
8      7     5
9      6     7
10     5     4

Creating a LinkPredict instance for predicting potential missing links between all nodes:\cr
> lp <- hanaml.LinkPredict(conn.context = conn,
                           data = df,
                           used.cols = c(node1 = "NODE1",
                                         node2 = "NODE2"),
                           method = "common.neighbors")

Link prediction result:\cr
> lp$result
   NODE1 NODE2     SCORE
1      1     3 0.2857143
2      1     6 0.1428571
3      1     7 0.2857143
4      2     4 0.2857143
5      2     5 0.1428571
6      2     7 0.1428571
7      4     6 0.1428571
8      3     5 0.1428571
9      3     6 0.1428571
10     3     7 0.1428571
11     5     6 0.1428571

## End(Not run)

[Package hana.ml.r version 1.0.8 Index]