R: Page Rank

hanaml.PageRank {hana.ml.r}

R Documentation

Page Rank

Description

hanaml.PageRank is an R wrapper for PAL page rank algorithm.

Usage

hanaml.PageRank(conn.context,
                 data,
                 used.cols = NULL,
                 damping = NULL,
                 max.iter = NULL,
                 tol = NULL)

Arguments

`conn.context`	`ConnectionContext` Connection to the SAP HANA system.
`data`	`DataFrame` DataFrame containing links among all nodes in a social network.
`used.cols`	`list of characters, optional` This parameter specifies the two columns for source and sink nodes of all those links in `data`. In the settings here, source node named "source" and sink node named "sink". Defaults to the 1st and 2nd column of `data` if not provided.
`damping`	`double, optional` The damping factor for PageRank scores. Defautls to 0.85.
`max.iter`	`integer, optional` The maximum number of iterations of power method for solving the PageRank problem. The value 0 means no maximum number of iterations is set, and the calculation stops when the result converges. Defaults to 0.
`tol`	`double, optional` The stopping criterion for power method. When the mean improvement value of ranks is less than this value, the program stops calculation. Defaults to 1e-6.

Details

PageRank is an algorithm used by a search engine to measure the importance of website pages. A website page is considered more important if it receives more links from other websites. PageRank represents the likelihood that a visitor will visit a particular page by randomly clicking of other webpages. Higher rank in PageRank means greater probability of the site being reached.

Value

result: DataFrame
The data frame that contains the ranking scores of all nodes in a network.

Examples

## Not run: 
Social networks data that contain existing links between nodes:\cr

> df
  FROM_NODE TO_NODE
1     Node1   Node2
2     Node1   Node3
3     Node1   Node4
4     Node2   Node3
5     Node2   Node4
6     Node3   Node1
7     Node4   Node1
8     Node4   Node3

Creating a PageRank instance for calculating the ranking scores of all nodes in the network:\cr

> pr <- hanaml.PageRank(conn.context = conn,
                        data = df,
                        used.cols = c(source = "FROM_NODE",
                                      sink = "TO_NODE"),
                        damping = 0.85)

Computed ranking result:\cr

> pr$result
   NODE      RANK
1 Node1 0.3681516
2 Node2 0.1418082
3 Node3 0.2879621
4 Node4 0.2020780

## End(Not run)

[Package hana.ml.r version 1.0.8 Index]