PageRank

class hana_ml.algorithms.pal.pagerank.PageRank(damping=None, max_iter=None, tol=None, thread_ratio=None)

A page rank model.

Parameters:
dampingfloat, optional

The damping factor d.

Defaults to 0.85.

max_iterint, optional

The maximum number of iterations of power method.

The value 0 means no maximum number of iterations is set and the calculation stops when the result converges.

Defaults to 0.

tolfloat, optional

Specifies the stop condition.

When the mean improvement value of ranks is less than this value, the program stops calculation.

Defaults to 1e-6.

thread_ratiofloat, optional

Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.

Defaults to 0.

Examples

Input DataFrame df:

>>> df.collect()
   FROM_NODE    TO_NODE
0   Node1       Node2
1   Node1       Node3
...
6   Node4       Node1
7   Node4       Node3

Create a PageRank instance:

>>> pr = PageRank()
>>> pr.run(data=df).collect()
    NODE    RANK
0   NODE1   0.368152
1   NODE2   0.141808
2   NODE3   0.287962
3   NODE4   0.202078
Attributes:
None

Methods

run(data)

This method reads link information and calculates rank for each node.

run(data)

This method reads link information and calculates rank for each node.

Parameters:
dataDataFrame

Data for predicting the class labels.

Returns:
DataFrame

Calculated rank values and corresponding node names.

Inherited Methods from PALBase

Besides those methods mentioned above, the PageRank class also inherits methods from PALBase class, please refer to PAL Base for more details.