PageRank¶
- class hana_ml.algorithms.pal.pagerank.PageRank(damping=None, max_iter=None, tol=None, thread_ratio=None)¶
A page rank model.
- Parameters
- dampingfloat, optional
The damping factor d.
Defaults to 0.85.
- max_iterint, optional
The maximum number of iterations of power method.
The value 0 means no maximum number of iterations is set and the calculation stops when the result converges.
Defaults to 0.
- tolfloat, optional
Specifies the stop condition.
When the mean improvement value of ranks is less than this value, the program stops calculation.
Defaults to 1e-6.
- thread_ratiofloat, optional
Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.
Defaults to 0.
- Attributes
- None
Methods
run(data)This method reads link information and calculates rank for each node.
Examples
Input DataFrame df:
>>> df.collect() FROM_NODE TO_NODE 0 Node1 Node2 1 Node1 Node3 ... 6 Node4 Node1 7 Node4 Node3
Create a PageRank instance:
>>> pr = PageRank() >>> pr.run(data=df).collect() NODE RANK 0 NODE1 0.368152 1 NODE2 0.141808 2 NODE3 0.287962 3 NODE4 0.202078
- run(data)¶
This method reads link information and calculates rank for each node.
- Parameters
- dataDataFrame
Data for predicting the class labels.
- Returns
- DataFrame
Calculated rank values and corresponding node names.