PageRank
- class hana_ml.algorithms.pal.pagerank.PageRank(damping=None, max_iter=None, tol=None, thread_ratio=None)
A page rank model.
- Parameters:
- dampingfloat, optional
The damping factor d.
Defaults to 0.85.
- max_iterint, optional
The maximum number of iterations of power method.
The value 0 means no maximum number of iterations is set and the calculation stops when the result converges.
Defaults to 0.
- tolfloat, optional
Specifies the stop condition.
When the mean improvement value of ranks is less than this value, the program stops calculation.
Defaults to 1e-6.
- thread_ratiofloat, optional
Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.
Defaults to 0.
Examples
Input DataFrame df:
>>> df.collect() FROM_NODE TO_NODE 0 Node1 Node2 1 Node1 Node3 ... 6 Node4 Node1 7 Node4 Node3
Create a PageRank instance:
>>> pr = PageRank() >>> pr.run(data=df).collect() NODE RANK 0 NODE1 0.368152 1 NODE2 0.141808 2 NODE3 0.287962 3 NODE4 0.202078
- Attributes:
- None
Methods
Get the model metrics.
Get the score metrics.
run
(data)This method reads link information and calculates rank for each node.
- run(data)
This method reads link information and calculates rank for each node.
- Parameters:
- dataDataFrame
Data for predicting the class labels.
- Returns:
- DataFrame
Calculated rank values and corresponding node names.
- get_model_metrics()
Get the model metrics.
- Returns:
- DataFrame
The model metrics.
- get_score_metrics()
Get the score metrics.
- Returns:
- DataFrame
The score metrics.
Inherited Methods from PALBase
Besides those methods mentioned above, the PageRank class also inherits methods from PALBase class, please refer to PAL Base for more details.