tfgraph.algorithms.pagerank

tfgraph.algorithms.pagerank Module

This module contains a set of PageRank’s algorithm implementations on the tfgraph module.

PageRank

class tfgraph.algorithms.pagerank.pagerank.PageRank(sess: tensorflow.python.client.session.Session, name: str, beta: float, T: tfgraph.algorithms.pagerank.transition.transition.Transition, writer: tensorflow.python.summary.writer.writer.FileWriter = None, is_sparse: bool = False) → None[source]

PageRank base class.

This class model the PageRank algorithm as Abstract Class containing all methods that the heir classes need to implements. Also, this class provides a set of attributes that helps to implement the algorithm.

The PageRank algorithm calculates the rank of each vertex in a graph based on the relational structure from them and giving more importance to the vertices that connects with edges to vertices with very high in-degree recursively.

This class depends on the TensorFlow library, so it’s necessary to install it to properly work.

sess

tf.Session – This attribute represents the session that runs the TensorFlow operations.

name

str – This attribute represents the name of the object in TensorFlow’s op Graph.

G

tfgraph.Graph – The graph on witch it will be calculated the algorithm. It will be treated as Directed Weighted Graph.

beta

float – The reset probability of the random walks, i.e. the probability that a user that surfs the graph an decides to jump to another vertex not connected to the current.

T

tfgraph.Transition – The transition matrix that provides the probability distribution relative to the walk to another node of the graph.

v

tf.Variable – The stationary distribution vector. It contains the normalized probability to stay in each vertex of the graph. So represents the PageRank ranking of the graph.

writer

tf.summary.FileWriter – This attribute represents a TensorFlow’s Writer, that is used to obtain stats.

is_sparse

bool – Use sparse Tensors if it’s set to True. Not implemented yet.

__init__(sess: tensorflow.python.client.session.Session, name: str, beta: float, T: tfgraph.algorithms.pagerank.transition.transition.Transition, writer: tensorflow.python.summary.writer.writer.FileWriter = None, is_sparse: bool = False) → None[source]

The constructor of the class.

This method initializes all the attributes needed to compute the PageRank of the graph.

Parameters:
  • sess (tf.Session) – This attribute represents the session that runs the TensorFlow operations.
  • name (str) – This attribute represents the name of the object in TensorFlow’s op Graph.
  • beta (float) – The reset probability of the random walks, i.e. the probability that a user that surfs the graph an decides to jump to another vertex not connected to the current.
  • T (tfgraph.Transition) – The transition matrix that provides the probability distribution relative to the walk to another node of the graph.
  • v (tf.Variable) – The stationary distribution vector. It contains the normalized probability to stay in each vertex of the graph. So represents the PageRank ranking of the graph.
  • writer (tf.summary.FileWriter) – This attribute represents a TensorFlow’s Writer, that is used to obtain stats.
  • is_sparse (bool) – Use sparse Tensors if it’s set to True. Not implemented yet.
error_vector_compare_np(other_pr: tfgraph.algorithms.pagerank.pagerank.PageRank, k: int = -1) → numpy.ndarray[source]

The comparison method between two PageRank algorithm results.

This method compares the self PageRank with another one passed as parameter of the function. The comparison is based on the difference of the Norm One of each v vector.

The method also provides a k parameter as option to base the comparison only the k better ranked vertices.

Parameters:
  • other_pr (tfgraph.PageRank) – Another PageRank object to compare the resulting ranking.
  • k (int, optional) – An additional parameter that allows to base the comparison only on the k better vertices. Not implemented yet.
Returns:

A np.ndarray with 0-D shape, that represents the

difference between the two rankings using the Norm One.

Return type:

(np.ndarray)

error_vector_compare_tf(other_pr: tfgraph.algorithms.pagerank.pagerank.PageRank, k: int = -1) → tensorflow.python.framework.ops.Tensor[source]

The comparison method between two PageRank algorithm results.

This method compares the self PageRank with another one passed as parameter of the function. The comparison is based on the difference of the Norm One of each v vector.

The method also provides a k parameter as option to base the comparison only the k better ranked vertices.

Parameters:
  • other_pr (tfgraph.PageRank) – Another PageRank object to compare the resulting ranking.
  • k (int, optional) – An additional parameter that allows to base the comparison only on the k better vertices. Not implemented yet.
Returns:

A tf.Tensor with 0-D shape, that represents the

difference between the two rankings using the Norm One.

Return type:

(tf.Tensor)

Todo

  • Implement ranking based only on the k better ranked vertices.
pagerank_vector_np(convergence: float = 1.0, steps: int = 0, topics: typing.List[int] = None, c_criterion=<function ConvergenceCriterion.<lambda>>) → numpy.ndarray[source]

The Method that runs the PageRank algorithm

This method returns a Numpy Array that contains the result of running the PageRank algorithm customized by the parameters passed to it.

This method acts as interface between the algorithm and the external classes, so it contains a set of parameters that in some implementations of PageRank algorithms will not be needed. All the parameters is defined as optional for this reason.

Parameters:
  • convergence (float, optional) – A float between 0 and 1 that represents the convergence rate that allowed to finish the iterative implementations of the algorithm to accept the solution. It has more preference than the steps parameter. Default to 1.0.
  • steps (int, optional) – A positive integer that sets the number of iterations that the iterative implementations will run the algorithm until finish. It has less preference than the convergence parameter. Default to 0.
  • topics (list of int, optional) – A list of integers that represent the set of vertex where the random jumps arrives. If this parameter is used, the uniform distribution over all vertices of the random jumps will be modified to jump only to this vertex set. Default to None.
  • c_criterion (function, optional) – The function used to calculate if the Convergence Criterion of the iterative implementations is reached. Default to tfgraph.ConvergenceCriterion.ONE.
Returns:

A 1-D np.ndarray of [n] shape, where n is the

cardinality of the graph vertex set. It contains the normalized rank of vertex i at position i.

Return type:

(np.ndarray)

pagerank_vector_tf(convergence: float = 1.0, steps: int = 0, topics: typing.List[int] = None, topics_decrement: bool = False, c_criterion=<function ConvergenceCriterion.<lambda>>) → tensorflow.python.framework.ops.Tensor[source]

The Method that runs the PageRank algorithm

This method generates a TensorFlow graph of operations needed to calculate the PageRank Algorithm and sets to it different parameters passed as parameters.

This method acts as interface between the algorithm and the external classes, so it contains a set of parameters that in some implementations of PageRank algorithms will not be needed. All the parameters is defined as optional for this reason.

Parameters:
  • convergence (float, optional) – A float between 0 and 1 that represents the convergence rate that allowed to finish the iterative implementations of the algorithm to accept the solution. It has more preference than the steps parameter. Default to 1.0.
  • steps (int, optional) – A positive integer that sets the number of iterations that the iterative implementations will run the algorithm until finish. It has less preference than the convergence parameter. Default to 0.
  • topics (list of int, optional) – A list of integers that represent the set of vertex where the random jumps arrives. If this parameter is used, the uniform distribution over all vertices of the random jumps will be modified to jump only to this vertex set. Default to None.
  • topics_decrement (bool, optional) – If topics is not None and topics_decrement is True the topics will be casted to 0-Index. Default ` to False`.
  • c_criterion (function, optional) – The function used to calculate if the Convergence Criterion of the iterative implementations is reached. Default to tfgraph.ConvergenceCriterion.ONE.
Returns:

A 1-D tf.Tensor of [n] shape, where n is the

cardinality of the graph vertex set. It contains the normalized rank of vertex i at position i.

Return type:

(tf.Tensor)

ranks_np(convergence: float = 1.0, steps: int = 0, topics: typing.List[int] = None, topics_decrement: bool = False) → numpy.ndarray[source]

Generates a ranked version of PageRank results.

This method returns the PageRank ranking of the graph sorted by the position of each vertex in the rank. So it generates a 2-D matrix with shape [n,2] where n is the cardinality of the vertex set of the graph, and at the first column it contains the index of vertex and the second column contains it normalized rank. The i row is referred to the vertex with i position in the rank.

Parameters:
  • convergence (float, optional) – A float between 0 and 1 that represents the convergence rate that allowed to finish the iterative implementations of the algorithm to accept the solution. It has more preference than the steps parameter. Default to 1.0.
  • steps (int, optional) – A positive integer that sets the number of iterations that the iterative implementations will run the algorithm until finish. It has less preference than the convergence parameter. Default to 0.
  • topics (list of int, optional) – A list of integers that represent the set of vertex where the random jumps arrives. If this parameter is used, the uniform distribution over all vertices of the random jumps will be modified to jump only to this vertex set. Default to None.
  • topics_decrement (bool, optional) – If topics is not None and topics_decrement is True the topics will be casted to 0-Index. Default ` to False`.
Returns:

A 2-D np.ndarray than represents a sorted PageRank

ranking of the graph.

Return type:

(np.ndarray)

update_edge(edge: numpy.ndarray, change: float) → None[source]

The callback to receive notifications about edge changes in the graph.

This method is called from the Graph when an addition or deletion is produced on the edge set. So probably is necessary to recompute the PageRank ranking.

Parameters:
  • edge (np.ndarray) – A 1-D np.ndarray that represents the edge that changes in the graph, where edge[0] is the source vertex, and edge[1] the destination vertex.
  • change (float) – The variation of the edge weight. If the final value is 0.0 then the edge is removed.
Returns:

This method returns nothing.

AlgebraicPageRank

class tfgraph.algorithms.pagerank.algebraic_pagerank.AlgebraicPageRank(sess: tensorflow.python.client.session.Session, name: str, graph: tfgraph.graph.graph.Graph, beta: float, writer: tensorflow.python.summary.writer.writer.FileWriter = None, is_sparse: bool = False) → None[source]

The Algebraic PageRank implementation.

This class acts as the algebraic algorithm to obtain the PageRank ranking of a graph.

The PageRank algorithm calculates the rank of each vertex in a graph based on the relational structure from them and giving more importance to the vertices that connects with edges to vertices with very high in-degree recursively.

This class depends on the TensorFlow library, so it’s necessary to install it to properly work.

sess

tf.Session – This attribute represents the session that runs the TensorFlow operations.

name

str – This attribute represents the name of the object in TensorFlow’s op Graph.

beta

float – The reset probability of the random walks, i.e. the probability that a user that surfs the graph an decides to jump to another vertex not connected to the current.

T

tfgraph.Transition – The transition matrix that provides the probability distribution relative to the walk to another node of the graph.

v

tf.Variable – The stationary distribution vector. It contains the normalized probability to stay in each vertex of the graph. So represents the PageRank ranking of the graph.

writer

tf.summary.FileWriter – This attribute represents a TensorFlow’s Writer, that is used to obtain stats.

is_sparse

bool – Use sparse Tensors if it’s set to True. Not implemented yet.

__init__(sess: tensorflow.python.client.session.Session, name: str, graph: tfgraph.graph.graph.Graph, beta: float, writer: tensorflow.python.summary.writer.writer.FileWriter = None, is_sparse: bool = False) → None[source]

Constructor of the class.

This method initializes the attributes needed to run the Algebraic version of PageRank algorithm. It uses the tfgraph.TransitionMatrix as transition matrix.

Parameters:
  • sess (tf.Session) – This attribute represents the session that runs the TensorFlow operations.
  • name (str) – This attribute represents the name of the object in TensorFlow’s op Graph.
  • G (tfgraph.Graph) – The graph on witch it will be calculated the algorithm. It will be treated as Directed Weighted Graph.
  • beta (float) – The reset probability of the random walks, i.e. the probability that a user that surfs the graph an decides to jump to another vertex not connected to the current.
  • v (tf.Variable) – The stationary distribution vector. It contains the normalized probability to stay in each vertex of the graph. So represents the PageRank ranking of the graph.
  • writer (tf.summary.FileWriter) – This attribute represents a TensorFlow’s Writer, that is used to obtain stats.
  • is_sparse (bool) – Use sparse Tensors if it’s set to True. Not implemented yet.
update_edge(edge: numpy.ndarray, change: float) → None[source]

The callback to receive notifications about edge changes in the graph.

This method is called from the Graph when an addition or deletion is produced on the edge set. So probably is necessary to recompute the PageRank ranking.

Parameters:
  • edge (np.ndarray) – A 1-D np.ndarray that represents the edge that changes in the graph, where edge[0] is the source vertex, and edge[1] the destination vertex.
  • change (float) – The variation of the edge weight. If the final value is 0.0 then the edge is removed.
Returns:

This method returns nothing.

IterativePageRank

class tfgraph.algorithms.pagerank.iterative_pagerank.IterativePageRank(sess: tensorflow.python.client.session.Session, name: str, graph: tfgraph.graph.graph.Graph, beta: float, writer: tensorflow.python.summary.writer.writer.FileWriter = None, is_sparse: bool = False) → None[source]

The Iterative PageRank implementation.

This class acts as the iterative algorithm to obtain the PageRank ranking of a graph.

The PageRank algorithm calculates the rank of each vertex in a graph based on the relational structure from them and giving more importance to the vertices that connects with edges to vertices with very high in-degree recursively.

This class depends on the TensorFlow library, so it’s necessary to install it to properly work.

sess

tf.Session – This attribute represents the session that runs the TensorFlow operations.

name

str – This attribute represents the name of the object in TensorFlow’s op Graph.

beta

float – The reset probability of the random walks, i.e. the probability that a user that surfs the graph an decides to jump to another vertex not connected to the current.

T

tfgraph.Transition – The transition matrix that provides the probability distribution relative to the walk to another node of the graph.

v

tf.Variable – The stationary distribution vector. It contains the normalized probability to stay in each vertex of the graph. So represents the PageRank ranking of the graph.

writer

tf.summary.FileWriter – This attribute represents a TensorFlow’s Writer, that is used to obtain stats.

is_sparse

bool – Use sparse Tensors if it’s set to True. Not implemented yet.

iter

tf.Tensor – The operation that will be repeated in each iteration of the algorithm.

__init__(sess: tensorflow.python.client.session.Session, name: str, graph: tfgraph.graph.graph.Graph, beta: float, writer: tensorflow.python.summary.writer.writer.FileWriter = None, is_sparse: bool = False) → None[source]

Constructor of the class.

This method initializes the attributes needed to run the Algebraic version of PageRank algorithm. It uses the tfgraph.TransitionResetMatrix as transition matrix between vertex.

Parameters:
  • sess (tf.Session) – This attribute represents the session that runs the TensorFlow operations.
  • name (str) – This attribute represents the name of the object in TensorFlow’s op Graph.
  • G (tfgraph.Graph) – The graph on witch it will be calculated the algorithm. It will be treated as Directed Weighted Graph.
  • beta (float) – The reset probability of the random walks, i.e. the probability that a user that surfs the graph an decides to jump to another vertex not connected to the current.
  • v (tf.Variable) – The stationary distribution vector. It contains the normalized probability to stay in each vertex of the graph. So represents the PageRank ranking of the graph.
  • writer (tf.summary.FileWriter) – This attribute represents a TensorFlow’s Writer, that is used to obtain stats.
  • is_sparse (bool) – Use sparse Tensors if it’s set to True. Not implemented yet.
update_edge(edge: numpy.ndarray, change: float) → None[source]

The callback to receive notifications about edge changes in the graph.

This method is called from the Graph when an addition or deletion is produced on the edge set. So probably is necessary to recompute the PageRank ranking.

Parameters:
  • edge (np.ndarray) – A 1-D np.ndarray that represents the edge that changes in the graph, where edge[0] is the source vertex, and edge[1] the destination vertex.
  • change (float) – The variation of the edge weight. If the final value is 0.0 then the edge is removed.
Returns:

This method returns nothing.