Skip to content

A local graph partitioning algorithm based on a nonlinear modification of the PageRank algorithm.

License

Notifications You must be signed in to change notification settings

DmsPas/Nonlinear_modified_PageRank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nonlinear modified PageRank problem for local graph partitioning

This repository contains the official code for the paper "Nonlinear modified PageRank problem for local graph partitioning", available online at arXiv.

Initial data p Clusters

In this work, we developed a new method for local graph partitioning based on a nonlinear generalisation of the PageRank problem involving the Moore-Penrose inverse of the incidence matrix. The Levenberg-Marquardt method is used with a full rank Jacobian variant to obtain a numerical solution to the generalised problem. Sets of vertices are formed according to the ranking supplied by the solution, and a conductance criterion decides upon the set that best represents the local cluster around a starting vertex. Experiments on both synthetic and real-world inspired graphs demonstrate the capability of the approach to not only produce low conductance sets, but to also recover local clusters with an accuracy that consistently surpasses state-of-the-art algorithms.

Motivation & contributions

This work takes the system of linear equations that forms the PageRank problem as the starting point in the construction of a local clustering algorithm. Only clusters around a single vertex are considered. We study a modification to the PageRank problem, and the theoretical motivation for a nonlinear generalisation involving the p-norm. The Moore-Penrose inverse of the incidence matrix plays a very important role in this generalisation, which can reduce to a system of linear equations closely resembling the original modified PageRank problem. Additionally, an insight into the effect of the generalised problem on the cluster criterion based on an infinitesimal perturbation argument is offered. The Levenberg-Marquardt method with a full rank Jacobian variant is suggested for obtaining a numerical solution to the generalised problem.

Requirement

Our code uses and adapts the Levenberg-Marquardt implementation from immoptibox. The necessary paths are included in the script addpaths_NPR.m. The code has been tested in Mac and Ubuntu distributions. All the algorithms are implemented in MATLAB R2021b.

Input data

The complete list of datasets used in this study can be found at this permalink. A small subset of this data is stored in Input/.

Usage

The main script Benchmark_NPR.m runs the experiments on the input data. For the ORBIS real-world experiment, use the script Benchmark_NPR_Orbis.m.

Run: To perform local graph clustering on the available input data include the desired dataset in cases{} (Line 16), and type in the matlab command line

>> cd src/
>> Benchmark_NPR

or

>> cd src/
>> Benchmark_NPR_Orbis

The following parameters (Lines 30 - 38) control important features of the algorithm:

print_level          = 2;     % 0/1/2 for no print/low/high verbosity
p_levels             = [1.95; 1.9; 1.8; 1.7; 1.6; 1.5; 1.45]; % levels of p
% β = (1 - α)/α, where α is the teleportation constant
beta                 = 0.01;      % for majority of cases
num_trials           = 10;      % number of different seed nodes
norm_Lap             = 2;       % 0/1/2 for combinatorial/norm. symmetric/random walk Laplacian
write_output_to_file = false;   % write output to file

Output: The output is printed in the command window for each case under question, or saved to a file if write_output_to_file = true. It includes the index of the seed node (s_node), the value of p at which the best partitioning was found (best_p), the values of the best conductance (RCCut) and F-score, and the elapsed time in seconds of the NPR algorithm per starting node. Mean results are then displayed with standard deviation based on all the seed nodes. The ORBIS experiments have additional visual output, illustrating the classification of the Roman settlements on the map. An example is offered below.

Const clusters Lond clusters Roma clusters

Code Structure

The structure of the files in this repository is as follows:

├── Input                       # input graphs in .mat format
├── immoptibox                  # DTU toolbox (modified) with functions for optimization and data fitting
├── src                         # source code of the paper
├── src/Optimization            # objective, Jacobian, and Hessian computations
├── src/Metrics                 # evaluation of the clustering results
├── src/Graph_Construction      # graph generation code
├── src/Visualization           # graph visualization code and some figures

Further details can be found within the code.

Maintainers

Citation

@misc{KoPa2024,
      title={Nonlinear Modified PageRank Problem for Local Graph Partitioning}, 
      author={Costy Kodsi and Dimosthenis Pasadakis},
      year={2024},
      eprint={2409.01834},
      archivePrefix={arXiv},
      primaryClass={math.NA},
      url={https://arxiv.org/abs/2409.01834}, 
}

About

A local graph partitioning algorithm based on a nonlinear modification of the PageRank algorithm.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages