-
Notifications
You must be signed in to change notification settings - Fork 0
/
README
executable file
·126 lines (104 loc) · 5.39 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
This clone of cuVite will modify the graph data structs to eliminate the CPU overhead for heterogeneous CPU and CPU implementation
*************
-------------
COMPILATION
-------------
*************
Pass -DDEBUG_PRINTF if detailed diagonostics is required along
program run. This program requires OpenMP and C++11 support,
so pass -fopenmp (for g++)/-qopenmp (for icpc) and -std=c++11/
-std=c++0x. For Cray systems, use CC.
Pass -DUSE_32_BIT_GRAPH if number of nodes in the graph are
within 32-bit range (2 x 10^9), else 64-bit range is assumed.
Pass -DDONT_CREATE_DIAG_FILES if you dont want to create 2 files
per process with detail diagonostics.
Upon building, the program will generate a binary file named
`graphClustering`.
Cmake build option:
$ mkdir build install
$ cd build
$ cmake ../ -DCMAKE_INSTALL_PREFIX=~/vite-gpu/install -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx
$ make -j4
$ make install
*******************
-------------------
GRAPH CONVERSION
-------------------
*******************
The file format of cuVite is compatible with Vite, so please see Vite
README for information on binary file conversion from the various native
formats.
https://github.com/Exa-Graph/vite/blob/master/README
Also, see above for comparing communities with a ground-truth.
***********************
-----------------------
EXECUTING THE PROGRAM
-----------------------
***********************
0. Input file conversion to Vite binary format from the respective native
file formats: convert the input file to binary format using fileConvert.
1. Set the desired number of processes and threads.
2. Run distributed parallel Louvain algorithm (graphClustering binary)
using the binary file created in Step #0:
mpiexec -n 2 bin/./graphClustering -c -i -f karate.bin
Possible options (can be combined):
1. -c <ncolors> : Enable coloring, adjacent vertices are processed
in different order. Synchronization happens once
per <ncolors>.
2. -d <ncolors> : Enable coloring, adjacent vertices are processes
in different order. No synchronization.
3. -i : Threshold cycling, threshold changes dynamically
in every phase of Louvain algorithm.
4. -t 1 : If a vertex is in the same community for past 3
iterations, then consider it inactive.
5. -t 3 : If a vertex is in the same community for past 3
iterations, then consider it inactive. Also,
adds a communication step to gather inactive
vertices and terminate Louvain if >= 90% vertices
at an iteration are inactive.
6. -t 2 -a <0-1> : Early termination* using probability alpha.
7. -t 4 -a <0-1> : Early termination* using probability alpha. Also,
adds a communication step to gather inactive
vertices and terminate Louvain if >= 90% vertices
at an iteration are inactive.
8. -b : Only valid for real-world inputs. Attempts to
distribute approximately equal number of edges among
processes. Irregular number of vertices owned by a
particular process. Increases the distributed graph
creation time due to serial overheads, but may
significantly improve the overall execution time.
9. -o : Output communities into a file. This option will result
in Vite dumping the communities (community-per-vertex in
each line, total number of lines == number of vertices)
in a text file named <input-binary-file>.communities in
the same path as the input binary file.
10. -r <nranks> : This is used to control the number of aggregators in MPI
I/O and is meaningful when an input binary graph file is
passed with option "-f".
naggr := (nranks > 1) ? (nprocs/nranks) : nranks;
11. -g <gfile> : Pass a ground truth file for community comparison. We
expect the ground truth file to contain N lines (equal to
the total #vertices in the graph), while each line containing
a distinct vertex ID and associated community ID, separated by
a space or tab. Ground truth community comparison is performed
in a single node, and it uses OpenMP to parallelize. It may take
a substantial amount of time for large files.
12. -z : Only applicable if "-g <gfile>" option is passed. This tells us
that the passed ground truth file is 1-based. If this option is
not passed, we assume the ground truth to the 0-based.
*Note: Option to update inactive vertices percentage is defined as
macro ET_CUTOFF in louvain.hpp, and the default is 2%.
*Note: Only the default version has been ported on GPUs, most of the
heuristics mentioned above will only work for the CPU-only version.
See Vite: https://github.com/Exa-Graph/vite/blob/master/README
***************
---------------
OUTPUT RESULT
---------------
***************
If -DDONT_CREATE_DIAG_FILES is passed during compilation,
then output is send to stdout.
Otherwise, the output result is dumped per process on files
named as dat.out.<process-id>. Check dat.out.0 to review
program diagonostics.
Output files are cleared with: `make clean`.