-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
24 changed files
with
18,885 additions
and
1 deletion.
There are no files selected for viewing
Submodule miniVite
deleted from
f4367f
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,270 @@ | ||
***************** | ||
* miniVite FAQs * | ||
***************** | ||
---------------------------------------------------- | ||
FYI, typical "How to run" queries are addressed Q5 | ||
onward. | ||
|
||
Please send your suggestions for improving this FAQ | ||
to zsayanz at gmail dot com OR hala at pnnl dot gov. | ||
---------------------------------------------------- | ||
|
||
------------------------------------------------------------------------- | ||
Q1. What is graph community detection? | ||
------------------------------------------------------------------------- | ||
|
||
A1. In most real-world graphs/networks, the nodes/vertices tend to be | ||
organized into tightly-knit modules known as communities or clusters, | ||
such that nodes within a community are more likely to be "related" to | ||
one another than they are to the rest of the network. The goodness of | ||
partitioning into communities is typically measured using a metric | ||
called modularity. Community detection is the method of identifying | ||
these clusters or communities in graphs. | ||
|
||
[References] | ||
|
||
Fortunato, Santo. "Community detection in graphs." Physics reports | ||
486.3-5 (2010): 75-174. https://arxiv.org/pdf/0906.0612.pdf | ||
|
||
-------------------------------------------------------------------------- | ||
Q2. What is miniVite? | ||
-------------------------------------------------------------------------- | ||
|
||
A2. miniVite is a distributed-memory code (or mini application) that | ||
performs partial graph community detection using the Louvain method. | ||
Louvain method is a multi-phase, iterative heuristic that performs | ||
modularity optimization for graph community detection. miniVite only | ||
performs the first phase of Louvain method. | ||
|
||
[Code] | ||
|
||
https://github.com/Exa-Graph/miniVite | ||
http://hpc.pnl.gov/people/hala/grappolo.html | ||
|
||
[References] | ||
|
||
Blondel, Vincent D., et al. "Fast unfolding of communities in large | ||
networks." Journal of statistical mechanics: theory and experiment | ||
2008.10 (2008): P10008. | ||
|
||
Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Gebremedhin AH. | ||
miniVite: A Graph Analytics Benchmarking Tool for Massively Parallel | ||
Systems. | ||
|
||
--------------------------------------------------------------------------- | ||
Q3. What is the parent application of miniVite? How are they different? | ||
--------------------------------------------------------------------------- | ||
|
||
A3. miniVite is derived from Vite, which implements the multi-phase | ||
Louvain method. Apart from a parallel baseline version, Vite provides | ||
a number of heuristics (such as early termination, threshold cycling and | ||
incomplete coloring) that can improve the scalability and quality of | ||
community detection. In contrast, miniVite just provides a parallel | ||
baseline version, and, has option to select different MPI communication | ||
methods (such as send/recv, collectives and RMA) for one of the most | ||
communication intensive portions of the code. miniVite also includes an | ||
in-memory random geometric graph generator, making it convenient for | ||
users to run miniVite without any external files. Vite can also convert | ||
graphs from different native formats (like matrix market, SNAP, edge | ||
list, DIMACS, etc) to the binary format that both Vite and miniVite | ||
requires. | ||
|
||
[Code] | ||
|
||
http://hpc.pnl.gov/people/hala/grappolo.html | ||
|
||
[References] | ||
|
||
Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Lu H, Chavarria-Miranda D, | ||
Khan A, Gebremedhin A. Distributed louvain algorithm for graph community | ||
detection. In 2018 IEEE International Parallel and Distributed Processing | ||
Symposium (IPDPS) 2018 May 21 (pp. 885-895). IEEE. | ||
|
||
Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Gebremedhin AH. | ||
Scalable Distributed Memory Community Detection Using Vite. | ||
In 2018 IEEE High Performance extreme Computing Conference (HPEC) 2018 | ||
Sep 25 (pp. 1-7). IEEE. | ||
|
||
----------------------------------------------------------------------------- | ||
Q4. Is there a shared-memory equivalent of Vite/miniVite? | ||
----------------------------------------------------------------------------- | ||
|
||
A4. Yes, Grappolo performs shared-memory community detection using Louvain | ||
method. Apart from community detection, Grappolo has routines for matrix | ||
reordering as well. | ||
|
||
[Code] | ||
|
||
http://hpc.pnl.gov/people/hala/grappolo.html | ||
|
||
[References] | ||
|
||
Lu H, Halappanavar M, Kalyanaraman A. Parallel heuristics for scalable | ||
community detection. Parallel Computing. 2015 Aug 1;47:19-37. | ||
|
||
Halappanavar M, Lu H, Kalyanaraman A, Tumeo A. Scalable static and dynamic | ||
community detection using grappolo. In High Performance Extreme Computing | ||
Conference (HPEC), 2017 IEEE 2017 Sep 12 (pp. 1-6). IEEE. | ||
|
||
------------------------------------------------------------------------------ | ||
Q5. How does one perform strong scaling analysis using miniVite? How to | ||
determine 'good' candidates (input graphs) that can be used for strong | ||
scaling runs? How much time is approximately spent in performing I/O? | ||
------------------------------------------------------------------------------ | ||
|
||
A5. Use a large graph as an input, preferably over a billion edges. Not all | ||
large graphs have a good community structure. You should be able to identify | ||
one that serves your purpose, hopefully after few trials. Graphs can be | ||
obtained various websites serving as repositories, such as Sparse TAMU | ||
collection[1], SNAP repository[2] and MIT Graph Challenge website[3], to name | ||
a few of the prominent ones. You can convert graphs from their native format to | ||
the binary format that miniVite requires, using the converters in Vite (please | ||
see README). If your graph is in Webgraph[4] format, you can easily convert it | ||
to an edge list first (example code snippet below), before passing it on to Vite | ||
for subsequent binary conversion. | ||
|
||
#include "offline_edge_iterator.hpp" | ||
... | ||
using namespace webgraph::ascii_graph; | ||
|
||
// read in input/output file | ||
std::ofstream ofile(argv[2]); | ||
offline_edge_iterator itor(argv[1]), end; | ||
|
||
// read edges | ||
while( itor != end ) { | ||
ofile << itor->first << " " << itor->second << std::endl; | ||
++itor; | ||
} | ||
ofile.close(); | ||
... | ||
|
||
Due to its simple vertex-based distribution, miniVite takes about 2-4s to read a 55GB | ||
binary file if you use Burst buffer (Cray DataWarp) or Lustre striping (about 25 OSTs, | ||
default 1M blocks). Hence, the overall I/O time that we have observed in most cases is | ||
within 1/2% of the overall execution time. | ||
|
||
[1] https://sparse.tamu.edu/ | ||
[2] http://snap.stanford.edu/data | ||
[3] http://graphchallenge.mit.edu/data-sets | ||
[4] http://webgraph.di.unimi.it/ | ||
|
||
----------------------------------------------------------------------------------- | ||
Q6. How does one perform weak scaling analysis using miniVite? How does one scale | ||
the graphs with processes? | ||
----------------------------------------------------------------------------------- | ||
|
||
A6. miniVite has an in-memory random geometric graph generator (please see | ||
README) that can be used for weak-scaling analysis. An n-D random geometric graph | ||
(RGG), is generated by randomly placing N vertices in an n-D space and connecting | ||
pairs of vertices whose Euclidean distance is less than or equal to d. We only | ||
consider 2D RGGs contained within a unit square, [0,1]^2. We distribute the domain | ||
such that each process receives N/p vertices (where p is the total | ||
number of processes). | ||
|
||
Each process owns (1 * 1/p) portion of the unit square and d is computed as (please | ||
refer to Section 4 of miniVite paper for details): | ||
|
||
d = (dc + dt)/2; | ||
where, dc = sqrt(ln(N) / pi*N); dt = sqrt(2.0736 / pi*N) | ||
|
||
Therefore, the number of vertices (N) passed during miniVite execution on p | ||
processes must satisfy the condition -- 1/p > d. | ||
|
||
Please note, the default distribution of graph generated from the in-built random | ||
geometric graph generator causes a process to only communicate with its two | ||
immediate neighbors. If you want to increase the communication intensity for | ||
generated graphs, please use the "-p" option to specify an extra percentage of edges | ||
that will be generated, linking random vertices. As a side-effect, this option | ||
significantly increases the time required to generate the graph. | ||
|
||
------------------------------------------------------------------------------ | ||
Q7. Does Vite (the parent application to miniVite) have an in-built graph | ||
generator? | ||
------------------------------------------------------------------------------ | ||
|
||
A7. At present, Vite does not have an in-built graph generator that we have in | ||
miniVite, so we rely on users providing external graphs for Vite (strong/weak | ||
scaling) analysis. However, Vite has bindings to NetworKit[5], and users can use | ||
those bindings to generate graphs of their choice from Vite (refer to the | ||
README). Generating large graphs in this manner can take a lot of time, since | ||
there are intermediate copies and the graph generators themselves may be serial | ||
or may use threads on a shared-memory system. We do not plan on supporting the | ||
NetworKit bindings in future. | ||
|
||
[5] https://networkit.github.io/ | ||
|
||
------------------------------------------------------------------------------ | ||
Q8. Does providing a larger input graph translate to comparatively larger | ||
execution times? Is it possible to control the execution time for a particular | ||
graph? | ||
------------------------------------------------------------------------------ | ||
|
||
A8. No. A relatively small graph can run for many iterations, as compared to | ||
a larger graph that runs for a few iterations to convergence. Since miniVite is | ||
iterative, the final number of iterations to convergence (and hence, execution | ||
time) depends on the structure of the graph. It is however possible to exit | ||
early by passing a larger threshold (using the "-t <...>" option, the default | ||
threshold or tolerance is 1.0E-06, a larger threshold can be passed, for e.g, | ||
"-t 1.0E-03"), that should reduce the overall execution time for all graphs in | ||
general (at least w.r.t miniVite, which only executes the first phase of Louvain | ||
method). | ||
|
||
------------------------------------------------------------------------------ | ||
Q9. Is there an option to add some noise in the generated random geometric | ||
graphs? | ||
------------------------------------------------------------------------------ | ||
|
||
A9. Yes, the "-p <percent>" option allows extra edges to be added between | ||
random vertices (see README). This increases the overall communication, but | ||
affects the structure of communities in the generated graph (lowers the | ||
modularity). Therefore, adding extra edges in the generated graph will | ||
most probably reduce the global modularity, and the number of iterations to | ||
convergence shall decrease. | ||
The maximum number of edges that can be added is bounded by INT_MAX, at | ||
present, we do not handle data ranges more than INT_MAX. | ||
|
||
------------------------------------------------------------------------------ | ||
Q10. What are the steps required for using real-world graphs as an input to | ||
miniVite? | ||
------------------------------------------------------------------------------ | ||
|
||
A10. First, please download Vite (parent application of miniVite) from: | ||
http://hpc.pnl.gov/people/hala/grappolo.html | ||
|
||
Graphs/Sparse matrices come in several native formats (matrix market, SNAP, | ||
DIMACS, etc.) Vite has several options to convert graphs from native to the | ||
binary format that miniVite requires (please take a look at Vite README). | ||
|
||
As an example, you can download the Friendster file from: | ||
https://sparse.tamu.edu/SNAP/com-Friendster | ||
The option to convert Friendster to binary using Vite's converter is as follows | ||
(please note, this part is serial): | ||
|
||
$VITE_BIN_PATH/bin/./fileConvertDist -f $INPUT_PATH/com-Friendster.mtx | ||
-m -o $OUTPUT_PATH/com-Friendster.bin | ||
|
||
After the conversion, you can run miniVite with the binary file obtained | ||
from the previous step: | ||
|
||
mpiexec -n <...> $MINIVITE_PATH/./dspl -r <processes-per-node> | ||
-f $FILE_PATH/com-Friendster.bin | ||
|
||
-------------------------------------------------------------------------------- | ||
Q11. miniVite is scalable for a particular input graph, but not for another | ||
similar sized graph, why is that? | ||
-------------------------------------------------------------------------------- | ||
|
||
A11. Presently, our distribution is vertex-based. That means a process owns N/p | ||
vertices and all the edges connected to those N/p vertices (including ghost | ||
vertices). Load imbalances are very probable in this type of distribution, | ||
depending on the graph structure. | ||
|
||
As an example, lets say there is a large (real-world) graph, and its structure | ||
is such that only a few processes end up owning a majority of edges, as per | ||
miniVite graph data distribution. Also, lets assume that the graph has either a | ||
very poor community structure (modularity closer to 0) or very stable community | ||
structure (modularity close to 1 after a few iterations, that means not many | ||
vertices are migrating to neighboring communities). In both these cases, | ||
community detection in miniVite will run for relatively less number of | ||
iterations, which may affect the overall scalability. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
BSD 3-Clause License | ||
|
||
Copyright (c) 2018, Battelle Memorial Institute | ||
All rights reserved. | ||
|
||
Redistribution and use in source and binary forms, with or without | ||
modification, are permitted provided that the following conditions are met: | ||
|
||
* Redistributions of source code must retain the above copyright notice, this | ||
list of conditions and the following disclaimer. | ||
|
||
* Redistributions in binary form must reproduce the above copyright notice, | ||
this list of conditions and the following disclaimer in the documentation | ||
and/or other materials provided with the distribution. | ||
|
||
* Neither the name of the copyright holder nor the names of its | ||
contributors may be used to endorse or promote products derived from | ||
this software without specific prior written permission. | ||
|
||
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" | ||
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE | ||
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE | ||
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL | ||
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR | ||
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER | ||
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, | ||
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
CXX = mpicxx | ||
# use -xmic-avx512 instead of -xHost for Intel Xeon Phi platforms | ||
PLUGIN_FLAG = -Xclang -load -Xclang ~/git/unifiedmem/code/llvm-pass/build/uvm/libOMPPass.so | ||
#OPTFLAGS = -O3 -xHost -qopenmp -DCHECK_NUM_EDGES #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD | ||
OPTFLAGS = -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -DOMP_GPU -DOMP_GPU_ALLOC -DCHECK_NUM_EDGES #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD | ||
#OPTFLAGS = -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -DOMP_GPU -DCHECK_NUM_EDGES #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD | ||
#OPTFLAGS = -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -DOMP_GPU -DCHECK_NUM_EDGES -DDEBUG_PRINTF #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD | ||
#OPTFLAGS = -O3 -fopenmp -DOMP_GPU -DCHECK_NUM_EDGES -DDEBUG_PRINTF #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD | ||
#OPTFLAGS = -O3 -fopenmp -DCHECK_NUM_EDGES -DDEBUG_PRINTF #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD | ||
#-DUSE_MPI_SENDRECV | ||
#-DUSE_MPI_COLLECTIVES | ||
# use export ASAN_OPTIONS=verbosity=1 to check ASAN output | ||
SNTFLAGS = -std=c++11 -fopenmp -fsanitize=address -O1 -fno-omit-frame-pointer | ||
CXXFLAGS = -std=c++11 -g $(OPTFLAGS) | ||
|
||
OBJ = main.o | ||
TARGET = miniVite | ||
|
||
all: $(TARGET) | ||
|
||
%.o: %.cpp | ||
$(CXX) $(CXXFLAGS) $(PLUGIN_FLAG) -c -o $@ $^ | ||
|
||
%.ll: %.cpp | ||
$(CXX) $(CXXFLAGS) $(PLUGIN_FLAG) -emit-llvm -S -c -o $@ $^ | ||
|
||
$(TARGET): $(OBJ) | ||
$(CXX) $^ $(OPTFLAGS) -o $@ | ||
|
||
.PHONY: clean | ||
|
||
clean: | ||
rm -rf *~ $(OBJ) $(TARGET) *.ll |
Oops, something went wrong.