Skip to content

Commit

Permalink
Readd miniVite
Browse files Browse the repository at this point in the history
  • Loading branch information
lingda-li committed Aug 1, 2019
1 parent 8246268 commit 584f234
Show file tree
Hide file tree
Showing 24 changed files with 18,885 additions and 1 deletion.
1 change: 0 additions & 1 deletion miniVite
Submodule miniVite deleted from f4367f
270 changes: 270 additions & 0 deletions miniVite/FAQS
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
*****************
* miniVite FAQs *
*****************
----------------------------------------------------
FYI, typical "How to run" queries are addressed Q5
onward.

Please send your suggestions for improving this FAQ
to zsayanz at gmail dot com OR hala at pnnl dot gov.
----------------------------------------------------

-------------------------------------------------------------------------
Q1. What is graph community detection?
-------------------------------------------------------------------------

A1. In most real-world graphs/networks, the nodes/vertices tend to be
organized into tightly-knit modules known as communities or clusters,
such that nodes within a community are more likely to be "related" to
one another than they are to the rest of the network. The goodness of
partitioning into communities is typically measured using a metric
called modularity. Community detection is the method of identifying
these clusters or communities in graphs.

[References]

Fortunato, Santo. "Community detection in graphs." Physics reports
486.3-5 (2010): 75-174. https://arxiv.org/pdf/0906.0612.pdf

--------------------------------------------------------------------------
Q2. What is miniVite?
--------------------------------------------------------------------------

A2. miniVite is a distributed-memory code (or mini application) that
performs partial graph community detection using the Louvain method.
Louvain method is a multi-phase, iterative heuristic that performs
modularity optimization for graph community detection. miniVite only
performs the first phase of Louvain method.

[Code]

https://github.com/Exa-Graph/miniVite
http://hpc.pnl.gov/people/hala/grappolo.html

[References]

Blondel, Vincent D., et al. "Fast unfolding of communities in large
networks." Journal of statistical mechanics: theory and experiment
2008.10 (2008): P10008.

Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Gebremedhin AH.
miniVite: A Graph Analytics Benchmarking Tool for Massively Parallel
Systems.

---------------------------------------------------------------------------
Q3. What is the parent application of miniVite? How are they different?
---------------------------------------------------------------------------

A3. miniVite is derived from Vite, which implements the multi-phase
Louvain method. Apart from a parallel baseline version, Vite provides
a number of heuristics (such as early termination, threshold cycling and
incomplete coloring) that can improve the scalability and quality of
community detection. In contrast, miniVite just provides a parallel
baseline version, and, has option to select different MPI communication
methods (such as send/recv, collectives and RMA) for one of the most
communication intensive portions of the code. miniVite also includes an
in-memory random geometric graph generator, making it convenient for
users to run miniVite without any external files. Vite can also convert
graphs from different native formats (like matrix market, SNAP, edge
list, DIMACS, etc) to the binary format that both Vite and miniVite
requires.

[Code]

http://hpc.pnl.gov/people/hala/grappolo.html

[References]

Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Lu H, Chavarria-Miranda D,
Khan A, Gebremedhin A. Distributed louvain algorithm for graph community
detection. In 2018 IEEE International Parallel and Distributed Processing
Symposium (IPDPS) 2018 May 21 (pp. 885-895). IEEE.

Ghosh S, Halappanavar M, Tumeo A, Kalyanaraman A, Gebremedhin AH.
Scalable Distributed Memory Community Detection Using Vite.
In 2018 IEEE High Performance extreme Computing Conference (HPEC) 2018
Sep 25 (pp. 1-7). IEEE.

-----------------------------------------------------------------------------
Q4. Is there a shared-memory equivalent of Vite/miniVite?
-----------------------------------------------------------------------------

A4. Yes, Grappolo performs shared-memory community detection using Louvain
method. Apart from community detection, Grappolo has routines for matrix
reordering as well.

[Code]

http://hpc.pnl.gov/people/hala/grappolo.html

[References]

Lu H, Halappanavar M, Kalyanaraman A. Parallel heuristics for scalable
community detection. Parallel Computing. 2015 Aug 1;47:19-37.

Halappanavar M, Lu H, Kalyanaraman A, Tumeo A. Scalable static and dynamic
community detection using grappolo. In High Performance Extreme Computing
Conference (HPEC), 2017 IEEE 2017 Sep 12 (pp. 1-6). IEEE.

------------------------------------------------------------------------------
Q5. How does one perform strong scaling analysis using miniVite? How to
determine 'good' candidates (input graphs) that can be used for strong
scaling runs? How much time is approximately spent in performing I/O?
------------------------------------------------------------------------------

A5. Use a large graph as an input, preferably over a billion edges. Not all
large graphs have a good community structure. You should be able to identify
one that serves your purpose, hopefully after few trials. Graphs can be
obtained various websites serving as repositories, such as Sparse TAMU
collection[1], SNAP repository[2] and MIT Graph Challenge website[3], to name
a few of the prominent ones. You can convert graphs from their native format to
the binary format that miniVite requires, using the converters in Vite (please
see README). If your graph is in Webgraph[4] format, you can easily convert it
to an edge list first (example code snippet below), before passing it on to Vite
for subsequent binary conversion.

#include "offline_edge_iterator.hpp"
...
using namespace webgraph::ascii_graph;

// read in input/output file
std::ofstream ofile(argv[2]);
offline_edge_iterator itor(argv[1]), end;

// read edges
while( itor != end ) {
ofile << itor->first << " " << itor->second << std::endl;
++itor;
}
ofile.close();
...

Due to its simple vertex-based distribution, miniVite takes about 2-4s to read a 55GB
binary file if you use Burst buffer (Cray DataWarp) or Lustre striping (about 25 OSTs,
default 1M blocks). Hence, the overall I/O time that we have observed in most cases is
within 1/2% of the overall execution time.

[1] https://sparse.tamu.edu/
[2] http://snap.stanford.edu/data
[3] http://graphchallenge.mit.edu/data-sets
[4] http://webgraph.di.unimi.it/

-----------------------------------------------------------------------------------
Q6. How does one perform weak scaling analysis using miniVite? How does one scale
the graphs with processes?
-----------------------------------------------------------------------------------

A6. miniVite has an in-memory random geometric graph generator (please see
README) that can be used for weak-scaling analysis. An n-D random geometric graph
(RGG), is generated by randomly placing N vertices in an n-D space and connecting
pairs of vertices whose Euclidean distance is less than or equal to d. We only
consider 2D RGGs contained within a unit square, [0,1]^2. We distribute the domain
such that each process receives N/p vertices (where p is the total
number of processes).

Each process owns (1 * 1/p) portion of the unit square and d is computed as (please
refer to Section 4 of miniVite paper for details):

d = (dc + dt)/2;
where, dc = sqrt(ln(N) / pi*N); dt = sqrt(2.0736 / pi*N)

Therefore, the number of vertices (N) passed during miniVite execution on p
processes must satisfy the condition -- 1/p > d.

Please note, the default distribution of graph generated from the in-built random
geometric graph generator causes a process to only communicate with its two
immediate neighbors. If you want to increase the communication intensity for
generated graphs, please use the "-p" option to specify an extra percentage of edges
that will be generated, linking random vertices. As a side-effect, this option
significantly increases the time required to generate the graph.

------------------------------------------------------------------------------
Q7. Does Vite (the parent application to miniVite) have an in-built graph
generator?
------------------------------------------------------------------------------

A7. At present, Vite does not have an in-built graph generator that we have in
miniVite, so we rely on users providing external graphs for Vite (strong/weak
scaling) analysis. However, Vite has bindings to NetworKit[5], and users can use
those bindings to generate graphs of their choice from Vite (refer to the
README). Generating large graphs in this manner can take a lot of time, since
there are intermediate copies and the graph generators themselves may be serial
or may use threads on a shared-memory system. We do not plan on supporting the
NetworKit bindings in future.

[5] https://networkit.github.io/

------------------------------------------------------------------------------
Q8. Does providing a larger input graph translate to comparatively larger
execution times? Is it possible to control the execution time for a particular
graph?
------------------------------------------------------------------------------

A8. No. A relatively small graph can run for many iterations, as compared to
a larger graph that runs for a few iterations to convergence. Since miniVite is
iterative, the final number of iterations to convergence (and hence, execution
time) depends on the structure of the graph. It is however possible to exit
early by passing a larger threshold (using the "-t <...>" option, the default
threshold or tolerance is 1.0E-06, a larger threshold can be passed, for e.g,
"-t 1.0E-03"), that should reduce the overall execution time for all graphs in
general (at least w.r.t miniVite, which only executes the first phase of Louvain
method).

------------------------------------------------------------------------------
Q9. Is there an option to add some noise in the generated random geometric
graphs?
------------------------------------------------------------------------------

A9. Yes, the "-p <percent>" option allows extra edges to be added between
random vertices (see README). This increases the overall communication, but
affects the structure of communities in the generated graph (lowers the
modularity). Therefore, adding extra edges in the generated graph will
most probably reduce the global modularity, and the number of iterations to
convergence shall decrease.
The maximum number of edges that can be added is bounded by INT_MAX, at
present, we do not handle data ranges more than INT_MAX.

------------------------------------------------------------------------------
Q10. What are the steps required for using real-world graphs as an input to
miniVite?
------------------------------------------------------------------------------

A10. First, please download Vite (parent application of miniVite) from:
http://hpc.pnl.gov/people/hala/grappolo.html

Graphs/Sparse matrices come in several native formats (matrix market, SNAP,
DIMACS, etc.) Vite has several options to convert graphs from native to the
binary format that miniVite requires (please take a look at Vite README).

As an example, you can download the Friendster file from:
https://sparse.tamu.edu/SNAP/com-Friendster
The option to convert Friendster to binary using Vite's converter is as follows
(please note, this part is serial):

$VITE_BIN_PATH/bin/./fileConvertDist -f $INPUT_PATH/com-Friendster.mtx
-m -o $OUTPUT_PATH/com-Friendster.bin

After the conversion, you can run miniVite with the binary file obtained
from the previous step:

mpiexec -n <...> $MINIVITE_PATH/./dspl -r <processes-per-node>
-f $FILE_PATH/com-Friendster.bin

--------------------------------------------------------------------------------
Q11. miniVite is scalable for a particular input graph, but not for another
similar sized graph, why is that?
--------------------------------------------------------------------------------

A11. Presently, our distribution is vertex-based. That means a process owns N/p
vertices and all the edges connected to those N/p vertices (including ghost
vertices). Load imbalances are very probable in this type of distribution,
depending on the graph structure.

As an example, lets say there is a large (real-world) graph, and its structure
is such that only a few processes end up owning a majority of edges, as per
miniVite graph data distribution. Also, lets assume that the graph has either a
very poor community structure (modularity closer to 0) or very stable community
structure (modularity close to 1 after a few iterations, that means not many
vertices are migrating to neighboring communities). In both these cases,
community detection in miniVite will run for relatively less number of
iterations, which may affect the overall scalability.
29 changes: 29 additions & 0 deletions miniVite/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
BSD 3-Clause License

Copyright (c) 2018, Battelle Memorial Institute
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

* Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

* Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
33 changes: 33 additions & 0 deletions miniVite/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
CXX = mpicxx
# use -xmic-avx512 instead of -xHost for Intel Xeon Phi platforms
PLUGIN_FLAG = -Xclang -load -Xclang ~/git/unifiedmem/code/llvm-pass/build/uvm/libOMPPass.so
#OPTFLAGS = -O3 -xHost -qopenmp -DCHECK_NUM_EDGES #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD
OPTFLAGS = -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -DOMP_GPU -DOMP_GPU_ALLOC -DCHECK_NUM_EDGES #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD
#OPTFLAGS = -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -DOMP_GPU -DCHECK_NUM_EDGES #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD
#OPTFLAGS = -O3 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -DOMP_GPU -DCHECK_NUM_EDGES -DDEBUG_PRINTF #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD
#OPTFLAGS = -O3 -fopenmp -DOMP_GPU -DCHECK_NUM_EDGES -DDEBUG_PRINTF #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD
#OPTFLAGS = -O3 -fopenmp -DCHECK_NUM_EDGES -DDEBUG_PRINTF #-DPRINT_EXTRA_NEDGES #-DPRINT_DIST_STATS #-DUSE_MPI_RMA -DUSE_MPI_ACCUMULATE #-DUSE_32_BIT_GRAPH #-DDEBUG_PRINTF #-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_LOHI_RANDOM_NUMBERS#-DUSE_MPI_RMA #-DPRINT_LCG_DOUBLE_RANDOM_NUMBERS #-DPRINT_RANDOM_XY_COORD
#-DUSE_MPI_SENDRECV
#-DUSE_MPI_COLLECTIVES
# use export ASAN_OPTIONS=verbosity=1 to check ASAN output
SNTFLAGS = -std=c++11 -fopenmp -fsanitize=address -O1 -fno-omit-frame-pointer
CXXFLAGS = -std=c++11 -g $(OPTFLAGS)

OBJ = main.o
TARGET = miniVite

all: $(TARGET)

%.o: %.cpp
$(CXX) $(CXXFLAGS) $(PLUGIN_FLAG) -c -o $@ $^

%.ll: %.cpp
$(CXX) $(CXXFLAGS) $(PLUGIN_FLAG) -emit-llvm -S -c -o $@ $^

$(TARGET): $(OBJ)
$(CXX) $^ $(OPTFLAGS) -o $@

.PHONY: clean

clean:
rm -rf *~ $(OBJ) $(TARGET) *.ll
Loading

0 comments on commit 584f234

Please sign in to comment.