Skip to content

Latest commit

 

History

History
27 lines (20 loc) · 2.65 KB

README.md

File metadata and controls

27 lines (20 loc) · 2.65 KB

Reproducibility: Performance Evaluation of MemXCT on Azure CycleCloud Platform

Specification

Compile

  • To compile the code, first use spack to load the Intel parallel studio 2019.5 compiler, then run make in the released code folder.

Figure

  • preprocessing script

    • The script process_logs.py can be run by python process_logs.py log_folder_name output_file_name, the log_folder_name is where we save logs of experiments, and the output_file_name is a json file, including parameters, times and results from every experiment.
  • visualization script

    • Each visualization script specifies a data source(json file generated by preprocessing), and a result directory where the figure would be saved
    • The sciprt fig9_single_cpu_gpu_barplot.py reproduce figure 9 from the MemXCT paper, and can be run using command python fig9_single_cpu_gpu_barplot.py
    • The sciprt fig10_tune_param_heatmap.py reproduce figure 10 from the MemXCT paper, and can be run using command python fig10_tune_param_heatmap.py
    • The script fig11_strong_scaling_lineplot.py and fig11_strong_scaling_lineplot_multinode.py generate figures for strong scaling on a single node(change number of processors) and multiple node(change number of nodes), they reproduce figure 11 from the MemXCT paper, and can be run using command python fig11_strong_scaling_lineplot.py
    • We use python packages matplotlib and seaborn in above scripts to generate figures.

Run

  • The shell script run.sh runs the MemXCT algorithm on a specified dataset and can be executed using command run.sh [arguments...] DATASET [-- MPI_ARGS...] [-- PROGRAM_ARGS...], more details can be found by executing ./run.sh -h.
  • The shell script calc_dataset_size.sh specifies size of a dataset, command ./calc_dataset_size.sh DATASET will output size of it, in M $\times$ N format.
  • Shell scripts for tuning the parameters on CPU and GPU, comparing performance on single CPU and GPU, and test scalability of weak scaling and strong scaling are put in the run/script folder separately.
  • To run the script, e.g. the script run_tune_params.sh, one can execute command ./run_tune_params.sh. Each result of specific parameters are saved in a log file in the run/output folder.

Publication

This reproducibility experimental result was performed during the Student Cluster Competition 2020 (SC-20) by the GeekPie_HPC team from ShanghaiTech University. We thank our shepherds and the reviewers for their constructive feedback. We thank the support from ShanghaiTech University. We also want to thank our co-advisor, Mr. Yingdong Zhang from the Library and Information Technology Center, ShanghaiTech University.