Skip to content

Commit

Permalink
Clean up LaTeX files
Browse files Browse the repository at this point in the history
  • Loading branch information
briancpark committed Dec 20, 2023
1 parent 3f3179f commit 80f273a
Show file tree
Hide file tree
Showing 13 changed files with 386 additions and 275 deletions.
54 changes: 54 additions & 0 deletions .github/workflows/latex.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: LaTeX Build and Lint

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
schedule:
# Schedule to run at 00:00 UTC on the 1st of every month
- cron: '0 0 1 * *'
jobs:
build-and-lint:
runs-on: ubuntu-latest

steps:
- name: Set up Git repository
uses: actions/checkout@v3

- name: Install LaTeX
run: |
sudo apt-get update
sudo apt-get install -y texlive-latex-base texlive-fonts-recommended texlive-latex-extra texlive-fonts-extra texlive-latex-recommended texlive-science
- name: Install cpanminus and Perl dependencies
run: |
sudo apt-get update
sudo apt-get install -y cpanminus
sudo cpanm Log::Log4perl Log::Dispatch::File YAML::Tiny File::HomeDir Unicode::GCString
- name: Install latexindent
run: |
curl -L https://github.com/cmhughes/latexindent.pl/archive/master.zip -o latexindent.zip
unzip latexindent.zip -d latexindent
sudo cp -r latexindent/latexindent.pl-main/* /usr/local/bin/
sudo chmod +x /usr/local/bin/latexindent.pl
sudo mv /usr/local/bin/latexindent.pl /usr/local/bin/latexindent
- name: Verify latexindent installation
run: |
latexindent --version
- name: Check LaTeX formatting
working-directory: latex
run: make check

- name: Compile LaTeX documents
working-directory: latex
run: make all

- name: Upload PDFs
uses: actions/upload-artifact@v3
with:
name: Compiled-PDFs
path: latex/*.pdf
11 changes: 9 additions & 2 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,17 @@
name: Lint

on: [push, pull_request]
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
schedule:
# Schedule to run at 00:00 UTC on the 1st of every month
- cron: '0 0 1 * *'

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- uses: psf/black@stable
50 changes: 50 additions & 0 deletions latex/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Makefile for LaTeX project

# List of source LaTeX files
SOURCES := $(wildcard *.tex)
PDFS := $(SOURCES:.tex=.pdf)

# Default target
all: $(PDFS)

# Rule to compile LaTeX files
%.pdf: %.tex
pdflatex -shell-escape $<
-bibtex $(<:.tex=) 2>/dev/null
pdflatex -shell-escape $<
pdflatex -shell-escape $<

# Clean up temporary files
clean:
rm -f *.aux *.log *.out *.toc *.bak* *pdf *.bbl *.blg *.synctex.gz
rm -rf *_minted-*

# Lint .tex files
lint:
for file in $(SOURCES); do latexindent -w -s $$file; done

# Check if .tex files are properly formatted
check:
@errors=0; \
for file in $(SOURCES); do \
latexindent $$file > $$file.formatted; \
if ! diff -q $$file $$file.formatted > /dev/null; then \
echo "Formatting issue detected in $$file"; \
errors=$$((errors + 1)); \
fi; \
rm -f $$file.formatted; \
done; \
if [ $$errors -ne 0 ]; then \
echo "Formatting issues found in $$errors files."; \
exit 1; \
else \
echo "All files are properly formatted."; \
fi

# Help
help:
@echo "Available commands:"
@echo " make Compile all LaTeX files to PDF"
@echo " make clean Remove temporary files"
@echo " make lint Format .tex files using latexindent"
@echo " make check Check if .tex files are properly formatted"
Binary file removed latex/final.pdf
Binary file not shown.
202 changes: 101 additions & 101 deletions latex/final.tex

Large diffs are not rendered by default.

Binary file removed latex/proj1.pdf
Binary file not shown.
56 changes: 28 additions & 28 deletions latex/proj1.tex
Original file line number Diff line number Diff line change
Expand Up @@ -36,49 +36,49 @@ \subsection{Pruning Configuration}

For the second model, we chose ResNet-101, which is a much deeper DNN. We wanted to see the effects that residual connections have on deep neural networks. We used the CIFAR-10 dataset as that's a relatively more challenging dataset than MNIST. As for which pruner we used, we brute forced the search for the best pruning method. Training multiple pruning methods on the same dataset is an embarrassingly parallel problem, so we used Ray to exploit this parallelism on the A100 GPU \cite{ray}. Ray is a distributed framework for scaling AI and Python applications. Since there is no communication between the training processes, we really could've used any other distributed programming framework. But we chose Ray as it has GPU support, and it's able to support multi process Python execution on a single GPU. We see that once we train in parallel, we can fully utilize the memory usage as well as the GPU utilization as shown in the output of \verb|nvidia-smi| in Figure \ref{fig:nvidiasmi}. We retrained for 20 more epochs after the pruning step, since this is a much deeper network. We experimented with \verb|LevelPruner|, \verb|L1NormPruner|, \verb|L2NormPruner|. We found that pruning with only \verb|Conv2d| and excluding the last layer showed the best results.

For both networks, we pruned from 10\% to 90\% sparsity ratios with intervals of 10\%. We experimented with pruners such as \verb|LevelPruner|, \verb|L1NormPruner|, \verb|L2NormPruner| \cite{levelpruner, l1prune}.
For both networks, we pruned from 10\% to 90\% sparsity ratios with intervals of 10\%. We experimented with pruners such as \verb|LevelPruner|, \verb|L1NormPruner|, \verb|L2NormPruner| \cite{levelpruner, l1prune}.


\begin{figure}
\centerline{\includegraphics[width=6in]{../proj1/figures/nvidia-smi_pruning.png}}
\caption{Peak GPU utilization}
\label{fig:nvidiasmi}
\centerline{\includegraphics[width=6in]{../proj1/figures/nvidia-smi_pruning.png}}
\caption{Peak GPU utilization}
\label{fig:nvidiasmi}
\end{figure}

\section{Experimental Results}
%(3) What results you have obtained, including the output of "print(model)" of your original model and pruned model, their running speeds and accuracies;

\begin{figure}
\centerline{
\includegraphics[width=2in]{../proj1/figures/mnist_cnn.png}
\includegraphics[width=2in]{../proj1/figures/resnet18.png}
}
\caption{DNN Computation Graphs}
\label{fig:dnngraph}
\centerline{
\includegraphics[width=2in]{../proj1/figures/mnist_cnn.png}
\includegraphics[width=2in]{../proj1/figures/resnet18.png}
}
\caption{DNN Computation Graphs}
\label{fig:dnngraph}
\end{figure}

The outputs of \verb|print(model)| is shown for \hyperref[sec:A1]{Simple CNN} and \hyperref[sec:A2]{ResNet-18} in Appendix A. For visuals, we outputted the graph of the DNNs as well in Figure \ref{fig:dnngraph}. Note that although we used ResNet-101 for the report, the print output and computation graph are showing ResNet-18 for readability. For the Simple CNN, we show both the outputs before and after pruning.

After training each model, we benchmarked both models for their inference time as a baseline. Then we used NNI to prune as aggressively as we could to showcase the performance vs. accuracy degradation trade-off. Unfortunately, we could not run the pruning process on the M1 MacBook Pro as the \href{https://github.com/pytorch/pytorch/issues/78915}{top $k$ function is not supported with MPS backend for Apple Silicon}. Thus we could not run NNI unless we changed the source code to avoid this issue.

\begin{figure}
\centerline{
\includegraphics[width=4in]{../proj1/figures/mnist_cnn_benchmark.png}
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmark.png}
}
\caption{MNIST CNN and CIFAR 10 ResNet-101 Benchmark on NVIDIA A100}
\label{fig:a100benchmark}
\centerline{
\includegraphics[width=4in]{../proj1/figures/mnist_cnn_benchmark.png}
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmark.png}
}
\caption{MNIST CNN and CIFAR 10 ResNet-101 Benchmark on NVIDIA A100}
\label{fig:a100benchmark}
\end{figure}

The results are shown in Figure \ref{fig:a100benchmark} for NVIDIA A100 GPU. We report the inference time, training, and validation accuracy. We ran inference models for 25 trials and also plot the variance. We did this to see if there are any effects on cache misses on the GPU, as a higher variance across multiple trials may indicate it.

\begin{figure}
\centerline{
\includegraphics[width=4in]{../proj1/figures/mnist_cnn_benchmarkm1.png}
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmarkm1.png}
}
\caption{MNIST CNN and CIFAR-10 ResNet-101 Benchmark on M1 MacBook Pro}
\label{fig:m1benchmark}
\centerline{
\includegraphics[width=4in]{../proj1/figures/mnist_cnn_benchmarkm1.png}
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmarkm1.png}
}
\caption{MNIST CNN and CIFAR-10 ResNet-101 Benchmark on M1 MacBook Pro}
\label{fig:m1benchmark}
\end{figure}

For M1 Mac, we're able to transfer the ResNet-101 weights from the node with the A100 we trained on. Then we ran just the inference models and got the results as reported in Figure \ref{fig:m1benchmark}. Another thing to keep in mind is that we don't see any synchronization primitives for the MPS kernels, and we're not exactly sure if the MPS GPU kernels are synchronous, or asynchronous like CUDA. The A100 inference time is also much more flat. We suspect that this might be due to overhead of memory transfer of the dataset from main memory to GPU HBM, making this memory bounded for such a small model. We don't see this effect in M1, as the memory between GPU and CPU is \textit{unified}, or shared.
Expand All @@ -87,12 +87,12 @@ \section{Experimental Results}
For accuracy, the simple CNN degrades pretty quickly as the network becomes more sparse. This is what we expected to see. But interestingly the accuracy of ResNet-101 actually increases significantly. We're seeing training accuracy increase from 84\% to 96\% and validation accuracy increase from 79\% to 86\%. Both of the accuracies then start to degrade as the model becomes more sparse. The only suspicion is that retraining for 20 more epochs helped the model become more accurate. We also report our benchmarks for the \verb|L2NormPruner| and \verb|LevelPruner| in Figure \ref{fig:otherpruner}. The results are not comprehensive, as we didn't tune the configurations and parameters compared to \verb|L1NormPruner|. We wish to do more thorough analysis of why this happens, if time permits.

\begin{figure}
\centerline{
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmark_l2.png}
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmark_level.png}
}
\caption{ResNet-101 Benchmark for L2NormPruner (left) and LevelPruner (right) on NVIDIA A100}
\label{fig:otherpruner}
\centerline{
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmark_l2.png}
\includegraphics[width=4in]{../proj1/figures/resnet101_benchmark_level.png}
}
\caption{ResNet-101 Benchmark for L2NormPruner (left) and LevelPruner (right) on NVIDIA A100}
\label{fig:otherpruner}
\end{figure}


Expand Down
Binary file removed latex/proj2.pdf
Binary file not shown.
Loading

0 comments on commit 80f273a

Please sign in to comment.