Skip to content

Discussion of advantages and disadvantages of AdaHessian, a state-of-the-art Second Order Methods over First Order Methods on a Non-Convex Optimization Problem (digits classification on MNIST database using ResNet18). - @ EPFL

License

Notifications You must be signed in to change notification settings

riccardocadei/adahessian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Discussion on the viability of a modern Second Order Method in Non-Convex Optimization training a Deep Convolutional Neural Network

This repository contains the code and report of the Final Project of Optimization for Machine Learning (CS-439) course at EPFL during Spring term 2021.

Team

This project is accomplished by:

Abstract

Second order algorithms are among the most powerful optimization algorithms with superior convergence properties as compared to first order methods such as SGD and Adam. However computing or approximating the curvature matrix can be very expensive both in per-iteration computation time and memory cost. In this study we analyze the convenience in using a state-of-the-art Second Order Method (AdaHessian) in Non-Convex Optimization training a Deep Convolutional Neural Network (ResNet18) on MNIST database comparing with traditional First Order Methods. In fact almost all the theoretical results of these methods cannot be extended to Non-Convex optimization and we have to limit to experimental comparisons. Advantages and disadvantages of both the methods are discussed and a final hybrid method combining the advantages of both is proposed.

For further information about this project, read report.pdf.

Environment

The project has been developed and test with python3.8.3.

Required libraries:

  • torch (version 1.9.0.)
  • matplotlib.pyplot (version 3.3.4.)
  • time (build in the interpreter)
  • sklearn (version 0.24.1.)

About

Discussion of advantages and disadvantages of AdaHessian, a state-of-the-art Second Order Methods over First Order Methods on a Non-Convex Optimization Problem (digits classification on MNIST database using ResNet18). - @ EPFL

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published