This is a university project based on the Google Landmark Recognition 2020 kaggle competiton.
The goal of this project is to classify successfully images of known sites from around the world, given big and challenging train set to learn from and a test set that contain mainly out of domain images.
In face of the special and challenging features of the data set, we proposed and implemented two possible solution using machine learning techniques.
The first solution, a baseline, is a simple straight forward aprrocah, training a CNN (EfficientNet using RAdam optimizer) and use it as a classifier. This solution fail to overcome the challenging aspects of the data set and yields poor results.
The second solution is a retrival based solution that derive inspiration from other teams solution to this competition.
This solution consist of two steps, the first is to clean the test set from out of domain images using object detection (we used YOLO darknet implementation). Object detection examples:
The second is classification using nearest neighbor algorithm, using the images features vector.
The power of using feature vectore and K-NN (the test set image is to the left, next to it there are the 5 nearest neighbors from the train set. If there is a small res X in the lower right corner of the image, it means that this image class is not as the test set image):
Even when the classification is not succesful, the nearest neighbors still have some resemblance to the test set image:
This solution is built to face on the challenging features of the data set and although the solution it yields are far from great they are much better than the baseline's results.
To run the whole code of this project, one needs the following libraries (in the specified version or higher):
Library | Version |
---|---|
Python |
3.6 |
torch |
1.8.0 |
torchvision |
0.9.0 |
pandas |
1.25.0 |
numpy |
1.19.0 |
opencv |
4.2.0 |
matplotlib |
3.2.1 |
seaborn |
0.11.0 |
efficientnet_pytorch |
0.7.0 |
torch_optimizer |
0.1.0 |
sklearn |
0.21.3 |
PIlow |
6.1.0 |
tqdm |
4.55.0 |
In this project we also used YOLO darknet implementation as an object detector. We used version 3 and version 4 network that were pre trained on Open Images Dataset and COCO Dataset accordingly.
Many of the code in this project is part of a jupyter notebook. Unfortunately, GitHub is not able to render successfully all the notebooks, so one can download them and run them locally or via colab or view them using nbviewer with the links in the nbviewer directory.
The code we wrote for this project is organized in sub directories, so that there is a sub directory for each part of the project. Each sub directory contain the relevant code files (.py or .ipynb) and may contain csv files or images.
Sub-Directory | Content |
---|---|
\baseline |
directory containing implementation of the baseline, results and evaluation |
\data |
directory containing GLDv2 dataset analysis |
\feature_extraction |
directory containing implementation of feature extraction and K-NN classifier |
\images |
directory containing images used in this repository |
\landmark_classifier |
directory containing pre-process of the data as input to YOLO Darknet implementation and its results analysis and evaluation |
\nbviewer |
directory containing nbviewer links for the jupyter notebook in this repository |
\poster |
directory containing the project poster |
\results_and_evaluation |
directory containing the classification results and evaluation |
We tried to write the code so it will be organized and well documented.
Matan Kleiner
Yuval Snir
Supervised by Ori Linial
[1] T. Weyand, A. Araujo, B. Cao and J. Sim, Proc. "Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval", CVPR'20
[2] K. Chen et-al “2nd Place and 2nd Place Solution to Kaggle Landmark Recognition and Retrieval Competition 2019", arXiv:1906.03990 [cs.CV], Jun. 2019.
[3] J. Redmon and A. Farhadi. "YOLOv3: An Incremental Improvement", arXiv:1804.02767v1 [cs.CV] Apr. 2018.
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton. "ImageNet classification with deep convolutional neural networks", In Proceedings of NIPS, pages 1106–1114, 2012.