Skip to content

clementmariebrisson/Finding_Objects_In_Images

Repository files navigation

Contributors Forks Stargazers Issues

Table of content



About The Project

The objective is to create an algorithm that retrieves the logos of the companies present on the documents. In this way, we will identify the companies that have supplied these invoices in order to classify them according to the field of activity of the company that issued the invoice.

During this project, we have read a lot of documentation, as well as tests on the different possible methods of logo detection on invoices.

The goal of this project is to classify images by detecting objects. Files can be:

  • bill (from serval companies)
  • payslip
  • account statement

To do so, we used two methods :

  • Template matching
  • SIFT: Scale Invariant Feature Transform

This is a student project.

Built With

Getting Started

How to install, run and use the project.

Prerequisites

The project is a Jupyter Notebook. You will have to install the Jupyter Notebook software on your computer. Or, you can install it on AnaConda.

Installation

Before compiling one of the two notebook, you will have to check if you have all required Python librairies installed. Otherwise, you will have to install them.

  1. Clone the repo
    git clone https://github.com/clementmariebrisson/Finding_Objects_In_Images.git
  2. Open Jupyter Notebook
  3. Install Python packages
     pip install opencv-python
     pip install numpy
     pip install matplotlib
     pip install pdf2image
     pip install Pillow
     pip install regex
     pip install pytesseract
     pip install pandas
    

(back to top)

Usage

We find our scores in a CSV file, which allows us to see the ratio of each logo detection test on a file. The SIFT Brute Force method is the method that works best among the different ones we tested. Indeed, this method is the most precise and the most adapted to our initial problem.

All methods have their weak points, for SIFT it is the distance of the keypoints that are matched. We were able to overcome this with SIFT - bruteforce, but if the logo does not have much detail, few keypoints are detected and this affects the score calculated by our program.

The bruteforce method was the best way to solve the initial problem, thanks to all the possibilities offered by the knnMatcher.

For more informations you can read the report.

For more examples, please refer to those sites :

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

Contact

Alexis Guillotin & Clement Marie Brisson

Project Link: https://github.com/clementmariebrisson/Finding_Objects_In_Images

(back to top)

Acknowledgments

Use this space to list resources you find helpful and would like to give credit to. I've included a few of my favorites to kick things off!

About

Projet de TP Python (USSI4Z) - Finding Objects In Images

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published