Authors: Kushankur Ghosh, Colin Bellinger, Roberto Corizzo, Bartosz Krawczyk, and Nathalie Japkowicz
Published in IEEE-BigData'2021
(This paper is one of my works)
Structural concept complexity, class overlap, and data scarcity are some of the most important factors influencing the performance of classifiers under class imbalance conditions. When these effects were uncovered in the early 2000s, understandably, the classifiers on which they were demonstrated belonged to the classical rather than Deep Learning categories of approaches. As Deep Learning is gaining ground over classical machine learning and is beginning to be used in critical applied settings, it is important to assess systematically how well they respond to the kind of challenges their classical counterparts have struggled with in the past two decades. The purpose of this paper is to study the behavior of deep learning systems in settings that have previously been deemed challenging to classical machine learning systems to find out whether the depth of the systems is an asset in such settings. The results in both artificial and real-world image datasets show that these settings remain mostly challenging for Deep Learning systems. Deeper architectures help with structural concept complexity but not with data scarcity and class overlap.
Paper Link: click here
This repository contains the source code and data of our paper:
ArtificialImg_P1_C1_Train.py
: Creates complexity 1 (easy complexity) image dataArtificialImg_P1_C2_Train.py
: Create complexity 2 (medium complexity) image dataArtificialImg_P1_C3_Train.py
: Create complexity 3 (higher complexity) image dataArtificialImg_BackboneTrain.py
: Following the Backbone framework on the image dataTraditionalBackbone.py
: Implementing the traditional backboneGaussianBackbone.py
: Implementing the Gaussian backbonelibraries.py
: contains the library list
The program is written in Python 3.8:
- Using conda:
conda install -c conda-forge jupyterlab
- or using pip:
pip install jupyterlab
The program requires the following Python libraries:
- scikit-learn v1.0.1
- pandas v1.3.4
- scipy v1.7.3
- scikit-image v0.19.3
- PIL v1.1.7
- Kushankur Ghosh, kushanku@ualberta.ca
- Kushankur Ghosh, Dept. of Computing Science, University of Alberta, kushanku@ualberta.ca
- Colin Bellinger, Digital Technologies, National Research Council of Canada, colin.bellinger@nrc-cnrc.gc.ca
- Roberto Corizzo, Dept. of Computer Science, American University, rcorizzo@american.edu
- Bartosz Krawczyk, Dept. of Computer Science, University of Alberta, bkrawczyk@vcu.edu
- Nathalie Japkowicz, Dept. of Computer Science, American University, japkowic@american.edu