The Python libraries necessary to execute the codes in the Jupyter notebook are as follows:
In addition, one will need the Online Retail II UCI dataset available on Kaggle.
The dataset must be downloaded into the same directory that the Jupyter notebook is saved.
In this project I use the given datasets mentioned in the Installation to give relevant answers to the following questions:
-
How many online customers are there in the dataset and what is their country of origin?
-
What are the countries that are most represented in the dataset?
-
Calculate the revenue that was made in each month and what is the percentage revenue based on countries?
-
Build a machine learning model to estimate if a given customer will buy something again from the online shop in the next quarter.
There is only one Jupyter notebook and an HTML version of that assigned to this project whereby all the questions mentioned in the Project Motivation section are answered.
The main results of my analysis in the Jupyter notebook have been communicated here.
Big credit goes to Mashlyn for the Online Retail II UCI dataset which is licensed under the CC0: Public Domain. In addition, the repository is distributed under the GNU GPLv3 license.