Skip to content

Portfolio of data science projects completed by me for academic, self-learning, and hobby purposes.

Notifications You must be signed in to change notification settings

abdo-projects/data-science-portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Portfolio

The repository contains a portfolio of data science projects I completed for academic, self-learning, and hobby purposes. They are presented in the form of iPython Notebooks, Python codes, and R markdown files.

For a more visually pleasing experience for browsing the portfolio, check out abdo.tech.

The R portfolio is coming soon!

Note: Data used in the projects (accessed under the Dataset directory) is only for demonstration purposes. The datasets are Free Public datasets available in Kaggle.

Contents

  • Machine Learning (Supervised learning)

    • k-nearest neighbors (KNN)

      • Predicting Adult Income: A model to precisely predict individuals’ income using Adult data Set collected from the UCI machine learning repository. Our goal with this implementation is to build a model that accurately predicts whether an individual makes more than $50,000.
      • Predicting Cars Classes: Experiment with KNN machine learning algorithm to predict your Class label based on your selected data. Use default KNN configurations and try at least two different values of k. Try conduct also with custom KNN configurations with at least 5 fold cross-validation.
      • Credit Card Approval: Use the KNN machine learning algorithm to help banks decide whether they should approve or reject giving the credit card to each customer. The most critical indicator among the indicators of confusing matrices will be an indicator called specificity or another name - true negative rate (TNR).
    • Decision Tree

      • Predicting Cars Classes: Experiment with Decision Tree machine learning algorithm to predict your Class label based on your selected data.
    • Naive Bayes

      • Predicting Cars Classes: Experiment with Naive Bayes machine learning algorithm to predict your Class label based on your selected data.
    • Support Vector Machine (SVM)

      • Predicting Cars Classes: Experiment with SVM machine learning algorithm to predict your Class label based on your selected data.
      • Credit Card Approval: Use the SVM machine learning algorithm to help banks decide whether they should approve or reject giving the credit card to each customer. The most critical indicator among the indicators of confusing matrices will be an indicator called specificity or another name - true negative rate (TNR).
    • Linear Regression

Tools: scikit-learn, Pandas, Seaborn, Matplotlib, Numpy.

  • Machine Learning (Unsupervised learning)

    • Hierarchical clustering

      • Cluster Automotives: Cluster Automotives using Agglomerative Clustering based on MPG and displacement.

Tools: scikit-learn, Pandas, Seaborn, Matplotlib, Numpy.

  • Predictive Business Analytics

    • Market Basket Analysis

      • Association Rules for Covid Symptoms: Association rules using the apriori algorithm that help to show the probability of relationships between different covid-19 symptoms.
      • Market Basket Analysis for Online Retail: Determine which products are most often bought in combination with each other to identify how customers' figure size will affect the purchasing pattern to have a better insight into inventory planning and better stock management.

Tools: Mlxtend, Pandas, Seaborn, Matplotlib, Numpy.

  • Time Series Analysis

    • Dresses Sales Forecast for ModCloth Using ARIMA : Forecast dresses sales for ModCloth online retail. The first model parameters are determined by testing the stationary of the time series; the p and q values are determined by observing the ACF plot "Autocorrelation function "and PACF plot "autocorrelations" sequentially. The second model (p,d,q) parameters will be determined using Auto ARIMA.
    • Dresses Sales Forecast for ModCloth Using Xgboost : Dresses Sales Forecast for ModCloth using Xgboost. Two different models based on two different learning rates will be implemented. We will calculate MAE, MSE, and RMSE for each model. Finally, we will forecast 20 periods with Xgboost (20 Months).
    • Aircrafts Crashes Forecast Using Xgboost: Aviation Causality Forecasting using Xgboost to analyze and research historical airplane crashes and fatalities data and forecast future causalities for 20 periods (20 Months).

Tools: Statsmodels, xgboost, Scikit-learn, Pandas, Seaborn, Matplotlib, Numpy.

  • Deep Learning

    • Convolutional Neural Network

      • Passion Fruit Classification using Convolutional Neural Network: Build a CNN architecture to perform classification between several cultivars of Passion Fruit (Markisa), notably the following cultivars of Markisa, Sweet Passion Fruit (Markisa Manis), Yellow Passion Fruit (Markisa Kuning), Purple Passion Fruit (Markisa Ungu), and Big Passion fruit (Markisa Besar).
      • Classification of COVID-19 from Chest X-ray images using Transfer Learning: Build a CNN-based model with DenseNet201 transfer learning to detect coronavirus, Lung Opacity and Viral Pneumonia infected patients using chest X-ray radiographs and gives a classification accuracy of training accuracy of 94.5%, validation accuracy of 96.49 %, and validation AUC of 99.39%. The results demonstrate that transfer learning proved effective, showed robust performance, and was an easily deployable approach for COVID-19 detection.

Tools: Tensorflow, keras, Colab, Pandas, Seaborn, Matplotlib, Numpy.

  • Natural Language Processing

    • Sentiment Analysis

      • Twitter Sentiment Analysis for Cryptocurrency Price Prediction: Build a system that Connect to Twitter API V2 to collect relative posts related to BTC and store them into MySQL database, while using Hugging Face is an NLP library to provide sentiment analysis. The output from the library provides either "Positive," "Negative," or "Neutral," indicating the sentiment and store them back into the MySQL database.

Tools: Transformers, HuggingFace, Colab,Twitter API,Pandas, Numpy, MySQL.

I also immerse in all other kinds of technologies. You can find a general portfolio here.

If you liked what you saw and want to chat with me about the portfolio, work opportunities, or collaboration, shoot an email at a7med.abdu@gmail.com.

Support My Work

If this portfolio inspired you, gave you ideas for your portfolio, or helped you, please consider following my Github profile for newly updated content.

About

Portfolio of data science projects completed by me for academic, self-learning, and hobby purposes.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published