This is a repository with some of the challenges, mini-projects, course homeworks, tutorials, etc, that I've done.
Each one is contained in a sub-directory, and has a README file or a Notebook describing a bit more about it.
-
Sentiment Analysis Web App: This project uses nltk and PyTorch to build a Sentiment Analysis model. It trains the model locally, but then deploys it to SageMaker. It also creates a Web App, which uses AWS API Gateway to receive the POST requests, and AWS Lambda to do the NLP pre-processing.
-
Plagiarism Detection: This project contains code and associated files for deploying a plagiarism detector using AWS SageMaker. It examines a text file and performs binary classification; labeling that file as either plagiarized or not, depending on how similar that text file is to a provided source text. The project is broken down into three main notebooks: Data Exploration, Feature Engineering and Train and Deploy in SageMaker
-
Movie Recommendations Engine: This project builds Knowledge Based, Content Based and Collaborative Filtering Based (both Neighborhood Based and Model Based) Recommendations Engines using the MovieTweetings dataset.
-
Recommendations Engine with IBM dataset: This project builds a blend of a Knowledge and a Collaborative Filtering Based Recommendation Engine, as the dataset used does not have any reviews or ratings, but logs of views of articles by different users.
-
Disaster Response Pipeline Project: In this project, we will analyze messages sent during disasters to build a model for an API that classifies disaster messages. It creates both a ETL (for NLP processing) and a ML pipeline (using scikit Pipelines, GridSearch, etc.) and then hosts the model in a Flask web app.
-
Bank Data: In this project, we have data on 10.000 (fictitious) customers of a bank, and want to use the insights to improve the customer retention, and identify customers at risk of leaving the bank. Finally, we want to predict which of these customers we will be able to retain over the next 12 months.
-
Variational Autoencoders with TensorFlow: In this notebook we build a Variational Autoencoder (a type of generative model) and explore its latent/hidden representations.
-
Generative adversarial networks with TensorFlow: In this notebook we build different types of Generative adversarial networks (another type of generative models): GAN, DCGAN and VAEGAN.
-
Visualizing Representations with TensorFlow: In this notebook we walk through visualizing the gradients of trained convolutional networks (Inception, VGG, Illustration2Vec), and also explore Deep Dream and Style Net.
-
Fashion MNIST: Uses the Fashion MNIST dataset from Zalando to build a CNN using Keras.
-
Kaggle Reuters: Uses NLP and ML to classify texts into categories. The domain is Topic modeling and focused on newspaper articles. It also does EDA.
-
AWS ETL and ML Pipeline: Notebooks for Data preparation, model development, ETLs, model training, inference pipelines and batch transformations, using AWS Glue, Athena and SageMaker servicies.