Machine Learning Exercises

Development of practical works (TP) related to Machine Learning field.

💬 Description	📁 Data	👨🏻‍💻 Code
TP-1: Anscombe's quartet Analysis of the importance of the outliers effect and data visualization.	Anscombe's datasets (source).	Jupyter Notebook
TP-2.1: Data visualization General exploratory analysis to find data showing abnormal behavior.	Sanitary and epidemiological situation of the municipality of Bahía Blanca, Argentina (source).	Jupyter Notebook
TP-2.2: Parametric classifier Minimum error classifier design and performance analysis against variations of the mean and standard deviation of the generated data.	"Randomly" generated Gaussian distributed data.	Jupyter Notebook
TP-3.1: KNN Overview Creation of K-nearest neighbors (KNN) classifiers and performance evaluation against some training parameters.	Random samples from a normal (Gaussian) distribution.	Jupyter Notebook
TP-3.2: KNN GridSearch Evaluation of hyperparameters and their combination for a k-nearest neighbors (knn) classifier. K-Fold cross-validation is implemented to find the influence of the data on the model.	Random samples from a normal (Gaussian) distribution.	Jupyter Notebook
TP-3.3: Spotify songs Development and tunning of a k-nearest neighbors (knn) classifier to predict whether a given song will be liked or not. Feature engineering is implemented to select the data that contributes the most information to the model.	More than 2000 Spotify songs from a specific user marked as liked or disliked (source).	Jupyter Notebook
Fog event forecasting Comparison of ensembles to predict the occurrence of fog event in the next hour. Bagging and boosting algorithms are implemented to achieve this purpose, including some basic hyperparameter tuning.	Meteorological data from the Ezeiza (Buenos Aires, Argentina) weather station with hourly measurements from 1979 to 2011 (source).	Jupyter Notebook
TP-5: Customers segmentation Construction of clustering algorithms to segment customers based on their annual consumption pattern in product categories. Silhouette coefficient is implemented to evaluate each model performances.	Clients of a wholesale distributor. It includes the annual spending in monetary units (m.u.) on diverse product categories (source).	Jupyter Notebook
TP-6: Boston housing prices Construction of regression algorithms to predict property sales prices in the city of Boston. Feature selection techniques are implemented to reduce data dimensionality.	Boston Housing dataset with 506 observations and 14 features describing housing prices (source).	Jupyter Notebook

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Machine Learning Exercises

Files

README.md

Latest commit

History

README.md

File metadata and controls

Machine Learning Exercises