Buenos Aires Real Estate Price Prediction

Introduction

This project aims to predict the prices of apartments in Buenos Aires, Argentina, using a robust machine-learning model. The focus is on properties costing less than $400,000. Accurate price predictions are crucial for various stakeholders, including buyers, sellers, real estate agents, and policymakers. Our objective was to develop a reliable model despite the absence of temporal indicators.

Problem Statement

The goal is to identify significant features that accurately predict apartment prices in Buenos Aires and to achieve a Mean Absolute Error (MAE) of less than 50% compared to a baseline model.

Methodology

We followed a prescriptive methodology to guide our model development:

Data Collection: Scraped 12,000+ apartment listings from real estate websites.
Data Preprocessing: Cleaned the dataset by handling missing values, converting data types, removing duplicates, and normalizing features.
Exploratory Data Analysis (EDA): Conducted to understand feature distributions and relationships.
Feature Engineering: Extensively used to identify the most significant features.
Modeling: Iteratively experimented with various models, including multiple versions of the Ordinary Least Squares (OLS) model.
Handling Heteroskedasticity: Identified and addressed this issue to improve model accuracy.
Final Model: Selected a Gradient Boosting Regressor based on performance, which achieved the best results.

Results

Optimal High Leverage Threshold: 0.0003
Optimal High Residual Threshold: 3
Mean Absolute Error (MAE): $23,809.9879
R²: 0.7863
MAE Improvement: 60% better than the baseline

The final model successfully handled heteroskedasticity and outperformed previous iterations.

Libraries Used

Python: Core language used for development.
Pandas: For data manipulation and analysis.
NumPy: For numerical operations.
Matplotlib: For data visualization.
Seaborn: For statistical data visualization.
Plotly: For interactive graphs.
Dash: For web-based application development.
Scikit-learn: For machine learning modeling.
Statsmodels: For statistical modeling.

Dependencies

Make sure to install the following dependencies before running the project:

pip install pandas numpy matplotlib seaborn plotly dash scikit-learn statsmodels

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.ipynb_checkpoints		.ipynb_checkpoints
.vscode		.vscode
assets		assets
classes		classes
data		data
results		results
.gitattributes		.gitattributes
Buenos_Aires_Price_Prediction Adjusted for HTML.ipynb		Buenos_Aires_Price_Prediction Adjusted for HTML.ipynb
Buenos_Aires_Price_Prediction HTML.html		Buenos_Aires_Price_Prediction HTML.html
Buenos_Aires_Price_Prediction document.pdf		Buenos_Aires_Price_Prediction document.pdf
Buenos_Aires_Price_Prediction.ipynb		Buenos_Aires_Price_Prediction.ipynb
Buenos_Aires_Project_Classes.ipynb		Buenos_Aires_Project_Classes.ipynb
LICENSE		LICENSE
README.md		README.md
final_model.pkl		final_model.pkl
screenshot.png		screenshot.png
standard_scaler.pkl		standard_scaler.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Buenos Aires Real Estate Price Prediction

Introduction

Problem Statement

Methodology

Results

Libraries Used

Dependencies

About

Releases

Packages

Languages

License

ALPHAbilal/Predicting-Apartments-Prices-in-Buenos-Aires

Folders and files

Latest commit

History

Repository files navigation

Buenos Aires Real Estate Price Prediction

Introduction

Problem Statement

Methodology

Results

Libraries Used

Dependencies

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages