This repository contains the implementation of the LGBM_MA model. Below is a detailed guide on how to use, contribute, and understand the codebase.
Title: Gradient Boosting With Moving-Average Terms for Nonlinear Sequential Regression
Abstract: The paper investigates sequential nonlinear regression and introduces a novel gradient boosting algorithm. This algorithm is inspired by the well-known linear auto-regressive-moving-average (ARMA) models and exploits the residuals, i.e., prediction errors, as additional features. The main idea is to utilize the state information from early time steps contained in the residuals to enhance the performance in a nonlinear sequential regression/prediction framework. By exploiting the changes in the previous time steps through residual terms, the algorithm aims to achieve improved predictive accuracy in the context of boosting.
- Clone the Repository:
git clone https://github.com/YigitTurali/LGBM_MA.git
- Navigate to the Directory:
cd LGBM_MA
- Install Dependencies:
pip install -r requirements.txt
- Run the Code Detailed instructions on specific scripts or models will be provided below in the Code Explanation section.
This file contains the implementation of the LGBM-MA model, which combines GBM and MA terms to achieve better predictive performance.
This script is responsible for loading the data required for the MLP (Multi-Layer Perceptron) model. It ensures that the data is in the correct format and is ready for training and evaluation.
In this file, the MLP model is defined. It includes the architecture, training, and evaluation methods for the model.
This script contains various data preprocessing pipelines. These pipelines are essential for preparing the data for different models and ensuring that it's in the right format.
As the name suggests, this file contains the implementation of the LightGBM model. It includes the model's definition, training, and evaluation methods.
This script is used to prepare synthetic datasets. It contains various utilities and methods to generate and process synthetic data.
This file contains the implementation related to the letter paper dataset. It includes data loading, preprocessing, and model training for this specific dataset.
Similar to the letter_paper.py
file, this script deals with the real data from the letter paper dataset.
This script is dedicated to the preparation of the M4 dataset. It includes utilities and methods to load, preprocess, and prepare the M4 dataset for model training.
This is the main script where the entire workflow is orchestrated. It calls various utilities and models defined in other files and executes the project's main logic.
Another utility script for synthetic data generation and processing.
Contributions are welcome! Please read the contributing guidelines to get started.
This project is licensed under the MIT License. See the LICENSE file for details.
For more information or questions, please contact Yigit Turali.