Skip to content

2. Examples: regression (soon)

George Lopatenko edited this page Jul 12, 2024 · 6 revisions

Regression

Basic case

Code

from fedot_ind.core.architecture.pipelines.abstract_pipeline import ApiTemplate

dataset_name = 'AppliancesEnergy'  # BeijingPM10Quality
api_config = dict(
    problem='regression',
    metric='rmse',
    timeout=1,
    n_jobs=2,
    logging_level=20
)
metric_names = ('r2', 'rmse', 'mae')
result_dict = ApiTemplate(api_config=api_config,
                          metric_list=('f1', 'accuracy')).eval(dataset=dataset_name, finetune=False)
print(result_dict['metrics'])

Industrial Examples

Ethereum analysis

Ethereum analysis notebook

Historical Sentiment Data dataset

In all four datasets, the predictors are the hourly closing price (in USD) and the trading volume for each respective cryptocurrency for one day. This results in a two-dimensional series of length 24. The target variable is the normalized sentiment score on the day spanned by the timepoints. The datasets were divided into training and testing sets by randomly selecting 30% of each set as the test data.

import pandas as pd
from fedot_ind.api.utils.path_lib import PROJECT_PATH
from fedot_ind.core.architecture.pipelines.abstract_pipeline import ApiTemplate

api_config = dict(
    problem='regression',
    metric='rmse',
    timeout=5,
    n_jobs=-1,
    with_tuning=False,
    logging_level=10
)
metric_list = ('r2', 'rmse', 'mae')
dataset_name = 'EthereumSentiment'
data_path = PROJECT_PATH + '/examples/data'

api_client = ApiTemplate(api_config=api_config, metric_list=metric_list)

Next steps are quite straightforward: we need to fit the model and then predict the values for the test data just like for any other model in sklearn.

At the fit stage FedotIndustrial will transform initial time series data into features dataframe and will train regression model.

result_dict = api_client.eval(dataset=dataset_name, finetune=False)
print(result_dict['metrics'])
r2 rmse mae
0.302 0.224 0.171
eth_analysis_target_vs_automl eth_fitted_pipeline

Here is a comparison of the RMSE metrics for some of the SOTA (State-Of-The-Art) models (less is better):

model avg(rmse) model avg(rmse)
DrCIF_RMSE 0.223448 MultiROCKET_RMSE 0.249028
Fedot_Industrial_AutoML 0.224000 InceptionT_RMSE 0.251960
FreshPRINCE_RMSE 0.225876 XGBoost_RMSE 0.252351
RotF_RMSE 0.229423 FCN_RMSE 0.252702
FPCR_RMSE 0.231336 ResNet_RMSE 0.255854
RandF_RMSE 0.233020 Grid-SVR_RMSE 0.257539
TSF_RMSE 0.234477 SingleInception_RMSE 0.257753
FPCR-Bs_RMSE 0.235169 Ridge_RMSE 0.262168
RDST_RMSE 0.239879 CNN_RMSE 0.271610
RIST_RMSE 0.241129 ROCKET_RMSE 0.287077
5NN-DTW_RMSE 0.241217 1NN-DTW_RMSE 0.318791
5NN-ED_RMSE 0.246046 1NN-ED_RMSE 0.333878

Oil and gas prices analysis

Oil and gas prices analysis notebook

2000-2022 Oil and Gas Data dataset

Note

This type of model could help companies and governments to better analyse and predict economic situations and correlations regarding oil and natural gas.

Dataset consists of historical prices of Brent Oil, CrudeOil WTI, Natural Gas, and Heating Oil from 2000 to 2022. This sample of DailyOilGasPrices was created by using 30 consecutive business days of Crude Oil WTI close prices and traded volumes as predictors and the average natural gas close price during each 30-day time frame as the target variable. The final dataset has 191 2-dimensional time series of length 30, of which 70% were randomly sampled as training data and the remaining 30% as testing data.

import pandas as pd
from fedot_ind.api.utils.path_lib import PROJECT_PATH
from fedot_ind.core.architecture.pipelines.abstract_pipeline import ApiTemplate

api_config = dict(
    problem='regression',
    metric='rmse',
    timeout=15,
    n_jobs=-1,
    with_tuning=False,
    logging_level=20
)

metric_list = ('r2', 'rmse', 'mae')
dataset_name = 'DailyOilGasPrices'
data_path = PROJECT_PATH + '/examples/data'

api_client = ApiTemplate(api_config=api_config, metric_list=metric_list)

At the fit stage FedotIndustrial will transform initial time series data into features dataframe and will train regression model.

result_dict = api_client.eval(dataset=dataset_name, finetune=False)
print(result_dict['metrics'])
r2 rmse mae
0.584 1.161 0.833
oil_gas_analysis_target_vs_automl oil_gas_fitted_pipeline

Here is a comparison of the RMSE metrics for some of the SOTA (State-Of-The-Art) models (less is better):

model avg(rmse) model avg(rmse)
Fedot_Industrial_tuned 1.161000 FPCR_RMSE 2.052389
FreshPRINCE_RMSE 1.490442 5NN-DTW_RMSE 2.055256
RIST_RMSE 1.501047 FCN_RMSE 2.069046
RotF_RMSE 1.559385 FPCR-Bs_RMSE 2.097964
DrCIF_RMSE 1.594442 SingleInception_RMSE 2.149368
TSF_RMSE 1.684828 CNN_RMSE 2.150854
RandF_RMSE 1.708196 Grid-SVR_RMSE 2.203537
XGBoost_RMSE 1.716903 5NN-ED_RMSE 2.251424
RDST_RMSE 1.772813 ROCKET_RMSE 2.275254
MultiROCKET_RMSE 1.773578 Ridge_RMSE 2.363609
ResNet_RMSE 1.938074 1NN-DTW_RMSE 2.742105
InceptionT_RMSE 2.030315 1NN-ED_RMSE 2.822595

Building energy consumption analysis

Building energy consumption analysis notebook

ASHRAE Energy prediction notebook with data

Dataset published on Kaggle, aims to assess the value of energy efficiency improvements. For that purpose, four types of sources are identified: electricity, chilled water, steam and hot water. The goal is to estimate the energy consumption in kWh. Dimensions correspond to the air temperature, dew temperature, wind direction and wind speed. These values were taken hourly during a week, and the output is the meter reading of the four aforementioned sources. In this way, was created four datasets: ChilledWaterPredictor, ElectricityPredictor, HotwaterPredictor, and SteamPredictor.

ashrae_analysis_target_vs_automl ashrae_fitted_pipeline

Here is a comparison of the RMSE metrics for some of the SOTA (State-Of-The-Art) models (less is better):

model avg(rmse) model avg(rmse)
FCN_RMSE 1072.502 RandF_RMSE 1310.439
Fedot_Industrial_AutoML 1106.248 FPCR_RMSE 1331.504
RDST_RMSE 1142.764 5NN-DTW_RMSE 1331.504
ResNet_RMSE 1145.645 RotF_RMSE 1383.306
InceptionT_RMSE 1156.251 TSF_RMSE 1401.285
SingleInception_RMSE 1162.325 XGBoost_RMSE 1424.823
RIST_RMSE 1172.270 FPCR-Bs_RMSE 1427.171
CNN_RMSE 1174.255 5NN-ED_RMSE 1458.866
ROCKET_RMSE 1236.408 Grid-SVR_RMSE 1587.147
FreshPRINCE_RMSE 1240.376 1NN-DTW_RMSE 1819.103
DrCIF_RMSE 1246.467 1NN-ED_RMSE 1906.032
MultiROCKET_RMSE 1252.545 Ridge_RMSE 2719.383