Skip to content

3. Tools

Vadim A. Potemkin edited this page Jun 6, 2024 · 4 revisions

This page is dedicated to special tools that developed within framework to support it functionality, but could be used separately.

Time-series generator

Code

A tool for generating four types of synthetic time series:

  • sin wave
  • random walk
  • auto regression
  • smooth normal

All you need to to is to pass config dict to main class. Here are examples of config for any on four time series types that are mentioned above:

sin_config = {
    'ts_type': 'sin',
    'length': 1000,
    'amplitude': 10,
    'period': 500
}

random_walk_config = {
    'ts_type': 'random_walk',
    'length': 1000,
    'start_val': 36.6
}

auto_regression_config = {
    'ts_type': 'auto_regression',
    'length': 1000,
    'ar_params': [0.5, -0.3, 0.2],
    'initial_values': None
}

smooth_normal_config = {
    'ts_type': 'smooth_normal',
    'length': 1000,
    'window_size': 300
}

Then you just call get_ts() method:

from fedot_ind.tools.synthetic.ts_generator import TimeSeriesGenerator

ts_generator = TimeSeriesGenerator(ts_config)
ts = ts_generator.get_ts()

ts, ts dataset, ts anomaly generators

Dataset generator

Code

This tool allows to create synthetic dataset of time series for classification and regression tasks. Several arguments must be passed to class TimeSeriesDatasetsGenerator:

  • num_samples - The number of samples to generate.
  • max_ts_len - The maximum length of the time series.
  • binary - Whether to generate binary classification datasets or multiclass.
  • test_size - The proportion of the dataset to include in the test split.
  • multivariate - Whether to generate multivariate time series.
generator = TimeSeriesDatasetsGenerator(num_samples=80,
                                        task='classification',
                                        max_ts_len=50,
                                        binary=True,
                                        test_size=0.5,
                                        multivariate=False)
train_data, test_data = generator.generate_data()

Anomaly generator

AnomalyGenerator class is used to generate anomalies in time series data. It takes time series data as input and returns time series data with anomalies. Anomalies are generated based on anomaly_config parameter. It is a dict with anomaly class names as keys and anomaly parameters as values. Anomaly class names must be the same as anomaly class names in anomalies.py

First, we need to create an instance of AnomalyGenerator class with config as its argument where every anomaly type hyperparameters are defined. There are six types of anomaly:

  • dip
  • peak
  • decrease_dispersion
  • increase_dispersion
  • shift_trend_up
  • add_noise

Each one of them is configurable with corresponding parameters: level, number, min length, max length. For examples, we could image a case when there is a need for each one of them. So config would looks like this:

anomaly_config = {'dip': {'level': 20,
                          'number': 5,
                          'min_anomaly_length': 10,
                          'max_anomaly_length': 20},
                  'peak': {'level': 2,
                           'number': 5,
                           'min_anomaly_length': 5,
                           'max_anomaly_length': 10},
                  'decrease_dispersion': {'level': 70,
                                          'number': 2,
                                          'min_anomaly_length': 10,
                                          'max_anomaly_length': 15},
                  'increase_dispersion': {'level': 50,
                                          'number': 2,
                                          'min_anomaly_length': 10,
                                          'max_anomaly_length': 15},
                   'shift_trend_up': {'level': 10,
                                      'number': 2,
                                      'min_anomaly_length': 10,
                                      'max_anomaly_length': 20},
                   'add_noise': {'level': 80,
                                 'number': 2,
                                 'noise_type': 'uniform',
                                 'min_anomaly_length': 10,
                                 'max_anomaly_length': 20}
                      }

For our case we use peak and decrease_dispersion:

from fedot_ind.tools.synthetic.anomaly_generator import AnomalyGenerator

config = {'peak': {'level': 50,
                   'number': 5,
                   'min_anomaly_length': 20,
                   'max_anomaly_length': 50},
          'decrease_dispersion': {'level': 50,
                                  'number': 5,
                                  'min_anomaly_length': 40,
                                  'max_anomaly_length': 50}}
generator = AnomalyGenerator(config=config)

Then we can generate anomalies in time series data using method generate which arguments are time_series_data (np.array of config for synthetic ts_data), plot and acceptable overlap. As initial data we will use synthetic time series obtained with previously introduced generator:

from fedot_ind.tools.synthetic.ts_generator import TimeSeriesGenerator

sin_config = {
    'ts_type': 'sin',
    'length': 1000,
    'amplitude': 10,
    'period': 1000
}
ts_generator = TimeSeriesGenerator(sin_config)
ts = ts_generator.get_ts()


initial_ts, modified_ts, intervals = generator.generate(time_series_data=data,
                                                        plot=True,
                                                        overlap=0.1)

This method returns initial time series data, modified time series data and dict with anomaly intervals which could be visualised:

anomaly

Data Loader

Class for reading data files and downloading from UCR archive if not found locally. At the moment supports .ts, .txt, .tsv, and .arff formats.

data_loader = DataLoader('ItalyPowerDemand')
train_data, test_data = data_loader.load_data()

etc.