Skip to content

Regress using many metalearning packages including caret and TPOT from one script. Also artificial data generator, comparison script and several expiraments. Messy.

License

AGPL-3.0 and 2 other licenses found

Licenses found

AGPL-3.0
LICENSE
AGPL-3.0
LICENSE.txt
AGPL-3.0
COPYING
Notifications You must be signed in to change notification settings

ran88dom99/GeneratedRegMLBenchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

This project is licensed under the terms of the GNU AFFERO GENERAL PUBLIC LICENSE license.

GeneratedRegMLBenchmark

Testing what the machine learning algorithms of R-caret can detect. Many benchmarks have been run on common real world datasets. This Benchmark is over generated extremes.

For results see WIKI and folders containing [number]th. Files named minnrec ods, power png especially. Some patterns, like 3 variables multiplied to produce target variable and nested if statments can barely be reconstructed by any algorithm.

First run: R version 3.3.2., 100 datapoints, 3 Xvalidated repeated 5 times, 16 random hyper-parameters tested each time. bagEarth & cubist are best on at detecting most patterns with svmLinear2 & cforest for support. xyf and SBC 42 models always fail.

Warning! Installs each model's package without prompt. Run "multiple generators" to generate data then "model tester" to see if any of caret's modeling algorithms can detect each generated pattern. If a specific test is in "test out.csv" it will not be run by the program. there are 2 additional forloops (6 extra itinerations total) besides for each model and pattern generated.

To regress on your own data add your data with target column 1 to Generats folder and name of file to More Generators.R. Then run Multiple Generators.R. Look up number of your file in gensnames and make that the only number in ModelTesterAllAuto.R 's line 51. There are any options and no documentation soo good luck.

First two columns of test out.csv are % of RMSE generated by mean resolved by algorithm and % of MAE.

About

Regress using many metalearning packages including caret and TPOT from one script. Also artificial data generator, comparison script and several expiraments. Messy.

Topics

Resources

License

AGPL-3.0 and 2 other licenses found

Licenses found

AGPL-3.0
LICENSE
AGPL-3.0
LICENSE.txt
AGPL-3.0
COPYING

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages