Feature Request: inclusion of the trivial random forest model #711

DoktorPi · 2023-11-26T15:58:26Z

Currently, the `ranger' package does not support fitting a model with no covariates (i.e., the trivial model Y~ 1). In the context of random forest, this would effectively amount to simply bootstrapping the mean of Y along with the respective OOB estimates for each prediction and the prediction error (variance of Y).

The potential benefits of implementing the case of zero covariates could be the following:

The trivial model without covariates often serves as a reference or literal null model.
It fits neatly into the class of random forests as a trivial but essential subclass, i.e. bagging the mean.
Many scenarios involve automated screening through various sets of predictors, which may or may not include the trivial model, such that no additional code must be specified for handling exceptional cases.

As a side note: Setting mtry = 0 does not force a fit of the trivial model, if handing over a formula with at least one covariate, but seems to force a fit with mtry = 1.

mnwright · 2023-11-30T07:16:36Z

Thanks, that's a good idea. Could the interface look like this:

Formula interface: y ~ 1
dependent.variable.name interface: Supply data with just the target column?
x/y interface: x = NULL

Or any other ideas?

DoktorPi · 2023-12-04T13:39:14Z

I fully agree with your scheme:

Formula interface: y~1 is the standard way to specify the trivial model in R´s formula syntax.
Dependent Variable interface: Triggering the trivial model by providing only one data column aligns very well with automated data processing pipeline.
x/y interface: providing y without x as a method to trigger the trivial model is convenient and consistent.

So, this seems to be the most consistent interface scheme.

Btw, I also discovered a workaround to emulate the trivial model in ranger by just specifying a non-trivial formula (at least one predictor) but setting min.node.size equal to the sample size.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: inclusion of the trivial random forest model #711

Feature Request: inclusion of the trivial random forest model #711

DoktorPi commented Nov 26, 2023

mnwright commented Nov 30, 2023

DoktorPi commented Dec 4, 2023

Feature Request: inclusion of the trivial random forest model #711

Feature Request: inclusion of the trivial random forest model #711

Comments

DoktorPi commented Nov 26, 2023

mnwright commented Nov 30, 2023

DoktorPi commented Dec 4, 2023