Cookies

We use cookies to ensure that we give you the best experience on our website. By continuing to browse this repository, you give consent for essential cookies to be used. You can read more about our Privacy and Cookie Policy.


Durham Research Online
You are in:

Imprecise weighted extensions of random forests for classification and regression.

Utkin, L.V. and Kovalev, M.S. and Coolen, F.P.A. (2020) 'Imprecise weighted extensions of random forests for classification and regression.', Applied soft computing., 92 . p. 106324.

Abstract

One of the main problems of using the random forests (RF) in classification and regression tasks is a lack of sufficient data which fall into certain leaves of trees in order to estimate the tree predicted values. To cope with this problem, robust imprecise classification and regression RF models, called the imprecise RF, are proposed. They are based on the following ideas. First, imprecision of the tree estimates is taken into account by means of imprecise statistical inference models and confidence interval models. Secondly, we introduce weights assigned to trees or to groups of trees, which are computed in order to correct the RF estimates under condition of imprecise tree predicted values. In fact, the weights can be regarded as a robust meta-learner controlling the imprecision of estimates. Special modifications of loss functions to compute optimal weights for the classification and regression tasks are proposed in order to simplify maximin optimization problems. As a result, simple linear and quadratic optimization problems are obtained, whose solution does not meet any difficulties. Various numerical examples with real datasets illustrate the proposed robust models and show outperforming results when datasets are rather small or noisy.

Item Type:Article
Full text:(AM) Accepted Manuscript
Available under License - Creative Commons Attribution Non-commercial No Derivatives.
Download PDF
(1855Kb)
Status:Peer-reviewed
Publisher Web site:https://doi.org/10.1016/j.asoc.2020.106324
Publisher statement:© 2020 This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/
Date accepted:16 April 2020
Date deposited:20 April 2020
Date of first online publication:21 April 2020
Date first made open access:21 April 2021

Save or Share this output

Export:
Export
Look up in GoogleScholar