Combining tree-boosting and mixed effects models improves the performance of remote-sensing based forest age predictions
Oxford University Press
2025
Toivonen_etal_2025_Forestry_Combining_treeboosting.pdf - Publisher's version - 1.83 MB
How to cite: Janne Toivonen, Annika Kangas, Timo P Pitkänen, Mari Myllymäki, Matti Maltamo, Mikko Kukkonen, Petteri Packalen, Combining tree-boosting and mixed effects models improves the performance of remote-sensing based forest age predictions, Forestry: An International Journal of Forest Research, 2025;, cpaf083, https://doi.org/10.1093/forestry/cpaf083
Pysyvä osoite
Tiivistelmä
Forest age is an important attribute from the perspectives of forest management and biodiversity, but prediction with remote sensing data is difficult. In this study, we evaluated the performance of airborne laser scanning (ALS) and Sentinel-2 data in plot-level forest age predictions. In addition, we accounted for site conditions in the modelling by utilizing categorical variables, such as the main site type of the forest plot. Categorical variables were derived from field data but were available for the entire landscape. We compared two prediction methods: linear mixed effects (LME) modelling and tree-boosted mixed effects (GPBoost) modelling. Our field data contained 870 National Forest Inventory plots in northern Finland with ages that ranged from 0 to 300 years. Some plots contained seedling and retention trees (hereafter hold-over tree) left from the previous generation, which make the age prediction of these plots a major challenge. To mitigate this, we tested an alternative strategy that included a prior classification step to identify hold-over plots. Overall, three age modelling strategies were tested (1) without categorical variables, (2) with categorical variables, and (3) with both categorical variables and hold-over plot classification. Our results showed that GPBoost was superior to LME in each tested scenario, and the addition of categorical variables led to a clear decrease in the prediction error. When categorical variables were added as random components, the relative root mean square error (RMSE) values for LME improved from 46.2% to 40.2% and from 41.7% to 38.5% for GPBoost. The best performing modelling strategy included hold-over plot classification before age modelling, which yielded RMSE values of ~38.2% and 36.3% for LME and GPBoost, respectively. Compared to earlier research, our approach exhibited better prediction performance for older forests (≥150 years old) which in turn enables better identification of old-growth forests.
ISBN
OKM-julkaisutyyppi
A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä
Julkaisusarja
Forestry
Volyymi
Numero
Sivut
Sivut
14 p.
ISSN
0015-752X
1464-3626
1464-3626
