Jukuri, open repository of the Natural Resources Institute Finland (Luke) All material supplied via Jukuri is protected by copyright and other intellectual property rights. Duplication or sale, in electronic or print form, of any part of the repository collections is prohibited. Making electronic or print copies of the material is permitted only for your own personal use or for educational purposes. For other purposes, this article may be used in accordance with the publisher’s terms. There may be differences between this version and the publisher’s version. You are advised to cite the publisher’s version. This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Author(s): Jalisha Theanutti Kallingal, Marko Scholze, Paul Anthony Miller, Johan Lindström, Janne Rinne, Mika Aurela, Patrik Vestin, and Per Weslien Title: Assimilating multi-site eddy-covariance data to calibrate the wetland CH4 emission module in a terrestrial ecosystem model Year: 2025 Version: Published version Copyright: The Author(s) 2025 Rights: CC BY 4.0 Rights url: https://creativecommons.org/licenses/by/4.0/ Please cite the original version: Kallingal, J. T., Scholze, M., Miller, P. A., Lindström, J., Rinne, J., Aurela, M., Vestin, P., and Weslien, P.: Assimilating multi-site eddy-covariance data to calibrate the wetland CH4 emission module in a terrestrial ecosystem model, Biogeosciences, 22, 4061–4086, https://doi.org/10.5194/bg-22-4061- 2025, 2025.. https://creativecommons.org/licenses/by/4.0/ https://doi.org/10.5194/bg-22-4061-2025 https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 © Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License. R esearch article Assimilating multi-site eddy-covariance data to calibrate the wetland CH4 emission module in a terrestrial ecosystem model Jalisha Theanutti Kallingal1, Marko Scholze1, Paul Anthony Miller1, Johan Lindström2, Janne Rinne3, Mika Aurela4, Patrik Vestin1, and Per Weslien5 1Department of Physical Geography and Ecosystem Science, Lund University, Lund, 223 62, Sweden 2Centre for Mathematical Sciences, Lund University, Lund, 223 62, Sweden 3Natural Resources Institute Finland, Helsinki, 00 790, Finland 4Finnish Meteorological Institute, Helsinki, 00 101, Finland 5Department of Geosciences, University of Gothenburg, 413 90, Sweden Correspondence: Jalisha Theanutti Kallingal (jalisha.theanutti@nateko.lu.se) Received: 22 October 2024 – Discussion started: 26 November 2024 Revised: 29 April 2025 – Accepted: 26 May 2025 – Published: 25 August 2025 Abstract. In this study, we use a data assimilation framework based on the adaptive Markov chain Monte Carlo (MCMC) algorithm to constrain process parameters in LPJ-GUESS model using CH4 eddy-covariance flux observations from 14 different natural boreal, temperate, and arctic wetlands. The objective is to derive a single set of calibrated parame- ter values. The calibrated parameter values are then used in the model to validate its CH4 flux output against indepen- dent CH4 flux observations from five different types of nat- ural wetlands situated in different locations, assessing their generality for simulating CH4 fluxes from boreal, temperate, and arctic wetlands. The results show that the MCMC frame- work has substantially reduced the cost function (measur- ing the misfit between simulated and observed CH4 fluxes) and facilitated detailed characterisation of the posterior pa- rameter distribution. A reduction of around 50 % in RMSE was achieved, reflecting improved agreement with the ob- servations. The results of the validation experiment indicate that for four out of the five validation sites the RMSE was successfully reduced, demonstrating the effectiveness of the framework for estimating CH4 emissions from wetlands not included in the assimilation experiment. For wetlands above 45° N, the total mean annual CH4 emission estimation using the optimised model resulted in 28.16 Tg C yr−1 and for re- gions above 60 ° N it resulted in 7.46 Tg C yr−1. 1 Introduction Methane (CH4) emissions from wetlands contribute 20 %– 30 % to the total global emissions (IPCC AR6 Chap. 5: Canadell et al., 2021; Saunois et al., 2020). About one third to one half of these wetland emissions are from wetlands lo- cated at northern latitudes of North America, Europe, and Russia (Saunois et al., 2016a). According to the IPCC AR6 report, wetlands are the largest single source of uncertainty to the global CH4 budget estimate. It is expected to have in- creased uncertainties in wetland CH4 emissions in the future (Christensen et al., 2007), partly due to climate change and partly due to spatio-temporal changes in wetland extent (that in itself is partly a consequence of climate change) (Saunois et al., 2016b; Zhang et al., 2017). A key question to consider here is the extent to which these changes in emissions are occurring and how they will impact the future global green- house gas (GHG) budget and hence the climate. While cur- rent in situ measurement techniques such as eddy-covariance (EC) flux observations are promising for drawing assump- tions at local scales, studies to date have faced difficulties in estimating wetland CH4 emissions over large landscapes pri- marily due to the spatial heterogeneity of wetlands, strong temporal variability in emissions, and the limited coverage of long term flux observations (Saunois et al., 2020). An attempt to overcome this limitation through process- based modelling of global CH4 emissions was first initi- ated by Fung et al. (1991) followed by Christensen and Cox Published by Copernicus Publications on behalf of the European Geosciences Union. 4062 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data (1995) and more mechanistically by Cao et al. (1996) and Walter and Heimann (2000). These models were simple in structure and later more attention was given to improving process representation through the studies of mainly Segers and Leffelaar (2001), Gedney et al. (2004), and Zhuang et al. (2006). In the past decade, more advanced models have in- corporated additional complexities such as dynamic water table fluctuations, temperature-dependent microbial activity, substrate diffusion, and oxidation in the soil column, along with broader applications across diverse wetland types and climatic conditions (e.g. Wania et al., 2010; Ringeval et al., 2010; Susiluoto et al., 2018). All of these past efforts indicate that comprehensive, process-based modelling of CH4 emis- sions from wetland ecosystems is unquestionably a key way to understand the variability of wetlands and how they re- spond to environmental stresses and climate change (Saunois et al., 2020). As all these models are approximations of the real world and exhibit their own uncertainties; here again the question is how to reduce the uncertainty for large scale applications. According to Kuppel et al. (2012), every terrestrial biosphere model contains uncertainties in five different ways: errors in real data used for calibration, errors in meteorological forc- ing, errors in process descriptions, errors in model parame- ter values, and inaccurate initial state of the model. The first two errors are related to measurement, while the last three are related to model formulation and are important to im- prove the model performance for general applications. There has been a growing effort to reduce uncertainty related to the last three sources of error factors in several ways. A popular method to reduce uncertainty in model parameters is to cal- ibrate the model simulations against observations. Previous studies like Williams et al. (2009), Susiluoto et al. (2018) and Kuppel et al. (2012), based on different models, dif- ferent data, and different parameter sets provide examples of improving model parameters and reducing uncertainties through data assimilation. In this study, we consider uncertainties in parameter val- ues of the CH4 module of a global process-based ecosys- tem model, Lund–Potsdam–Jena General Ecosystem Simu- lator (LPJ-GUESS) v4.1 and aim to reduce their uncertain- ties. Dynamic global vegetation models (DGVMs) like LPJ- GUESS are state-of-the-art tools for studying the function- ing of high latitude wetlands and estimating the dynamics in their global carbon balance (Sitch et al., 2003). In a previ- ous study, Kallingal et al. (2024) have used EC flux observa- tions collected at an individual site to investigate the poten- tial of a Markov chain Monte Carlo (MCMC) type (Global Rao–Blackwellised Adaptive Metropolis, GRaB-AM) algo- rithm to optimise the parameters in the CH4 module of LPJ- GUESS. The study showed that EC flux measurements of CH4 contains useful information for optimising the CH4 model parameters due to the high temporal resolution of the CH4 flux measurements. However, the small spatial scale (site scale) and limited temporal extent of data collected from a single site could have over fitted parameters to the specifici- ties of the particular site used. This points to the need for a more general approach. For example, in a study conducted by Groenendijk et al. (2011) the parameters of a photosynthesis model are optimised using EC flux observations from sev- eral Fluxnet sites. Similarly, studies like Kuppel et al. (2012) and Raoult et al. (2016) have constrained the parameters of a global ecosystem model using multi-site EC flux observa- tions. Considering these studies and the results of Kallingal et al. (2024), we hypothesise that assimilating multi-site daily CH4 flux observations using the GRaB-AM framework can derive a set of optimised general parameters capable of rep- resenting various types of northern wetlands. The present study’s objective is to investigate the capacity of the GRaB-AM framework developed by Kallingal et al. (2024) for calibrating selected model process parameters in a more general multi-site setup. We aim to use the calibrated model parameters to assess the extent to which this optimisa- tion improves the model’s ability to estimate CH4 emissions from various wetlands across northern latitudes and to esti- mate the total mean annual budget at larger scales. We also aim to estimate the posterior process and parameter uncer- tainties, as well as the posterior parameter correlations. The 14 different arctic, boreal and temperate wetland sites chosen for this study were selected to encompass diverse bioclimatic and geographical characteristics of wetlands. This deliber- ate selection aimed to endow the optimised parameters with the ability to represent a range of wetland types, indepen- dent of their distinct climatic and geographical features. We then perform additional validation simulations to evaluate the performance of the calibrated model against five different, independent validation sites to verify our above-mentioned hypothesis. In the following, we first give a brief overview of the LPJ-GUESS model, the data used in the assimilation and the data assimilation methodology itself. The assimila- tion results are then presented and discussed in Sects. 3 and 4 before we end with the conclusion (Sect. 5). 2 Data and methodology 2.1 LPJ-GUESS model LPJ-GUESS represents the structure and dynamics of ter- restrial ecosystems from local to global scales (Smith, 2001; Smith et al., 2014). The model combines basic eco- physiological features with detailed vegetation dynamics and canopy structure as used in forest gap models and includes an interactive nitrogen cycle (Smith et al., 2014). In ver- sion 4.1, which we used for this study, global vegetation is grouped into 13 different co-occurring mixtures of plant functional types (PFTs) and 5 additional PFTs that can only exist on peatland stands. The model input data consists of climate parameters (mean daily air temperature, precipita- tion, and incoming shortwave radiation), atmospheric CO2 Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4063 concentrations, and soil properties. LPJ-GUESS simulates vegetation dynamics, ecosystem biogeochemistry, water cy- cling, and energy and carbon fluxes on a daily time step. The peatland module in LPJ-GUESS contains detailed represen- tations of wetland PFT characteristics and bio-geo-chemical processes such as estimation of peat temperature, hydrology, and ecosystem exchanges, including CH4 emissions. 2.1.1 Main process description in CH4 module of LPJ-GUESS A detailed description of the wetland and CH4 emissions module is given in Wania et al. (2010) and in Kallingal et al. (2024). Here, we only briefly summarise the most important aspects of the module. The wetland peat in LPJ-GUESS is 1.5 m deep and is divided into an acrotelm with a thickness of 0.3 m with varying water table depth (wtd) and a perma- nently saturated catotelm. Peat hydrology and peat temper- ature in this layered structure depend on the composition of each layer and prevailing meteorological conditions. The five types of PFTs implemented in the wetlands are Sphagnum mosses, C3 graminoids, evergreen and deciduous shrubs, and a generic herbaceous cushion lichen moss. Shade mortal- ity, inundation stress, and daily desiccation stress are lim- iting factors for the existence and productivity of these PFTs. The basic concept of the CH4 module in LPJ-GUESS is a soil carbon pool distributed in proportion to the root distri- bution. This “potential carbon pool” serves as the substrate for methanogens to produce CH4. A portion of the soil car- bon get transformed to CH4 and/or CO2 depending on the hydrological conditions. A fraction of the produced CH4 is in dissolved form and the remainder is in gaseous form. A part of this CH4 is oxidised by the oxygen in the soil and the other part is eventually transported to the atmosphere through either diffusion, plant-mediated transport, or ebullition. As shown in Eq. (1) the sum of the emissions through these three pathways constitutes the total CH4 flux from the soil to the atmosphere (Wania et al., 2010; Kallingal et al., 2024): FCH4 = CH4diff+CH4plant+CH4ebul, (1) where FCH4 is the total CH4 flux, CH4diff is the CH4 flux component from diffusion, CH4plant is the CH4 flux compo- nent from plant-mediated transport, and CH4ebul is the CH4 flux component from ebullition. 2.1.2 Parameter selection The parameters selected for optimisation in this study are shown in Table 1. For this study we have considered 10 out of 11 parameters calibrated by Kallingal et al. (2024) in their single-site optimisation. The parameter wtiller, which is the plant tiller weight, is removed because it showed high corre- lation with rtiller in Kallingal et al. (2024). 2.2 Flux sites and climate data As mentioned in Sect. 1, for this study we selected 14 natural wetland sites for the assimilation and five additional wetland sites for validation (see Fig. 1, Table 2 and Table 4). The selection criteria for the sites were (1) that they are located above 40° N, (2) that they include at least 3 years of consec- utive CH4 (at least the summer) measurements and meteoro- logical measurements available for the sites used for assim- ilation (same criteria but at least 2 years of measurements available for the sites used for validation), and (3) that they represent arctic, boreal, or temperate ecosystems. We did not consider lakes, uplands, etc. as they are beyond the scope of the present study. The 19 sites are representative of a range of wetland types including fens, mires, bogs, marsh, and a wet tundra (for validation). In total we have used 18 437 data points of daily CH4 EC flux observations for assimilation, spanning approximately 93 measurement years. We did not consider any sites where the climate data had gaps of more than 14 d. In case of gaps smaller than 14 d the data are backfilled (by copying the val- ues from the preceding cells) to ensure dataset continuity, as data with gaps cannot be used in the model as input. For validation, we utilised 5111 data points spanning a total of 14 measurement years. For all sites, only the available CH4 flux observations are compared with the model both for as- similation and validation; i.e. we do not use any gap-filled CH4 fluxes. It should be noted that for the assimilation sites Att, Bon, Sco, Uoa, and Zak, most of the data from winter months were missing, while for the validation sites Lgt and Wpt, most of the data from summer months were missing. See Appendix A for the details regarding data collection. 2.3 Data assimilation system To find an optimal posterior parameter set we used an adaptive Rao–Blackwellised Markov chain Monte Carlo Metropolis–Hastings (MCMC-MH) algorithm (Andrieu and Thoms, 2008) to iteratively reduce the model–observation misfit in terms of a so-called cost function (see Eq. 2) that compares the modelled observable with the field measure- ments. As mentioned in Sect. 1, details of this search algo- rithm and its application to a single-site optimisation have been described in Kallingal et al. (2024). Efficient sam- pling from the target distribution requires a proposal dis- tribution that correctly represents the dependence structure of the target; to avoid manual tuning of the proposal they used an adaptive MCMC to tune the proposal distribution, where the Rao–Blackwellisation improves the adaptation step. The tuning improves the MCMC convergence speed and avoids cases of incomplete convergence (Andrieu and Thoms, 2008), especially for a complex non-linear model like LPJ-GUESS. We assumed errors in observation and parameters in the form of Gaussian distributions yielding the cost https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4064 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Table 1. Parameters selected for the multi-site assimilation. Model prior values, prior standard deviation (std), units, and description of the parameters are given. Number Parameter Prior value Prior SD Unit Description 1. Rmoist 0.4 0.396 – Moisture response in acrotelm 2. CH4/CO2 0.085 0.236 – CH4 to CO2 ratio 3. foxid 0.5 0.36 – Litter CO2 fraction 4. φtiller 70 36 % Porosity of tiller 5. rtiller 0.0035 0.004 m Radius of tiller 6. fair 0 4 % Fraction of air in peat 7. poracro 0.98 0.06 – Acrotelm porosity 8. porcato 0.92 0.076 – Catotelm porosity 9. Rmoist-an* 0.025 0.04 – Moisture response in catotelm 10. λroot 25.17 12 cm Decay length of root biomass Figure 1. Location of measurement sites selected for the study. The 14 sites used for assimilation are indicated as circles and the 5 sites used for validation are indicated as stars. The numbers in the left and right side legends correspond to the numbers assigned to the sites in Tables 2 and 4, respectively. function J (x), J (x)= 1 2 n∑ i=1 (Yi −Mi(x)) tR−1 i (Yi −Mi(x)) + 1 2 (x− xp) tB−1(x− xp), (2) where Yi , Mi(x), and Ri are the CH4 flux observations, model simulations, and the covariance matrix of the obser- vation errors, respectively at the ith site, xp are the expected prior parameters, and B is the prior parameter error covari- ance matrix. Thus, the first term represents the model–data misfit weighted by the observation error covariances and the second term represents the prior information on the parame- ters weighted by the parameter error covariances. Samples are generated by drawing xprop from a proposal distribution and then either accepting the proposed state (xi = xprop) or keeping the current state (xi = xi−1) based on the posterior probabilities. The probability of accepting the proposed state (α) is generally computed as α =min ( 1, P (xprop) P (xi−1) ) . (3) Here, P(xprop) is the posterior probability of the proposed state and P(xi−1) is the posterior probability of the cur- rent state, both computed using the cost function, Eq. (2). The acceptance probability ensures a balanced exploration of the parameter space, accepting states that improve the fit while allowing occasional exploration of less favourable re- gions (see Andrieu and Thoms, 2008, for technical details and Kallingal et al., 2024, for implementation). For optimisation we used the GRaB-AM with a chain length of 100 000 iterations, where each iteration involves one complete model run for all 14 sites. As mentioned above, daily averages of flux observations collected from the above- mentioned 14 sites are assimilated simultaneously. For each site, only the actual CH4 flux observations; i.e. non-gap-filled data are used to calculate the cost function. Having multi- ple sites in this framework, one crucial challenge was scal- ing the cost function to maintain a balanced representation of each site’s contribution to the overall model–data misfit. This process is particularly relevant when sites exhibit vari- ations in the magnitude of their individual cost functions or when the number of observations at each site differs signif- icantly. Here, the scaling factors are carefully chosen to en- sure an approximately equal representation of all sites in the cost function, regardless of their individual characteristics, and to ensure that each site has an equal influence on the op- timisation outcome. Considering the difficulty of calculating error correlations in the flux observations, we only considered errors in indi- Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4065 Table 2. Site information and data references of the 14 natural wetland sites used for assimilation. MAT refers to the mean annual temperature and MAPr to the mean annual precipitation. For Bib, Deg, Lom, Los, Ole, Sco, Sii, and Uao, MAT and MAPr are extracted from the corresponding grid cells of the site locations of the WorldClim 2.0 gridded product (Fick and Hijmans, 2017) and for the rest of the sites the information is taken from their references. The table also includes type, coordinates, and climate zones of the wetlands, as well as the time period of data availability. No Site Abr. Type Location Climate MAT MAPr Period Reference Zone (° C) (mm) 1. Abisko Abi Bog 68.21° N, 19.03° E Arctic −0.7 304 2014–2018 Łakomiec et al. (2021) 2. Attawapiskat Att Fen 52.70° N, −83.95° E Boreal −1.3 700 2011–2020 Todd and Humphreys (2018) 3. Bibai Bib Bog 43.32° N, 141.81° E Temperate 6.7 1153 2015–2019 Ueyama et al. (2020) 4. Bonanza Creek Bon Bog 64.69° N, −148.32° E Boreal −0.9 331 2014–2017 Euskirchen and Edgar (2020) 5. Degerö Deg Fen 64.18° N, 19.55° E Boreal 1.2 523 2014–2019 Granberg et al. (2001) 6. Huetelmoor Hue Fen 54.21° N, 12.17° E Temperate 10.2 572 2011–2019 Koebsch and Jurasinski (2020) 7. Lompolojänkkä Lom Fen 68.00° N, 24.21° E Boreal −0.4 484 2006–2015 Lohila et al. (2020), Aurela et al. (2015) 8. Lost Creek Los Fen 46.08° N, −89.97° E Temperate 4.8 833 2014–2019 Desai and Thom (2020) 9. Olentangy Ole Marsh 40.02° N, −83.01° E Temperate 12.1 1120 2011–2016 Bohrer and Morin (2020) 10. Scotty Creek Sco Bog 61.30° N, −121.29° E Boreal −2.8 414 2014–2018 Sonnentag and Helbig (2020) 11. Siikaneva Sii Fen 61.83° N, 24.193° E Boreal 4.2 707 2005–2015 Rinne et al. (2018) 12. Uni. of Alaska Uoa Bog 64.86° N, −147.85° E Boreal −2.9 611 2011–2019 Iwata et al. (2020) 13. Zackenberg Zak Fen 74.30° N, −20.30° E Arctic −8.6 253 2006–2020 Scheller et al. (2021) 14. Zarnekow Zar Fen 53.87° N, 12.88° E Temperate 9.7 426 2014–2019 Sachs et al. (2020) Table 3. Data availability and threshold estimated for the base error values. The number of available flux observations from each site is also provided. No Site No. Available Threshold for Error below of data base error the threshold obs. (%) (g C m−2 d−1) (g C m−2 d−1) 1. Abi 1310 89.7 0.003 0.09 2. Att 1952 61.65 0.0012 0.036 3. Bib 815 60.1 0.01 0.3 4. Bon 560 60.0 0.01 0.3 5. Deg 1361 74.6 0.0057 0.15 6. Hue 2124 76.4 0.037 1.1 7. Lom 1682 51.3 0.011 0.32 8. Los 1472 83.0 0.0085 0.25 9. Ole 1135 74.6 0.0085 0.25 10. Sco 646 49.5 0.018 0.53 11. Sii 1547 44.1 0.01 0.3 12. Uao 1126 41 0.013 0.38 13. Zak 1294 26.67 0.007 0.21 14. Zar 1413 77.42 0.04 1.2 vidual observations; i.e. we did not consider off-diagonal el- ements in specifying the observational error covariance ma- trix R in the cost function (Eq. 2). Estimating the exact ob- servation error for each site is again challenging. Assigning a constant percentage error for all measured values could intro- duce a bias, as it would result in high error values for mea- surements with high magnitudes and very low error values for observations with small magnitudes. To overcome this challenge, we followed the procedure introduced by Knorr and Kattge (2005) for the case of assimilating CO2 eddy- covariance observations and assign a threshold value set at 5 % of the variance of the distributions of observations, cal- culated separately for each site. Values below this thresh- old are identified and a uniform error is assigned to them (see Table 3). An error of 30 % is estimated for the observa- tions greater than the threshold values. For the matrix B in the Eq. (2), for each parameter a standard deviation of 40 % of their possible range is assumed based on the expertise of LPJ-GUESS modellers. 2.4 Posterior estimation After completing a full run of the MCMC chain, the posterior parameter error covariance matrix (Bp) is estimated from the prior error covariance matrices of the observations, R, and of the parameters, B, and the linearisation of the model at the minimum of the cost function, J (M∞), as described in Tarantola (1987): Bp = [M t ∞R −1M∞+B −1 ] −1. (4) Bp is then used to estimate the level of optimisation of each parameter and the sensitivity of the cost function to them. The posterior parameter uncertainties have been estimated from the square root of the diagonal elements of Bp. Large absolute values of posterior error correlations indicate that the observations do not provide independent information to distinguish the effects of a given parameter pair (Tarantola, 1987). From the 100 000 samples yielded by the GRaB-AM framework 75 % of this chain was discarded as burn-in. The remaining part of the chain, which we consider con- verged to its stationary distribution, was used for calculating the maximum a posteriori estimation (MAP) and posterior mean estimations. The posterior distributions of parameters are classified as “well-constrained”, “poorly constrained”, and “edge-hitting” parameters. The well-constrained param- https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4066 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data eters are characterised by a clearly defined unimodal distribu- tion with a low standard deviation. Conversely, poorly con- strained parameters exhibit a relatively flat multimodal dis- tribution with a large standard deviation. For a more precise estimation, we classified posterior parameter distributions as poorly constrained if the standard deviation exceeded 20 % of the total range. Edge-hitting parameters cluster near one of the edges of their prior range, as described by Kallingal et al. (2024) and Braswell et al. (2005). 2.5 Experimental setup The model was spun up for 500 years using Climate Research Unit (CRU) meteorological forcing data (Harris et al., 2014), which includes daily measurements of air temperature, pre- cipitation, and incoming shortwave radiation, to bring the model’s state variables, i.e. the various carbon pools, to ini- tial equilibrium. After spinning up, the model was run for the 14 assimilation sites using local, daily meteorological forc- ing data collected directly at the sites. We have bias-corrected the CRU data for the grid cells in which the sites are lo- cated to enforce agreement with monthly mean values of the site-specific meteorological data. For this we have used at least 2 years of meteorological data that are recorded prior to the time period of the site flux observations that we used for the assimilation. Atmospheric CO2 concentration, as de- scribed in McGuire et al. (2001) and updated until recent years using data from the NOAA Global Monitoring Labo- ratory (NOAA-GML, https://gml.noaa.gov/ccgg/trends, last access: 29 April 2025) is used as the CO2 concentration in- put to the model. Most of the CH4 flux observations at the sites were avail- able with a half-hourly resolution, but for the assimilation we used daily mean values corresponding to the LPJ-GUESS temporal resolution. Also, using daily values reduces the complexity of error correlations of half-hourly data and is better suited for a broad range timescale assimilation (Lass- lop et al., 2008). Days with less than 50 % of half-hourly CH4 data availability were removed from the assimilation. To test whether the resulting posterior parameters can enhance model performance for other individual wetlands within the study area, we designed a validation experiment that uses ad- ditional flux observations not included in the assimilation ex- periment. Independent flux observations from five different wetlands located in various parts of the study area were used for the experiment (see Sect. 2.2 and Table 4). Additionally, we conducted a mean annual budget estima- tion for all wetlands above 45° N and for all wetlands above 60° N using the optimised parameters. The model was run using the peat fraction map PEATMAP (Xu et al., 2018) and daily gridded climate data from the Climatic Research Unit-Japanese Reanalysis (CRU-JRA) (Harris et al., 2020) as model inputs. CRU-JRA data can be accessed at https://rda. ucar.edu/datasets/ds628.0/ (last access: 29 April 2025). The results were then compared with the output of the JSBACH- HIMMELI model, as described in Zhang et al. (2023) (here- after referred to as JSBACH-H), for regions above 45° N (for easier comparison with other literature) as well as with the estimates from the Global CH4 Budget of the Global Carbon Project (GCP) (Saunois et al., 2025) for regions above 60° N. 3 Results 3.1 Posterior parameter distributions The PDFs of the parameters from the MCMC chains after the burn-in are displayed in Fig. 2. All parameters, except for φtiller, fair, and poracro, are well-constrained with only one peak in the PDF. However, the parameters φtiller, fair, and poracro are rather poorly constrained, exhibiting some clus- tering around multiple peaks in the PDF and having large standard deviations. The Rmoist, fair, and λroot showed high skewness and kurtosis. However, the smaller kurtosis values of foxid, porcato, and Rmoistan , along with their low skewness, indicate that they closely resemble Gaussian distributions. The remaining four parameters showed low skewness, sug- gesting agreement with a Gaussian distribution, but with very low kurtosis indicating somewhat flatter distributions than a Gaussian distribution. The figure shows that, except for λroot, none of the parameters exhibited edge-hitting behaviour, in- dicating that the hypothetical boundaries assigned for each parameter align well with the model structure. The param- eter λroot finds its solution nearly at the lower edge. Most parameters displayed posterior solutions far from their prior values, i.e. the prior values where outside the posterior mean estimate±1σ , except for φtiller. For φtiller, both the MAP and posterior mean appeared close to the prior value, i.e. within 1σ of the posterior mean estimate. 3.2 Posterior parameter estimates and cross-correlation The posterior parameter values estimated from the MAP and the posterior mean estimate, along with their standard devi- ations, are presented in Table C1 and Fig. 2. The MAP and posterior mean estimates of all parameters remained close to each other, except for φtiller, rtiller, and fair. The param- eters λroot and φtiller exhibited high standard deviations in their posterior distributions, indicating greater uncertainty. In contrast, the parameters CH4/CO2, rtiller, porcato, and rmost showed low standard deviations, suggesting more precise posterior estimates. A cross-correlation plot (see Fig. 3) shows the correlation between all 10 parameter pairs after optimisation to exam- ine potential optimisation issues due to parameter correla- tion. High positive or negative correlations suggest that these parameters may convey similar information and one of them might be redundant in further studies. The results show that not many parameters have strong positive or negative cor- relations, except for the correlation between CH4/CO2 and Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 https://gml.noaa.gov/ccgg/trends https://rda.ucar.edu/datasets/ds628.0/ https://rda.ucar.edu/datasets/ds628.0/ J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4067 Table 4. Site information and data references of five natural wetland sites used for validation. MAT refers to the mean annual temperature and MAPr to the mean annual precipitation collected from their references. The table also includes the time period of data collected, the availability of data and the type and climate zones of the wetlands. No Site Abr. Type Location Climate MAT MAPr Period No. of Available Reference zone (°C) (mm) obs. data (%) 1. Chersky Che Wet tundra 68.61° N, 161.35° E Arctic −9.8 200 2014–2017 923 84 Merbold et al. (2020) 2. La Guette Lgt Fen 47.32° N, 2.28° E Temperate 11.07 650 2017–2019 227 31 Jacotot et al. (2020) 3. Mycklemossen Myk Bog 58.36° N, 12.16° E Hemi-boreal 6.9 802 2016–2019 1095 100 White et al. (2023) 4. Schechenfilz N. Sfn Bog 47.80° N, 11.32° E Temperate 8.28 700 2012–2015 700 64 Schmid and Klatt (2020) 5. Winous Point N. Wpt Marsh 41.48° N, −82.99° E Temperate 11.4 900 2011–2014 477 44 Chen and Chu (2020) Figure 2. Probability distribution functions (PDFs) of the posterior obtained after the GRaB-AM experiment. The blue curves are smoothed Gaussian kernel estimates on the posterior histograms, while the black curves represent the prior distributions. The dotted vertical lines in green, orange, and black correspond to the posterior mean, MAP, and prior means, respectively. The shaded grey area in the distributions represents the 1σ error estimate of the PDFs. Skewness and kurtosis values for each posterior distribution are provided in insets. Rmoistan , which has a high negative correlation of −0.82. Many slight positive correlations are observed, with a few pairs like poracro and porcato, porcato and fair having compar- atively higher values. A detailed discussion of the parameter correlation and their possible impacts on the posterior esti- mates is given in Sect. 4.1. 3.3 Performance of the optimisation The site-wise data–model misfit, presented in terms of RMSE both before and after optimisation, along with the av- erage RMSE for all sites combined is presented in Fig. 4. Most sites demonstrate a substantial reduction in RMSE, with many achieving over 50 % improvement. The un- weighted cost function values for each of the 14 sites indi- vidually and their collective sum, along with the correspond- ing RMSE and reduced χ2 values, are presented in Table C2 in the appendix. The estimated prior cost function value was 1 763 294.9. After optimisation the value changed to 79 296.4 (around 5 % of the prior) for the posterior mean estimate. The total χ2 value is 8.6, accounting for measurement and param- eter uncertainty. Notably, none of the sites exhibit overconfi- dence, with individual χ2 values larger than 1, except for the site Hue, which shows a χ2 value slightly below but close to 1. The median RMSE reduction is 52.2 % and the χ2 value is 6.8. Taking these values as thresholds, the sites Abi, Att, Sii, and Lom show significant reductions in RMSE but with high χ2 values. In contrast, sites such as Bib, Deg, Hue, Sco, Zak, and Zar exhibit low RMSE reductions but low χ2 values. The sites Bon, Los, Ole, and Uoa display both high RMSE reduc- tions and low χ2 values. None of the sites are characterised by low RMSE reduction and a high χ2 value. 3.4 Impact of the optimisation The observed CH4 flux data available from all 14 sites col- lectively provide a total of 888.8 g C m−2. The prior sum es- timate for the corresponding data was 1957 g C m−2. This value reduced to 850.9 g C m−2 after optimisation. A reduc- tion of approximately 56.52 % from the prior to the pos- terior CH4 flux was observed. Compared to the observa- tion, the posterior resulted only a slight underestimation of −38.1 g C m−2. Time series of the annual sums of CH4 emis- sions at 4 of the 14 sites used in this study are shown in Fig. 6a. The time series of the remaining sites are shown in Appendix C3, Fig. C3 for convenience. All four sites shown in Fig. 6a exhibited a better fit to the annual budget after the optimisation. Particularly noteworthy is the site Bib, which https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4068 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Figure 3. Posterior correlations between parameters derived from the GRaB-AM optimisation. In the upper triangle of the figure, negative correlations are depicted in blue and positive correlations are shown in red. The numerical labels on the upper triangle correspond to values of Pearson’s correlation coefficient. The diagonal panels exhibit 1-D histograms for each model parameter. The lower triangle displays two- dimensional marginal distributions for each parameter. The grey dots on the marginal distributions represent the parameter values obtained from the posterior GRaB-AM chain. The ranges of the distributions are labelled on the left and bottom of the figure. Figure 4. Prior and posterior Root Mean Square Error (RMSE) estimates are provided for each of the 14 sites individually, along with the combined average values. In the figure, purple, cyan, and grey bars represent the RMSE corresponding to the prior estimate, maximum a posteriori (MAP) estimate and posterior mean estimate, respectively. Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4069 had a prior estimation in 2016 significantly deviating from the observed values. This discrepancy was corrected in the posterior estimate and the annual posterior CH4 emissions at all four sites align well with the flux observations after the optimisation. Figure C3 shows that the sites Att, Los, Uoa, and Zar have shown improvement in all the years assimilated. Deg improved in more than half of the years assimilated. Hue improved in 2013 but did not improve in the other years. Bon improved in 2 out of 3 years but was slightly overestimated in 2017. For Zak, although the prior and posterior merged in some years, it showed improvement in almost all years. Lom showed improvement in 2010, 2012, 2013, and 2014. Sco showed improvement in 2016 and 2017, which accounts for half of the total assimilated years. Figure 6b displays the mean annual sums of CH4 estimated at all 14 sites. The fig- ure illustrates that the highest contribution of CH4 came from the sites Hue and Zar, which have the highest mean annual temperature (MAT) as compared to other sites. The lowest contributions are from Zak and Abi, which have below-zero MAT. Abi, Att, Los, Ole, Sco, Sii, Uoa, Zak, and Zar showed improvement in the mean annual budgets after the optimisa- tion. The remaining sites did not show an improvement. Table 5 presents the total uncertainty for each site and the total uncertainty estimated for all sites together (see Sect. B in the appendix for technical details regarding the uncer- tainty estimation). The total uncertainty of the posterior CH4 flux estimates for all the sites together was 0.19 g C m−2 d−1, whereas for the prior fluxes it was 0.36 g C m−2 d−1. This re- sults in a reduction of the total uncertainty of around 50 % after the optimisation. Comparing the prior and posterior RMSE (Fig. 4) and the uncertainty reduction, it can be con- cluded that the more constrained sites, such as Abi, Att, Bon, and Uoa, exhibited high uncertainty reduction. Notably, Abi and Att, which had the highest prior RMSE, showed a reduc- tion of uncertainty of around 95 % and 82 %, respectively. A low reduction in uncertainty was mainly observed in sites that demonstrated a low reduction in RMSE. In contradiction to this, even though the RMSE reduction observed in the case of Zak is very small, this site showed an uncertainty reduc- tion of around 33 %. Figure 5 illustrates the time series of observed, prior, and posterior fluxes for all 14 sites considered in the assimilation. The posterior model continued to overestimate emissions at the Abi, Att, Los, Ole, Sii, and Uoa sites. For Bib (except for 2015), Deg, Hue, and Zak emissions were consistently un- derestimated. The remaining four sites demonstrated reason- ably good agreement of the posterior estimates with the flux observations. The model adeptly captures the seasonal cycles of CH4 emissions from all wetlands, both for flux estimates using the prior and the posterior parameter values. Generally, no significant phase shift is observed in either the prior or the posterior estimates. However, for Bib, both prior and poste- rior estimates exhibit a slight phase shift to early summer, while the sites Hue, Los and Zar show a similar shift to late summer. Summer and winter anomalies Figure 7 illustrates the mean annual summer and winter emissions for all sites and their corresponding standard devi- ations. Figure C2 in the appendix shows their “summer” and “winter” anomalies. It should be noted that the winter mean and anomaly estimation for the sites Att, Bon, Sco, Uoa, and Zak was conducted with only a very limited number of avail- able data points, as most of them were missing. Conversely, it should be taken into account that proper winter measure- ments were not carried out at these sites, given the almost negligible emission estimates during the winter months due to their extremely cold temperatures. For all these sites, the mean annual temperature (MAT) was estimated to be below zero (refer to Table 2 and the corresponding site references). After the optimisation, both mean annual summer and winter emission estimates for sites Abi, Att, Los, Sco, Sii, and Zar exhibited improved agreement with the flux observa- tions. The sites Uoa and Bib showed improvement in summer but not in winter, whereas the sites Bon, Hue, Lom, and Ole improved in winter but not in summer. The sites Zak and Deg did not show any significant improvement in either season. For the site Deg, the prior estimates were closer to the flux observations than the posterior and for the site Zak no signifi- cant changes were observed. Summer emissions in Zak were underestimated by both the prior and posterior models. This may be attributed to the relatively low (below zero) mean an- nual temperature (MAT) of−8.6 at this location, variation of summer months with latitude, and the seasonality of decom- position. In contrast, for the sites Abi and Att with negative MAT, the model tends to overestimate summer emissions but the observed average summer emissions were comparatively lower. However, despite a MAT of−2.9 °C at the site Sco and −2.6 °C at the site Uoa, both these sites demonstrated rela- tively high summer emissions, which were better captured by the model using the posterior parameter values. Sites with a higher MAT, such as Hue, Bib, and Zar, exhibited the highest summer emission values. Although, the site Ole, which has the highest MAT of 12.1 °C, displayed comparatively lower summer emissions. This difference could be influenced by the substantial MAPr of 1120 mm at the Ole site. For most of the sites, the observed winter annual mean was very close to zero, except for Bib, Deg, Hue, Ole, and Zar. For Hue, Ole, and Zar, the posterior estimate showed better performance in capturing the seasonal trends in flux observations than the prior. The site Hue exhibited high win- ter emissions. When the air temperature input for this site was estimated, it showed a mean value of 3.8 °C in winter months. Overall, although the majority of sites showed im- proved estimation of winter and summer emissions, some of the sites remained unchanged or did not perform better than before. The estimation of the standard deviation for summer and winter months showed a reduction for all sites after opti- misation. https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4070 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Figure 5. The CH4 simulation from the LPJ-GUESS model from 14 different wetland sites (purple dots) after optimising with the GRaB-AM algorithm. The black dots are the real CH4 flux observations from corresponding wetlands. The teal dots are the prior simulation with the prior model parameters used to start the MCMC chain; 3 d running averages are calculated from the original time series and from most of the figure a few outliers on the vertical axis have been removed for better visualisation. Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4071 Table 5. Prior and posterior total uncertainty estimates (σ (g C m−2 d−1)). Total parameter and model uncertainty estimates separately for each site and collectively for all sites combined are shown. Uncertainty is abbreviated to “unc.”. Site Prior unc. Posterior unc. Site Prior unc. Posterior unc. Abi 0.18 0.01 Los 0.04 0.03 Att 0.11 0.02 Ole 0.02 0.019 Bib 0.08 0.04 Sco 0.10 0.02 Bon 0.15 0.02 Sii 0.05 0.04 Deg 0.03 0.02 Uaf 0.08 0.03 Hue 0.13 0.12 Zak 0.03 0.02 Lom 0.03 0.027 Zar 0.11 0.10 Total unc. 0.36 0.19 Figure 6. Panel (a) displays the annual sum estimation of CH4 from 4 out of 14 sites used in the study. The sites are represented in different colours with distinct markings to distinguish between observation, prior, and posterior. Panel (b) presents the mean annual sums of CH4 estimated for all 14 sites used in this study as bar charts. The averages of the flux observations are calculated with only the available daily averages used for the assimilation. 3.5 Validation of optimised parameters and large scale simulation Table 6 and Fig. 8 show the results of the validation experi- ment explained in Sect. 2.5. A collective reduction of 58.8 % in RMSE was observed across all the validation sites. The RMSE decreased from 0.4 in the prior simulation to 0.16 in the posterior simulation. The total sum of flux observations for all five sites was 319.4 g C m−2. Corresponding posterior simulation resulted in a total sum of 421.3 g C m−2, com- pared to 521.9 g C m−2 in the prior simulation. This repre- sents a 19.3 % decrease in the posterior estimate compared to the prior. Similarly, the prior σ was 0.61, which was re- duced to 0.17, representing a 72.13 % reduction. The posterior estimates for four of the five sites showed improved RMSE (Fig. 8). For the sites Sfn, Myk, and Che, reductions of 81.1 %, 59.8 %, and 31.2 % were observed, re- spectively. The RMSE improvement for the site Lgt was neg- ligible, with a value of 0.6 %. The site Wpt, a temperate marsh (note that marshes were not included in the sites used for assimilation), exhibited a very slight increase in RMSE. The total prior model–data mismatch of CH4 estimated at this site during the time period was 72.5 g C m−2, which in- creased to 78.98 g C m−2 after optimisation. Including Wpt, the posterior estimates of all these sites appeared improved in terms of σ reduction. Wet tundra was not used for assim- ilation; however, the site Che, a wet tundra used for valida- tion, demonstrated a remarkable 31.2 % reduction in RMSE, https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4072 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Figure 7. Mean annual summer (a) and winter (b) emissions for all sites, along with their corresponding standard deviations. The dots represents mean seasonal values and error lines indicates their 1 standard deviation. Table 6. RMSE reduction and the changes in the total emission es- timate (in %) for the validation sites, along with their prior and pos- terior uncertainty (σ ) estimates. Site RMSE Change in Total Total Reduction Total Estimation Prior Posterior σ (%) (%) σ g C m−2 d−1 Che 31.2 −21.5 0.29 0.19 Lgt 0.6 4.9 0.10 0.10 Myk 59.8 43.6 0.17 0.09 Sfn 81.1 38.5 0.77 0.16 Wpt −0.23 31.8 0.2 0.18 Total 58.8 19.3 0.61 0.17 along with a significant reduction in total posterior σ , from 0.29 to 0.19. For all the sites except Che, the prior was over- estimating the posterior flux. For Che, there was an increase of 21.5 % in the posterior estimate. Figures 9 and 10 show the results of large scale simula- tions of CH4 emissions from wetlands located above 45° N. Figure 9 includes prior and posterior simulations and their differences, as well as the fluxes from JSBACH-H simula- tions for the same domain. The maps in the top row indicate that both the prior and posterior simulations identify hotspots of emission in North America, Canada, Russia, and Europe, though the total posterior estimate is clearly smaller com- pared to the prior. However, examining the differences of prior and posterior emission illustrated in the bottom left map reveals that the smaller posterior emissions are not system- atic over the whole simulation domain. Essentially the mag- nitude of the emissions at the hotspot locations is reduced in the posterior, whereas surrounding areas of low emissions Figure 8. Scatterplot comparing prior and posterior RMSE values for each site. Each point represents one of the study sites. The 1 : 1 dashed line indicates no change in RMSE. Points below the line indicate improved model performance after assimilation. in the prior are slightly enhanced in the posterior. The bot- tom right map displays the CH4 simulation from JSBACH-H, with a resolution 1.875°× 1.875°, for comparison. It is evident that both LPJ-GUESS and JSBACH-H exhibit similar patterns in high and low emissions. The total mean emission over 2010–2020 was 32.32 Tg C yr−1 for the LPJ- GUESS prior simulation and 28.16 Tg C yr−1 for the pos- terior simulation. For JSBACH-H, the mean emission over Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4073 2010–2016 was 15.66 Tg C yr−1. The posterior simulation of LPJ-GUESS is closer to the result from JSBACH-H but still much larger. When comparing CH4 emissions from wet- lands above 60° N, the agreement between the posterior LPJ- GUESS estimate (7.46 Tg C yr−1) and the JSBACH-H esti- mate (7.62 Tg C yr−1, mean over 2010–2016) is much closer. However, the LPJ-GUESS prior estimate (8.69 Tg C yr−1) aligns better with the Global CH4 Budget estimate from the GCP (9.0 Tg C yr−1, mean over 2010–2019). 4 Discussion In this study we have conducted detailed examination of the posterior results to assess parameter distributions, un- certainties, error correlations, changes in flux components, and model–observation fit as well as the changes in summer and winter anomalies. The well-constrained seven parame- ters discussed in Sect. 3.1 with single-peaked PDFs indicate strong convergence and effective estimation, suggesting the model is sensitive to the data for these parameters. How- ever, the poorly constrained parameters with multiple peaks highlight areas where the model might need refinement or where additional types of observations could improve param- eter estimation. The ranges of posterior parameter distribu- tions indicate a successful search within the permitted pa- rameter ranges (Fig. 2). The non-Gaussian behaviour of the parameters φtiller, fair, and poracro described in Sect. 3.1 can still be considered acceptable, as the posterior distribution is expected to be perfectly Gaussian only under the assump- tion of a linear model. Beyond the inherent nonlinearity of LPJ-GUESS, this behaviour also suggest that the assimilated fluxes provide more complex information about the wetland processes than initially assumed or that there are complex interactions between the model parameters. This complexity may warrant further investigation into the model structure. It also indicates that the model could benefit from account- ing for errors that place less weight on extreme observations, allowing for a smoother response curve from the model. 4.1 Parameter correlations and their implications Rmoist and Rmoistan : these parameters are related to the mois- ture response in the acrotelm and catotelm, respectively. A smaller posterior value of Rmoist and Rmoistan would result in a slower soil carbon turnover time in the posterior model, leading to slightly less carbon available for CH4 production. Although this could decrease the total decomposed carbon in the soil, the strong negative correlation between Rmoistan and CH4/CO2, and the weak negative correlation between Rmoist and CH4/CO2, indicate that the decrease in these pa- rameters has influenced the increase in the CH4 fraction from this reduced amount of decomposed soil fraction. This indi- cates that other factors such as water table depth, availability of oxygen, soil temperature, etc., might have influenced the CH4 production. CH4/CO2: this parameter represents the ratio of CH4 to CO2. The parameter was slightly increased to a value of 0.14 from 0.085 following optimisation. This indicates a com- paratively higher CH4 emission fraction from the total de- composed carbon. All the parameters except for Rmoist and Rmoistan discussed above showed weak positive or neutral correlations with CH4/CO2. The weak positive correlations observed between most parameters and the CH4/CO2 ra- tio suggest that, while these factors may not be the primary drivers of CH4 production, they still contribute to the in- creases in the ratio. foxid: the fraction of oxidised CH4, utilising available oxy- gen in the soil, is represented by the parameter foxid. The posterior parameter value (0.76) is increased compared to the prior (0.5). This indicates that a substantial fraction of the produced CH4 will get oxidised, while the remaining CH4 (24 %) will get transported to the atmosphere. This parame- ter has shown very low correlation with the other parameters and zero correlation with rtiller and porcato. φtiller and rtiller: the posterior parameter values estimated for φtiller and rtiller are higher than the prior values, 0.77 and 0.0081, respectively. With aerenchyma tissues having more porous space and a larger radius, the plant-mediated transport of CH4 to the atmosphere is facilitated. However, through the same spacious aerenchyma tissues, plants also have the potential to transport more O2 to the soil. This potential in- crease in the transport of O2 to the soil could be a reason for increase in foxid, considering the slight positive correlation observed between foxid and φtiller. fair, poracro, and porcato: these three parameters are related to soil composition. The posterior values of fair increased compared to the model prior indicate a higher fraction of air in the soil. The decrease in poracro and porcato indicates decreased porosity in the acrotelm (which can contain both water or air) and catotelm (which can contain only water) re- spectively. fair and poracro are positively correlated, indicat- ing more air in a more compact acrotelm environment with less water. A higher amount of air in the acrotelm can have a positive effect on diffusion. In the model, the diffusivity of CH4 in air is estimated to be four orders of magnitude larger than in water. On the other hand, lower porosity in the catotelm can reduce ebullition due to the limited porous space in the soil, which holds less water. This decreases the capacity to retain excess CH4 and release it when the solu- bility threshold is reached. (see Kallingal et al., 2024; Wania et al., 2010, for details). λroot: λroot played a crucial role in this optimisation. After optimisation, this parameter got a significantly lower value of 10.25 cm compared to the prior. It seems that the optimi- sation, when generally trying to reduce the emission from the model, has a tendency to reduce the decay length of root biomass in the soil. The posterior parameter value closely aligns with the values reported in Kallingal et al. (2024) https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4074 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Figure 9. Total mean annual CH4 estimates above 45° N. (a) Prior CH4 emissions from LPJ-GUESS averaged from 2010 to 2020 with a spatial resolution of 0.5°× 0.5°. (b) Posterior CH4 emissions from LPJ-GUESS averaged from 2010 to 2020 with a spatial resolution of 0.5°× 0.5°. (c) Difference between the prior and posterior emissions from LPJ-GUESS. (d) CH4 emissions from JSBACH-H averaged from 2010 to 2016 with a spatial resolution of 1.875°× 1.875°. (10.47 cm). The optimisation results indicate a much shal- lower soil profile for the majority of root decay activities and CH4. Given that most of the peat decomposition activities are assumed to occur in acrotelm, the reduction in the magnitude of λroot could substantially facilitate diffusion. λroot has not shown any strong correlation with any of the other param- eters, but its weak negative correlation with rmoist, rmoistan, foxid, and fair suggests that soils with shallow root depth exhibit increased moisture response and air fraction, with higher oxygen availability. Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4075 4.2 Posterior model estimates Overall, the optimisation successfully reduced the model– data misfits, with some variation in the degree of improve- ment among different sites. This improvement is evident from the substantial magnitude change in the total cost func- tion, RMSE and uncertainty from the prior to the posterior, and the estimated χ2 values (see Tables C1 and C2, Fig. 4, and Table 5). It should be noted that this successful improve- ment occurred even when assimilating data from multiple sites with diverse climatic conditions. The approximately 95 % improvement in the fit, as mea- sured by the cost function, demonstrates the effectiveness of the GRaB-AM algorithm in sampling from high-probability regions. In general, such a large reduction in the magnitude of the cost function value can be interpreted as the model becoming overly tuned to the specific dataset, which risks a loss of generalisability. However, the high magnitude of the posterior cost, 79 296.4, along with the high chi-square value of 8.6, indicates a slight underfitting. A portion of this underfitting in posterior estimate can be attributed to the in- ability of GRaB-AM to capture the collective variability in such a large and diverse EC flux observations (see the Fig. 5). GRaB-AM assumes normally distributed fluxes, but in real- ity, observed fluxes often deviate from normality, exhibiting outliers, skewness, and heavy tails. However, the simplifi- cation to normality is a direct result of applying the com- monly used quadratic loss function in data assimilation, and aids tractability of the MCMC-implementation. Introducing more complex error distributions would require estimation of the parameters of those distributions, significantly increasing both the number of parameters to estimate and the computa- tional expense. Such an expansion, while possible and inter- esting future work, is outside the scope of this study. Although LPJ-GUESS is a well-developed model, there are still several processes affecting the calculation of CH4 emissions that are still not fully included in the model. For example, the assumption of zero wind speed above wetlands, the lack of detailed representation of ebullition, and the in- adequate representation of wintertime emissions are some of the process in need of improvement (Kallingal et al., 2024). Another main reason for this underfitting could be the vari- ability in the model input data. It has been observed that the model is highly sensitive to precipitation and temperature in- puts. Temperature alone can explain a large proportion of the variation in the CH4 simulations (Aalto et al., 2025), which lies beyond the constraints of GRaB-AM. And even though we used the meteorological measurements at the sites, there may still be biases in the representativeness of these mea- surements compared to the flux footprint. Examining the sites individually, the four sites with a sig- nificant reduction in RMSE and a high χ2 value imply en- hanced accuracy in the model predictions but indicate sim- ilar underfitting as discussed before. More attention should be given to these sites, as there may be potential for fur- ther improvement in capturing the variability of the assim- ilated data. On the other hand, the seven sites showing the smallest RMSE reduction and low χ2 values indicate less model improvement, but the model captures the variability in the assimilated data. The three sites with a high reduc- tion in RMSE and low χ2 values indicate good performance in model prediction and in capturing the variability. No sites have appeared with low RMSE reduction and high χ2 val- ues, which indicates that the predictions after optimisation have improved in at least one way or the other. No system- atic trends are observed in these matrices, which indicates that the reason a few sites show considerable improvement in both RMSE and χ2 is more due to the external factors such as model structure and assimilated or input data rather than the assimilation framework. Even though no correlations are observed between the types of wetlands and their locations, we note that the sites that showed a considerable reduction in RMSE, such as Abi, Att, Bon, Deg, Lom, Los, Sii, Uoa, and Zar, are those that are boreal or arctic in nature, miss- ing only Sco and Zak. Except for Los, all sites located in the temperate region, namely Hue, Ole, Bib, and Zar, showed comparatively lower reductions in RMSE (see Table 2 and Fig. 4). Compared to the total assimilated flux observations, the posterior estimate showed a slight underestimation of −38.1 g C m−2 (see Sect. 3.4). The assimilated sites did not exhibit any consistent patterns of over or underestimation (Fig. 5). Testing the GRaB-AM algorithm at the Siikaneva site, Kallingal et al. (2024) observed systematic underesti- mation in LPJ-GUESS posterior over many years. Employ- ing multiple sites with varying climatic variability has proven beneficial in resolving this issue, as sites like Bib, Deg, Sco, etc., exhibit both overestimation and underestimation in con- secutive years. On the other hand, Kallingal et al. (2024) observed a considerable reduction in RMSE for Siikaneva in their single-site experiment, whereas in this study, being one among the 14 sites, the optimisation for Siikaneva was comparatively limited. This is rather expected in a multi-site setup, as the model must balance variability across multiple flux observations, which can degrade the fit at any individual site in favour of overall performance and generalisability. Emission of CH4 from wetlands can vary significantly across latitudes, largely influenced by the growing season and climatic conditions. When assimilating data from a sin- gle site into the LPJ-GUESS model, Kallingal et al. (2024) encountered difficulties in accurately capturing winter-time emissions. We faced a similar issue, where the model emitted zero CH4 when temperatures dropped below freezing point. However, observational data showed that wetlands continue to emit CH4 even under sub-zero conditions (Ito et al., 2023; Treat et al., 2018; Aalto et al., 2025). This discrepancy high- lights a notable limitation of the LPJ-GUESS model: its high sensitivity to temperature inputs for CH4 emissions and mi- crobial CH4 production and consumption being strongly in- hibited in cold and frozen soils. The study by Ito et al. https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4076 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data (2023), in which they compared cold-season CH4 fluxes sim- ulated by 16 models, shows that many similar models ex- hibit the issue with underestimating wintertime emissions. The same was observed in the study by Aalto et al. (2025), in which six models were compared. In both studies, LPJ- GUESS was a participant. One of the main observations by Ito et al. (2023) is that LPJ-GUESS is one of the models that discretely suppressed emissions under sub-zero temper- atures and set a clear temperature threshold for CH4 produc- tion. This could limit the GRaB-AM framework from ad- justing the model’s wintertime emissions and capturing the overall variability. Consequently, the algorithm compensates for the winter model–data mismatch by adjusting summer values. One potential model modification could involve in- corporating mechanisms to simulate microbial activity un- der frozen conditions, snowpack insulation, and detailed rep- resentation of soil temperature dynamics, allowing for CH4 emissions even when surface temperatures drop below zero. It has been observed that models representing detailed mi- crobial activities and soil temperature profiles, such as the WETMETH model (Nzotungicimpaye et al., 2021), could simulate wintertime emissions on a comparable scale. A fu- ture study should involve using the GRaB-AM framework on such a somewhat simpler model to compare and verify its performance. 4.3 Validation of the posterior model and budget estimation of northern wetlands Through selecting 14 different wetlands as described in Sect. 2.2, we aimed to equip the optimised parameters with the capability to accurately represent different types of wet- lands, irrespective of their specific climatic and geographical features. Our validation analysis suggests that the optimised parameters achieved the goal to a large extent (see Sect. 3.5). In general, the optimised parameters perform better in repre- senting different types of wetlands, especially bogs. A mis- match of 102 g C m−2 is observed between the total flux ob- servation and the total posterior model estimate, which is a large value compared to the result from the optimisation. Given this observation and the less constrained nature of sites like Wpt and Lgt, which are temperate, future studies might consider the necessity of different parameter sets for different wetland types. This is evident from the study of Treat et al. (2018), which shows that the behaviour of wetlands can be completely different based on their type and location, espe- cially during the non-growing season. Here it is worth noting, as mentioned in Sect. 4.2, that the majority of temperate sites used for optimisation also exhibited lower fit improvement in terms of cost function. There is a significant shortage of reliable, long-term CH4 flux observations from wetlands, which is a major chal- lenge for studies like ours. Many of the wetland sites used in this analysis only provided measurements during summer months, or contained substantial data gaps. This limitation makes it difficult for any assimilation algorithm to fully cap- ture the natural variability in the data and could potentially lead to biased misinterpretations. Having used all the main sites with long-term measurements for assimilation, it was challenging to identify additional measurement sites for val- idation. Both the assimilation and validation experiments showed a reduction in the posterior compared to the prior. The large scale simulation above 45° N also showed a similar tendency, with around a 12.87 % reduction in the posterior compared to the prior (see Figs. 9 and 10 and Sect. 3.5). We compared our results against CH4 wetland emissions from JSBACH- H. This comparison primarily indicates that the change in the LPJ-GUESS simulated emissions after optimisation is in the direction towards better agreement with JSBACH-H fluxes, even though the comparison with JSBACH-H still suggests that LPJ-GUESS overestimates emissions. How- ever, other studies, such as the one by Nzotungicimpaye et al. (2021), reported 24.825 Tg C yr−1 from wetlands north of 45° N, which aligns much better with the posterior LPJ- GUESS estimate. Also, the study by Aalto et al. (2025) shows that the 16 wetland models they used simulated annual CH4 emissions from wetlands of 45–90° N during 2000– 2020 as 20.1 Tg C yr−1. We also compared the resulting pos- terior to the observations quantified for the same domain. The gridded data product of northern wetland CH4 emis- sions, based on upscaling EC flux observations estimated by Peltola et al. (2019), resulted in an average value of 25.98 Tg C yr−1 (23.77–28.2 Tg C yr−1). The results indicate that the LPJ-GUESS posterior is close to their quantified ob- servation. However, the large uncertainty in various estimates of wetland CH4 emissions makes it difficult to compare the results with other studies. For example, as mentioned in Sect. 3.5, when compared with the GCP estimate of wetland CH4 emissions above 60° N, the prior emissions from LPJ- GUESS show better agreement. Nevertheless, the GCP esti- mate also reports a wide range, from 4.5 to 13.5 Tg C yr−1, which would also cover the LPJ-GUESS posterior. 4.4 Possibilities and limitations of GRaB-AM The GRaB-AM algorithm incorporates the adaptation mech- anism with Rao–Blackwellisation, which recursively updates the covariance of the proposal distribution to capture the de- pendence among different parameters. Such a recursive up- date of the proposal distribution can improve the efficiency of MCMC by allowing the search algorithm to take larger steps in the parameter space, while still accounting for parameter inter-dependencies. This is particularly important for high- dimensional and correlated parameter spaces. In the optimi- sation process laid out here, with flux observations from mul- tiple sites and with a relatively high-dimensional parameter space of a highly non-linear model, the GRaB-AM algorithm is particularly beneficial because it enhances the exploration of a wider parameter space while adapting the proposal dis- Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4077 Figure 10. Time series of the total annual CH4 estimates above 45° N. Prior and posterior CH4 estimates from LPJ-GUESS, from 2010 to 2020 are shown in blue and green, respectively. CH4 estimate from JSBACH-H, from 2010 to 2016, is shown in purple. tribution over iterations. The algorithm has the potential to incorporate other types of natural data distributions and ac- count for temporal correlations in the data. Based on the re- sults of this study and the results form Kallingal et al. (2024), it is evident that the algorithm is useful for optimising com- plex models but can also be easily applied to simpler, more linear models such as WETMETH (see Sect. 4.2 for the dis- cussion). Additionally, there is potential for the algorithm to be used in optimising other model variables, such as CO2 fluxes, vegetation, or soil dynamics, where long, reliable time series data are available. While GRaB-AM is a powerful technique for highly non- linear models, there are several considerations and poten- tial challenges when applying it to multi-site assimilation. (a) GRaB-AM can be computationally intensive, especially in high-dimensional parameter spaces or when dealing with a large number of sites. In this study, it took around 480 com- putational hours to complete 100 000 iterations on an AMD Ryzen Threadripper processor. (b) Adaptation mechanisms in GRaB-AM need to be carefully tuned to balance parameter exploration. Improper tuning may lead to suboptimal explo- ration of the parameter space. Additionally, the algorithm re- quires tuning hyperparameters that control the speed of adap- tation, and its performance is sensitive to these choices, mak- ing it challenging to find appropriate hyperparameter values to begin with. (c) GRaB-AM assumes that observational data are Gaussian distributed, which is not always true when con- sidering natural variability in multi-site flux data. The data distribution may be Lorentzian, Bernoulli, heavy-tailed, or take other forms. (d) Temporal correlations in the data are not addressed in the current form of GRaB-AM. (e) We encoun- tered difficulties in efficiently minimising the total misfit, i.e. simultaneously reducing the misfit across multiple sites. This is due to the largely different cost function values for individ- ual sites, caused by the varying conditions at each location. As a result, scaling was applied independently to each site to homogenise the contribution (weight) of each site in the optimisation process. However, this weighting of each site in a multi-site assimilation is a general challenge in multi- site data assimilation experiments and does not pertain to the GRaB-AM technique. 5 Conclusions This study aimed to optimise the simulation of CH4 emis- sions from natural wetlands in the LPJ-GUESS DGVM us- ing eddy-covariance flux measurement data obtained from 14 diverse natural wetlands, characterised by variations in temporal, spatial, and/or climatic features. Ten selected model process parameters with the greatest influence on wetland CH4 flux simulation were optimised using the Global Rao–Blackwellised Adaptive MCMC (GRaB-AM) algorithm within a Bayesian framework as a follow-up study of Kallingal et al. (2024). Following the optimisation, the study used flux observations from five different wetlands, which again differ in their temporal, spatial, and bioclimatic features, to validate the results of the optimisation. The op- timisation results showed a substantial enhancement in the model’s capacity to align with observed CH4 fluxes, with a total reduction of approximately 50 % in RMSE and an ap- proximately 53 % reduction in total uncertainty. The discrep- ancy between the modelled and observed values decreased from 1068.5 to 38.1 g C m−2. For wetlands above 45 ° N, the total mean annual emission from the posterior LPJ-GUESS estimate is 7.46 Tg C yr−1. Validation results demonstrate that four out of five sites reduced RMSE, contributing to an overall reduction of approximately 58.8 %. Given the re- maining mismatches between observations and simulations and the presence of less constrained sites, future investi- gations will focus on individual sites, and grouping them based on their bio-geo-climatic characteristics, to examine if they need to be parameterised with different sets of pa- rameters. Additionally, further studies are planned to quan- https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4078 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data tify CH4 emissions from boreal and temporal wetlands on large spatial scales, using the optimised parameters, and to validate them against independent atmospheric observations, i.e. atmospheric CH4 observations provided by the Euro- pean ICOS observation network. Another intended outcome of this study is to make use of the error correlation derived from the study as prior input to the atmospheric CH4 inver- sion model, such as Lund University Modular Inversion Al- gorithm (LUMIA) (Monteil and Scholze, 2021). Appendix A: Data source description Among the sites used for assimilation, Bib (JP-BBY), Bon (US-BZB), Deg (SE-Deg), Hue (DE-Hte), Los (US-Los), Ole (US-ORv), Sco (CA-SCB), Uoa (US- Uaf), and Zar (DE-Zrk) were collected from Fluxnet datasets (the IDs in the bracket corresponds to Fluxnet) (https://fluxnet.org/data/fluxnet-ch4-community-product/, last access: 7 April 2025, Delwiche et al., 2021; Pastorello et al., 2020). For the site Abi, CH4 data was collected from ICOS Sweden (2023) (https://doi.org/10.18160/Q6H6- B94B and the climate data was obtained from SMHI (https://www.smhi.se/data/meteorologi/, last access: 29 April 2025). Data for Att was collected from Ameri- flux (https://doi.org/10.17190/AMF/1902822; Todd and Humphreys, 2025). For the site Lom (FI-Lom), climate observations of 2006 were taken from the Fluxnet site mentioned above. Observations for the remaining years were obtained from the station Principal Investigator (PI). Precipitation and temperature data for Sii were taken from FMI (https://en.ilmatieteenlaitos.fi/download-observations, last access: 7 April 2025), and CH4 data and short- wave radiation data for Sii were collected from AVAA- SMEAR (https://smear.avaa.csc.fi/download, last ac- cess: 29 April 2025) (see Kallingal et al., 2024, for details). For the site Zak, the data was taken from GEM CH4 (https://doi.org/10.17897/430P-DS31; Green- land Ecosystem Monitoring, 2025a) and GEM climate (https://doi.org/10.17897/9RY6-ZB75; Greenland Ecosys- tem Monitoring, 2025b). The data for Che, Lgt, Sfn, and Wpt, used for validation, were collected from the Fluxnet datasets mentioned above. Climate data for Myk was obtained from SITES (https://www.fieldsites.se/sv-SE, last access: 29 April 2025) and the CH4 data were obtained from station PIs. Appendix B: Statistical metrics The following statistical metrics are used, among others, to analyse the results. The total reduced chi-square (χ2) is used to assess the goodness of the fit between the flux data and the model outputs. We calculated it by dividing twice the value of the cost function by the number of observations as χ2 = 2J (x) N , (B1) where J is the cost function, x corresponds to the parameters, and N is the number of observations. The root mean square error (RMSE) is estimated to quan- tify the average error between predicted and observed flux data as RMSE= √√√√ 1 N N∑ i=1 (Yi −Mi) 2, (B2) where N is the number of observations, Yi is the observed value, and Mi is the predicted value. An uncertainty estimation was conducted assuming inde- pendence between parameter uncertainty and model uncer- tainty, using the following equation: σtotal = √ σ 2 model+ σ 2 param, (B3) where σmodel is the model structural uncertainty estimated from the standard deviation of the prior and posterior resid- uals (Desroziers et al., 2005). σparam represents the contri- bution of parameter uncertainty to the overall uncertainty in observation space, estimated from the 95 % credible interval of the parameters and the standard deviation of total sums of the model prediction by taking into account both the parame- ter uncertainty from the MCMC sampling and the variability in the model predictions. The calculation is performed as fol- lows: σparam = σpredic(CIupp−CIlow) 1.96 , (B4) where σpredic is the standard deviation of total sums of the model prediction over MCMC runs, CIupp and CIlow are the upper and lower bounds of the credible intervals of the pa- rameters, and the factor 1.96 is the conversion factor to con- vert the 95 % credible interval to a standard deviation assum- ing a Gaussian distribution. In addition, we used skewness and kurtosis estimates of the posterior parameter probability density functions (PDFs) to describe their structure. We also calculated summer and winter anomalies by subtracting the mean values for April through September (summer) and October through March (winter) from the corresponding time series to estimate the variability in the seasonal cycle of the observed and mod- elled CH4 fluxes. Appendix C: Result of optimisation C1 Changes in component contribution Changes in the component-wise estimation of ebullition, dif- fusion, and plant-mediated transport before and after opti- misation is illustrated in Fig. C1. The inner circles repre- sent the priors, while the outer circles represent the posterior Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 https://fluxnet.org/data/fluxnet-ch4-community-product/ https://doi.org/10.18160/Q6H6-B94B https://doi.org/10.18160/Q6H6-B94B https://www.smhi.se/data/meteorologi/ https://doi.org/10.17190/AMF/1902822 https://en.ilmatieteenlaitos.fi/download-observations https://smear.avaa.csc.fi/download https://doi.org/10.17897/430P-DS31 https://doi.org/10.17897/9RY6-ZB75 https://www.fieldsites.se/sv-SE J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4079 model estimates. The optimised parameters are constrained differently for different components across sites. There was no general trend of any one of the three transport mecha- nisms being dominant after optimisation. Regarding the prior estimate, all sites but Hue and Att had significant contribu- tions from all three emission components. Hue and Att had a very minor contribution from plant-mediated transport. After optimisation, zero contributions from plant-mediated trans- port were estimated for both these sites. Interestingly, for the site Los, the majority of the prior was contributed by plant- mediated transport and ebullition. However, in the posterior, ebullition contributed very little and was taken over by diffu- sion. Furthermore, many sites showed the dominance of only two components after optimisation. It was consistently ob- served that the third component was suppressed, regardless of the nature or climatic conditions of the site. Table C1. Posterior parameter estimates from the GRaB-AM optimisation. The “start prior values” are the initial random values sampled from the prior distributions. Maximum a posteriori (MAP) estimates, posterior means, and posterior standard deviations (std) are provided. The cost function values corresponding to the prior and posterior estimates are also given. Parameter Rmoist CH4/CO2 foxid φtiller rtiller fair poracro porcato Rmoist_an λroot Cost value (w) Start prior vals. 0.3 0.2 0.6 0.8 0.003 0.001 0.9 0.87 0.04 35 4897.95 MAP 0.13 0.12 0.75 0.82 0.0092 0.008 0.95 0.86 0.017 10.25 221.01 Posterior mean 0.15 0.14 0.76 0.77 0.0081 0.023 0.94 0.86 0.017 10.25 227.30 Posterior std ∓ 0.045 0.007 0.027 0.15 0.0012 0.02 0.027 0.008 0.005 0.23 Table C2. A comprehensive overview of un-weighted (uw) prior and posterior cost values, RMSE reduction in percentages, and the χ2 values estimated for all sites individually and together. Site Prior cost Posterior cost RMSE χ2 Site Prior cost Posterior cost RMSE χ2 value (uw) value (uw) reduc. (%) (uw) value (uw) value (uw) reduc. (%) (uw) Abi 717 913.5 14 818.9 63.5 22.6 Los 163 304.4 8711.5 63.42 11.8 Att 451 259.9 19 867.7 68.1 20.3 Ole 25 368.6 2693.4 52.2 4.7 Bib 54 339.3 1274.8 46.0 3.1 Sco 58 147.0 644.4 36.7 2.0 Bon 23 300.2 1906.2 70.9 6.8 Sii 58 183.1 12 406.5 59.2 16.0 Deg 10 510.6 873.4 51.3 1.3 Uoa 130 041.1 5172.5 68.4 9.2 Hue 5033.6 1018.4 27.16 0.96 Zak 3770.6 1357.8 4.3 2.1 Lom 54 172.8 7672.8 55.9 9.12 Zar 7949.7 877.3 42.6 1.2 Total 1 763 294.9 79 296.42 50.30 8.6 https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4080 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Figure C1. Component-wise percentage contribution of CH4 to the total modelled emission for all 14 sites is presented separately. The inner circle represents the prior estimate, and the outer circle represents the posterior estimate. Another interesting observation pertains to the site Zak, an arctic fen with a very low mean annual temperature (MAT), where nearly equal contributions from all three components were observed in the prior. The posterior, however, showed that nearly all emissions were from diffusion, with very little contributions from the two other components. The RMSE es- timate for this site indicates a very low reduction compared to other sites, suggesting that the optimisation did not perform well in constraining this site. The dominance of only one or two components after op- timisation indicates possibilities of biases. Verifying and re- solving this issue might be achievable through component- wise assimilation into the model using data from all three components and local hydrology observations. However, this will be challenging due to the unavailability of data, espe- cially of the ebullition. Measuring ebullition fluxes poses sig- nificant challenges, primarily attributed to the pronounced spatio-temporal variability. Ecosystems exhibit rapid, mo- mentary surges in fluxes, reaching exceptionally high levels within seconds, interspersed with prolonged periods of neg- ligible ebullition (Canadell et al., 2021) C2 Summer and winter anomaly estimation Summer and winter anomalies of flux observations, prior, and posterior estimated separately for all 14 sites used can be seen in Fig. C2. The figure also provides details about the years in which the model either underestimated or overes- timated the emissions. It is clear that neither the simulation nor the observation follows any common seasonal patterns or trends. This indicates that CH4 emissions from wetlands are generally highly dependent on the variabilities in the under- lying climatic variables, and the same holds for the model. A detailed analysis of the correlation and sensitivity between the model’s CH4 emission and input climatic variables can be seen in Kallingal et al. (2024). High deviations were observed in the summer anomaly compared to the winter anomaly at all the sites. In general, in the majority of cases, the model was capable enough to capture trends shown by the observational anomaly, though there were differences in magnitude. For example, for the sites Bon and Zak, the model was successful in capturing all the summer and winter trends of the observation. Notably, the high positive anomaly of Abi in 2016 and of Ole in 2015, etc., and the high negative anomaly of Att in 2015, of Deg in 2017, of Low in 2017, etc., were also captured by the model. C3 The annual sum estimation of CH4 Figure C3 shows the annual sum estimation of CH4 from 10 out of 14 sites used in the study. The remaining four sites are illustrated in Sect. 3.4 of the main paper. The figure illus- trates that, after optimisation, most sites exhibited improved annual CH4 estimation throughout the year. However, for the site Hue, the model consistently failed to capture the obser- vation pattern in most years, except for 2013. For the site Att, particularly for the year 2016, the model displayed shortcom- ings. On the other hand, Att in 2016 completely aligned with the observed value for both prior and posterior estimations. Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4081 Figure C2. Summer and winter anomalies were estimated from the averages of the summer months (April to September) and winter months (October to March). The black, green, and purple dashed lines represent flux observations, prior, and posterior values, respectively. Dots and plus signs denote summer and winter data points of the season, respectively. https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 4082 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Figure C3. The annual sum estimation of CH4 from 10 out of 14 sites used in the study. The sites are represented in different colours with distinct markings to distinguish between observation, prior, and posterior. Appendix D: Time series estimation of validation sites Figure D1. The prior and posterior CH4 simulation from the LPJ-GUESS compared with flux observations from five different wetland sites used for validation. The black dots represent the CH4 flux observations. The teal dots depict the prior simulation using the prior model parameters and the purple dots represent the posterior simulation; 3 d running averages are calculated from the original time series. In most of the figures, a few outliers on the vertical axis have been removed for better visualisation. Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5194/bg-22-4061-2025 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data 4083 Code and data availability. The GRaB-AM code and data used for this article are available at https://doi.org/10.5281/zenodo.10589593 (Theanutti Kallin- gal, 2024). The LPJ-GUESS model code can be obtained at https://doi.org/10.5281/zenodo.8065737 (Nord et al. , 2021). Please contact the site PIs if the site observations are intended to be used for purposes other than use in this publication. Author contributions. Conceptualisation was undertaken by JTK and MS. Methodology was formulated by JTK, JL, and MS. PM assisted in setting up the multi-site simulation in LPJ-GUESS. MA provided the CH4 flux observations collected at Lompolojänkkä. PV and PW provided the flux observations collected at Myckle- mossen. Setting up the GRaB-AM and writing the original draft was carried out by JTK. Editing were performed by JTK, MS, JL, PAM, JR, PV, PW, and MA. All authors have read and agreed to the published version of the paper. Competing interests. The contact author has declared that none of the authors has any competing interests. Disclaimer. Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, pub- lished maps, institutional affiliations, or any other geographical rep- resentation in this paper. While Copernicus Publications makes ev- ery effort to include appropriate place names, the final responsibility lies with the authors. Acknowledgements. We would like to express our gratitude to Ivan Mammarella, Annalea Lohila, Kirsty Langley, Jutta Holst, and Elin Humphreys for their guidance on data repositories. Special thanks go to Sadat Ismayil for assisting with computational resources. Additionally, we acknowledge the Fluxnet database, the Ameri- flux database, the Integrated Carbon Observation System (ICOS), Greenland Ecosystem Monitoring (GEM), the Swedish Meteoro- logical and Hydrological Institute (SMHI), the Swedish Infrastruc- ture for Ecosystem Science (SITES), the Institute for Atmospheric and Earth System Research (SMEAR-INAR) at the University of Helsinki, and the Finnish Meteorological Institute (FMI) for pro- viding open access to their data. Financial support. This research has been supported by the Strategic Research Area: Biodiversity and Ecosystem services in a Changing Climate (BECC), Lund University, and is a contribution to the Strategic Research Area: ModElling the Regional and Global Earth system (MERGE). BECC and MERGE are funded by the Swedish government. The publication of this article was funded by the Swedish Research Council, Forte, Formas, and Vinnova. Review statement. This paper was edited by Paul Stoy and re- viewed by two anonymous referees. References Aalto, T., Tsuruta, A., Mäkelä, J., Müller, J., Tenkanen, M., Burke, E., Chadburn, S., Gao, Y., Mannisenaho, V., Kleinen, T., Lee, H., Leppänen, A., Markkanen, T., Materia, S., Miller, P. A., Peano, D., Peltola, O., Poulter, B., Raivonen, M., Saunois, M., Wårlind, D., and Zaehle, S.: Air temperature and precipitation constraining the modelled wetland methane emissions in a bo- real region in northern Europe, Biogeosciences, 22, 323–340, https://doi.org/10.5194/bg-22-323-2025, 2025. Andrieu, C. and Thoms, J.: A tutorial on adaptive MCMC, Statistics and computing, 18, 343–373, 2008. Aurela, M., Lohila, A., Tuovinen, J.-P., Hatakka, J., Penttilä, T., and Laurila, T.: Carbon dioxide and energy flux measurements in four northern-boreal ecosystems at Pallas, Boreal Environ. Res., 20, 455–473, 2015. Bohrer, G. and Morin, T. H.: FLUXNET-CH4 US-ORv Olentangy River Wetland Research Park, Tech. rep., FluxNet; The Ohio State Univ., Columbus, OH (United States), 2020. Braswell, B. H., Sacks, W. J., Linder, E., and Schimel, D. S.: Esti- mating diurnal to annual ecosystem parameters by synthesis of a carbon flux model with eddy covariance net ecosystem exchange observations, Global Change Biol., 11, 335–355, 2005. Canadell, J., Monteiro, P., Costa, M., da Cunha, L. C., Cox, P., Eliseev, A., Henson, S., Ishii, M., Jaccard, S., Koven, C., Lo- hila, A., Patra, P., Piao, S., Rogelj, J., Syampungani, S., Zaehle, S., and Zickfeld, K.: Global Carbon and other Biogeochemical Cycles and Feedbacks, in: Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, IPCC AR6, chapter 5, Cambridge University Press, https://doi.org/10.1017/9781009157896.007, 2021. Cao, M., Marshall, S., and Gregson, K.: Global carbon exchange and methane emissions from natural wetlands: Application of a process-based model, J. Geophys. Res.-Atmos., 101, 14399– 14414, 1996. Chen, J. and Chu, H.: FLUXNET-CH4 US-CRT Curtice Walter- Berger cropland, Tech. rep., FluxNet; University of Tole- do/Michigan State University, 2020. Christensen, J. H., Hewitson, B., Busuioc, A., Chen, A., Gao, X., Held, I., Jones, R., Kolli, R. K., Kwon, W.-T., Laprise, R., Mag- aña, R. V., Mearns, L., Menéndez, C. G., Räisänen, J., Rinke, A., Sarr, A., and Whetton, P.: Regional climate projections, Chapter 11, 847–940 pp., https://doi.org/10.1017/CBO9780511546013, 2007. Christensen, T. and Cox, P.: Response of methane emission from Arctic tundra to climatic change: results from a model simula- tion, Tellus B, 47, 301–309, 1995. Delwiche, K. B., Knox, S. H., Malhotra, A., Fluet-Chouinard, E., McNicol, G., Feron, S., Ouyang, Z., Papale, D., Trotta, C., Can- fora, E., Cheah, Y.-W., Christianson, D., Alberto, Ma. C. R., Alekseychik, P., Aurela, M., Baldocchi, D., Bansal, S., Billes- bach, D. P., Bohrer, G., Bracho, R., Buchmann, N., Campbell, D. I., Celis, G., Chen, J., Chen, W., Chu, H., Dalmagro, H. J., Dengel, S., Desai, A. R., Detto, M., Dolman, H., Eichelmann, E., https://doi.org/10.5194/bg-22-4061-2025 Biogeosciences, 22, 4061–4086, 2025 https://doi.org/10.5281/zenodo.10589593 https://doi.org/10.5281/zenodo.8065737 https://doi.org/10.5194/bg-22-323-2025 https://doi.org/10.1017/9781009157896.007 https://doi.org/10.1017/CBO9780511546013 4084 J. T. Kallingal et al.: Assimilating multi-site eddy-covariance data Euskirchen, E., Famulari, D., Fuchs, K., Goeckede, M., Gogo, S., Gondwe, M. J., Goodrich, J. P., Gottschalk, P., Graham, S. L., Heimann, M., Helbig, M., Helfter, C., Hemes, K. S., Hirano, T., Hollinger, D., Hörtnagl, L., Iwata, H., Jacotot, A., Jurasinski, G., Kang, M., Kasak, K., King, J., Klatt, J., Koebsch, F., Krauss, K. W., Lai, D. Y. F., Lohila, A., Mammarella, I., Belelli Marchesini, L., Manca, G., Matthes, J. H., Maximov, T., Merbold, L., Mitra, B., Morin, T. H., Nemitz, E., Nilsson, M. B., Niu, S., Oechel, W. C., Oikawa, P. Y., Ono, K., Peichl, M., Peltola, O., Reba, M. L., Richardson, A. D., Riley, W., Runkle, B. R. K., Ryu, Y., Sachs, T., Sakabe, A., Sanchez, C. R., Schuur, E. A., Schäfer, K. V. R., Sonnentag, O., Sparks, J. P., Stuart-Haëntjens, E., Sturtevant, C., Sullivan, R. C., Szutu, D. J., Thom, J. E., Torn, M. S., Tuittila, E.- S., Turner, J., Ueyama, M., Valach, A. C., Vargas, R., Varlagin, A., Vazquez-Lule, A., Verfaillie, J. G., Vesala, T., Vourlitis, G. L., Ward, E. J., Wille, C., Wohlfahrt, G., Wong, G. X., Zhang, Z., Zona, D., Windham-Myers, L., Poulter, B., and Jackson, R. B.: FLUXNET-CH4: a global, multi-ecosystem dataset and analy- sis of methane seasonality from freshwater wetlands, Earth Syst. Sci. Data, 13, 3607–3689, https://doi.org/10.5194/essd-13-3607- 2021, 2021. Desai, A. R. and Thom, J.: FLUXNET-CH4 US-Los Lost Creek, Tech. rep., FluxNet; Univ. of Wisconsin, Madison, WI (United States), 2020. Desroziers, G., Berre, L., Chapnik, B., and Poli, P.: Diagnosis of ob- servation, background and analysis-error statistics in observation space, Q. J. Roy. Meteor. Soc., 131, 3385–3396, 2005. Euskirchen, E. and Edgar, C.: FLUXNET-CH4 US-BZB Bonanza Creek Thermokarst Bog, Tech. rep., FluxNet, 2020. Fick, S. E. and Hijmans, R. J.: WorldClim 2: new 1-km spatial reso- lution climate surfaces for global land areas, Int. J. Climatol., 37, 4302–4315, 2017. Fung, I., John, J., Lerner, J., Matthews, E., Prather, M., Steele, L., and Fraser, P.: Three-dimensional model synthesis of the global methane cycle, J. Geophys. Res.-Atmos., 96, 13033– 13065, 1991. Gedney, N., Cox, P., and Huntingford, C.: Climate feedback from wetland methane emissions, Geophys. Res. Lett., 31, 20, L20503, https://doi.org/10.1029/2004GL020919, 2004. Granberg, G., Sundh, I., Svensson, B., and Nilsson, M.: Effects of temperature, and nitrogen and sulfur deposition, on methane emission from a boreal mire, Ecology, 82, 1982–1998, 2001. Greenland Ecosystem Monitoring: GeoBasis Zackenberg – Flux monitoring – AC (Version 1.0), [CC-BY-SA-4.0], Greenland Ecosystem Monitoring [data set], https://doi.org/10.17897/430P- DS31, 2025a. Greenland Ecosystem Monitoring: GeoBasis Zacken- berg – Meteorology – MM2 (Version 1.0), [CC-BY- SA-4.0], Greenland Ecosystem Monitoring [data set], https://doi.org/10.17897/9RY6-ZB75, 2025b. Groenendijk, M., Dolman, A., Van der Molen, M., Leuning, R., Ar- neth, A., Delpierre, N., Gash, J., Lindroth, A., Richardson, A., Verbeeck, H., and Wohlfahrt, G.: Assessing parameter variability in a photosynthesis model within and between plant functional types using global Fluxnet eddy covariance data, Agr. Forest Me- teorol., 151, 22–38, 2011. Harris, I., Jones, P. D., Osborn, T. J., and Lister, D. H.: Up- dated high-resolution grids of monthly climatic observations – the CRU TS3.10 Dataset, Int. J. Climatol., 34, 623–642, https://doi.org/10.1002/joc.3711, 2014. Harris, I., Osborn, T. J., Jones, P., and Lister, D.: Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset, Sci. Data, 7, 109, https://doi.org/10.1038/s41597-020- 0453-3, 2020. ICOS Sweden: Collection of Abisko Stordalen Palsa Bog Swedish network data, ICOS [data set], https://doi.org/10.18160/Q6H6- B94B, 2023. Ito, A., Li, T., Qin, Z., Melton, J., Tian, H., Kleinen, T., Zhang, W., Zhang, Z., Joos, F., Ciais, P., Hopcroft, P. O., Beerling, D. J., Liu, X., Zhuang, Q., Zhu, Q., Peng, C., Chang, K.-Y., Fluet- Chouinard, E., McNicol, G., Patra, P., Poulter, B., Sitch, S., Ri- ley, W., and Zhu, Q.: Cold-season methane fluxes simulated by GCP-CH4 Models, Geophys. Res. Lett., 50, e2023GL103037, https://doi.org/10.1029/2023GL103037, 2023. Iwata, H., Ueyama, M., and Harazono, Y.: FLUXNET-CH4 US-Uaf University of Alaska, Fairbanks, Tech. rep., FluxNet; Osaka Pre- fecture University; Shinshu University, 2020. Jacotot, A., Gogo, S., and Laggoun-Défarge, F.: FLUXNET-CH4 FR-LGt La Guette, France, FLUXNET-CH4 Community Prod- uct [data set], https://doi.org/10.18140/FLX/1669641, 2020. Kallingal, J. T., Lindström, J., Miller, P. A., Rinne, J., Raivo- nen, M., and Scholze, M.: Optimising CH4 simulations from the LPJ-GUESS model v4.1 using an adaptive Markov chain Monte Carlo algorithm, Geosci. Model Dev., 17, 2299–2324, https://doi.org/10.5194/gmd-17-2299-2024, 2024. Knorr, W. and Kattge, J.: Inversion of terrestrial ecosystem model parameter values against eddy covariance measurements by Monte Carlo sampling, Global Change Biol., 11, 1333–1351, 2005. Koebsch, F. and Jurasinski, G.: FLUXNET-CH4 DE-Hte Huetel- moor, Tech. rep., FluxNet; Lan