Jukuri, open repository of the Natural Resources Institute Finland (Luke)               

   
All material supplied via Jukuri is protected by copyright and other intellectual property rights. 

Duplication or sale, in electronic or print form, of any part of the repository collections is prohibited. 

Making electronic or print copies of the material is permitted only for your own personal use or for 

educational purposes. For other purposes, this article may be used in accordance with the publisher’s 

terms. There may be differences between this version and the publisher’s version. You are advised to 

cite the publisher’s version. 

This is an electronic reprint of the original article.  

This reprint may differ from the original in pagination and typographic detail. 

 
Author(s): Diego Rondon, Samu Mäntyniemi, Jouni Aspi, Laura Kvist, Mikko J. Sillanpää 

Title: A Bayesian multi-state model with data augmentation for estimating population size 

and effect of inbreeding on survival 

Year: 2024 

Version: Published version 

Copyright: The Author(s) 2024 

Rights: CC BY 4.0 

Rights url: https://creativecommons.org/licenses/by/4.0/  

 
Please cite the original version: 

Diego Rondon, Samu Mäntyniemi, Jouni Aspi, Laura Kvist, Mikko J. Sillanpää, A Bayesian multi-state 

model with data augmentation for estimating population size and effect of inbreeding on survival, 

Ecological Modelling, Volume 490, 2024, 110662, ISSN 0304-3800, 

https://doi.org/10.1016/j.ecolmodel.2024.110662. 

https://creativecommons.org/licenses/by/4.0/


Ecological Modelling 490 (2024) 110662

A
0

Contents lists available at ScienceDirect

Ecological Modelling

journal homepage: www.elsevier.com/locate/ecolmodel

A Bayesian multi-state model with data augmentation for estimating
population size and effect of inbreeding on survival
Diego Rondon a,∗, Samu Mäntyniemi b, Jouni Aspi c, Laura Kvist c, Mikko J. Sillanpää a

a Research Unit of Mathematical Sciences, University of Oulu, Oulu, FI-90014, Finland
b Natural Resources Institute Finland, Helsinki, FI-00790, Finland
c Research Unit of Ecology and Genetics, University of Oulu, Oulu, FI-90014, Finland

A R T I C L E I N F O

Dataset link: https://etsin.fairdata.fi/

Keywords:
Data augmentation
Hidden Markov model
Inbreeding
Multi-state model
Survival

A B S T R A C T

A joint model framework for estimating population sizes over time and survival probabilities while considering
inbreeding and age as covariates in the survival function is elaborated. This methods is tested with data
simulated over two small (𝑁 ≈ 150) close to extinction open populations, that aims to imitate wild individuals
under decline and bottleneck dynamics. A Hidden Markov Model (HMM) perspective with a multi-state
formulation that considers young and adult individuals is applied with data augmentation to account for
non-seen individuals. The transition probabilities are estimated, and different treatments of the covariates and
levels of data seen are compared. Our results suggest that the model framework correctly estimates different
population size trends but the parameter estimation is challenging in some cases. The proposed model presents
a new way to use the inbreeding coefficient, and further implementations on a real conservation data of wild

species can help the decision-making process for the management of small populations.
1. Introduction

High effort in the implementation of conservation management in
small populations is required to avoid extinction (Pimm et al., 2014;
Ripple et al., 2017). Usually, there is a lack of information on various
aspects of population dynamics, such as the number of individuals, sur-
vival rate, and fecundity amounts. In addition, environmental (Lande,
1993) and societal factors such as hunting (Leclerc et al., 2015) or
urbanization (Benson et al., 2019) bring big challenges (Ovaskainen
and Meerson, 2010) and increase the uncertainty on the actions and
decisions that should be taken for conservation.

When a population size becomes small, the probability of inbreeding
and inbreeding depression increases, leading to a rise in mortality,
and a decrease in reproduction, further reducing the population size
(Blomqvist et al. 2010; Godwin et al. 2020). This phenomenon is
known as the extinction vortex (Frankham et al., 2017). How to es-
cape or avoid it is an active area of research on the conservation
of species. Thus far, several practices have been employed, such as
the reintroduction of new individuals, assisted gene flow or migration
to reduce inbreeding and increase genetic variation (Armstrong and
Seddon 2008; Aitken and Whitlock 2013; Fitzpatrick et al. 2020) or
supplementary feeding to reduce mortality (Robb et al., 2008).

It is still not clear how inbreeding information and population
dynamics are related (Keller and Waller, 2002), since previous research

∗ Corresponding author.
E-mail address: diego.rondonbautista@oulu.fi (D. Rondon).

has focused on quantifying the effect of inbreeding on various mea-
sures of population viability (Leimu et al., 2006). For example, Marr
et al. (2006) suggested a relationship between inbreeding and envi-
ronmental factors that reduce the hatching success of Melitaea cinxia.
Domingue and Teale (2007) established that in a beetle population
(Ips pini), the number of offspring decreases as the inbreeding coeffi-
cient increases. Bozzuto et al. (2019) showed an inverse relationship
between inbreeding and population growth of the Capra ibex, alongside
other authors such as Latter and Robertson (1962) and Saccheri et al.
(1998). An exception is a study by Armstrong et al. (2021), where a
complex multi-step model uses inbreeding information to calculate sur-
vival probabilities and population sizes of North Island Robin (Petroica
longipes) in New Zealand.

However, there is still a lack of understanding of how inbreeding
is relevant to the dynamics of the populations. One of the difficulties
when considering inbreeding levels as part of the dynamics of the
wild population has been that pedigree data of several generations are
not often available. Nowadays, this can be overcome by estimating
inbreeding levels based on whole genome sequencing, where the in-
breeding level can be estimated from the DNA sample of an individual
(Nietlisbach et al., 2019).

In this study, we aimed to combine inbreeding and population
dynamics to jointly estimate the evolution of population size and sur-
vival probabilities. The work was conducted using an open simulated
vailable online 29 February 2024
304-3800/© 2024 The Author(s). Published by Elsevier B.V. This is an open access a

https://doi.org/10.1016/j.ecolmodel.2024.110662
Received 16 October 2023; Received in revised form 16 February 2024; Accepted 1
rticle under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

6 February 2024

https://www.elsevier.com/locate/ecolmodel
https://www.elsevier.com/locate/ecolmodel
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
https://etsin.fairdata.fi/
mailto:diego.rondonbautista@oulu.fi
https://doi.org/10.1016/j.ecolmodel.2024.110662
https://doi.org/10.1016/j.ecolmodel.2024.110662
http://crossmark.crossref.org/dialog/?doi=10.1016/j.ecolmodel.2024.110662&domain=pdf
http://creativecommons.org/licenses/by/4.0/


Ecological Modelling 490 (2024) 110662D. Rondon et al.

b
i

D
a
o
o

population where different values for the effect of inbreeding and age
were proposed resulting in two distinctive scenarios, namely, Decline
and Bottleneck populations. We used a Jolly–Seber model with a multi-
state formulation, also known as a Hidden Markov Model (HMM)
(Zucchini et al., 2016), This model has the advantage of separating
the effect of males and females. As opposed to a previous study by
Armstrong et al. (2021), which also considered inbreeding, we could
model the population without assuming conditions on the first captures,
as is commonly the case for wild open populations. Different meth-
ods are used to model the transition probabilities while considering
different levels of data seen. Model performance was evaluated via
the Mean Square Error (MSE) between the estimated population size
against the true simulated value. The proposed model uses a Bayesian
framework, which can easily incorporate any prior information in the
dynamics of the population and it works with Data Augmentation (DA)
to account for the unseen individuals (Kéry and Schaub, 2011; Royle
and Dorazio, 2012).

This work proposes a new way to consider the use of inbreeding in
estimating survival probabilities on an individual level. Additionally,
we present insights on how to extend the use of the model while using
different covariates and gaining more information on how they affect
the survival probability at different life stages of an individual.

2. Material and methods

Here, we first present and formulate a multi-state model, where
continuous covariates serve to parameterize the survival probability.
Then, we test the performance of the method using simulated data.
The presented model uses input data extracted from random samples
belonging to two distinct scenarios that represent populations close
to extinction and which are thus most likely to be of conservation
concern and suffer from inbreeding. These two scenarios are hereafter
named Decline and Bottleneck. To mitigate the effects of sampling
variation, the simulation analyses are replicated multiple times. This
model and simulation framework are not specialized for any specific
species. Instead, it is designed towards optimizing the reproductive
potential of all individuals, overlooking species-specific behaviors that
would otherwise reduce inbreeding and influence mate selection.

2.1. Multi-state model for data analysis

A multi-state model is used to describe the population dynamics
(Lebreton and Cefe, 2002; Kéry and Schaub, 2002). It is interpreted as a
Hidden Markov Model (HMM) given that it can be divided into hidden
and observed processes. In this framework, the life of an individual is
summarized as a set of possible discrete hidden states 𝐙 that represents
a succession of life stages, and it follows a Markov process, meaning
that the present state of an individual 𝑖 at generation 𝑡 depends only
on P(𝑧𝑖,𝑡|𝑧𝑖,𝑡−1). The observed outcome depends only on the present
hidden state through P(𝑦𝑖,𝑡|𝑧𝑖,𝑡) and refers to the observation or emission
probability (Rabiner, 1989).

States of the individuals and possible transitions between hidden
states can be seen in Fig. 1. The hidden states can be denoted numeri-
cally and interpreted as:

• 𝑧 = 1: refers to the unborn state. It is possible to move from this
state to any living state 𝑧 = [2,… , 5]. Moving to any young state
refers to a newborn individual. Moving to an adult state can occur
under two scenarios, either the individual was not detected as
young, or, the individual is an immigrant.

• 𝑧 = 2 or 𝑧 = 4: are the young states of females and males,
respectively. From here, an individual can survive and make a
transition to the adult stage. We assume that the time scales of
the observation process and of the biological process are the same
and thus it is not possible to observe an individual several times
2

as a young.
• 𝑧 = 3 or 𝑧 = 5: are the adult stages for females and males,
respectively. An individual remains in this stage until it dies.

• 𝑧 = 6: marks an individual as dead and it is an absorbing state.

The transition probability between hidden states is modeled by
a transition matrix 𝛺𝑖,𝑡, that represents all the possible changes of
individual 𝑖 between generations 𝑡 and 𝑡 + 1. The matrix is individual
and generation specific, the sum of all independent rows must add up
to one. As seen below:

𝛺𝑖,𝑡 =

1 2 3 4 5 6
⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎣

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎦

1 𝑠1,𝑡 𝑠2,𝑡 𝑠3,𝑡 𝑠4,𝑡 𝑠5,𝑡 0
2 0 0 𝜙𝑓𝑦;𝑖 0 0 1 − 𝜙𝑓𝑦;𝑖
3 0 0 𝜙𝑓𝑎;𝑖,𝑡 0 0 1 − 𝜙𝑓𝑎;𝑖,𝑡
4 0 0 0 0 𝜙𝑚𝑦;𝑖 1 − 𝜙𝑚𝑦;𝑖
5 0 0 0 0 𝜙𝑚𝑎;𝑖,𝑡 1 − 𝜙𝑚𝑎;𝑖,𝑡
6 0 0 0 0 0 1

. (1)

The transition probability between two unborn states (𝑧𝑡 = 1 →
𝑧𝑡+1 = 1) for adjacent generations is logit(𝑠1,𝑡+1) ∼ N(𝜂𝑡, 1), where 𝜂𝑡 =
logit−1(𝑠1,𝑡) is the back-transformed entry probability of the previous
generation, a random variation is achieved by the use of a standard nor-
mal distribution centered on previous value, the link function is defined
as logit(𝑠) = log( 𝑠

1−𝑠 ) and with the inverse as logit−1(𝑠) = log( exp(𝑠)
exp(𝑠)+1 ).

The use of the inverse function makes the arguments of the smooth
process on the same range, avoiding numerical problems. By linking
the transition probabilities of subsequent generations, that is, using a
first order random walk smoothing (Lang and Brezger, 2004), we create
a more stable model and avoid abrupt jumps in the population size.
The remaining transition probabilities between an unborn state and
any alive state are assumed to follow a Dirichlet distribution, 𝑠2∶5,𝑡 ∼
𝐷𝑖𝑟(K, 𝛼), which is then scaled to the range [0, 1−𝑠1,𝑡]. The true outcome
of an individual can take the values of: 𝑦 = 1 not seen (NS), 𝑦 = 2
seen as young (SY), or 𝑦 = 3 seen as an adult (SA). This is represented
y a succession of discrete states that depend on the true state of the
ndividual (P(𝑦𝑖,𝑡|𝑧𝑖,𝑡)). The probability of a certain result depends on

the observation matrix 𝛩𝑖,𝑡, and it is independent of the population size
and number of augmented individuals, this is to take full advantage of
the Bayesian approach and gain more modeling freedom. In practice,
however, considering underlying occurrence probabilities, which give
rise to the observed proportions (N/M), may lead to roughly similar
estimates as using observed proportions directly. Here the rows repre-
sent the current hidden state at a given generation and the columns the
observable outcome, as follows:

𝛩𝑖,𝑡 =

𝑁𝑆 𝑆𝑌 𝑆𝐴
⎡

⎢

⎢

⎢

⎢

⎢

⎢

⎣

⎤

⎥

⎥

⎥

⎥

⎥

⎥

⎦

1 1 0 0
2 1 − 𝑝𝑓𝑦;𝑖,𝑡 𝑝𝑓𝑦;𝑖,𝑡 0
3 1 − 𝑝𝑓𝑎;𝑖,𝑡 0 𝑝𝑓𝑎;𝑖,𝑡
4 1 − 𝑝𝑚𝑦;𝑖,𝑡 𝑝𝑚𝑦;𝑖,𝑡 0
5 1 − 𝑝𝑚𝑎;𝑖,𝑡 0 𝑝𝑚𝑎;𝑖,𝑡
6 1 0 0

. (2)

Some assumptions of the model based on the structure of the transition
matrix include (1) perfect classification of young and adults, and (2)
no observation of dead individuals.

A categorical distribution is used to model the transition between
the hidden states (𝑧𝑖,𝑡 → 𝑧𝑖,𝑡+1) and their emission to the observable
states (𝑧𝑖,𝑡 → 𝑦𝑖,𝑡,), allowing to write the model as:

𝑧𝑖,𝑡+1|𝑧𝑖,𝑡, 𝛺 ∼ 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑐𝑎𝑙
(

𝛺𝑧𝑖,𝑡 ,1∶6,𝑖,𝑡

)

, (3)

𝑦𝑖,𝑡|𝑧𝑖,𝑡, 𝛩 ∼ 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑖𝑐𝑎𝑙
(

𝛩𝑧𝑖,𝑡 ,1∶3,𝑖,𝑡

)

. (4)

ata Augmentation (DA) is used to represent the individuals that
re alive but non-observed (Royle and Dorazio, 2012). By adding
bservations that do not reflect information, we created a data set
f individuals seen and augmented. This is done in such a way that


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 1. Proposed multi-state model. White circles represent hidden states and gray boxes are observed states. Solid lines (dashed) are observation (transition) probabilities,
respectively. Indices for individual and time are removed to facilitate interpretation.
Fig. 2. Representation of the data augmentation. The augmented data is filled with
non-observed individuals (coded as 1) in the capture history. The covariates (inbreeding
coefficient and age) are sampled as pairs from observed individuals. The horizontal
red line represents the division between the true but not seen and the augmented
individuals.

the augmentation does not introduce any information into the model
(Kéry and Schaub, 2011). In this case, a certain number of unborn and
unseen individuals are added to the observed (𝑦 = 1) and hidden data
3

𝑖,𝑡
(𝑧𝑖,𝑡 = 1), the covariate information about age and inbreeding for non-
observed individuals are jointly sampled at random from observed part
of the data and their information is used to create the augmented data
set. We consider that not augmenting the covariates or augmenting the
covariates with missing values (NA) would easily lead to a biased esti-
mation. In contrast, our augmentation scheme makes the assumption
that the unobserved population follows the same distribution as the
observed one, as shown in Fig. 2. There is no correct answer of how
many individuals need to be added for data augmentation to work,
here, a fixed value of 600 individuals was used. It is possible to inspect
the posterior distribution of the population size to gain information
on this issue, given that a too small size of augmented data may lead
to underestimating the population size, while too large values will
increase unnecessarily computational time (see supplement material).

2.1.1. Survival probabilities
It is of interest to examine if the structure of the covariates have

an effect on the estimation of the population dynamics, specifically on
the population size, survival and entry probabilities. Therefore, three
different ways of treating the covariates of survival probability are
proposed and presented in Table 1. First, a model where no covariates
are used and a common parameter is applied for all individuals 𝑖,
this representation is the most simple model with the proposed states
and aims to work as a reference model. Second, a model where the
covariates age (age𝑖,𝑡) and inbreeding level (F𝑖) are modeled for indi-
vidual 𝑖 at generation 𝑡, common parameters are used for males (𝜙𝑚;𝑖,𝑡)
and females (𝜙𝑓 ;𝑖,𝑡). Third, a model where the covariates are scaled in
order to facilitate numerical sampling (Lesaffre and Lawson, 2012), this
transformation of the parameter will improve the mixing properties
of the MCMC, and needs a shorter MCMC sample to obtain better
estimates; this is carried out in following way age′𝑖,𝑡 = log(agei,t + 1) and
F′ = exp(F ), and different parameters are used to model the survival
𝑖 𝑖


Ecological Modelling 490 (2024) 110662D. Rondon et al.

s
o

s
w

P

S
b
I
p

P

H

P

T
o
d

m
t

P

N
i
M
i

p
p
t
b
o
c
o
a
i
z

M
w
B
i
a
o

Table 1
Survival modeling.
No covariates

logit(𝜙𝑖) = 𝛽0
No scaling

logit(𝜙𝑓𝑦;𝑖,𝑡) = logit(𝜙𝑚𝑦;𝑖,𝑡) = 𝛽0 + 𝛽1F𝑖
logit(𝜙𝑓𝑎;𝑖,𝑡) = logit(𝜙𝑚𝑎;𝑖,𝑡) = 𝛽0 + 𝛽1F𝑖 + 𝛽2age𝑖,𝑡

Scaling
logit(𝜙𝑓𝑦;𝑖,𝑡) = 𝛽0;𝑓 + 𝛽1;𝑓F′𝑖
logit(𝜙𝑚𝑦;𝑖,𝑡) = 𝛽0;𝑚 + 𝛽1;𝑚F′𝑖
logit(𝜙𝑓𝑎;𝑖,𝑡) = 𝛼0;𝑓 + 𝛼1;𝑓F′𝑖 + 𝛼2;𝑓 age′𝑖,𝑡
logit(𝜙𝑚𝑎;𝑖,𝑡) = 𝛼0;𝑚 + 𝛼1;𝑚F′𝑖 + 𝛼2;𝑚age′𝑖,𝑡

Three different cases are proposed to model the survival probability of males and
females: (i) No covariates, (ii) No scaling, and (iii) scaling.

probability of males and females. While working with wild individu-
als, the effect of sex(female/male) and age category(young/adult) are
important to consider, their modeling is often assumed to have an
additive or multiplicative effect over the survival probability. This is
a strong assumption, therefore relaxing this assumption and assigning
separate coefficients allows us to have a complete view of how the
model performs estimation over the survival function.

2.2. Model building

Consider a capture-recapture data set for M individuals, from which
𝑁 are observed (𝐘) and M−N are augmented (𝐘′), and in the matrix
𝐘̃ = [𝐘,𝐘′]′ = [𝐲1, 𝐲2,… , 𝐲N, 𝐲′N+1,… , 𝐲′M]′ each row provides infor-
mation of temporal observations of an individual for a window of a
study period T, where 𝐭 = {1, 2,… ,T} and 𝐲𝑖 = [𝑦𝑖,1, 𝑦𝑖,2,… , 𝑦𝑖,T]. Let
Z be an arrangement of possible hidden states for the M individuals,
in a similar way, 𝐙 = [𝐳1, 𝐳2,… , 𝐳N,… , 𝐳M]′ and 𝐳𝑖 = [𝑧𝑖,1, 𝑧𝑖,2,… , 𝑧𝑖,T].
Given a set of covariates 𝐗̃ = [𝐗,𝐗′]′ that are observed (𝐗) for the
first N individuals and augmented (𝐗′) for the reminding M−N, it is
assumed a direct influence on the survival of an individual, therefore
the observed values allows us to build part of the transition matrix
(𝜴). The posterior distribution of a Hidden Markov Model can be
represented in the following way:

P(𝐙,𝜴,𝜣,𝐗′
|𝐘,𝐗)

⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟
Posterior

∝ P(𝐘̃|𝐙,𝜣)
⏟⏞⏞⏟⏞⏞⏟
Likelihood

P(𝐙|𝜴) P(𝜣) P(𝜴|𝝋, 𝐗̃) P(𝝋|𝐗̃)P(𝐗′
|𝐗).

⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟
Joint prior

(5)

Here, on the right hand side, the first term represents the complete like-
lihood (probability of observing a specific pattern 𝐘̃ given the hidden
states Z and emission matrix 𝜣), the second term is the relationship
between the life history of the individuals and the respective transitions
over time (probability of hidden states given the transition matrix 𝜴),
thirdly the given prior distribution for the emission matrix (𝜣) and
the remaining terms are the prior distributions needed it to build the
transition matrix, where 𝝋 = {𝐒,𝝓, 𝛿, 𝜖} is the joint distribution of the
entry probabilities (S), survival coefficients (𝝓), with their variance as a
hyperparameter (𝛿) and smoothing variance as another hyperparameter
(𝜖).

We can now give more detail. Let 𝐩𝑖,𝑡 be an arrangement of obser-
vation probabilities for individual 𝑖 at time 𝑡. It is possible to construct
the prior contribution of a single individual to the observation matrix
P(𝜣𝑖) =

∏T
𝐭=1 P(𝐩𝑖,𝑡) and expand the observed contribution to the

likelihood across all time points for a single individual 𝑖 to:

P(𝐲𝑖|𝐳𝑖, 𝛩𝑖) =
T
∏

𝐭=1
P(𝑦𝑖,𝑡|𝑧𝑖,𝑡,𝜣𝑖,𝑡). (6)

Considering the set of covariates from the observed and augmented
data 𝐗 that can be individual specific, such as inbreeding level and age
(in temporal scale), it is possible to model the survival probabilities of
4

the transition matrix (𝜴) for a given individual 𝑖 at time 𝑡 using GLM t
with a logit link function, logit(𝝓𝑖,𝑡) = 𝐗𝜷. A hyperprior (𝛿) is used to
allow sharing of information between the coefficients 𝜷. That is, in their
prior distributions, 𝜷’s are realizations of a common distribution with
unknown (estimated) variance 𝛿.

The probabilities of any individual to enter the system, or entry
probabilities, at a given time 𝑡 are governed by a set of variables 𝐒𝑡 =
[𝑠1,𝑡,… , 𝑠5,𝑡], where ∑5

𝑗=1 𝑠𝑗,𝑡 = 1. The smooth process is controlled by a
random walk prior logit(𝑠1,𝑡+1) ∼ N(𝜂𝑡, 1), where 𝜂𝑡 = logit−1(𝑠1,𝑡). Given
this, the reminder probabilities are estimated, 𝑠2∶5,𝑡 ∼ 𝐷𝑖𝑟(1, 1, 1, 1), and
caled each them by a factor (1 − 𝑠1,𝑡) so that they jointly sum up to
ne.

Now it is possible to factorize the prior distribution of the hidden
tates by considering all possible generations of a single individual 𝑖,
here on the first occasion the contribution is taken at random:

(𝐳𝑖|𝜴𝑖) = P(𝑧𝑖,1)
T
∏

𝐭=2
P(𝑧𝑖,𝑡|𝑧𝑖,𝑡−1, 𝛺𝑖,𝑡). (7)

imilarly, the prior contribution of the transition matrix is decomposed
etween the parameters that allow estimating the survival probability.
n this case, they are the covariates X and the group of other required
arameters 𝝋, in the following way:

(𝜴|𝝋, 𝐗̃) = P(𝛺𝑖,1|𝜑1, 𝐗̃)
T
∏

𝐭=2
P(𝛺𝑖,𝑡|𝜑𝑡, 𝐗̃). (8)

ere the prior distribution of 𝝋 is:

P(𝝋|𝐗̃) = P(𝐒,𝝓, 𝜖, 𝛿|𝐗̃) =

(𝑠1,1|𝜖)P(𝜖)P(𝑠2∶5,1|𝑠1,1)P(𝝓𝑖,1|𝐗̃, 𝛿)
T
∏

𝑡=2
P(𝑠1,𝑡|𝑠1,𝑡−1)P(𝑠2∶5,𝑡|𝑠1,𝑡)P(𝝓𝑖,𝑡|𝐗̃, 𝛿).

(9)

he last step on building the joint prior is to write the dependency
f the augmented and observed set of covariates P(𝐗′

|𝐗), which is
escribed on more detail in Section 2.1.

Finally, considering the contribution across all (observed and aug-
ented) individuals, allows us to write the complete data likelihood of

he proposed model as:

(𝐘̃|𝐙,𝜣) ∝
N
∏

𝑖=1
P(𝐲𝑖|𝐳𝑖,𝜣𝑖)

⏟⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏟
Observed

⋅
M
∏

j=N+1
P(𝐲′𝑗 |𝐳𝑗 ,𝜣𝑗 )

⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟
Augmented

. (10)

ote that in Eq. (10) we assume that individuals are conditionally
ndependent given 𝐙 and 𝜣. This is a typical assumption in Hidden
arkov Models. A graphical representation of the model is presented

n Fig. 3.
A benefit of the Bayesian framework is that in addition to obtaining

osterior distributions for all the model parameters, we can obtain
osterior distributions for all their deterministic functions. In this case,
he number of individuals does not belong to the set of parameters,
ut we can still obtain its posterior distribution as a direct function
f the model parameters. This can be calculated from the MCMC
hain simply by monitoring the function that estimates the number
f individuals that exist outside the unborn state at every generation,
s follows N(𝑡) =

∑𝑀
𝑖=1 𝟏{𝑍𝑖,𝑡>1

⋂

𝑍𝑖,𝑡<6}, where 𝟏{𝑍𝑖,𝑡>1
⋂

𝑍𝑖,𝑡<6} is an
ndicator variable which equals one if the 𝑖th individual is alive, and is
ero otherwise.

Samples from the target posterior distribution are generated via
arkov Chain Monte Carlo (MCMC) simulation methods using a soft-
are package called Numerical Inference for statistical Models for
ayesian and Likelihood Estimation (NIMBLE) (de Valpine et al., 2017)

n R (R Core Team, 2021). Descriptions of used prior distributions
re collected in Table 2, where the truncated beta distribution for the
bservation probability ensures that there are observations at every

ime point, avoiding sampling of low values.


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 3. Model representation. Gray block represents observed data, white oval
represents hidden parameter of the model. Solid arrows indicate statistical dependency
and dotted arrow a functional relationship.

Table 2
Prior specification for key parameters.

Parameter Prior distribution Informativeness

Survival coefficient 𝝓 N(0, 10) Weak
Observation probability 𝐩𝑖,𝑡 TBeta(10, 10)a Slight
Hyperpriorb 𝛿 Gamma(2, 2) Weak
Regression parameters 𝜷 N(0, 𝛿) Weak
Transition from unborn 𝑆2∶5,𝑡 Dir(1, 1, 1, 1) Weak

a Truncated Beta on interval (0.2, 1].
b Used in no covariates and no scaling model.

2.3. Simulation of test scenarios

The simulation framework builds upon Abadi et al. (2010), incorpo-
rating adjustments to account for age and inbreeding when calculating
survival probabilities and the number of offsprings. It is important to
note that the simulated population does not intend to accurately mirror
any particular real-world species due to the inherent complexity of
actual interactions in nature. Both female and male individuals are
simultaneously simulated, a single generation time is defined as a
complete iteration through the three procedural steps, and after one
generation, young ones are marked as adults. as follows:

Step 1. Immigration: The number of new females and males that enter
the population is defined as a Poisson random variable with a
constant mean across all the simulations. New individuals are
unrelated to individuals already in the population.

Step 2. Reproduction: The simulation allows each male to reproduce
with multiple females. The mating pair selection is made at
random and only occurs between adults. Once an assignment is
made, the number of offspring to enter the system as young fol-
lows a Poisson distribution where the mean is a function of age
and inbreeding of the male and female pair, the higher the age
or inbreeding coefficient, the fewer offspring will be produced. A
new inbreeding coefficient is calculated from pedigree data using
the tabular method, taking into account the coancestry between
simulated individuals (Caballero, 2020).

Step 3. Survival: Survival probability of individual 𝑖 at generation 𝑡
of females (𝜙𝑓 ) and males (𝜙𝑚) is modeled with a generalized
linear model (GLM) and using the inbreeding coefficient (F𝑖) and
age (age𝑖,𝑡) as covariates. High values of age or inbreeding will
reduce survival probability.

Fig. 4 presents a graphical representation of the simulation and
sampling procedure.
5

Table 3
Initial conditions and parameter specifications for the simulations.

(a) Initial conditions for both scenarios

Number of females 15
Number of males 15
Immigration rate Poisson(0.2)
Initial inbreeding Beta(5, 5)
Immigrants’ inbreeding Beta(1, 6)

(b) Coefficient of the survival probability for both scenarios

Decline Bottleneck
Intercept 7 8
Inbreeding −0.2 −3
Age −2.2 −4

(c) Coefficient of the reproduction function for both scenarios

Female Male
Intercept 2 2
Inbreeding −0.8 −0.9
Age −0.4 −0.6

2.3.1. Simulated scenarios
Two different scenarios are proposed by changing the effects of the

survival function. The initial values of the parameters of each scenario
are given in Table 3. Both scenarios aim to represent small populations
close to extinction, at a stage where inbreeding plays a significant
role in an extinction vortex, where decreased genetic diversity and
increased inbreeding cause an inbreeding depression that together with
demographic stochasticity and genetic drift lead to decreased fitness via
decreased survival and reproduction, further decreasing population size
(Caughley, 1994). First, the Decline population scenario had a survival
function with more weight on the effect of age than of inbreeding
and 52 generations were simulated. Second, the Bottleneck population
scenario had age and inbreeding with similar effects, and 8 individuals
were manually added at generations 42 and 43 to avoid extinction
and 60 generations were simulated. Initial inbreeding parameters were
sampled randomly from a proposed distribution. The immigration rate
is the same for both scenarios.

Fig. 5 presents the evolution of the population size and inbreeding
coefficient of the simulation, we use individuals from generations 33
to 52 as our final data set since it presents the population pattern that
are of interest while allowing the simulated inbreeding level of each
individual to be the outcome of an interaction, discarding the initial
random values that are given at the start of the simulation, working as
a ‘‘burn-in’’ period. Once all individuals are simulated, 50 data sets are
built for three different levels of data seen at 40%, 60%, and 80%.

2.3.2. Model assessment
From each simulated scenario, 50 replicated data sets were built cor-

responding to different levels of observation. The size of the observed
data was variable since sampling occurs at random. The goal of the data
replicates is to inspect the stability of the posterior distributions, avoid
finding results by chance, check how often the posterior mean will
deviate from the true trend, and to monitor the posterior convergence.
Every data set was analyzed under the same model characteristics.

Deviation of the posterior mean compared to the true simulated
value of population size can be captured by estimating the mean square
error (MSE). The calculation takes the average across all MCMC chains
(n) of the replicates. Comparing the square difference between the true
simulated value ̂𝑁(𝑡) of population size with the posterior mean ( ̄𝑁𝑖(𝑡))
at each generation (t), creating a time series of errors, low values of
MSE reflect a better estimates.

MSE(t) = 1
𝑛
∑

(

̄𝑁𝑖(𝑡) − ̂𝑁(𝑡)
)2

. (11)

𝑛 𝑖=1


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 4. Simulation and sampling. The simulation process combines information from genealogy, survival, and reproduction on an open population. Sampling process uses life
stages of individuals and sampling probabilities to recreate capture-recapture data, this process is replicated 50 times to produce simulated data replicates.
3. Results

The proposed model is used to estimate population size, effect of
covariates on survival and entry probabilities. The model is used over
two scenarios (Decline and Bottleneck) with different levels of data
seen (40%, 60%, 80%) and treatment of covariates (scaling and no
scaling). Additionally a model with no covariates and observation of
data of 80% is used as a reference in both scenarios. Models without
the smoothing term were tested, but the results converged to non-ideal
posterior distribution. Three separate chains with different random
initial values were run for each replicated data set, each with 70
000 MCMC dependent samples (thinned by a factor of 5), including
50 000 MCMC samples for a burn-in period, leaving 4 000 effective
MCMC samples. Default MCMC algorithms assigned by NIMBLE are
used for most of the nodes, with the exception of the coefficients for
the survival function (𝛽) which use an Automated Factor Slice Sampler
(AFSS) (Tibbits et al., 2014) to reduce the auto-correlation among the
MCMC samples and improve the mixing of the chains. Convergence
of the MCMC simulation was ensured via visual inspection of the
MCMC chains for many different parameters, Effective Sample Size is
an approximate number of independent samples which could reach the
same estimation accuracy as our generated MCMC samples (Robert and
Casella, 2010), which values for some parameters are presented in the
supplementary materials. The models were run at the Finnish Center
for Scientific Computing (CSC) using the high-performance computer
Puhti and it took about 20 h of CPU time per chain.

3.1. Estimation of population size

The behavior of three estimated time series for different scenarios
(no covariates, no-scaling, scaling) and observation levels (40%, 60%
and 80%) are presented in Figs. 6–8. The posterior distributions of the
population sizes at each generation were summarized by the posterior
means calculated over chains and the credible intervals estimated
6

Fig. 5. Simulated population scenarios. Total population size for Decline (Bottle-
neck) population scenario on the top (bottom) panel, respectively. The x-axis represents
simulated generation, and y-axis population size, color and size of the circles represent
the mean and standard deviation of the inbreeding coefficient at each time point across
all alive individuals. The shaded area defines the study generations, where the data is
collected between generations 33 to 52.

across all replicates from a concatenated chain. Using no covariates,
Fig. 6, presents over-estimated values at the first two generations,
across all the replicates in both scenarios. In the case of no scaling, there


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 6. Population size posterior estimates over time, no covariates case: Left (Right) panel presents the scenario: Bottleneck (Decline) population, with a 80% of data seen.
Comparison between the true simulated value (blue) and the posterior mean estimate for every MC chain in each replicate (gray), an average posterior across the 50 replicates of
the upper and lower 95% credible intervals are presented by the red dashed lines.
Fig. 7. Population size posterior estimates over time, no scaling of covariates case: Left (Right) panel presents the scenario: Bottleneck (Decline) population, with a 40%,
60% and 80% of data seen from top to bottom, respectively. Comparison between the true simulated value (blue) and the posterior mean estimate for every MC chain in each
replicate (gray), an average posterior across the 50 replicates of the upper and lower 95% credible intervals are presented by the red dashed lines.
is also an over-estimation that decreases as the observation level of data
increases (Fig. 7). The Decline population presents 5% of the chains
converging to a non-ideal posterior distribution. Finally, scaling the
covariates results in over-estimation in the first generation at different
observation levels of the Bottleneck scenario while the Decline scenario
has less variability across all the replicates along the level of data seen
(Fig. 8).

MSE was calculated across the 50 replicates between the posterior
means and the true simulated population sizes at each generation.
Logarithmic transformation was used to facilitate interpretation. Fig. 9,
top row, presents the results of the two scenarios across different
models, with the highest error occurring at the beginning of the study.
In the Decline scenario, the model with no covariates provided the
most accurate results (the lowest MSE), it is possible that the models
that used covariates are not strong enough to capture the complexity
7

of the data, making the simpler model a better option. All models
performed equally well for the Bottleneck population scenario. In both
cases, the first two time points presented the highest deviation from
data. Examining various levels of data seen enables us to see their
effect on the uncertainty present in the posterior distributions. Fig. 9,
bottom row, presents the average standard deviations of each replicate,
in both scenarios, a larger levels of data seen reduced the variability,
as expected.

3.2. Posterior estimation of survival probability

Posterior means are collected across all the replicates for effects
of inbreeding and of age on the survival probability and they are
presented in Fig. 10. In Decline scenario, when no scaling is applied,
the effect of inbreeding is difficult to estimate accurately with 40%


Ecological Modelling 490 (2024) 110662

8

D. Rondon et al.

Fig. 8. Population size posterior estimates over time, scaling of covariates case: Left (Right) panel presents the scenario: Bottleneck (Decline) population, with a 40%, 60%
and 80% of data seen from top to bottom, respectively. Comparison between the true simulated value (blue) and the posterior mean estimate for every MC chain in each replicate
(gray), an average posterior across the 50 replicates of the upper and lower 95% credible intervals are presented by the red dashed lines.

Fig. 9. Mean Square Error (MSE): Top (bottom) row present the Log average (standard deviation) difference between the estimated posterior mean value with the true simulated
population size, across all the replicates over generations. Left (Right) column presents the Decline (Bottleneck) population scenario, respectively. Three cases for modeling the
survival probabilities: scaling of covariates, no scaling of covariates and no covariates are shown with colors: red, blue, and green; where the type of lines refers to different levels
of data seen being 40% (dotted), 60% (dashed), and 80% (solid lines), respectively.


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 10. Distribution of the 50 posterior means of the survival regression coefficient on the Decline scenario. Top (bottom) row is used for the no scaling (scaling) scheme.
Left (Right) panel presents information of inbreeding (age), respectively. The sample probability is represented by colors red, green, and blue corresponding to samples observed
at 40, 60 and, 80% respectively. For the model with no scaling of the covariates, the true simulated value is presented with the black dotted vertical line.
level of data seen given that the estimated posterior means do not
move from the given priors, one reason being the lack of data. On
the other hand, when observation level of either 60% or 80% is used,
the posterior values seem to move away from their true simulated
values. The estimated effect of the covariate age is close to the true
simulated value when observation levels are over 40%. On the other
hand, when the covariates are scaled, the level of data seen of 40%
results in estimates close to zero, and effects become easier to estimate
at 80% observation level.

For the Bottleneck scenario in Fig. 11, no scaling of the covariates
finds an effect when observation levels is 80%, while in the case of
scaling of the covariates, a bimodal posterior distributions are obtained
at all observation levels.

The average 95% credible intervals of the lower and upper bounds
are estimated across all replicates from a concatenated chain and
presented in Tables 4–9, in Appendix A. In all the cases, the model
finds it difficult to identify an effect of the covariates when the level
of data seen is 40%. The effect of inbreeding is identifiable in males
when the covariates are scaled and the data seen is 80%. In contrast,
the same effect is not found in females. In the case of no scaling, the
effect of age is easily identifiable for both males and females.

For no scaling of covariates, the posterior estimates (Tables 5 and
8 in the Appendix A) are compared with the true values used in
the simulation (Table 3(b)). For both scenarios, the posterior mean
estimates of the age coefficient are closer to the true values when the
level of data seen increases (Fig. 8). For the regression coefficient of
inbreeding, the posterior mean is closer to the simulated value when
the effect of inbreeding is large (Bottleneck scenario). The posterior
credible intervals seem to behave as expected so that they become
wider when the observation level decreases.
9

3.3. Posterior estimation of the entry probabilities

The estimated posterior probability of staying outside the system
across the study generations (𝑠1,𝑡), namely, no entry probability, is
averaged over the 50 replicates and presented in Fig. 12. Decline
population shows a positive trend when there are no covariates or
the covariates are scaled, while the no scaling case presents a small
decrease over time. The results are consistent at different levels of data
seen. In the Bottleneck scenario, the probability declines across the
study generations. Remaining probabilities (𝑠2∶5,𝑡) are presented in the
Appendix A, Figs. 13 and 14. In both scenarios, posterior probability
estimates for entering as an adult (𝑠3,𝑡 and 𝑠5,𝑡) are close to zero and
have small variations across the different study generations.

4. Discussion

The well-known connection between population decline, high levels
of inbreeding, deleterious mutations, and inbreeding depression has
been discussed and examined for decades and still is observed (Hedrick
et al. 2019; Kyriazis et al.; 2021 Robinson et al. 2023; among others).
To learn more about this connection, we studied a method that includes
the inbreeding coefficient as an explanatory variable to model the
survival probability of a small simulated population. Our approach aims
to measure the effect that the different levels of inbreeding have on
individuals and if this could improve the accuracy of the population size
predictions. The presented joint model framework allows the MCMC
sampling to consider the whole posterior distribution of the parameters,
where information is summarized. In the future, the presented method

could be implemented on real conservation data and may help in the


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 11. Distribution of the 50 posterior means of the survival regression coefficient on the Bottleneck scenario. Top (bottom) row is used for the no scaling (scaling)
scheme. Left (Right) panel presents information of inbreeding (age), respectively. The sample probability is represented by colors red, green, and blue corresponding to different
levels of data seen at 40, 60 and, 80% respectively. For the model with no scaling of the covariates, the true simulated value is presented with the black dotted vertical line.
Fig. 12. No entry probability (𝑠1,𝑡): Averaged posterior probability of no entry across all the replicates. Left (Right) column represent the Decline (Bottleneck) scenario, respectively.
Three cases for modeling the survival probabilities: scaling, no scaling and no covariates are shown with colors: red, blue, and green; where the type of lines refers to different
levels of data seen being 40% (dotted), 60% (dashed), and 80% (solid lines), respectively.
decision-making process for the management of small populations. This
can be done by obtaining capture-recapture data at an individual level
and the covariates that are of interest. The individual inbreeding levels
needed for the model can be acquired from pedigrees or genomic data.

Even if the number of parameters in our model was large and their
dependence structure was complex, we did not find large problems
in estimating posterior distributions for population size time series, in
contrast to Newman et al. (2022). Here, it was possible to see that under
different approaches to model the survival probability (no covariates,
no scaling, and scaling of covariates) and different levels of data seen
(40%, 60%, and 80%), the model is consistent with the location of
the population size time series and trend. However, a high number
of sample individuals brought less variability in the posterior distribu-
tions. It is possible that the identified overestimation in the number of
10
alive individuals may be due to the small population size, but further
research is needed, several approaches such as having a longer time
series, observing more individuals, or introducing informative priors to
some parameters may help on obtaining more accurate estimates. On
the other hand, estimation of the survival parameters is challenging for
more complex models, especially with a small observation level.

Estimating suitable posterior distributions for modeling transition
probabilities is a big challenge for this framework, and care must be
taken in the interpretation of the parameters. Here, the most accurate
parameter estimation was easier to find when the covariates were not
scale, the model was not over-parameterized and the observation level
was large. A more complex model can bring identifiability issues in
the estimation, as seen in the bi-modality of the posterior distribution
in Figs. 10 and 11. We found consistency in our results and results


Ecological Modelling 490 (2024) 110662D. Rondon et al.
of Gimenez et al. (2009), where a CJS model was used to study
how parameter redundancy produces low precision estimates. A more
detailed analysis of over-parameterization can be found in Cole (2020).
Parameter estimation over a transitional state, such as young individ-
uals, is poorly estimated, possibly because there is a lack of observed
individuals to have a meaningful impact on the posterior distribution.

Misspecification of the probability of staying outside of the system,
no entry probability, may lead to a wrong interpretation of the overall
behavior of the population. In both simulated scenarios the population
is close to extinction and it will be expected for 𝑠1,𝑡 to have a positive
trend, indicating a decrease in the rate of individuals entering the
system as time progresses. This behavior is exclusively observed in the
Decline scenario when the covariates are scaled.

Data augmentation was found to perform well in controlling for
unseen individuals. In the presence of covariates, the augmentation
has to be performed also over them. Here it is natural to assume that
covariates of non-seen and seen individuals come from the same dis-
tribution. Multiple approaches were tested (results not shown), where
augmentation was performed with NA values, zero values, or ignored
during the MCMC, all leading to poor posterior estimates. Then, by
sampling non-observed covariates jointly from the observed popula-
tion, one can ensure that the augmentation is not inducing any bias
and MCMC will converge more easily. Also, by assuming neighboring
entry probabilities to be close to each other (i.e., to be smooth over
generations), we were able to improve identifiability of the posterior
estimates of the population size. This avoided sudden jumps in the
process and gave the model stability, since the entry probability at
generation 𝑡 + 1 depends on 𝑡, allowing the model to borrow strength
from adjacent generations, while extending the modeling to a real life
application, it could be useful to reduce the computational time and
try different approaches such as Approximate Bayesian Computation
(Beaumont, 2010; Baey et al., 2023).

Several extensions can be implemented on the model. One that
can be relevant for the parameter estimation is to use time-to-event
data to model the survival probabilities (Ergon et al., 2018). This
framework would bring a more straightforward interpretation of the
parameters, correctly handling censoring in data that can exist as
migration or an unseen dead, and may be able to reduce the bias on
the estimation of the survival coefficients. Another relevant extension
would be to allow the model to use continuous time points, which may
help to obtain more information in order to have a better estimability
of the parameters. Thirdly, it will be of interest to model the entry
probabilities as a function of different covariates, which can be done
by using multinomial logit models. Finally, it is possible to extend the
model, allowing for the observation of a dead state and studying the
true survival of the individuals, in this case, the structure of the model
needs to be adjusted creating a new hidden state that holds information
about the ‘‘recently dead’’, that is transitory to a ‘‘dead’’ state. This new
hidden state can emit ‘‘not seen’’ or ‘‘recover’’ with certain probabilities.
Overall, this research underscores the need for continued innovation
and refinement of statistical models to improve our understanding and
management of complex ecological systems.

CRediT authorship contribution statement

Diego Rondon: Writing – review & editing, Writing – original draft,
Visualization, Software, Methodology, Formal analysis, Data curation,
Conceptualization. Samu Mäntyniemi: Validation, Conceptualization.
Jouni Aspi: Validation, Conceptualization. Laura Kvist: Validation,
Conceptualization. Mikko J. Sillanpää: Writing – review & editing,
Validation, Conceptualization.

Declaration of competing interest
11

Authors declare no conflict of interest.
Table 4
Mean estimate and credible interval in the Decline scenario, scaling: mean, lower and
upper 95% credible interval calculated across a concatenate chain of the 50 replicates.
Boldface values are credible intervals that do not contain the value zero.

40% 60% 80%

𝛼0;𝑓 1.48(−4.46;11.80) 5.33(−4.06;13.01) 10.35(6.13;14.41)
𝛼1;𝑓 0.80(−2.29;5.10) 2.04(−1.13;5.93) 2.82(−0.02;5.85)
𝛼2;𝑓 −1.56(−8.68;1.26) −4.63(−9.64;1.25) −8.13(−10.31;−6.10)
𝛼0;𝑚 5.63(−3.47;12.14) 9.07(−2.60;14.01) 11.00(7.11;14.90)
𝛼1;𝑚 3.09(−1.24;7.87) 3.48(−0.04;7.29) 3.42(0.58;6.62)
𝛼2;𝑚 −5.53(−9.86;1.06) −7.71(−10.61;0.97) −8.74(−10.90;−6.79)
𝛽0;𝑓 1.92(−3.15;7.28) 1.51(−2.76;6.74) 1.47(−2.76;5.99)
𝛽1;𝑓 2.10(−2.73;7.32) 0.80(−3.51;6.80) 0.13(−3.55;5.21)
𝛽0;𝑚 2.15(−2.96;7.44) 2.22(−2.91;7.54) 2.10(−3.04;7.37)
𝛽1;𝑚 2.20(−2.78;7.41) 2.41(−2.51;7.55) 2.19(−2.56;7.37)

Table 5
Mean estimate and credible interval in the Decline scenario, no scaling: mean, lower
and upper 95% credible interval calculated across a concatenate chain of the 50
replicates. Boldface values are credible intervals that do not contain the value zero.

40% 60% 80%

𝛽0 6.27(1.07;14.79) 10.75(7.47;15.47) 10.40(7.87;13.69)
𝛽1 0.20(−4.16;4.85) 1.22(−3.47;5.72) 1.60(−2.07;5.21)
𝛽2 −1.41(−3.26;−0.27) −2.41(−3.41;−1.71) −2.34(−3.03;−1.81)

Table 6
Mean estimate and credible interval in the Decline scenario, no covariates: mean,
lower and upper 95% credible interval calculated across a concatenate chain of the
50 replicates. Boldface values are credible intervals that do not contain the value
zero.

80%

𝛽0 0.65(0.50;0.80)

Data availability

Code and data for the simulation and analysis are available in:
https://etsin.fairdata.fi/.

Acknowledgments

We are very grateful to the anonymous referees for their construc-
tive comments, which helped us to improve the presentation of the
work.

The authors wish to acknowledge the Center of Scientific Computing
(CSC), Finland, for generous computational resources.

This work was financially supported by the Kvantum Institute, Uni-
versity of Oulu. Under the project: ‘‘Fall and rise of endangered species:
Detection of genomic and population ecological signals of decreasing
and increasing populations’’.

Appendix A
Posterior estimation of the survival probability

Average posterior mean, low and high credible intervals for a
concatenated chain across all the replicates

Posterior estimation of the entry probabilities

See Figs. 13 and 14.

Appendix B. Supplementary data

Supplementary material related to this article can be found online
at https://doi.org/10.1016/j.ecolmodel.2024.110662.

https://etsin.fairdata.fi/
https://doi.org/10.1016/j.ecolmodel.2024.110662


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Fig. 13. Entry probability (𝑠2∶5,𝑡) Decline scenario: Average posterior mean of the possible entry probabilities across all the replicates. Top (Bottom) row represent entry
probability as Female (Male) and left (right) column represent entry probability as Young (Adult). Three cases for modeling the survival probabilities: scaling, no scaling and no
covariates are shown with colors: red, blue, and green; where the type of lines refers to the different levels of data seen, being 40% (dotted), 60% (dashed), and 80% (solid lines),

respectively.
Fig. 14. Entry probability (𝑠2∶5,𝑡) Bottleneck scenario: Average posterior mean of the possible entry probabilities across all the replicates. Top (Bottom) row represent entry
probability as Female (Male) and left (right) column represent entry probability as Young (Adult). Three cases for modeling the survival probabilities: scaling, no scaling and no
covariates are shown with colors: red, blue, and green; where the type of lines refers to the different levels of data seen, being 40% (dotted), 60% (dashed), and 80% (solid lines),

respectively.
Table 7
Mean estimate and credible interval in the Bottleneck scenario, scaling: mean, lower and
upper 95% credible interval calculated across a concatenate chain of the 50 replicates.
Boldface values are credible intervals that do not contain the value zero.

40% 60% 80%

𝛼0;𝑓 2.48(−3.61;10.54) 3.78(−2.27;12.18) 2.92(−2.16;12.87)
𝛼1;𝑓 −0.26(−4.34;4.75) −0.52(−4.03;4.19) −1.12(−4.15;3.63)
𝛼2;𝑓 −1.84(−8.41;1.61) −2.30(−8.96;1.56) −1.24(−9.34;1.70)
𝛼0;𝑚 3.34(−4.29;10.54) 3.69(−3.94;11.38) 6.54(−3.29;13.46)
𝛼1;𝑚 1.17(−2.92;5.44) 1.36(−2.26;5.46) 1.43(−1.92;4.89)
𝛼2;𝑚 −3.36(−8.68;1.47) −3.60(−9.23;1.46) −5.38(−10.21;1.53)
𝛽0;𝑓 2.10(−2.86;7.28) 2.61(−2.34;7.78) 2.86(−2.08;8.00)
𝛽1;𝑓 1.84(−3.21;7.20) 2.79(−1.92;7.81) 3.15(−1.46;8.07)
𝛽0;𝑚 1.73(−3.35;7.11) 2.43(−2.60;7.70) 2.72(−2.31;7.93)
𝛽1;𝑚 2.00(−2.59;7.13) 2.82(−1.71;7.76) 3.17(−1.32;8.04)
12
Table 8
Mean estimate and credible interval in the Bottleneck scenario, no scaling: mean,
lower and upper 95% credible interval calculated across a concatenate chain of the
50 replicates. Boldface values are credible intervals that do not contain the value
zero.

40% 60% 80%

𝛽0 0.40(−0.16;1.03) 3.77(0.79;16.34) 14.81(8.73;23.58)
𝛽1 0.16(−1.44;1.84) −0.79(−6.32;1.57) −3.69(−8.57;0.63)
𝛽2 −0.25(−0.40;−0.11) −1.04(−4.21;−0.28) −3.83(−6.01;−2.30)


Ecological Modelling 490 (2024) 110662D. Rondon et al.
Table 9
Mean estimate and credible interval in the Bottleneck scenario, No covariates: mean,
lower and upper 95% credible interval calculated across a concatenate chain of the 50
replicates. Boldface values are credible intervals that do not contain the value zero.

80%

𝛽0 0.28(0.09;0.46)

References

Abadi, Fitsum, Gimenez, Olivier, Arlettaz, Raphaël, Schaub, Michael, 2010. An as-
sessment of integrated population models: bias, accuracy, and violation of the
assumption of independence. Ecology 91 (1), 7–14. http://dx.doi.org/10.1890/08-
2235.1.

Aitken, Sally N., Whitlock, Michael C., 2013. Assisted gene flow to facilitate local
adaptation to climate change. Annu. Rev. Ecol. Evol. Syst. 44, 367–388. http:
//dx.doi.org/10.1146/annurev-ecolsys-110512-135747.

Armstrong, Doug P, Parlato, Elizabeth H, Egli, Barbara, Dimond, Wendy J,
Kwikkel, Renske, Berggren, Åsa, McCready, Mhairi, Parker, Kevin A, Ewen, John G,
2021. Using long-term data for a reintroduced population to empirically estimate
future consequences of inbreeding. Conserv. Biol. 35 (3), 859–869. http://dx.doi.
org/10.1111/cobi.13646.

Armstrong, Doug P., Seddon, Philip J., 2008. Directions in reintroduction biology.
Trends Ecol. Evol. 23 (1), 20–25. http://dx.doi.org/10.1016/j.tree.2007.10.003.

Baey, Charlotte, Smith, Henrik G, Rundlöf, Maj, Olsson, Ola, Clough, Yann, Sahlin, Ull-
rika, 2023. Calibration of a bumble bee foraging model using approximate Bayesian
computation. Ecol. Model. 477, 110251. http://dx.doi.org/10.1016/j.ecolmodel.
2022.110251.

Beaumont, Mark A., 2010. Approximate Bayesian computation in evolution and ecology.
Annu. Rev. Ecol. Evol. Syst. 41, 379–406. http://dx.doi.org/10.1146/annurev-
ecolsys-102209-144621.

Benson, John F, Mahoney, Peter J, Vickers, T Winston, Sikich, Jeff A, Beier, Paul,
Riley, Seth PD, Ernest, Holly B, Boyce, Walter M, 2019. Extinction vortex dynamics
of top predators isolated by urbanization. Ecol. Appl. 29 (3), e01868. http://dx.
doi.org/10.1002/eap.1868.

Blomqvist, Donald, Pauliny, Angela, Larsson, Mikael, Flodin, Lars-Åke, 2010. Trapped
in the extinction vortex? Strong genetic effects in a declining vertebrate population.
BMC Evol. Biol. 10 (1), 1–9. http://dx.doi.org/10.1186/1471-2148-10-33.

Bozzuto, Claudio, Biebach, Iris, Muff, Stefanie, Ives, Anthony R, Keller, Lukas F, 2019.
Inbreeding reduces long-term growth of alpine ibex populations. Nat. Ecol. Evol. 3
(9), 1359–1364. http://dx.doi.org/10.1038/s41559-019-0968-1.

Caballero, Armando, 2020. Quantitative Genetics. Cambridge University Press.
Caughley, Graeme, 1994. Directions in conservation biology. J. Anim. Ecol. 215–244.
Cole, Diana J., 2020. Parameter Redundancy and Identifiability. Chapman and

Hall/CRC.
de Valpine, P., Turek, D., Paciorek, C.J., Anderson-Bergman, C., Temple Lang, D.,

Bodik, R., 2017. Programming with models: writing statistical algorithms for
general model structures with NIMBLE. J. Comput. Graph. Statist. 26, 403–417.
http://dx.doi.org/10.1080/10618600.2016.1172487.

Domingue, Michael J., Teale, Stephen A., 2007. Inbreeding depression and its effect on
intrinsic population dynamics in engraver beetles. Ecol. Entomol. 32 (2), 201–210.
http://dx.doi.org/10.1111/j.1365-2311.2007.00854.x.

Ergon, Torbjørn, Borgan, Ørnulf, Nater, Chloé Rebecca, Vindenes, Yngvild, 2018. The
utility of mortality hazard rates in population analyses. Methods Ecol. Evol. 9 (10),
2046–2056. http://dx.doi.org/10.1111/2041-210X.13059.

Fitzpatrick, Sarah W, Bradburd, Gideon S, Kremer, Colin T, Salerno, Patricia E,
Angeloni, Lisa M, Funk, W Chris, 2020. Genomic and fitness consequences of
genetic rescue in wild populations. Curr. Biol. 30 (3), 517–522. http://dx.doi.org/
10.1016/j.cub.2019.11.062.

Frankham, Richard, Ballou, Jonathan D, Ralls, Katherine, Eldridge, Mark Derek Bruce,
Dudash, Michele R, Fenster, Charles B, Lacy, Robert C, Sunnucks, Paul, 2017. Ge-
netic Management of Fragmented Animal and Plant Populations. Oxford University
Press.

Gimenez, Olivier, Morgan, Byron J.T., Brooks, Stephen P., 2009. Weak identifiability in
models for mark-recapture-recovery data. In: Thomson, David L., Cooch, Evan G.,
Conroy, Michael J. (Eds.), Modeling Demographic Processes in Marked Populations.
Springer. Springer US, Boston, MA, pp. 1055–1067. http://dx.doi.org/10.1007/978-
0-387-78151-8_48.

Godwin, Joanne L, Lumley, Alyson J, Michalczyk, Łukasz, Martin, Oliver Y,
Gage, Matthew JG, 2020. Mating patterns influence vulnerability to the extinction
vortex. Global Change Biol. 26 (8), 4226–4239. http://dx.doi.org/10.1111/gcb.
15186.

Hedrick, PW, Robinson, JA, Peterson, Rolf O, Vucetich, John A, 2019. Genetics and
extinction and the example of isle royale wolves. Animal Conserv. 22 (3), 302–309.
http://dx.doi.org/10.1111/acv.12479.
13
Keller, Lukas F., Waller, Donald M., 2002. Inbreeding effects in wild populations. Trends
Ecol. Evol. 17 (5), 230–241. http://dx.doi.org/10.1016/S0169-5347(02)02489-8.

Kéry, Marc, Schaub, Michael, 2011. Bayesian Population Analysis using WinBUGS: A
Hierarchical Perspective. Academic Press.

Kyriazis, Christopher, Wayne, Robert, Lohmueller, Kirk, 2021. Strongly deleterious mu-
tations are a primary determinant of extinction risk due to inbreeding depression.
Evol. Lett. 5 (1), 33–47. http://dx.doi.org/10.1002/evl3.209.

Lande, Russell, 1993. Risks of population extinction from demographic and envi-
ronmental stochasticity and random catastrophes. Am. Nat. 142 (6), 911–927.
http://dx.doi.org/10.1086/285580.

Lang, Stefan, Brezger, Andreas, 2004. Bayesian P-splines. J. Comput. Graph. Statist. 13
(1), 183–212. http://dx.doi.org/10.1198/1061860043010.

Latter, B.D.H., Robertson, Alan, 1962. The effects of inbreeding and artificial selection
on reproductive fitness. Genet. Res. 3 (1), 110–138. http://dx.doi.org/10.1017/
S001667230000313X.

Lebreton, Jean-Dominique, Cefe, R. Pradel, 2002. Multistate recapture models: mod-
elling incomplete individual histories. J. Appl. Stat. 29 (1–4), 353–369. http:
//dx.doi.org/10.1080/02664760120108638.

Leclerc, Camille, Bellard, Céline, Luque, Gloria M, Courchamp, Franck, 2015. Over-
coming extinction: understanding processes of recovery of the tibetan antelope.
Ecosphere 6 (9), 1–14. http://dx.doi.org/10.1890/ES15-00049.1.

Leimu, Roosa, Mutikainen, PIA, Koricheva, Julia, Fischer, Markus, 2006. How general
are positive relationships between plant population size, fitness and genetic
variation? J. Ecol. 94 (5), 942–952. http://dx.doi.org/10.1111/j.1365-2745.2006.
01150.x.

Lesaffre, Emmanuel, Lawson, Andrew B., 2012. Bayesian Biostatistics. John Wiley &
Sons.

Marr, AB, Arcese, P, Hochachka, WM, Reid, Jane Margaret, Keller, LF, 2006. Interactive
effects of environmental stress and inbreeding on reproductive traits in a wild bird
population. J. Anim. Ecol. 75 (6), 1406–1415. http://dx.doi.org/10.1111/j.1365-
2656.2006.01165.x.

Newman, Ken, King, Ruth, Elvira, Víctor, de Valpine, Perry, McCrea, Rachel S, Mor-
gan, Byron JT, 2022. State-space models for ecological time-series data: Practical
model-fitting. Methods Ecol. Evol. 14 (1), 26–42. http://dx.doi.org/10.1111/2041-
210X.13833.

Nietlisbach, Pirmin, Muff, Stefanie, Reid, Jane M, Whitlock, Michael C, Keller, Lukas F,
2019. Nonequivalent lethal equivalents: Models and inbreeding metrics for unbiased
estimation of inbreeding load. Evol. Appl. 12 (2), 266–279. http://dx.doi.org/10.
1111/eva.12713.

Ovaskainen, Otso, Meerson, Baruch, 2010. Stochastic models of population extinction.
Trends Ecol. Evol. 25 (11), 643–652. http://dx.doi.org/10.1016/j.tree.2010.07.009.

Pimm, Stuart L, Jenkins, Clinton N, Abell, Robin, Brooks, Thomas M, Gittleman, John L,
Joppa, Lucas N, Raven, Peter H, Roberts, Callum M, Sexton, Joseph O, 2014. The
biodiversity of species and their rates of extinction, distribution, and protection.
Science 344 (6187), 1246752. http://dx.doi.org/10.1126/science.1246752.

R Core Team, 2021. R: A Language and Environment for Statistical Computing. R
Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/.

Rabiner, Lawrence R., 1989. A tutorial on hidden Markov models and selected
applications in speech recognition. Proc. IEEE 77 (2), 257–286.

Ripple, William J, Wolf, Christopher, Newsome, Thomas M, Hoffmann, Michael,
Wirsing, Aaron J, McCauley, Douglas J, 2017. Extinction risk is most acute for
the world’s largest and smallest vertebrates. Proc. Natl. Acad. Sci. USA 114 (40),
10678–10683. http://dx.doi.org/10.1073/pnas.1702078114.

Robb, Gillian N, McDonald, Robbie A, Chamberlain, Dan E, Reynolds, S James,
Harrison, Timothy JE, Bearhop, Stuart, 2008. Winter feeding of birds increases
productivity in the subsequent breeding season. Biol. Lett. 4 (2), 220–223. http:
//dx.doi.org/10.1098/rsbl.2007.0622.

Robert, Christian, Casella, George, 2010. Introducing Monte Carlo Methods with R.
Springer.

Robinson, Jacqueline, Kyriazis, Christopher C, Yuan, Stella C, Lohmueller, Kirk E,
2023. Deleterious variation in natural populations and implications for conservation
genetics. Annu. Rev. Anim. Biosci. 11, 93–114. http://dx.doi.org/10.1146/annurev-
animal-080522-093311.

Royle, J. Andrew, Dorazio, Robert M., 2012. Parameter-expanded data augmentation
for Bayesian analysis of capture–recapture models. J. Ornithol. 152 (2), 521–537.
http://dx.doi.org/10.1007/s10336-010-0619-4.

Saccheri, Ilik, Kuussaari, Mikko, Kankare, Maaria, Vikman, Pia, Fortelius, Wilhelm,
Hanski, Ilkka, 1998. Inbreeding and extinction in a butterfly metapopulation.
Nature 392 (6675), 491–494. http://dx.doi.org/10.1038/33136.

Tibbits, Matthew M, Groendyke, Chris, Haran, Murali, Liechty, John C, 2014. Auto-
mated factor slice sampling. J. Comput. Graph. Statist. 23 (2), 543–563. http:
//dx.doi.org/10.1080/10618600.2013.791193.

Zucchini, Walter, MacDonald, Iain L., Langrock, Roland, 2016. Hidden Markov Models
for Time Series: An Introduction using R. Chapman and Hall/CRC.

http://dx.doi.org/10.1890/08-2235.1
http://dx.doi.org/10.1890/08-2235.1
http://dx.doi.org/10.1890/08-2235.1
http://dx.doi.org/10.1146/annurev-ecolsys-110512-135747
http://dx.doi.org/10.1146/annurev-ecolsys-110512-135747
http://dx.doi.org/10.1146/annurev-ecolsys-110512-135747
http://dx.doi.org/10.1111/cobi.13646
http://dx.doi.org/10.1111/cobi.13646
http://dx.doi.org/10.1111/cobi.13646
http://dx.doi.org/10.1016/j.tree.2007.10.003
http://dx.doi.org/10.1016/j.ecolmodel.2022.110251
http://dx.doi.org/10.1016/j.ecolmodel.2022.110251
http://dx.doi.org/10.1016/j.ecolmodel.2022.110251
http://dx.doi.org/10.1146/annurev-ecolsys-102209-144621
http://dx.doi.org/10.1146/annurev-ecolsys-102209-144621
http://dx.doi.org/10.1146/annurev-ecolsys-102209-144621
http://dx.doi.org/10.1002/eap.1868
http://dx.doi.org/10.1002/eap.1868
http://dx.doi.org/10.1002/eap.1868
http://dx.doi.org/10.1186/1471-2148-10-33
http://dx.doi.org/10.1038/s41559-019-0968-1
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb10
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb11
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb12
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb12
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb12
http://dx.doi.org/10.1080/10618600.2016.1172487
http://dx.doi.org/10.1111/j.1365-2311.2007.00854.x
http://dx.doi.org/10.1111/2041-210X.13059
http://dx.doi.org/10.1016/j.cub.2019.11.062
http://dx.doi.org/10.1016/j.cub.2019.11.062
http://dx.doi.org/10.1016/j.cub.2019.11.062
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb17
http://dx.doi.org/10.1007/978-0-387-78151-8_48
http://dx.doi.org/10.1007/978-0-387-78151-8_48
http://dx.doi.org/10.1007/978-0-387-78151-8_48
http://dx.doi.org/10.1111/gcb.15186
http://dx.doi.org/10.1111/gcb.15186
http://dx.doi.org/10.1111/gcb.15186
http://dx.doi.org/10.1111/acv.12479
http://dx.doi.org/10.1016/S0169-5347(02)02489-8
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb22
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb22
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb22
http://dx.doi.org/10.1002/evl3.209
http://dx.doi.org/10.1086/285580
http://dx.doi.org/10.1198/1061860043010
http://dx.doi.org/10.1017/S001667230000313X
http://dx.doi.org/10.1017/S001667230000313X
http://dx.doi.org/10.1017/S001667230000313X
http://dx.doi.org/10.1080/02664760120108638
http://dx.doi.org/10.1080/02664760120108638
http://dx.doi.org/10.1080/02664760120108638
http://dx.doi.org/10.1890/ES15-00049.1
http://dx.doi.org/10.1111/j.1365-2745.2006.01150.x
http://dx.doi.org/10.1111/j.1365-2745.2006.01150.x
http://dx.doi.org/10.1111/j.1365-2745.2006.01150.x
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb30
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb30
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb30
http://dx.doi.org/10.1111/j.1365-2656.2006.01165.x
http://dx.doi.org/10.1111/j.1365-2656.2006.01165.x
http://dx.doi.org/10.1111/j.1365-2656.2006.01165.x
http://dx.doi.org/10.1111/2041-210X.13833
http://dx.doi.org/10.1111/2041-210X.13833
http://dx.doi.org/10.1111/2041-210X.13833
http://dx.doi.org/10.1111/eva.12713
http://dx.doi.org/10.1111/eva.12713
http://dx.doi.org/10.1111/eva.12713
http://dx.doi.org/10.1016/j.tree.2010.07.009
http://dx.doi.org/10.1126/science.1246752
https://www.R-project.org/
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb37
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb37
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb37
http://dx.doi.org/10.1073/pnas.1702078114
http://dx.doi.org/10.1098/rsbl.2007.0622
http://dx.doi.org/10.1098/rsbl.2007.0622
http://dx.doi.org/10.1098/rsbl.2007.0622
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb40
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb40
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb40
http://dx.doi.org/10.1146/annurev-animal-080522-093311
http://dx.doi.org/10.1146/annurev-animal-080522-093311
http://dx.doi.org/10.1146/annurev-animal-080522-093311
http://dx.doi.org/10.1007/s10336-010-0619-4
http://dx.doi.org/10.1038/33136
http://dx.doi.org/10.1080/10618600.2013.791193
http://dx.doi.org/10.1080/10618600.2013.791193
http://dx.doi.org/10.1080/10618600.2013.791193
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb45
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb45
http://refhub.elsevier.com/S0304-3800(24)00050-4/sb45

	Kansi_Rondon_etal-2024-A_Bayesian_multi-state_model_with_data_augmentation
	1-s2.0-S0304380024000504-main
	A Bayesian multi-state model with data augmentation for estimating population size and effect of inbreeding on survival
	Introduction
	Material and Methods
	Multi-state model for data analysis
	Survival probabilities

	Model building
	Simulation of test scenarios
	Simulated scenarios
	Model assessment


	Results
	Estimation of population size
	Posterior estimation of survival probability
	Posterior estimation of the entry probabilities

	Discussion
	CRediT authorship contribution statement
	Declaration of competing interest
	Data availability
	Acknowledgments
	Appendix A
	Posterior estimation of the survival probability
	Posterior estimation of the entry probabilities

	Appendix B. Supplementary data
	References