Jukuri, open repository of the Natural Resources Institute Finland (Luke)               
   
 
 
All material supplied via Jukuri is protected by copyright and other intellectual property rights. 
Duplication or sale, in electronic or print form, of any part of the repository collections is prohibited. 
Making electronic or print copies of the material is permitted only for your own personal use or for 
educational purposes. For other purposes, this article may be used in accordance with the publisher’s 
terms. There may be differences between this version and the publisher’s version. You are advised to 
cite the publisher’s version. 
This is an electronic reprint of the original article.  
This reprint may differ from the original in pagination and typographic detail. 
 
Author(s): Mari Myllymäki, Tomáš Mrkvička 
Title: GET: Global Envelopes in R 
Year: 2024 
Version: Published version 
Copyright: The Author(s) 2024 
Rights: CC BY 4.0 
Rights url: https://creativecommons.org/licenses/by/4.0/  
 
Please cite the original version: 
Myllymäki, M., & Mrkvička, T. (2024). GET: Global Envelopes in R. Journal of Statistical Software, 111(3), 
1–40. https://doi.org/10.18637/jss.v111.i03.  
JSS Journal of Statistical Software
November 2024, Volume 111, Issue 3. doi: 10.18637/jss.v111.i03
GET: Global Envelopes in R
Mari Myllymäki
Natural Resources Institute Finland (Luke)
Tomáš Mrkvička
University of South Bohemia
Abstract
This work describes the R package GET that implements global envelopes for a general
set of d-dimensional vectors T in various applications. A 100(1−α)% global envelope is a
band bounded by two vectors such that the probability that T falls outside this envelope
in any of the d points is equal to α. The term ‘global’ means that this probability is
controlled simultaneously for all the d elements of the vectors. The global envelopes can
be employed for central regions of functional or multivariate data, for graphical Monte
Carlo and permutation tests where the test statistic is multivariate or functional, and
for global confidence and prediction bands. Intrinsic graphical interpretation property is
introduced for global envelopes. The global envelopes included in the GET package have
this property, which particularly helps to interpret test results, by providing a graphical
interpretation that shows the reasons of rejection of the tested hypothesis. Examples
of different uses of global envelopes and their implementation in the GET package are
presented, including global envelopes for single and several one- or two-dimensional func-
tions, Monte Carlo goodness-of-fit tests for simple and composite hypotheses, comparison
of distributions, functional analysis of variance, functional linear model, and confidence
bands in polynomial regression.
Keywords: functional linear model, central region, goodness-of-fit, graphical normality test,
Monte Carlo test, multiple testing, permutation test, R, spatial point pattern.
1. Introduction
Global envelopes are useful for formal testing of various hypotheses using functional or mul-
tivariate statistics when interpretation of the test results is of key interest, for determining
central regions of functional or multivariate data, and also for determining confidence or
prediction bands (e.g., Myllymäki, Mrkvička, Grabarnik, Seijo, and Hahn 2017; Mrkvička,
Myllymäki, and Hahn 2017; Narisetty and Nair 2016). Global envelopes have shown their
usefulness already in many areas, e.g., spatial statistics, functional data analysis and im-
age analysis, with applications to agriculture, architecture and art, astronomy and astro-
2 GET: Global Envelopes in R
physics, ecology, evolution, economics, eye movement research, fisheries, forestry, geography,
material science, and medicine, health and neurosciences (e.g., Stoyan 2016; Murrell 2018;
Häbel et al. 2017; Chaiban et al. 2019; Pollington, Tildesley, Hollingsworth, and Chapman
2020). To make these methods easily accessible, the R (R Core Team 2024) package GET
has been developed that is available from the Comprehensive R Archive Network (CRAN)
at https://CRAN.R-project.org/package=GET. A development version of the package is
available via the repository https://github.com/myllym/GET. The package provides an
implementation of global envelopes in various settings.
Because there are many other methods that can be used for the same purposes as global
envelopes, we first give a motivating example of functional analysis of covariance (ANCOVA)
to demonstrate the main features and advantages of global envelope tests (Section 1.1). Other
possible usages are also discussed in more detail. The second part of this introductory section
describes the competing and complementary methods and software for these same usages.
Thereafter, in Section 2, we give an overview of global envelopes including the formal defi-
nition of their graphical interpretation, summary of the types of global envelopes and their
implementation in GET. In Section 3, the usage of global envelopes is shown for several
examples of applications illustrating main features of the GET package, namely (1) the com-
putation of central regions and functional boxplots for a set of functions or jointly for several
sets of functions (Section 3.1); (2) the Monte Carlo goodness-of-fit test for simple hypotheses
with application to spatial statistics (Section 3.2); (3) the Monte Carlo goodness-of-fit test
for composite hypotheses with application to graphical normality testing (Section 3.3); 4) the
graphical n-sample test of correspondence of distribution functions, n ≥ 2 (Section 3.4);
(5) the graphical functional one-way analysis of variance (ANOVA) (Section 3.5); (6) the
functional general linear model (GLM) for images (Section 3.6); and (7) the computation of
the confidence band in polynomial regression (Section 3.7). The final section, Section 4, is
left for discussion.
1.1. A motivating example and possible usages
Let us study the differences between population growth accumulated in different continents
over the period of 55 years. We assume that GDP (gross domestic product) can be the main
driver of population growth in different countries, therefore we add GDP into the model as
a nuisance factor. Figure 1 (left) shows population growth, defined here as the population
at the end of the year divided by the population at the beginning of the year, in years from
1960 to 2014 in Africa, Asia, Europe and North America, and Latin America, total in 112
countries with more than one million inhabitants in 1950 (Nagy, Gijbels, and Hlubinka 2017;
Dai, Mrkvička, Sun, and Genton 2020). Figure 1 (right) shows the GDP of every country in
the study discounted to the 1960 USD, according to USD inflation. The data was obtained
from the World Bank. The missing values of the GDP of a country were extrapolated using
the closest known ratio of the GDP of the country and the median GDP in that year, and
interpolated using linear interpolation of the two closest ratios.
Since the global envelopes are a nonparametric tool, we can use as a test statistic any multi-
variate or functional test statistic without the worry of breaking assumptions of normality or
homogeneity of the distribution of the test statistic in different years. Also, since we can use
any test statistic, we can combine several functional test statistic into one (for more details
about combining test statistics see Appendix B). Here, we combine four functional test statis-
Journal of Statistical Software 3
0.95
1.00
1.05
1.10
1960 1980 2000
Year
Po
pu
la
tio
n 
gr
ow
th
0
5000
10000
15000
1960 1980 2000
Year
G
DP
 (in
 19
60
 U
SD
)
Continent Africa Asia Europe and North America Latin America
Figure 1: Population growth and GDP (in 1960 USD) curves for 1960–2014 for 112 countries
from the four continents.
tics, each representing a coefficient vector attached to a continent in our ANCOVA model,
i.e., our test statistic is
βCont = (βCont1,1960, . . . , βCont1,2014︸ ︷︷ ︸
1:Africa
, βCont2,1960, . . . , β
Cont
2,2014︸ ︷︷ ︸
2:Asia
,
βCont3,1960, . . . , β
Cont
3,2014︸ ︷︷ ︸
3:Europe and North America
, βCont4,1960, . . . , β
Cont
4,2014︸ ︷︷ ︸
4:Latin America
), (1)
where βCont1,i , . . . βCont4,i are the parameters of the univariate ANCOVA model for year i, com-
puted with condition∑4j=1 βContj,i = 0. This model parametrization enables that the parameter
βContj,i is the difference between the j-th continent’s growth rate and the overall mean.
This test statistic allows for a direct comparison of each continent’s growth rate with the
world’s mean growth rate. Figure 2 shows the test statistic (Equation 1) computed for the
data together with 95% global envelope computed under the null model of no continent
effect but with nuisance effect of GDP. This output shows, first, the formal global p value,
which indicates that the effect of the continent is significant after applying an adjustment
for multiple comparisons for the 55 years, and four continents. Second, it shows the reasons
of rejections, i.e., which years have led to the significant test result. Due to the intrinsic
graphical interpretation (IGI, see Definition 2.1) of the global envelope, we have a one-to-one
correspondence between the formal global test and its graphical interpretation, i.e., the data
statistic lies outside the envelope for some point (here years and continents) if and only if
the global test is significant. This allows for direct identification of the significant years.
Third, the output shows the direction and amount of deviation from the null hypothesis. For
example, the growth rate in Europe and North America is significantly lower than the world’s
average in all years, whereas Asia’s growth rate is slightly significantly higher than the world’s
average only up to the year 1966. Fourth, due to the free choice of the test statistic, the test
output shows here the comparison individually for every continent (level of the categorical
predictor). The GET package also allows for the choice of comparison between all pairs of
continents in the style of the post hoc test. Note that usually, the two tests are applied: global
4 GET: Global Envelopes in R
Continent.Europe and North America Continent.Latin America
Continent.Africa Continent.Asia
1960 1980 2000 1960 1980 2000
−0.01
0.00
0.01
−0.01
0.00
0.01
r (Year)
β^ i(
r)
Central function Data function
Graphical functional GLM: p < 0.001
Figure 2: Test for the main effect of the continent, given the GDP (in 1960 USD) as a nuisance
factor. The red color is used to highlight the (significant) years where the data coefficient
βˆContj,i , i = 1960, . . . , 2014, j = 1, 2, 3, 4, goes outside the 95% global envelope computed under
the null model of no continent effect. Here j refers to the continent given in the titles.
AN(C)OVA and multiple comparison post hoc test. Note also that we can calculate results
also for continuous GDP effect and interaction effect between the GDP and all continents (all
levels of the categorical effect) in the same way (see Mrkvička and Myllymäki 2023).
Technically, the global envelope of Figure 2 was computed from 5000 permutations under the
null model using the standard Freedman and Lane (1983) permutation procedure. The IGI
measure used was the area measure (see Appendix A for details).
The above example illustrates the global envelope test, a possible usage of global envelopes.
The usage of global envelopes is however more versatile. They can be used for producing:
(i) A central region: a central region is constructed for a set of vectors or functions in
order to find central or outlying vectors or functions (e.g., outlier detection, functional
boxplot).
(ii) A global envelope test: a Monte Carlo goodness-of-fit test or a permution test where
the test statistic is multivariate or a function of any dimension (e.g., goodness-of-fit test
Journal of Statistical Software 5
for point patterns, random sets, or for a family of distributions, functional ANOVA,
functional GLM, n-sample test of correspondence of distribution functions).
(iii) Global confidence or prediction bands: a confidence or prediction band is produced
from a set of vectors or functions obtained by bootstrap or sampling from Bayesian
posterior distribution (e.g., confidence band in polynomial regression, Bayesian posterior
prediction).
In fact, the global envelope test uses the 95% central region (case (i) usage) computed under
the null hypothesis for determining the test results. The global confidence bands (case (iii)
usage) also uses the 95% central region computed under the full model. Thus in core of all
applications lies the central region. Central regions can be constructed based on any functional
ordering, but we concentrate only in those satisfying the IGI, the one-to-one correspondence
of a functional ordering and its graphical interpretation. In fact, when central region is not
satisfying IGI and we see a function which is not contained completely inside the central
region, we still do not know if such a function can be regarded as extremal. On the contrary,
when we use IGI ordering, we know that such a function is among the most, say 5%, extremal
functions.
In this article, we focus on global envelopes for testing the global hypothesis under the control
of family-wise error rate (FWER). This is to determine if at least one component of the
test statistic can be considered significant. Alternatively, one can be interested in the local
test, which aims to differentiate between the domain where the test should be rejected and
where it should not be rejected. For this purpose, the false discovery rate (FDR) control is
typically suitable, and we recently developed the FDR envelope (Mrkvička and Myllymäki
2023) which accomplishes this task with an envelope which has the IGI under the FDR control.
Examples of the FDR envelopes are provided in the vignette which can be accessed by typing
library("GET") and vignette("FDRenvelopes") in R.
1.2. Competing and complementary methods and software
Below are listed the other R packages (or code) that we know to provide functions for some
global envelopes or central regions. Further, as already mentioned, the problems (i)–(iii)
can be solved by other methods as well, not just by global envelopes. The relation of these
methods to the global envelope methods are also discussed below.
Global envelopes and central regions:
• The R package fda (Ramsay, Graves, and Hooker 2022) provides the function fbplot()
for the computation of the central region and functional boxplot according to two dif-
ferent orderings than those described here, namely the band depth and modified band
depth (MBD, López-Pintado and Romo 2009; Sun, Genton, and Nychka 2012), but
these depths do not allow for IGI.
• The R package depthTools (López-Pintado and Torrente 2021; Torrente, López-Pintado,
and Romo 2013) similarly allows for central regions based on MBD (no IGI).
• The R package spatstat (Baddeley, Rubak, and Turner 2015) provides the function
envelope() for the simulation of envelopes based on a given summary function of
a spatial point pattern. By default, envelope() provides a pointwise envelope, but
6 GET: Global Envelopes in R
the option global = TRUE allows one to compute the global envelope of Ripley (1981),
which corresponds to the "unscaled" envelope in GET (see Table 1). It has been shown
that this unscaled global envelope test has generally lower power than the other methods
of Table 1 (Myllymäki, Grabarnik, Seijo, and Stoyan 2015; Myllymäki et al. 2017).
The corresponding adjusted unscaled global envelope (Dao and Genton 2014; Baddeley,
Hardegen, Lawrence, Milne, Nair, and Rakshit 2017) for composite hypotheses is also
provided in spatstat (the function dg.envelope()).
• Aldor-Noiman, Brown, Buja, Rolke, and Stine (2013) presented a global envelope for
a Q-Q plot (and provided a link to an R script). The shape of the envelope is derived
theoretically, but the size of the envelope has to be computed from simulations. The
methods of the GET package can be used for this purpose as well, both for simple and
composite hypotheses, but the theoretical achievement of Aldor-Noiman et al. (2013)
for simple hypotheses has apparent advantages.
• The R package boot (Canty and Ripley 2024; Davison and Hinkley 1997) provides
the function envelope() for the computation of a global envelope from bootstrapped
functions. This envelope has the same shape as the global rank envelope ("rank" in
Table 1), but the appropriate envelope (l of Equation 8) is chosen in boot experimentally
(Davison and Hinkley 1997). Since the differences in the nominal levels of the subsequent
(l-)envelopes from which the choice is made can be large, the predetermined level is
reached only approximately.
• The package dbmss (Marcon, Traissac, Puech, and Lang 2015) provides similar global
envelopes as the boot package (Duranton and Overman 2005) but for the global confi-
dence envelopes of spatial summaries.
• There are other R packages with the ability to compute simultaneous confidence bands
for various models, e.g., excursions (Bolin and Lindgren 2015, 2017, 2018) for Gaus-
sian processes, AdaptFitOS (Wiesenfarth, Krivobokova, Klasen, and Sperlich 2012) for
semiparametric regression models and SCBmeanfd (Degras 2016) for nonparametric re-
gression models with functional data using a functional asymptotic normality result.
Instead, the global envelopes of the GET package (see Table 1) are constructed non-
parametrically from a set of vectors. Apparently, the package excursions allows also for
a non-parametric version with the same shape as the global rank envelope ("rank" in
Table 1), similarly as boot.
Multiple testing: The global envelope tests can be seen as a general solution to the multiple
testing problem in Monte Carlo tests (Mrkvička et al. 2017). There are several other meth-
ods and R packages to the multiple testing problem controlling FWER. The few packages
mentioned below have a link to the methods of GET:
• The R packages coin (Hothorn, Hornik, Van de Wiel, and Zeileis 2008, 2006) and
multtest (Pollard, Dudoit, and van der Laan 2005) enable one to compute the p value
adjusted for multiple testing in a multiple permutation test based on the minimum
p value computed from all individual tests. The null distribution of the minimum p val-
ues or the maximum of a test statistic is obtained from permutations. The minimum
p value method corresponds to the conservative rank test based on the p+ value (see
global rank envelope in Appendix A).
Journal of Statistical Software 7
• General multiple test procedures are also provided by the package sgof (Castro Conde
and de Una Alvarez 2024) for goodness-of-fit testing and by the package stats (R Core
Team 2024) for adjusting the p values for multiple comparisons by Bonferroni type
methods (the function p.adjust()).
Functional GLM: The global envelope tests can also be used for functional GLM using a
permutation strategy to generate samples under the null hypothesis. There are several other
methods and software to the functional GLM problem:
• The PALM software (Winkler, Ridgway, Webster, Smith, and Nichols 2014) allows
for the computation of various functional GLM designs using permutation tests. The
multiple testing problem is solved by an unscaled envelope constructed for the test
statistic (e.g., F statistic).
• The R packages fda.usc (Febrero-Bande and Oviedo de la Fuente 2012) and fdANOVA
(Gorecki and Smaga 2018) allow for the computation of functional ANOVA designs
by several methods together with the computation of factor’s significances. Similarly
the package fda (Ramsay et al. 2022) allows for computations in functional regression
designs. However, to the best of our knowledge, these do not provide the IGI of tests
of a factor significance.
2. Overview of global envelopes in GET
2.1. Global envelopes and intrinsic graphical interpretation (IGI)
Global envelopes are constructed for general multivariate statistics, so in the case when the
data are purely functional, they first have to be discretized. The discretization of the functions
can be arbitrary, as long as it is the same for each function. Therefore, let T1,T2, . . . ,Ts be
d-dimensional vectors, Ti = (Ti1, Ti2, . . . , Tid) for i = 1, . . . , s. In the case (i) (see Section 1.1),
the vectors Ti are observed functions assumed to follow the same distribution and the aim is
to find the least extreme 100(1 − α)% of the s vectors to construct the central region, with
α ∈ (0, 1). In the case (ii), the vector T1 is an observed function and T2, . . . ,Ts are vectors
simulated under the tested null hypothesis. The key idea of a global envelope test is the same
as a general Monte Carlo or permutation test: the rank of the data vector T1 among all the
vectors T1, . . . ,Ts determines the result and the p value of the test (Barnard 1963). And, in
the case (iii), Ti are generated under the given bootstrap or Bayesian scheme. As mentioned
already, in the core of all cases (i)–(iii) lies the central region and, for the purpose of ordering
the d-dimensional vectors Ti (or functions) from the most extreme to the least extreme and
finding the central region, many different measures exist. The GET package focuses on such
measures for which it is possible to construct the global envelope with a practically interesting
graphical interpretation, which we call IGI.
In general, an envelope is considered to be a band bounded by two vectors. A 100(1 − α)%
global envelope is a set (T(α)low,T
(α)
upp) of envelope vectors with T(α)low = (T
(α)
low 1, . . . , T
(α)
low d) and
T(α)upp = (T (α)upp 1, . . . , T
(α)
upp d) such that the probability that Ti falls outside this envelope in any
of the d points is equal to α, for α ∈ (0, 1), i.e.,
P (Tik /∈ [T (α)low k, T (α)upp k] for any k ∈ {1, . . . , d}) = α. (2)
8 GET: Global Envelopes in R
In all cases (i)–(iii), global means that the envelope is given with the prescribed coverage
100(1 − α)% simultaneously for all the elements of the multivariate or functional statistic,
but the probability depends on the situation (i)–(iii). In case (i), the probability is taken
under the distribution of the vectors Ti. In case (ii), the probability is taken under the null
hypothesis H0, and, in case (iii), the probability is taken under the distribution of the random
vector Ti generated under the given bootstrap or Bayesian scheme. It should be noted that in
a pointwise (or local) envelope the probability to fall out of the envelope is controlled instead
individually for every element of the vector Ti.
We define the IGI property for global envelopes as follows.
Definition 2.1 Assume that a general ordering ≺ of the vectors Ti, i = 1, . . . , s, is induced
by a univariate measure Mi. That is, Mi ≥ Mj iff Ti ≺ Tj, which means that Ti is less
extreme or as extreme as Tj. (The smaller the measure Mi, the more extreme the Ti is.)
The 100(1 − α)% global envelope [T(α)low k,T(α)upp k] has intrinsic graphical interpretation (IGI)
with respect to the ordering ≺ if
1. M(α) ∈ R is the largest of the Mi such that the number of those i for which Mi < M(α)
is less or equal to αs;
2. Tik < T (α)low k or Tik > T
(α)
upp k for some k = 1, . . . , d iff Mi < M(α) for every i = 1, . . . , s;
3. T (α)low k ≤ Tij ≤ T (α)upp k for all k = 1, . . . , d iff Mi ≥M(α) for every i = 1, . . . , s.
The points 2 and 3 are equivalent, but we specify them both in order to stress both graphical
properties of IGI. The IGI property means that the vector Ti is outside the 100(1 − α)%
global envelope in any of its components if and only if the vector is considered to be extreme
by the measure M at the level α, and the vector Ti is completely inside the global envelope
if and only if the vector is not extreme at the level α. Thus, the global envelope with IGI
provides a solution to the tasks (i)–(iii) in a graphical manner.
In particular in case (ii), the data vector T1 is compared with a global envelope in order to
decide if the data vector is extreme (H0 is rejected) or not extreme (H0 is not rejected), and to
find reasons for the possible rejection of H0 through inspecting for which k = 1, . . . , d the data
vector T1 is outside the global envelope.For this testing task, in addition to a global envelope,
a Monte Carlo p value is computed according to the measure Mi: p =
∑s
i=1 1(Mi ≤ M1)
/
s
(see, e.g., Myllymäki et al. 2017; Mrkvička et al. 2017; Mrkvička, Myllymäki, Jílek, and
Hahn 2020; Mrkvička, Myllymäki, Kuronen, and Narisetty 2022; Mrkvička, Roskovec, and
Rost 2021). In order to obtain an exact Monte Carlo test, i.e., a test that achieves the
prescribed FWER, the exchangeability of the test vectors Ti is required. All the examples in
Section 3 satisfy exchangeability, except the functional GLM where the permutation of the
residuals from the null model (Freedman and Lane 1983) is used. This permutation scheme
is commonly used in univariate permutation GLMs and widely accepted as the best available
solution (Anderson and Robinson 2001; Anderson and Ter Braak 2003; Winkler et al. 2014).
2.2. Types of global envelopes
The GET package implements seven global envelopes defined in earlier works as specified
in Table 1, together with their short descriptions and specifications in the GET functions.
Journal of Statistical Software 9
Type Introduced in Description
"rank" Myllymäki et al. (2017) Global rank envelope corresponding to the extreme rank
measure (with ties); unique ordering (or p value) pro-
vided additionally as specified in the argument ties,
e.g., "erl" for extreme rank lengths
"erl" Myllymäki et al.
(2017); Narisetty and
Nair (2016); Mrkvička
et al. (2020)
Global rank envelope corresponding to extreme rank
length (ERL) measure
"cont" Hahn (2015); Mrkvička
et al. (2022)
Global rank envelope corresponding to the continuous
rank measure
"area" Mrkvička et al. (2022) Global rank envelope corresponding to the area measure
"qdir" Myllymäki et al. (2017,
2015)
Directional quantile envelope test corresponding to
the directional quantile maximum absolute deviation
(MAD) measure
"st" Myllymäki et al. (2017,
2015)
Studentized envelope test corresponding to the studen-
tized MAD measure
"unscaled" Ripley (1981) Unscaled envelope test corresponding to the classical,
unscaled MAD measure. The envelope has a constant
width.
Table 1: Overview of different types of global envelopes in the GET package. The types
"erl", "cont" and "area" refine the type "rank" by breaking the ties in the extreme ranks,
for details see Appendix A.
Definitions of these envelopes are given in Appendix A. The first four envelopes in Table 1
("rank", "erl", "cont", "area") are completely non-parametric envelopes and are called
global rank envelopes, because they are all based on pointwise ranks of the vectors Ti. The
extreme rank length, continuous and area envelopes are refinements to the rank envelope.
Note here that if a global FWER control is provided within other packages, it is only provided
for not refined "rank" envelope or "unscaled" envelope. (see details in Appendix A). The
"st" and "qdir" envelopes parameterize the marginal distributions of Tik, i = 1, . . . , s by
one or two parameters, respectively, for all k = 1, . . . , d. Thus they can be regarded as
approximations of the first four envelopes.
In a typical application one needs to choose one of the measures with IGI (Table 1). In
general, the first five types of Table 1 instead of the last two, "st" and "unscaled", can be
recommended based on previous studies (Myllymäki et al. 2015, 2017). Regarding the choice
between the first five types, when one can afford a large number of simulations in cases (ii)–
(iii) of Section 1.1, one can well use type "erl" that is based only on the ranks, thus also
suiting particularly well for combined tests where the test statistic is a combination of several
functional test statistics (see Appendix B and examples in Sections 1.1 and 3.1). On the other
hand, any other choice is also fine, because the "rank", "erl", "cont" and "area" measures
lead to an equivalent outcome for a large number of simulations or permutations. However,
the definition of large depends on the situation.A simulation study presented in Myllymäki
and Mrkvička (2020) supplements this article, giving guidance on the required number of
simulations under different scenarios.
Another situation arises in case (i) with a low number of vectors or functions, or in cases (ii)–
10 GET: Global Envelopes in R
(iii) where the simulations or permutations are too time consuming to have a large number
of them. Then the choice of the measure plays a role. Based on our experience supported by
the simulation study presented in Myllymäki and Mrkvička (2020), the "erl" and "area"
measures are typically good choices for detecting integral type of extremeness where the
vector Ti is extreme in the set of vectors for a large range of its components. On the
other hand, the "cont" and "qdir" measures are most sensitive to the maximum type of
extremeness, i.e., the case where Ti is extreme only for a few of its components, but also
the "area" measure performs well. Thus, if no particular type of extremeness is expected a
priori, the "area" measure is often a good compromise, since it is sensitive to the amount
of outlyingness (similarly as "erl") and to the value of outlyingness (similarly as "cont"
and "qdir"). Illustration of the different measures can be found in Mrkvička et al. (2022,
Section 2.4).
To conclude, for testing, if just possible, we recommend to use a large number of simulations
and one of the first five measures of Table 1. If large number of simulations is not possible,
then still Barnard’s Monte Carlo test is valid. For testing and ordering functional data, use
then the measure which is sensitive to the type of extremeness that you regard as important.
If the type of extremeness is unknown, then the "area" measure can be preferred.
As it was mentioned earlier, other measures or depths without IGI can be used for the order-
ing the data. We believe that there exists no specific depth/measure which would perform
universally the best and that for every depth/measure it is possible to construct a situation
for which the chosen depth/measure will be the most powerful. However, for an application at
hand, it is typically not possible to know the most powerful depth/measure; one has to make
the choice of the depth/measure in advance. Therefore, we believe that it is often practical
to choose the IGI measure, which brings an easier interpretation. In a particular study of the
power of goodness of fit tests in spatial statistics our IGI measures were much more powerful
than the modified band depth and modified half region depth (Myllymäki et al. 2017), but
no universal conclusions can be drawn from a single study.
2.3. Implementation and key functions in GET
As described above, the construction of a global envelope is based on a measure M . The
calculation of different measures in the GET package is provided by the function forder()
(functional ordering). Most often, the user however calls either central_region() for con-
structing central regions with IGI or global_envelope_test() for performing global envelope
tests (equipped with p values as well). Both functions utilize forder() for the calculation of
the measures M . The most important arguments of these functions are
central_region(curve_sets, type = "erl", coverage = 0.50, ...)
global_envelope_test(curve_sets, type = "erl", alpha = 0.05, ...)
where the multivariate or functional data are provided in curve_sets, type specifies type
of the global envelope (see Table 1 and descriptions in Appendix A), and the coverage or
level of the global envelope is specified by coverage or alpha (= 1−coverage), respectively.
Additionally, one can, for example, specify the one or two-sided alternative, i.e., whether
only small or large values of Ti or both should be considered extreme. These two functions
are the core functions for global envelopes in the package GET: given an appropriate set of
Journal of Statistical Software 11
curves, or, in fact vectors, they can be used for producing global envelopes of Table 1 in all
tasks (i)–(iii) listed in Section 1.1.
Recently, the argument typeone was added to global_envelope_test(). This argument can
be used to specify the control for the global test, FWER or FDR, where the former (default)
leads to the tests of Table 1, as further specified by the argument type, and the latter to the
FDR envelopes proposed in Mrkvička and Myllymäki (2023).
Different objects are supported for the data in curve_sets (see help files of the functions and
examples below), but the basic form provided by the GET package is a ‘curve_set’ object
that can be constructed by the function curve_set() simply providing the observed and/or
simulated curves, and optionally the (one- or two-dimensional) argument values where the
curves have been observed (see Section 3.2 for an example). The function curve_set() takes
care of checking the appropriateness of the data, and saving the data in the form that contains
the relevant information of the curves for global envelope methods, in particular for plotting
the results with graphical interpretation (see examples in Section 3).
In addition to constructing global envelopes from a set of curves, the central_region() and
global_envelope_test() functions provide combined central regions or combined global
envelope tests if the user provides a list consisting of (appropriate) sets of curves in the
argument curve_sets. Details of the combining methodology is given in Appendix B and
examples are given in Sections 1.1 and 3. The GET package also provides functions for specific
tasks (see Table 2 and the examples in Section 3). These functions utilize central_region()
and global_envelope_test() for the global envelope construction. In addition, many of
these functions take care of preparing the simulations or permutations for the specific testing
task. The print() and plot() methods are available for the objects obtained as the output
of the global envelope methods of GET. The plots present the results with IGI. They are
produced using the ggplot2 package (Wickham 2016).
3. Using GET
3.1. Sets of functions, ordering, combining, and central regions
The R package fda contains Berkeley Growth Study data (Ramsay and Silverman 2005) of
the heights of 39 boys and 54 girls from ages 1 to 18 and the ages at which the data were
collected. To illustrate different features of GET, we investigated whether there are any
outliers in the girls regarding both their annual heights and changes within years. First two
‘curve_set’ objects were created containing the raw heights and the differences within the
years (see Figure 3):
R> library("fda")
R> years <- paste(1:18)
R> curves <- growth[["hgtf"]][years,]
R> cset1 <- curve_set(r = as.numeric(years), obs = curves)
R> cset2 <- curve_set(r = as.numeric(years[-1]),
+ obs = curves[-1,] - curves[-nrow(curves),])
Ordering the functions from the most extreme to the least extreme by the "area" measure
(see Table 1 and Appendix A), the 8th girl was observed to have the most extreme heights
12 GET: Global Envelopes in R
Function name Description
curve_set() Create a ‘curve_set’ out of data given in the right form
crop_curves() Crop curves
forder() Different measures for ordering the multivariate statistics
from the most extreme to least extreme one
central_region() Central regions or global envelopes or confidence bands with
IGI (see types in Table 1)
global_envelope_test() Global envelope tests
GET.composite() Adjusted global envelope tests for composite null hypotheses
fBoxplot() Functional boxplot based on a central region with IGI
graph.fanova() One-way ANOVA tests for functional data with graphical
interpretation (Mrkvička et al. 2020)
frank.fanova() One-way functional ANOVA tests based on the global en-
velopes applied to F values (Mrkvička et al. 2020)
graph.flm() Non-parametric graphical tests of significance in functional
general linear model (GLM) (Mrkvička et al. 2021)
frank.flm() Global envelope tests applied to F values in permutation
inference for the GLM (Mrkvička et al. 2022)
fclustering() Functional clustering based on a specified measure (Dai,
Athanasiadis, and Mrkvička 2022)
GET.distrequal() Graphical n sample test of correspondence of distribution
functions
GET.distrindep() Permutation-based tests of independence to samples from
any bivariate distribution (Dvořák and Mrkvička 2022)
GET.spatialF() Testing global and local covariate effects in point process
models (Myllymäki, Kuronen, and Mrkvička 2020)
Table 2: Key and special purpose functions in the GET package. More complete list
of functions can be found from the main help page of the package available by typing
help("GET-package") in R.
and the 15th girl the most extreme changes (below the first ten most extreme girl indices are
printed):
R> A1 <- forder(cset1, measure = "area")
R> order(A1)[1:10]
[1] 8 13 29 48 42 25 7 38 18 40
R> A2 <- forder(cset2, measure = "area")
R> order(A2)[1:10]
[1] 15 7 3 8 25 52 19 16 24 5
Generally, ordering with respect to heights or height differences leads to two different orderings
of the girls. Combined ordering can be done by combining these two by the ERL measure as
described in Appendix B (two-step combining procedure). In R, the two sets of curves need
to be provided in a list to the function forder():
Journal of Statistical Software 13
90
120
150
180
5 10 15
Age (years)
H
ei
gh
t
0
5
10
15
20
5 10 15
Age (years)
Ch
an
ge
girl08
girl15
girl07
Other
Figure 3: The heights (left) and height differences (right) of the 54 girls of the growth data of
the R package fda at ages from 1 to 18. Three girls having the most extreme curves (combined
ordering by the area measure) are highlighted with the colors specified in the legend.
R> csets <- list(Height = cset1, Change = cset2)
R> A <- forder(csets, measure = "area")
R> order(A)[1:10]
[1] 8 15 7 13 3 29 48 25 42 52
Figure 3 highlights the curves of the three most extreme girls. The plots of the two sets of
curves were produced using the GET and ggplot2 packages and combined by the patchwork
package (Pedersen 2024):
R> library("ggplot2")
R> library("patchwork")
R> cols <- c("#21908CFF", "#440154FF", "#5DC863FF")
R> p1 <- plot(cset1, idx = order(A)[1:3], col_idx = cols) +
+ labs(x = "Age (years)", y = "Height")
R> p2 <- plot(cset2, idx = order(A)[1:3], col_idx = cols) +
+ labs(x = "Age (years)", y = "Change")
R> p1 + p2 + plot_layout(guides = "collect")
The labels were above redefined for the default plots by the function labs() of the ggplot2
package. In general, parts of the default plots of GET can be edited in this manner using
ggplot2 functions such as labs() and ggtitle().
The combined central region can be constructed directly by passing the sets of curves to
the function central_region(), in a similar manner as for forder() above. By using the
functional boxplot (Sun and Genton 2011) with the same measure and central region, an
investigation can be made into whether the most extreme girls are outliers with respect to
height or its change. Figure 4 shows the 50% central region and the functional boxplot with
the inflation factor 1.5 jointly for the heights and their changes obtained by:
14 GET: Global Envelopes in R
Height Change
5 10 15 5 10 15
0
10
20
30
50
100
150
200
Age (years)
Va
lu
e Outliers
girl15
Combined functional boxplot based on 50% central region (area)
Figure 4: The functional boxplot (entire gray band) using the 50% central region (inner dark
gray band) and the expansion factor 1.5 jointly for the heights and changes of heights of the
54 girls (see Figure 3). The solid line is an observed data function (vector) that goes outside
the functional boxplot (the 15th girl in the data).
R> res <- fBoxplot(csets, type = "area", factor = 1.5)
R> plot(res) + labs(x = "Age (years)", y = "Value")
One can see that one of the girls (the 15th girl) is an outlier, because she has grown extraor-
dinarily much in her sixth year. However, the highest height curve of Figure 3 (left) is not
regarded as an outlier with the given specifications.
Note that the combined central region computed using any measure of Table 1 has IGI. On
the contrary, central regions computed with the use of band depths implemented in the fda
package do not satisfy IGI. Narisetty and Nair (2016) proposed central regions and functional
boxplots based on the ERL measure (see Table 1) and compared them to those based on
band depths, claiming more reasonable behavior. Note also that IGI is not a property of the
functional boxplot.
3.2. Monte Carlo goodness-of-fit testing for simple hypotheses
Figure 5 shows the locations of 67 large trees (with height > 25 m) in an area of size 75 m×
75 m from an uneven aged multi-species broadleaf nonmanaged forest in Kaluzhskie Zaseki,
Russia (Grabarnik and Chiu 2002; Van Lieshout 2010). The x- and y-coordinates of the
locations are available in the data adult_trees in the GET package.
The test of complete spatial randomness (CSR) is a typical first step in analyzing a spatial
point pattern such as the tree pattern of Figure 5. CSR along with other hypotheses for
spatial point patterns are commonly tested using an estimator of a summary function that is
a function of distance r, e.g., Ripley’s K function or its transformation L(r) =
√
K(r)/π − r
for r ≥ 0 (Ripley 1977; Besag 1977). In this context, one typically resorts to the Monte Carlo
simulation (see, e.g., Illian, Penttinen, Stoyan, and Stoyan 2008; Diggle 2013; Myllymäki et al.
2017). First, this example is used to show the general steps to prepare a global envelope test
for testing a simple hypothesis. Second, it is shown how the same example of testing a simple
hypothesis for a spatial point pattern can be performed by utilizing the R package spatstat.
Journal of Statistical Software 15
0
20
40
60
0 20 40 60
x
y
Figure 5: Locations of 67 trees with height > 25 m observed in an area of 75 m× 75 m.
The testing of a simple hypothesis does not require the estimation of any model parameters,
and the one-stage test illustrated below can be used. In the case of a composite null hypothesis,
the level of the test needs to be adjusted, see Section 3.3 and Appendix C.
General setup for testing simple hypotheses
The first step of a Monte Carlo test is to generate nsim simulations under the null hypothesis
and to calculate the chosen test function (vector) for the data and simulations. Here the
functions runifpoint() and Lest() of spatstat are used to generate a simulation from the
binomial process (CSR with the number of points fixed to the observed number of points in
the pattern X) and to estimate the centred L-function for a pattern, respectively:
R> library("spatstat.explore")
R> data("adult_trees", package = "GET)
R> X <- as.ppp(adult_trees, W = square(75))
R> nsim <- 999
R> obs.L <- Lest(X, correction = "translate")
R> r <- obs.L[["r"]]
R> obs <- obs.L[["trans"]] - r
R> sim <- matrix(nrow = length(r), ncol = nsim)
R> for(i in 1:nsim) {
+ sim.X <- runifpoint(ex = X)
+ sim[, i] <- Lest(sim.X, correction = "translate", r = r)[["trans"]] - r
+ }
Thereafter, the function curve_set() can be used to construct a ‘curve_set’ object from
16 GET: Global Envelopes in R
−2
−1
0
1
2
0 5 10 15
r
L^ (r
)−
r
Central function Data function
Global envelope test: p = 0.299
Figure 6: The global envelope test for the CSR of the tree pattern of Figure 5 using the
centred L-function. The gray band represents the 95% global envelope ("erl").
the argument values where the test vectors were evaluated (r), the observed vector (obs) and
the simulated vectors (sim):
R> cset <- curve_set(r = r, obs = obs, sim = sim)
In some cases, missing or infinite values can occur for the computed statistic at some chosen
r (e.g., for too large r with another spatial summary function J). These problematic, unin-
teresting r-values can be easily omitted from cset using the function crop_curves(). The
final step is to make the global envelope test on the given set of vectors:
R> res <- global_envelope_test(cset, type = "erl")
R> plot(res) + ylab(expression(italic(hat(L)(r)-r)))
In this manner, the global envelope test can be constructed for any simple hypothesis and
any test vector, as long as one can generate the required simulations and calculate the test
vectors.
The test output is shown in Figure 6 (left), which shows no evidence against CSR (see more
detailed description in Myllymäki et al. 2017, Section S4).
Testing simple hypotheses for a point pattern utilizing the R package spatstat
For point process testing, the GET package and global_envelope_test() support the use
of the R package spatstat (Baddeley et al. 2015) for the simulations and calculations of the
summary functions by the function envelope(). Namely, the object returned by envelope()
can simply be given to the function global_envelope_test() in the argument curve_sets.
Importantly, the functions must be saved setting savefuns = TRUE in the envelope() call:
Journal of Statistical Software 17
R> env <- envelope(X, nsim = 999, fun = "Lest", correction = "translate",
+ transform = expression(.-r), simulate = expression(runifpoint(ex = X)),
+ savefuns = TRUE, verbose = FALSE)
R> res <- global_envelope_test(env, type = "erl")
Above the arguments fun, correction and transform define the summary function to be
calculated (the latter two parameters are passed to the function Lest()) and simulate spec-
ifies how the patterns are simulated under the null hypothesis (here CSR). The result can be
plotted similarly as above.
Further examples of use of the GET package for point pattern analysis are given in an accom-
panying vignette available in R by typing library("GET") and vignette("pointpatterns").
3.3. Monte Carlo goodness-of-fit testing for composite hypotheses
Aldor-Noiman et al. (2013) provided a graphical test for normality for simple hypotheses
(i.e., known parameters of sample distribution) based on a qq-plot envelope, whose shape
was derived from theoretical properties of quantiles of the uniform distribution. They also
provided a version of this algorithm for composite hypotheses (i.e., unknown parameters
of sample distribution). However, according to our unpublished study, this test does not
achieve the required significance level. Therefore, the example of the exact adjustment for
the composite hypothesis is provided here, based on the two-stage procedure of Baddeley
et al. (2017) (see Appendix C). For simplicity, in this example, this adjustment is applied
directly to the empirical distribution functions. Apparently, the adjustment could also be
applied to the qq-plot envelopes of Aldor-Noiman et al. (2013).
The normality test is illustrated for nitrogen oxides (NOx) emission levels available in the data
poblenou from the R package fda.usc (Febrero-Bande and Oviedo de la Fuente 2012). The
data contains NOx emission levels (µg/m3) measured every hour by a control station close
to an industrial area in Poblenou in Barcelona (Spain) for 115 days from 23 February to 26
June, 2005. NOx is a pollutant which is caused by combustion processes in sources that burn
fuels, e.g., motor vehicles, electric utilities, and industries (Febrero, Galeano, and González-
Manteiga 2008). In Section 3.5, the whole functional trajectories of 24 h observations are
studied, but for illustration purposes, here the attention is restricted to the NOx levels at
10 am.
A general solution to make the adjusted test is to prepare all the required simulations and
provide them to the function GET.composite() in arguments X and X.ls. Let dat be a vector
containing the data values and n is the number of observations specified as follows:
R> library("fda.usc")
R> data("poblenou", package = "fda.usc")
R> dat <- poblenou[["nox"]][["data"]][,"H10"]
R> n <- length(dat)
Then, first, the parameters of the normal distribution are estimated (step 1 of the algorithm
of Appendix C)
R> mu <- mean(dat)
R> sigma <- sd(dat)
18 GET: Global Envelopes in R
0.00
0.25
0.50
0.75
1.00
0 100 200 300
NOx
Ec
df
Adjusted global test: p = 0.005
0.00
0.25
0.50
0.75
1.00
2 3 4 5
log(NOx)
Ec
df
Adjusted global test: p = 0.07
Central function Data function
Figure 7: Graphical normality test for the NOx (left) and logarithm of the NOx (right) levels
at 10 am. The gray band represents the 95% global envelope ("erl"). Red dots are attached
to the data function outside the envelope.
and, using the function ecdf() of the R package stats (R Core Team 2024), the empirical
cumulative distribution functions are calculated for the data and for nsimsub replicates of n
simulations from the fitted normal distribution (step 2):
R> nsim <- nsimsub <- 199
R> r <- seq(min(dat), max(dat), length = 100)
R> obs <- stats::ecdf(dat)(r)
R> sim <- replicate(nsimsub, {
+ x <- rnorm(n, mean = mu, sd = sigma)
+ stats::ecdf(x)(r)
+ })
R> cset <- curve_set(r = r, obs = obs, sim = sim)
Here the last command creates a ‘curve_set’ object of the observed and simulated empirical
cumulative distribution functions. Thereafter, another nsim replicates of the n simulations
from the fitted model are simulated, and the same calculations as above for the data are done
for each of these simulations (steps 3–4 of the algorithm of Appendix C):
R> cset.ls <- list()
R> for(rep in 1:nsim) {
+ x <- rnorm(n, mean = mu, sd = sigma)
+ mu2 <- mean(x)
+ sigma2 <- sd(x)
+ obs2 <- stats::ecdf(x)(r)
+ sim2 <- replicate(nsimsub, {
+ x2 <- rnorm(n, mean = mu2, sd = sigma2)
+ stats::ecdf(x2)(r)
+ })
+ cset.ls[[rep]] <- curve_set(r = r, obs = obs2, sim = sim2)
+ }
Journal of Statistical Software 19
Thus, the list cset.ls contains all the simulations from the second stage of the algorithm.
As a final step, GET.composite() can be used to prepare the adjusted test:
R> res <- GET.composite(X = cset, X.ls = cset.ls, type = "erl")
R> plot(res) + labs(x = "NOx", y = "Ecdf")
Figure 7 (left) shows the test result for the NOx levels at 10 am. One can see that the
normality does not hold according to the test: the estimated distribution function is skewed
to the right with respect to the normal envelope. Therefore, we further applied the same
normality test to the logarithm of the NOx values as well, and then the normality hypothesis
was not rejected (Figure 7, right).
3.4. Graphical n-sample test of correspondence of distribution functions
The graphical n-sample test of correspondence of distribution functions serves as a simple
example of permutation tests. Figure 8 shows the empirical cumulative distribution functions
(ECDFs) obtained by the function ecdf() of the R package stats (R Core Team 2024) for
the heights of the 54 girls and 39 boys of the growth data (see above, and Ramsay and
Silverman 2005) at ages 10 (left) and 14 (right). A global envelope test can be performed to
investigate whether the two (or more generally n) distribution functions differ from each other
significantly and how they differ. This test is a generalization of the two-sample Kolmogorov-
Smirnov test with a graphical interpretation. The generalization is provided in two ways, in
the number of samples and in the variable width of the envelopes. Namely, the Kolmogorov-
Smirnov test provides an envelope of constant width which suffers in determining differences
in extreme quantiles. Here it is assumed that the heights in the sample i are an i.i.d. sample
from the distribution Fi(r), i = 1, . . . , n, and the hypothesis F1(r) = . . . = Fn(r) is to be
tested. The simulations under the null hypothesis that the distributions are the same can be
obtained by permuting the individuals of the groups. The GET package provides the wrapper
Age 10 Age 14
140 160 180 140 160 180
0.00
0.25
0.50
0.75
1.00
x (Height in cm)
e
cd
f
Gender
Boys
Girls
Figure 8: The empirical cumulative distribution functions of the heights of the 54 girls and
39 boys of the growth data of the R package fda at ages 10 (left) and 14 (right).
20 GET: Global Envelopes in R
Girls, 10 yr. Boys, 10 yr.
130 140 150 160 130 140 150 160
0.00
0.25
0.50
0.75
1.00
x (Height in cm)
F^ (
x
)
Combined global test: p = 0.312
Girls, 14 yr. Boys, 14 yr.
150 160 170 180 150 160 170 180
0.00
0.25
0.50
0.75
1.00
x (Height in cm)
F^ (
x
)
Combined global test: p = 0.001
Central function Data function
Figure 9: Global envelope tests for comparison of the empirical cumulative distribution func-
tions of the heights of the girls and boys (see Figure 8). Red color indicates the heights where
the observed distribution functions go outside the 95% global envelope ("erl"; gray bands).
Left: Age 10; Right: Age 14.
function GET.distrequal() that can be used to compare n distribution functions graphically,
n = 2, 3, . . .. The (default) test vector is
T = (Fˆ1(r), . . . , Fˆn(r)),
where Fˆi(r) = (Fˆi(r1), . . . , Fˆi(rd)) is the ECDF of the i-th sample evaluated at argument
values r = (r1, . . . , rd). To test the equality of distributions, one simply needs to provide the
samples as a list (code for age 10 shown here) for GET.distrequal() and plot the object
returned by GET.distrequal() (Figure 9, left):
R> fm10.l <- list("Girls, 10 yr." = growth$hgtf["10",],
+ "Boys, 10 yr." = growth$hgtm["10",])
R> res10 <- GET.distrequal(fm10.l, nsim = 1999)
R> myxlab <- substitute(paste(italic(i), " (", j, ")", sep = ""),
+ list(i = "x", j = "Height in cm"))
R> plot(res10) + xlab(myxlab)
The height distributions at age 10 do not differ from each other significantly, but at age 14
the boys are taller, particularly with a difference that the proportion of girls reaching a height
of around 175 cm is much lower (Figure 9, right).
3.5. Graphical functional one-way ANOVA
The use of the function graph.fanova() of the GET package for the graphical functional
one-way ANOVA is illustrated using the data set poblenou of the R package fda.usc (Febrero-
Bande and Oviedo de la Fuente 2012, see also Section 3.3 above). The trajectories of the 24
h observations of the NOx levels for Monday–Thursday (MonThu), Friday (Fri) and non-
working days (Free) including weekend and festive days (Figure 10) are compared. For the
purposes of this example, a factor vector Type was prepared containing the type of the day
for each of the 115 days having levels "MonThu", "Fri" and "Free":
Journal of Statistical Software 21
MonThu Fri Free
0 5 10 15 20 250 5 10 15 20 250 5 10 15 20 25
0
100
200
300
400
Hour
N
O
x
Max
100
200
300
Figure 10: The NOx levels for Monday–Thursday (Mon–Thu), Friday (Fri) and non-working
days (Free) including weekend and festive days in Poblenou for 115 days from 23 February to
26 June, 2005. The color of the daily curves is according to the maximum NOx level (µg/m3)
of the day.
R> library("fda.usc")
R> data("poblenou", package = "fda.usc")
R> fest <- poblenou$df$day.festive; week <- as.integer(poblenou$df$day.week)
R> Type <- vector(length = length(fest))
R> Type[fest == 1 | week >= 6] <- "Free"
R> Type[fest == 0 & week %in% 1:4] <- "MonThu"
R> Type[fest == 0 & week == 5] <- "Fri"
R> Type <- factor(Type, levels = c("MonThu", "Fri", "Free"))
Assuming that the NOx levels Tij(r) at times r ∈ R = [0, 24] are i.i.d. samples from stochastic
processes SP (µj , γj) with mean functions µj(r), r ∈ R, and covariance functions γj(s, t),
s, t ∈ R, for j = 1, . . . , J (here J = 3), the groups of NOx levels can be compared by means
of the graphical functional ANOVA (Mrkvička et al. 2020). The hypothesis
H0 : µ1(r) = . . . = µJ(r), r ∈ R,
can be tested by the test statistic
T = (T 1, T 2, . . . , T J), (3)
where T j = (T j(r1), . . . , T j(rd)) is the mean of functions in the j-th group at the discrete
number of arguments r1, . . . , rd (here each hour of the day). The hypothesis can be equiva-
lently expressed as
H ′0 : µj′(r)− µj(r) = 0, r ∈ R, j′ = 1, . . . , J − 1, j = j′, . . . , J
and an alternative test statistic is
T′ = (T 1 − T 2, T 1 − T 3, . . . , T J−1 − T J), (4)
22 GET: Global Envelopes in R
MonThu Fri Free
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
25
50
75
r (Hour of day)
Z
i(r)
Central function Data function
Graphical functional ANOVA: p = 0.001
Figure 11: The test for the equality of variances of the NOx levels observed each hour (r)
of the day. The 95% global envelope ("erl"; gray band) representing the null hypothesis of
equal variances and the observed test statistic, the mean of functions Zij(r) = |Tij(r)−T j(r)|
in each group (solid line with red dots when outside the envelope).
where we used the notation T j − T j′ = (T j(r1) − T j′(r1), . . . , T j(rd) − T j′(rd)). The lat-
ter test statistic (Equation 4) can be obtained by setting contrasts = TRUE in the call of
graph.fanova().
Because the ANOVA design assumes equal covariance functions, we first tested for the ho-
moscedasticity using the test proposed by Mrkvička et al. (2020, Section 2.3) and implemented
in GET. We created the ‘curve_set’ object
R> cset <- curve_set(obs = t(poblenou[["nox"]][["data"]]), r = 0:23)
and we used the function graph.fanova() with the option test.equality = "var" to test
the equality of variances (Figure 11):
R> res.v <- graph.fanova(nsim = 2999, curve_set = cset, groups = Type,
+ test.equality = "var")
R> myxlab <- substitute(paste(italic(i), " (", j, ")", sep = ""),
+ list(i = "r", j = "Hour of day"))
R> plot(res.v) + xlab(myxlab)
Here the test suggests heteroscedasticity and also Febrero et al. (2008) assumed heteroscedas-
ticity of working and non-working days. Therefore, we applied correction for unequal vari-
ances to the three groups by rescaling the functions Tij(r) of J groups containing n1, . . . , nJ
functions observed on the finite interval R = [0, 24] by the transformation
Yij(r) =
Tij(r)− T j(r)√
Var(Tj(r))
·
√
Var(T (r)) + T j(r), j = 1, . . . , J, i = 1, . . . , nj , (5)
where the group sample mean T j(r) and overall sample variance Var(T (r)) are involved to
keep the mean and variability of the functions at the original scale. The group sample variance
Journal of Statistical Software 23
MonThu−Fri MonThu−Free Fri−Free
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
−1.0
−0.5
0.0
0.5
1.0
r (Hour of day)
T
i(r)
−
T
j(r)
Central function Data function
Graphical functional ANOVA: p < 0.001
Figure 12: The output of the graphical functional ANOVA to test the difference between the
type of the day on the log NOx levels observed each hour (r) of the day. The 95% global
envelope ("erl"; gray band) accompanied with the observed differences between the group
means (solid line with red dots when outside the envelope).
Var(Tj(r)) corrects the unequal variances. This scaling is applied to the set of curves given
to the function graph.fanova() if the user specifies variances = "unequal" (the default is
no correction, variances = "equal"). When using the rescaled functions, the test vectors
(Equations 3 and 4) are asymptotically exchangeable under the null hypothesis of equal
means only under the assumption of normality of stochastic processes SP (µj , γj) (Mrkvička
et al. 2020). Therefore, the log transformation was applied to the NOx values prior to the
transformation (Equation 5):
R> lcset <- curve_set(obs = t(log(poblenou[["nox"]][["data"]])), r = 0:23)
To sample from the null hypotheses, the simple permutation of raw functions among the
groups is performed. The permutations and the global envelope test can be done by the
graph.fanova() function (Figure 12):
R> res.c <- graph.fanova(nsim = 2999, curve_set = lcset, groups = Type,
+ variances = "unequal", contrasts = TRUE)
R> plot(res.c) + xlab(myxlab)
Thus, the test rejects the null hypothesis H ′0 that the differences between the groups would be
zero and shows that on Monday–Thursday and Friday the (log) NOx levels are significantly
higher than on free days basically during the whole day with peaks around 8 am and 4 pm.
The difference between Monday–Thursday and Friday was not significant.
The graphical functional ANOVA allows one to detect either a) which groups deviate from
the mean (default) or b) which specific groups are different (option contrasts = TRUE). The
example above was for the latter. Note that this test directly has the nature of a post hoc
test. Furthermore, both versions of the test allow one to identify which r values lead to the
potential rejection of the null hypothesis.
24 GET: Global Envelopes in R
1 27
50 60 70 50 60 70
20
25
30
35
40
x
y
0.5
0.6
0.7
0.8
Figure 13: The example brain image data in the right Crus Cerebellum 1 region at a slice
for one subject from the autism group (subject number 1) and one from the control group
(subject number 27).
When a graphical interpretation for group specific differences is not of interest but the area
of rejection is, instead of graph.fanova() it is possible to apply the one-way functional
ANOVA based on the r-wise F statistics, r ∈ R. This test is implemented in the function
frank.fanova(). For the log NOx data, the test result was that there are differences between
the groups for the hours from 5 am to 6 pm (figure omitted).
3.6. Functional GLM and two-dimensional plots
Similar type of methods as in the functional one-way ANOVA above can be used in a more
general setup of functional general linear models (GLMs). The global envelopes for functional
GLMs are illustrated here by an example of a small subset of the autism brain imaging data
collected by resting state functional magnetic resonance imaging (R-fMRI) (Di Martino et al.
2014). The preprocessed fMRI data contains measurements from 514 individuals with the
autism spectrum disorder (ASD) and 557 typical controls (TC), where subjects with low
quality on imaging data or having a large proportion of the missing values were removed.
The imaging measurement for local brain activity at resting state was fractional amplitude of
low frequency fluctuations (Zou et al. 2008). The data considered here and available as the
data object abide_9002_23 in the GET package contains data from one of the 116 different
anatomical regions in the brain partitioning being based on the anotomical automatic labeling
system of Tzourio-Mazoyer et al. (2002). The studied region is the right Crus Cerebellum 1
region of the brain at one slice (23) accompanied with three subject-specific factors, i.e., group
(autism and control), sex and age. Figure 13 obtained by
R> data("abide_9002_23", package = "GET")
R> plot(abide_9002_23[["curve_set"]], idx = c(1, 27))
shows the data for two subjects, illustrating the small region used as the example. In the
examples below, the effect of the group on the images is studied in the presence of nuisance
factors sex and age.
Journal of Statistical Software 25
Graphical functional GLM
The functional GLM is the general linear model
Y(r) = X(r)β(r) + Z(r)γ(r) + ϵ(r) (6)
where the argument r ∈ {1, . . . , d} determines the component of the vector or the spatial
point or pixel of an image. For every argument r, a one-dimensional GLM is considered with
X(r) being a n × k matrix of regressors of interest (here group), Z(r) being a n × l matrix
of nuisance regressors (here sex and age), Y(r) being a n × 1 vector of observed data, and
ϵ(r) being a n× 1 vector of random errors with a mean of zero and a finite variance σ2(r) for
every r ∈ I. Further, β(r) and γ(r) are the regression coefficient vectors of dimensions k × 1
and l × 1, respectively, and the null hypothesis to be tested is
H0 : βi(r) = 0, ∀ r = 1, . . . , d, ∀ i = 1, . . . , k,
where βi(r) are the elements of the β(r). For a continuous factor of interest k = 1 and β(r)
serves as the test statistic. For a discrete factor of interest, in the default setup, k is equal to
the number of groups of the discrete factor, adding the additional condition that∑i βi(r) = 0
for all r ∈ {1, . . . d}. Similarly, for interaction of a continuous and a discrete factor, k is also
equal to the number of groups of the categorical factor, adding the same additional condition.
For the interaction of two discrete factors, k is equal to the product of the numbers of groups
of the discrete factors, adding the same additional condition. If the argument contrasts of
the function graph.flm() is set to FALSE, the test statistic for a discrete factor consist of
βi(r) for all r = 1, . . . , d and i = 1, . . . , k. This test allows to detect which groups deviate
from the zero (mean). An alternative test statistic is obtained by setting contrasts = TRUE:
then the test statistic is formed by all the pairwise differences between the group effects,
βi(r)− βj(r) for all r and i ̸= j, 1 ≤ i < j ≤ k. This test specifies which specific groups are
different in the post hoc nature. Note that this also holds for interaction terms. Furthermore,
all the options allow one to identify which of the components of the vector, r ∈ {1, . . . , d},
lead to the potential rejection of the null hypothesis. Permutations under the null hypothesis
are obtained using the Freedman-Lane procedure (Freedman and Lane 1983; Mrkvička et al.
2022, 2021).
Often factors are given for the whole function, i.e., they do not depend on argument r, and
so the matrices X(r) and Z(r) are identical for every r. These kind of constant factors (such
as sex and age in the considered example) can be provided in the argument factors of the
graph.flm() function. However, this simplification is not necessary and factors varying in
space can be provided in the argument curve_sets, along with the data curves in a named
list.
The functional GLM is performed by the function graph.flm():
R> res <- graph.flm(nsim = 999, formula.full = Y ~ Group + Sex + Age,
+ formula.reduced = Y ~ Sex + Age,
+ curve_sets = list(Y = abide_9002_23[["curve_set"]]),
+ factors = abide_9002_23[["factors"]], contrasts = TRUE,
+ GET.args = list(type = "area"))
Here the arguments formula.full and formula.reduced specify the full GLM and the GLM
where the interesting factor has been dropped out, and the number of simulations is given in
26 GET: Global Envelopes in R
50 60 70
20
25
30
35
40
−0.015
−0.010
−0.005
0.000
Direction
Below
Deviation
0.025
0.050
0.075
0.100
0.125
Graphical functional GLM: p = 0.024
Figure 14: Graphical functional GLM for testing the effect of the group (autism, control) in
the brain image example: the observed difference (autism-control) and the locations (blue
circles) where the observed coefficient goes below the 95% global envelope (area). The size
of the circles are proportional to the size of the deviation of the observed function from the
lower bound of the envelope divided by the width of the envelope.
nsim. Further arguments to global_envelope_test() can be passed in GET.args, e.g., the
type of the global envelope.
The r component of the abide_9002_23[["curve_set"]] object is a data frame with columns
x, y, width and height, where the width and height give the width and height of the pixels
placed at x and y. When such two-dimensional argument values are provided in a curve_set
object, the resulting default envelope plots produced by
R> plot(res) + scale_radius(range = 1.5 * c(1, 6))
are two-dimensional as well (Figure 14). Above scale_radius() function of ggplot2 is used
to enlarge the circles which show the size of deviation of the observed function from the bound
(here lower) of the envelope divided by the width of the envelope. Here only two groups were
compared, and the plot shows that the brain measurements were lower in the autism group
than in the control group in a part of the small example region (blue circles in Figure 14).
When the basic assumption of the homoscedasticity in the linear model (6) for every argument
r is violated, it is important to handle it. One possibility is to apply transformations to the
functions a priori as suggested by Mrkvička et al. (2020) and Mrkvička et al. (2021), see
Equation 5. Alternatively weighted least squares might be used for estimation of regression
coefficients.
Journal of Statistical Software 27
Observed Upper envelope Sign.: above
50 60 70 50 60 70 50 60 70
20
25
30
35
40
3
6
9
Global envelope test: p = 0.015
Alternative = "greater"
Figure 15: F rank functional GLM for testing the effect of the group (autism, control) in the
brain image example: the observed F statistic, the upper bound of the one-sided 95% global
envelope (area), and the significant region (red contour lines) where the observed F statistic
exceeds the envelope.
Functional GLM based on F statistics
In the F rank functional GLM, the same linear model (Equation 6) is fitted at each r ∈
{1, . . . , d} and permutations under the null hypothesis are obtained similarly by the Freedman-
Lane procedure as in the graphical functional GLM (Freedman and Lane 1983; Mrkvička et al.
2022). However, the test statistic is the classical F statistic (see, e.g., Winkler et al. 2014)
which is calculated for the hypothesis that the data follows the simpler reduced model of the
two proposed linear models that are nested within each other (given in formula.full and
formula.reduced). The use of the function frank.flm() is similar to that of graph.flm():
R> res.F <- frank.flm(nsim = 999, formula.full = Y ~ Group + Age + Sex,
+ formula.reduced = Y ~ Age + Sex,
+ curve_sets = list(Y = abide_9002_23[["curve_set"]]),
+ factors = abide_9002_23[["factors"]], GET.args = list(type = "area"))
R> plot(res.F, what = c("obs", "hi", "hi.sign"), sign.type = "contour")
For illustration of different options, we plot here the observed function (what = "obs"),
the upper envelope (what = "hi") and the observed function with the significant region
(what = "hi.sign"). The significant region is here shown by contour lines, as chosen in the
argument sign.type. Contours do not indicate the size of deviation from the bounds of the
global envelope, but contours or colored regions (sign.type = "col") can be preferable if
the observed function exceeds the envelope on large parts of the domain.
Figure 15 shows that the F rank GLM found significant differences between the groups ap-
proximately at the same pixels r ∈ {1, . . . , d} of the brain image as the graphical functional
GLM above. In general, for a factor with more than two groups, the F rank GLM is however
not able to tell between which specific groups of the factor the differences occur (or which of
the groups deviate from the mean). In the case of heteroscedasticy, the weighted least squares
test statistics can be used instead (Christensen 2002).
28 GET: Global Envelopes in R
0.0
0.1
0 25 50 75 100
x
f(x
)
95% 80% 50% central region (erl)
Figure 16: The global 95%, 80% and 50% confidence bands ("erl") for the cubic regression
(nested gray bands; widest is 95%), the true function (solid line) from which the data points
(dots) were simulated and the median computed from the simulations (dashed line).
3.7. Confidence band in polynomial regression
The bootstrap procedure described in Narisetty and Nair (2016) can be used to compute global
confidence bands for the fitted curve in the linear or polynomial regression. In this example,
regression data was simulated according to the cubic model f(x) = 0.8x − 1.8x2 + 1.05x3
for x ∈ [0, 1] with i.i.d. random noise (dots in Figure 16). Then the data was fitted with
cubic regression (black solid line in Figure 16) and by permuting the residuals 2000 bootstrap
samples were obtained and functions fitted (see more details about the bootstrap procedure in
Narisetty and Nair 2016). Finally a ‘curve_set’ object was constructed of these bootstrapped
functions and the central_region() function was applied to this set to obtain the 95%, 80%
and 50% global confidence bands. The multiple global bands were obtained by setting the
argument coverage = c(0.95, 0.80, 0.50).
The result of the procedure is shown in Figure 16. The code can be found in the help file of
the function central_region() in R.
A theoretical 95% confidence band can be considered under the given bootstrap scheme.
Based on a simulation experiment where the theoretical confidence band was computed from
200000 bootstrapped functions, we observed that the 95% confidence region computed as a
convex hull from s functions converged to the theoretical one from inside for increasing s.
The 95% confidence band computed as the extreme rank envelope from s functions (see
Equation 8) converged to the theoretical one from outside instead. Both these envelopes are
finite approximations of the theoretical envelope. On the other hand, in the sense of Barnard’s
Monte Carlo test (Barnard 1963), the global envelope test (convex hull) is exact for the given
set of simulated functions. In the same sense, the confidence band reaches the given global
level exactly under the given set of functions.
Journal of Statistical Software 29
4. Summary and discussion
We presented the GET package which was designed for global envelopes that are constructed
for a general vector and have IGI (Definition 2.1). The great advantage of these methods is
their graphical output given by the 100(1 − α)% (e.g., 95%) central region, which helps one
to interpret the results in the various applications. The package implements different types
of global envelopes (see Table 1) and their usage in general and for several specific problems.
Because the global envelopes can be used for so many different purposes specified in cases (i)–
(iii) in Section 1, there are several other software, particularly other R packages, that deal
with methods that can be used for the same purposes as the methods in GET. However, to
the best of our knowledge, GET is the first package specializing to the global envelopes with
IGI. Further, it provides testing results both under the FWER control and the FDR control.
Besides the graphical interpretation, another advantage of the proposed global rank envelopes
is their non-parametric (rank-based) nature, which ensures that the functional or multivariate
data coming into the analysis can be inhomogeneous across the domain of their arguments
and this phenomenon does not influence the result of the analysis. For example, before the
methods discussed in this paper appeared, formal goodness-of-fit testing in spatial statistic
was commonly based on the unscaled MAD test (Ripley 1981) or its non-graphical integrated
counterpart (Diggle 1979). However, the result of these tests is influenced by unequal vari-
ability of the test function across its domain leading in general to loss of power (Myllymäki
et al. 2015, 2017). A similar situation appears in the permutation GLM tests which, in the
functional data analysis or neuroimage analysis (see, e.g., Winkler et al. 2014), are commonly
based on the F statistic that standardizes the first and second moments of the data but not the
high quantiles. Thus, when the data are inhomogeneous across the domain and non-normal,
the commonly used F max test (which is similar to the unscaled MAD test) is influenced by
the inhomogeneous quantiles. The rank-based tests discussed here can then lead to higher
power (for details see Mrkvička et al. 2022). Similarly, the rank based methods can adjust the
shape of the central region to inhomogeneous distribution of the studied functions. Therefore,
the global rank envelopes are a valuable tool in all these situations.
A further advantage of the rank tests is that it allows one to give equal weights to the
components fed in. Thus, the method is particularly well suited also for multiple testing with
several univariate or functional test statistics (Mrkvička et al. 2017), for constructing central
regions jointly for various transformations of the functions (for details see Dai et al. 2020), and
for combining various dimensions of multidimensional functions or various functional elements
of multivariate functions. Further, it can be used for functional cluster analysis so that various
transformations of the functions have the same impact on the clustering algorithm (Dai et al.
2022).
Finally, the good properties of the methods presented here are retained also in the case of
testing a composite hypothesis: the two-stage Monte Carlo test is applicable to these graphical
methods (Appendix C).
We are committed to developing the GET package further. For example, new types of global
envelopes can be added, if such are invented, and support for specific applications or different
type of data will be extended.
30 GET: Global Envelopes in R
Acknowledgments
MM was financially supported by the Research Council of Finland (project numbers 295100,
306875, 327211) and TM by the Grant Agency of Czech Republic (Project No. 19-04412S and
24-11096S). The authors thank Jiří Dvořák, Pavel Grabarnik, Ute Hahn, Mikko Kuronen,
Michael Rost and Henri Seijo for their contributions and suggestions of code, as well as
Stavros Athanasiadis, Wenlin Dai, Marc G. Genton, Milan Jílek, Naveen Naidu Narisetty,
Tomáš Roskovec, Samuel Soubeyrand and Ying Sun for their collaboration on developing
global envelope methods. The authors wish to acknowledge CSC – IT Center for Science,
Finland, for computational resources.
References
Aldor-Noiman S, Brown LD, Buja A, Rolke W, Stine RA (2013). “The Power to See: A
New Graphical Test of Normality.” The American Statistician, 67(4), 249–260. doi:
10.1080/00031305.2013.847865.
Anderson M, Ter Braak C (2003). “Permutation Tests for Multi-Factorial Analysis of
Variance.” Journal of Statistical Computation and Simulation, 73(2), 85–113. doi:
10.1080/00949650215733.
Anderson MJ, Robinson J (2001). “Permutation Tests for Linear Models.” Australian & New
Zealand Journal of Statistics, 43(1), 75–88. doi:10.1111/1467-842x.00156.
Baddeley A, Hardegen A, Lawrence T, Milne RK, Nair G, Rakshit S (2017). “On Two-Stage
Monte Carlo Tests of Composite Hypotheses.” Computational Statistics & Data Analysis,
114, 75–87. doi:10.1016/j.csda.2017.04.003.
Baddeley A, Rubak E, Turner R (2015). Spatial Point Patterns: Methodology and Applications
with R. Chapman & Hall/CRC, London. doi:10.1201/b19708.
Barnard GA (1963). “Discussion on ’The Spectral Analysis of Point Processes’ (by M. S.
Bartlett).” Journal of the Royal Statistical Society B, 25, 294. doi:10.1111/j.2517-6161.
1963.tb00508.x.
Besag JE (1977). “Comment on ‘Modelling Spatial Patterns’ by B. D. Ripley.” Journal of the
Royal Statistical Society B, 39(2), 193–195. doi:10.1111/j.2517-6161.1977.tb01615.x.
Bolin D, Lindgren F (2015). “Excursion and Contour Uncertainty Regions for Latent Gaussian
Models.” Journal of the Royal Statistical Society B, 77(1), 85–106. doi:10.1111/rssb.
12055.
Bolin D, Lindgren F (2017). “Quantifying the Uncertainty of Contour Maps.” Journal of
Computational and Graphical Statistics, 26(3), 513–524. doi:10.1080/10618600.2016.
1228537.
Bolin D, Lindgren F (2018). “Calculating Probabilistic Excursion Sets and Related Quantities
Using Excursions.” Journal of Statistical Software, 86(5), 1–20. doi:10.18637/jss.v086.
i05.
Journal of Statistical Software 31
Canty A, Ripley BD (2024). boot: Bootstrap R (S-PLUS) Functions. doi:10.32614/CRAN.
package.boot. R package version 1.3-31.
Castro Conde I, de Una Alvarez J (2024). sgof: Multiple Hypothesis Testing. doi:10.32614/
CRAN.package.sgof. R package version 2.3.5.
Chaiban C, Biscio C, Thanapongtharm W, Tildesley M, Xiao X, Robinson TP, Vanwambeke
SO, Gilbert M (2019). “Point Pattern Simulation Modelling of Extensive and Intensive
Chicken Farming in Thailand: Accounting for Clustering and Landscape Characteristics.”
Agricultural Systems, 173, 335–344. doi:10.1016/j.agsy.2019.03.004.
Christensen R (2002). Plane Answers to Complex Questions: The Theory of Linear Models.
Springer-Verlag, New York. doi:10.1007/978-1-4419-9816-3.
Dai W, Athanasiadis S, Mrkvička T (2022). “A New Functional Clustering Method with Com-
bined Dissimilarity Sources and Graphical Interpretation.” In R López-Ruiz (ed.), Compu-
tational Statistics and Applications. IntechOpen. doi:10.5772/intechopen.100124.
Dai W, Mrkvička T, Sun Y, Genton MG (2020). “Functional Outlier Detection and Taxonomy
by Sequential Transformations.” Computational Statistics & Data Analysis, 149, 106960.
doi:10.1016/j.csda.2020.106960.
Dao NA, Genton MG (2014). “A Monte Carlo Adjusted Goodness-Of-Fit Test for Paramet-
ric Models Describing Spatial Point Patterns.” Journal of Computational and Graphical
Statistics, 23, 497–517. doi:10.1080/10618600.2012.760459.
Davison AC, Hinkley DV (1997). Bootstrap Methods and Their Application. Cambridge
University Press, Cambridge.
Degras D (2016). SCBmeanfd: Simultaneous Confidence Bands for the Mean of Functional
Data. doi:10.32614/CRAN.package.SCBmeanfd. R package version 1.2.2.
Di Martino A, Yan C, Li Q, Denio E, Castellanos F, Alaerts K, Anderson J, Assaf M,
Bookheimer S, Dapretto M, et al (2014). “The Autism Brain Imaging Data Exchange:
Towards a Large-Scale Evaluation of the Intrinsic Brain Architecture in Autism.” Molecu-
lar Psychiatry, 19, 659–667. doi:10.1038/mp.2013.78.
Diggle PJ (1979). “On Parameter Estimation and Goodness-of-Fit Testing for Spatial Point
Patterns.” Biometrics, 35(1), 87–101. doi:10.2307/2529938.
Diggle PJ (2013). Statistical Analysis of Spatial and Spatio-Temporal Point Patterns. 3rd
edition. Chapman & Hall/CRC, Boca Raton.
Duranton G, Overman HG (2005). “Testing for Localization Using Micro-Geographic Data.”
Review of Economic Studies, 72(4), 1077–1106. doi:10.1111/0034-6527.00362.
Dvořák J, Mrkvička T (2022). “Graphical Tests of Independence for General Distributions.”
Computational Statistics, 37, 671–699. doi:10.1007/s00180-021-01134-y.
Febrero M, Galeano P, González-Manteiga W (2008). “Outlier Detection in Functional Data
by Depth Measures, with Application to Identify Abnormal NOx Levels.” Environmetrics,
19(4), 331–345. doi:10.1002/env.878.
32 GET: Global Envelopes in R
Febrero-Bande M, Oviedo de la Fuente M (2012). “Statistical Computing in Functional Data
Analysis: The R Package fda.usc.” Journal of Statistical Software, 51(4), 1–28. doi:
10.18637/jss.v051.i04.
Freedman D, Lane D (1983). “A Nonstochastic Interpretation of Reported Significance Lev-
els.” Journal of Business and Economic Statistics, 1(4), 292–298. doi:10.2307/1391660.
Gorecki T, Smaga L (2018). fdANOVA: Analysis of Variance for Univariate and Multivariate
Functional Data. doi:10.32614/CRAN.package.fdANOVA. R package version 0.1.2.
Grabarnik P, Chiu SN (2002). “Goodness-of-Fit Test for Complete Spatial Randomness
against Mixtures of Regular and Clustered Spatial Point Processes.” Biometrika, 89(2),
411–421. doi:10.1093/biomet/89.2.411.
Häbel H, Rajala T, Marucci M, Boissier C, Schladitz K, Redenbach C, Särkkä A (2017). “A
Three-Dimensional Anisotropic Point Process Characterization for Pharmaceutical Coat-
ings.” Spatial Statistics, 22, 306–320. doi:10.1016/j.spasta.2017.05.003.
Hahn U (2015). “A Note on Simultaneous Monte Carlo Tests.” Technical report, Centre for
Stochastic Geometry and Advanced Bioimaging, Aarhus University.
Hothorn T, Hornik K, Van de Wiel M, Zeileis A (2008). “Implementing a Class of Permutation
Tests: The coin Package.” Journal of Statistical Software, 28(8), 1–23. doi:10.18637/jss.
v028.i08.
Hothorn T, Hornik K, Van de Wiel MA, Zeileis A (2006). “A Lego System for Conditional
Inference.” The American Statistician, 60(3), 257–263. doi:10.1198/000313006x118430.
Illian J, Penttinen A, Stoyan H, Stoyan D (2008). Statistical Analysis and Modelling of Spatial
Point Patterns. Statistics in Practice, 1st edition. John Wiley & Sons, Chichester.
López-Pintado S, Romo J (2009). “On the Concept of Depth for Functional Data.” Journal of
the American Statistical Association, 104(486), 718–734. doi:10.1198/jasa.2009.0108.
López-Pintado S, Torrente A (2021). depthTools: Depth Tools Package. doi:10.32614/
CRAN.package.depthTools. R package version 0.7.
Marcon E, Traissac S, Puech F, Lang G (2015). “Tools to Characterize Point Patterns: dbmss
for R.” Journal of Statistical Software, Code Snippets, 67(3), 1–15. doi:10.18637/jss.
v067.c03.
Mrkvička T, Myllymäki M (2023). “False Discovery Rate Envelopes.” Statistics and Comput-
ing, 33, 109. doi:10.1007/s11222-023-10275-7.
Mrkvička T, Myllymäki M, Hahn U (2017). “Multiple Monte Carlo Testing, with Applications
in Spatial Point Processes.” Statistics and Computing, 27(5), 1239–1255. doi:10.1007/
s11222-016-9683-9.
Mrkvička T, Myllymäki M, Jílek M, Hahn U (2020). “A One-Way ANOVA Test for
Functional Data with Graphical Interpretation.” Kybernetika, 56(3), 432–458. doi:
10.14736/kyb-2020-3-0432.
Journal of Statistical Software 33
Mrkvička T, Myllymäki M, Kuronen M, Narisetty NN (2022). “New Methods for Multiple
Testing in Permutation Inference for the General Linear Model.” Statistics in Medicine,
41(2), 276–297. doi:10.1002/sim.9236.
Mrkvička T, Roskovec T, Rost M (2021). “A Nonparametric Graphical Tests of Significance
in Functional GLM.” Methodology and Computing in Applied Probability, 23, 593–612.
doi:10.1007/s11009-019-09756-y.
Murrell DJ (2018). “A Global Envelope Test to Detect Non-Random Bursts of Trait Evo-
lution.” Methods in Ecology and Evolution, 9(7), 1739–1748. doi:10.1111/2041-210x.
13006.
Myllymäki M, Grabarnik P, Seijo H, Stoyan D (2015). “Deviation Test Construction and
Power Comparison for Marked Spatial Point Patterns.” Spatial Statistics, 11, 19–34. doi:
10.1016/j.spasta.2014.11.004.
Myllymäki M, Kuronen M, Mrkvička T (2020). “Testing Global and Local Dependence of
Point Patterns on Covariates in Parametric Models.” Spatial Statistics, 42, 100436. doi:
10.1016/j.spasta.2020.100436.
Myllymäki M, Mrkvička T (2020). “Comparison of Non-Parametric Global Envelopes.”
arXiv:2008.09650 [stat.ME]. doi:10.48550/arxiv.2008.09650.
Myllymäki M, Mrkvička T, Grabarnik P, Seijo H, Hahn U (2017). “Global Envelope Tests
for Spatial Processes.” Journal of the Royal Statistical Society B, 79, 381–404. doi:
10.1111/rssb.12172.
Nagy S, Gijbels I, Hlubinka D (2017). “Depth-Based Recognition of Shape Outlying
Functions.” Journal of Computational and Graphical Statistics, 26(4), 883–893. doi:
10.1080/10618600.2017.1336445.
Narisetty NN, Nair VJ (2016). “Extremal Depth for Functional Data and Applications.” Jour-
nal of the American Statistical Association, 111(516), 1705–1714. doi:10.1080/01621459.
2015.1110033.
Pedersen TL (2024). patchwork: The Composer of Plots. doi:10.32614/CRAN.package.
patchwork. R package version 1.3.0.
Pollard KS, Dudoit S, van der Laan MJ (2005). “Multiple Testing Procedures: The multtest
Package and Applications to Genomics.” In R Gentleman, VJ Carey, W Huber, RA Irizarry,
S Dudoit (eds.), Bioinformatics and Computational Biology Solutions Using R and Biocon-
ductor. Statistics for Biology and Health, pp. 249–271. Springer-Verlag, New York.
Pollington TM, Tildesley MJ, Hollingsworth TD, Chapman LAC (2020). “Developments
in Statistical Inference When Assessing Spatiotemporal Disease Clustering with the Tau
Statistic.” Spatial Statistics, p. 100438. doi:10.1016/j.spasta.2020.100438.
Ramsay JO, Graves S, Hooker G (2022). fda: Functional Data Analysis. doi:10.32614/
CRAN.package.fda. R package version 6.0.5.
Ramsay JO, Silverman BW (2005). Functional Data Analysis. Springer Series in Statistics,
2nd edition. Springer-Verlag, New York. doi:10.1007/b98888.
34 GET: Global Envelopes in R
R Core Team (2024). R: A Language and Environment for Statistical Computing. R Founda-
tion for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Ripley BD (1977). “Modelling Spatial Patterns.” Journal of the Royal Statistical Society B,
39(2), 172–212. doi:10.1111/j.2517-6161.1977.tb01615.x.
Ripley BD (1981). Spatial Statistics. Wiley Series in Probability and Mathematical Statistics.
John Wiley & Sons, New Jersey.
Stoyan D (2016). “Point Process Statistics: Application to Modern and Contempo-
rary Art and Design.” Journal of Mathematics and the Arts, 10(1-4), 20–34. doi:
10.1080/17513472.2016.1208980.
Sun Y, Genton MG (2011). “Functional Boxplots.” Journal of Computational and Graphical
Statistics, 20(2), 316–334. doi:10.1198/jcgs.2011.09224.
Sun Y, Genton MG, Nychka DW (2012). “Exact Fast Computation of Band Depth for Large
Functional Datasets: How Quickly Can One Million Curves Be Ranked?” Stat, 1(1), 68–74.
doi:10.1002/sta4.8.
Torrente A, López-Pintado S, Romo J (2013). “depthTools: An R Package for a Ro-
bust Analysis of Gene Expression Data.” BMC Bioinformatics, 14(237). doi:10.1186/
1471-2105-14-237.
Tzourio-Mazoyer N, Landeau B, Papathanassiou D, Crivello F, Etard O, Delcroix N, Mazoyer
B, Joliot M (2002). “Automated Anatomical Labeling of Activations in SPM Using a
Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain.” NeuroImage,
15, 273–289. doi:10.1006/nimg.2001.0978.
Van Lieshout MC (2010). “Spatial Point Process Theory.” In AE Gelfand, PJ Diggle,
M Fuentes, P Guttorp (eds.), Handbook of Spatial Statistics, Handbooks of Modern Statis-
tical Methods, 1st edition. Chapman & Hall, CRC.
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York.
doi:10.1007/978-0-387-98141-3.
Wiesenfarth M, Krivobokova T, Klasen S, Sperlich S (2012). “Direct Simultaneous Inference
in Additive Models and Its Application to Model Undernutrition.” Journal of the American
Statistical Association, 107(500), 1286–1296. doi:10.1080/01621459.2012.682809.
Winkler AM, Ridgway GR, Webster MA, Smith SM, Nichols TE (2014). “Permutation
Inference for the General Linear Model.” NeuroImage, 92, 381–397. doi:10.1016/j.
neuroimage.2014.01.060.
Zou QH, Zhu CZ, Yang Y, Zuo XN, Long XY, Cao QJ, Wang YF, Zang YF (2008). “An
Improved Approach to Detection of Amplitude of Low-Frequency Fluctuation (ALFF) for
Resting-State fMRI: Fractional ALFF.” Journal of Neuroscience Methods, 172, 137–141.
doi:10.1016/j.jneumeth.2008.04.012.
Journal of Statistical Software 35
A. Definitions of global envelopes
Here, we provide definitions of the different global envelopes and the corresponding measures,
as outlined in Table 1. When working with functions, it is assumed that they have already
been discretized as required in practice. Thus, we consider general vectors Ti = (Ti1, . . . , Tid),
i = 1, . . . , s, for which a global envelope is to be constructed.
The measures satisfy the IGI property with probability 1 if there are no pointwise ties,
meaning that there are no ties in T1k, . . . , Tsk for every k = 1, . . . , d. This holds if Ti are
realizations of an absolutely continuous random vector. If pointwise ties do occur, the IGI
property may be violated in elements of the vector where ties are present. For the measures
based on pointwise ranks (Sections A.1-A.4), we use the pointwise mid-rank for tied values
to weigh down the influence of elements (k) where many Tik, i = 1, . . . , s, coincide.
A.1. Global rank envelope
The extreme rank Ri of the vector Ti is defined as the minimum of its pointwise ranks, namely
Ri = min
k=1,...,d
Rik, (7)
where the pointwise rank Rik is the rank of the element Tik among the corresponding elements
T1k, T2k, . . . , Tsk of the s vectors such that the lowest ranks correspond to the most extreme
values of the statistics. How the pointwise ranks are determined depends on whether a one-
sided or a two-sided global envelope (test) is to be constructed: Let r1k, r2k, . . . , rsk be the
raw ranks of T1k, T2k, . . . , Tsk, such that the smallest Tik has rank 1. In the case of ties among
T1k, T2k, . . . , Tsk, the raw ranks are averaged. The pointwise ranks are then calculated as
Rik =

rik, for the one-sided case ‘less’,
s+ 1− rik, for the one-sided case ‘greater’,
min(rik, s+ 1− rik), for the two-sided case.
The cases ’less’ and ’greater’ are most often relevant in a hypothesis testing case (ii) (see
Section 1). The case ’less’ corresponds to the alternative that the values of interesting vector
(T1) are smaller than under the null hypothesis. The case ’greater’ corresponds to the alter-
native that the values are larger than under the null hypothesis. The extreme rank measure
Ri induces an ordering of Ti, i = 1, . . . , s, which can be used to detect the extremeness of
the vectors among each other. Given that T1 is the observed vector in the Monte Carlo or
permutation test, the (conservative) p value of the test is equal to p+ =
∑s
i=1 1(Ri ≤ R1)
/
s.
Since the extreme ranks can have many ties, the test is also equipped with the liberal p value,
p− =
∑s
i=1 1(Ri < R1)
/
s. Then, when α falls inside the p interval (p−, p+], the decision of
the test is not defined.
The 100(1− α)% global rank envelope induced by this measure is defined through
Tlocal,llow k = min
l
i=1,...,s
Tik and Tlocal,lupp k = max
l
i=1,...,s
Tik for k = 1, . . . , d, (8)
by taking l = R(α), according to the point 1 of IGI (see Definition 2.1), i.e., setting T
(α)
low k =
Tlocal,R(α)low k and T
(α)
upp k = T
local,R(α)
upp k . Here minl and maxl denote the l-th smallest and largest
36 GET: Global Envelopes in R
values, respectively, and l = 1, 2, . . . , ⌊s/2⌋. If Ti is strictly outside the envelope for some
k = 1, . . . , d, then also Ri < R(α), and if T
(α)
low k < Tik < T
(α)
upp k for all k = 1, . . . , d, then
Ri > R(α). However, if Ti coincides either with T
(α)
low k or T
(α)
upp k for some k = 1, . . . , d, then
Ri = R(α), and α ∈ (p−, p+] in the testing case.
Because the extreme rank can achieve many ties (see, e.g., Mrkvička et al. 2022), it is necessary
to use a relatively large s for the global rank envelope. The following three refinements of the
extreme rank solve the ties problem and enable the use of a smaller s.
A.2. Global extreme rank length (ERL) envelope
The extreme rank length (ERL) measure (Myllymäki et al. 2017; Narisetty and Nair 2016)
refines the extreme rank measure by breaking the ties in the extreme ranks Ri by taking into
account also the number of Rik which are equal to Ri. Further, the numbers of ranks equal
to Ri + 1, Ri + 2, ... are used to break any remaining ties.
Formally, the ERL measure of Ti is defined based on the vector of the pointwise ordered
ranks Ri = (Ri[1], Ri[2], . . . , Ri[d]), where the ranks are arranged from smallest to largest,
i.e., Ri[k] ≤ Ri[k′] whenever k ≤ k′. While the extreme rank (Equation 7) corresponds to
Ri = Ri[1], the ERL measure takes all these ranks into account by the reverse lexical ordering.
The ERL measure of Ti is
Ei =
1
s
s∑
i′=1
1(Ri′ ≺ Ri) (9)
where
Ri′ ≺ Ri ⇐⇒ ∃n ≤ d : Ri′[k] = Ri[k]∀k < n,Ri′[n] < Ri[n].
The division by s leads to normalized ranks that obtain values between 0 and 1. Consequently,
the ERL measure corresponds to the extremal depth of citetNarisettyNair2016.
The probability of having a tie in the ERL measure is rather small, thus practically the ERL
solves the ties problem. The final p value of a Monte Carlo test is perl =
∑s
i=1 1(Ei ≤ E1)
/
s.
Let E(α) be defined according to the point 1 of IGI and Iα = {i ∈ 1, . . . , s : Ei ≥ E(α)} be the
index set of vectors less or as extreme as E(α). Then the 100(1 − α)% global ERL envelope
induced by Ei is
T(α)low k = mini∈Iα
Tik and T(α)upp k = maxi∈Iα
Tik for k = 1, . . . , d, (10)
see Narisetty and Nair (2016) and Mrkvička et al. (2020).
A.3. Global continuous rank envelope
The ties can alternatively be broken by the continuous rank measure (Hahn 2015; Mrkvička
et al. 2022) which refines the extreme rank measure by considering instead of the (discrete)
pointwise ranks Rik continuous pointwise ranks Cik defined by the ratios of Tik to the closest
other Tjk, j = 1, . . . , s, j ̸= i.
Formally, the continuous rank measure is
Ci =
1
s
min
k=1,...,d
Cik,
Journal of Statistical Software 37
where s scales the values to interval from 0 to 1. The definition of pointwise continuous
ranks Cik depends again on whether a one-sided or two-sided global envelope (test) is to be
constructed:
Cik =

cik, for the one-sided case ’less’
s− cik, for the one-sided case ’greater’
min(cik, s− cik), for the two-sided case.
where cik is the raw continuous rank of Tik among T1k, . . . , Tsk according to Definition A.1
and the three cases are similar to those of Rik above.
Definition A.1 Let y[1] ≤ y[2] ≤ . . . ≤ y[s] denote the ordered set of values yi, i = 1, 2, . . . , s.
Define the raw continuous rank such that the smallest yi has smallest rank following Mrkvička
et al. (2022):
c[j] = j − 1 +
y[j] − y[j−1]
y[j+1] − y[j−1]
for j = 2, 3, . . . , s− 1 and
c[1] = exp
(
−y[2] − y[1]
y[s] − y[2]
)
, c[s] = s− exp
(
−y[s] − y[s−1]
y[s−1] − y[1]
)
.
If there are ties, y[i−1] < y[i] = . . . = y[j] < y[j+1], then the raw continuous rank is defined as
c[k] = i+j2 − 12 for k = i, i+ 1, . . . , j.
The p value of the univariate Monte Carlo test is pcont =
∑s
i=1 1(Ci ≤ C1)
/
s. The 100(1−α)%
global continuous rank envelope induced by Ci is constructed in the same way as global ERL
envelope, i.e., as a hull of Ti for which Ci ≥ C(α), where C(α) is defined according to the
point 1 of IGI. This is achieved by having Iα = {i ∈ 1, . . . , s : Ci ≥ C(α)} in Equation 10.
A.4. Global area rank envelope
Another refinement of rank envelope is the area rank measure (Mrkvička et al. 2022),
Ai =
1
s
(
Ri − 1
d
d∑
k=1
(Ri − Cik)1(Cik < Ri)
)
.
The area measure breaks the ties in the extreme ranks by the sum (area) of the differences
between the extreme rank Ri and the pointwise continuous rank Cik from those k = 1, . . . , d
where the continuous rank is smaller than the extreme rank. The univariate Monte Carlo
test is performed based on Ai with parea =
∑s
i=1 1(Ai ≤ A1)
/
s. The 100(1− α)% global area
rank envelope induced by Ai is constructed similarly as the global ERL and continuous rank
envelopes, with Iα = {i ∈ 1, . . . , s : Ai ≥ A(α)} in Equation 10.
A.5. Global directional quantile, studentized and unscaled envelope
The above four global envelopes are based on the whole distributions of T1k, . . . , Tsk, k =
1, . . . , d. It is also possible to approximate the distribution by a few sample characteristics.
The sample characteristics are in the package GET estimated from Tik, i = 1, . . . , s, for
each k.
38 GET: Global Envelopes in R
The global directional quantile envelope uses the expectation T0k, β% upper quantile T ·k
and β% lower quantile T ·k to approximate the distributions. Setting β = 2.5 was used in
Myllymäki et al. (2017); setting β = 25 can also be useful especially for defining the 50%
central region from a low number of functions. Note that β has to be greater than 100/s
in order to be able to estimate the β and 1 − β quantiles. The directional quantile measure
(Myllymäki et al. 2015, 2017) Di is defined as
Di = max
k
(
1(Tik ≥ T0k) Tik − T0k|T ·k − T0k|
+ 1(Tik < T0k)
T0k − Tik
|T ·k − T0k|
)
, (11)
From historical reasons, Di is defined to be bigger for more extreme vectors. The same holds
for the following two measures. The univariate Monte Carlo test is performed based on Di
with pqdir =
∑s
i=1 1(Di ≥ D1)
/
s, and the 100(1 − α)% global directional quantile envelope
induced by Di is defined by
T(α)low k = T0k −D(α)|T ·k − T0k| and T(α)upp k = T0k +D(α)|T ·k − T0k| for k = 1, . . . , d, (12)
where D(α) is taken according to the point 1 of IGI.
The global studentized envelope approximates the distribution instead by the expectation T0k
and the standard deviation sd(T·k). The studentized measure (Myllymäki et al. 2015, 2017)
is
Si = max
k
∣∣∣Tik − T0ksd(T·k)
∣∣∣, (13)
and the univariate Monte Carlo test is performed based on si with pst =
∑s
i=1 1(Si ≥ S1)
/
s.
The 100(1− α)% global studentized envelope induced by Si is defined by
T(α)low k = T0k − S(α)sd(T·k) and T(α)upp k = T0k + S(α)sd(T·k) for k = 1, . . . , d, (14)
where S(α) is taken according to the point 1 of IGI.
The global unscaled envelope considered for the sake of completeness has its origin in the
classical Kolmogorov-Smirnov statistic. The unscaled measure Ui can be defined as Ui =
maxk|Tik − T0k|, the univariate Monte Carlo test performed based on Ui has the p value
punsc =
∑s
i=1 1(Ui ≥ U1)
/
s, and the 100(1 − α)% global unscaled envelope induced by ui is
given by
T(α)low k = T0k − U(α) and T(α)upp k = T0k + U(α) for k = 1, . . . , d, (15)
where U(α) is taken according to the point 1 of IGI. A problem of this envelope is that its
width is the same along the whole domain, thus it cannot account for the changes in the
variability of the distributions Ti across different values of k (Myllymäki et al. 2015, 2017).
B. Combined global envelopes
Assume that there are G vectors Tji = (T
j
i1, . . . , T
j
idj
), j = 1, . . . , G, i = 1, . . . , s, dj ≥ 1, and
the construction of a global envelope is wanted jointly for all of them. A combined global
envelope test can be made in two different ways.
In the two-step combining procedure, first, a measure is chosen for each j = 1, . . . , G and
computed for the vectors Tji , i = 1, . . . , s and j = 1, . . . , G. Let the resulting measures be
Journal of Statistical Software 39
mji . As the second step, the one-sided extreme rank length is applied to the new vector
T′i = (m1i ,m2i , . . . ,mGi ) of the measures. As a result, a joint sorting of vectors T1i , . . .TGi , i =
1, . . . , s, is obtained and a joint extreme rank length measure Ei is attached to every i =
1, . . . , s. The p value of the combined Monte Carlo test is perl =
∑s
i=1 1(Ei ≤ E1)
/
s, and
the common 100(1− α)% global envelope is constructed similarly as the 100(1− α)% global
extreme rank length envelope (Equation 10): Let E(α) be defined according to the point 1 of
IGI and Iα = {i ∈ 1, . . . , s : Ei ≥ E(α)} be the index set of vectors less or as extreme as E(α).
Then the common 100(1− α)% global envelope is
T(α),jlow k = mini∈Iα
T jik and T
(α),j
upp k = maxi∈Iα
T jik for k = 1, . . . , dj , j = 1, . . . , G. (16)
The extreme rank length measure is chosen in the second step because it gives the same weight
to every component (even when dj , j = 1, . . . , G, are different or even if different measures
are used in the first step), it is based on ranks only and it achieves almost no ties.
In cases where d1 = . . . = dG (= d), it is also possible to use a simple one-step combining
procedure. Then the global envelope (any of those in Table 1) is constructed for the long
vectors
Ti = (T 1i1, . . . , T 1id, T 2i1, . . . , T 2id, . . . . . . , TGi1 , . . . , TGid ), i = 1, . . . , s.
The one-step combining can be used for example when multivariate functional data (Ti1, . . . , Tid),
i = 1, . . . , s, where Tik = (T 1ik, . . . , T Jik) are multivariate vectors of J elements, are investigated.
Then it is possible to separate the dimensions into a set of J marginal vectors applying the
one-step combining procedure, i.e., to take Ti = (T 1i1, . . . , T 1id, . . . , T Ji1, . . . , T Jid). Further, it is
also possible to add other vectors expressing the correlation between the elements of the vec-
tors, e.g., if we have a two-dimensional functional data, the vector (T 1i1T 2i1−T 101T 201, . . . , T 1idT 2id−
T 10dT
2
0d) can be added into Ti behind the marginal vectors. Here T
j
0k denotes the expectation
of T jik, i = 1, . . . , s.
The graphical functional ANOVA and GLM (see the functions in Table 2) use the one-step
combining procedure to merge the mean or contrast vectors under inspection, because in this
case all the vectors have the same structure (see Sections 3.5 and 3.6 and Mrkvička et al. 2020,
2021). On the other hand, for generality, the default combining procedure of global envelope
construction functions in GET is the two-step procedure, which is presented for the first time
here as an improvement of the combined tests of Mrkvička et al. (2017) (see an example
in Section 3.1). The combined envelopes are implemented in the central_region() and
global_envelope_test() functions as mentioned above, and the one- or two-step procedure
can be specified in the argument nstep (either 1 or 2).
C. Adjusted global envelopes for composite null hypotheses
The Monte Carlo tests for which the global envelopes are constructed are exact only in the
case when the null hypothesis is simple, i.e., when no parameters have to be estimated. This
is the case in permutation tests of task (iii), but in task (ii) the null hypothesis can often be
composite, i.e., some parameters of the null model have to be estimated. In such a composite
case, the classical Monte Carlo test can be liberal or conservative. This problem can be
solved by a two-stage procedure, where in the first stage the level of the test is estimated.
Such a procedure was first introduced by Dao and Genton (2014) for Monte Carlo tests.
40 GET: Global Envelopes in R
Myllymäki et al. (2017) extended this adjusted method for global envelopes. Baddeley et al.
(2017) improved the procedure further in order to obtain an exact significance level. Here
the procedure of Baddeley et al. (2017) is summarized and extended for global envelopes as
implemented in GET.
Let M denote the chosen measure and α the chosen significance level. Let T1 be the test
vector computed from the data.
1. Estimate the parameters θ1 of the null model.
2. Simulate s2−1 replicates of the data from the null model with the estimated parameters
θˆ1, and compute the test vectors T11 = T1,T12, . . . ,T1s2 (create a ‘curve_set’ object of
vectors, C1).
3. Simulate another s− 1 replicates of the data from the null model with the parameters
θˆ1 and estimate the parameters of the null model from each of them (θˆi, i = 2, . . . , s),
4. For every i = 2, . . . , s, simulate s2−1 replicates from the null model with parameters θˆi,
and compute the test vectors Ti1,Ti2, . . . ,Tis2 (create a ‘curve_set’ object of vectors,
Ci).
5. For each set of curves Ci, i = 1, 2, . . . , s, compute the Monte Carlo p value pi =∑s2
j=1 1(M ij ≤M i1)
/
s2, whereM i1, . . . ,M is2 are the chosen measure computed forTi1, . . . ,Tis2 .
6. The adjusted MC p value is padj =
∑s
j=1 1(pi ≤ p1)
/
s.
7. Let pα denote the lower α quantile of the sample p1, . . . , ps. Construct the chosen
100(1− pα)% global envelope from T11, . . . ,T1s2 .
This adjusted test is implemented in the GET.composite() function of the GET package.
Examples can be found in the help page of the function in R and in Section 3.3.
Affiliation:
Mari Myllymäki
Natural Resources Institute Finland (Luke)
Latokartanonkaari 9, FI-00790 Helsinki, Finland
E-mail: mari.myllymaki@luke.fi
URL: https://www.luke.fi/en/experts/mari-myllymaki/
Tomáš Mrkvička
University of South Bohemia
Faculty of Agriculture and Technology
Studentská 1668, 37005 České Budějovice, Czech Republic
E-mail: mrkvicka.toma@gmail.com
Journal of Statistical Software https://www.jstatsoft.org/
published by the Foundation for Open Access Statistics https://www.foastat.org/
November 2024, Volume 111, Issue 3 Submitted: 2020-10-24
doi:10.18637/jss.v111.i03 Accepted: 2024-05-04