COMMUNICATIONES INSTITUTI FORESTALIS FENNIAE 138 MODELS AND METHODS FOR ANALYSING SPATIAL PATTERNS OF TREES ERKKI TOMPPO SELOSTE MALLEJA JA MENETELMIÄ PUIDEN TILAJÄRJESTYKSEN ANALY SOIMISEKSI HELSINKI 1986 COMMUNICATIONES INSTITUTI FORESTALIS FENNIAE THE FINNISH FOREST RESEARCH INSTITUTE (METSÄNTUTKIMUSLAITOS) Unioninkatu 40 A SF-00170 Helsinki 17 FINLAND telex: 125181 hvfor sf attn: metla/ phone: 90-661 401 Director: Professor Aarne Nyyssönen Distribution and exchange of publications: The Finnish Forest Research Institute Library Unioninkatu 40 A SF-00170 Helsinki 17 FINLAND Publications of the Finnish Forest Research Institute: Communicationes Instituti Forestalls Fenniae (Commun. Inst. For. Fenn.) Folia Forestalia (Folia For.) Metsäntutkimuslaitoksen tiedonantoja Cover (front & back): Scots pine ( Pinus sylvestris L.) is the most important tree species in Finland. Pine dominated forest covers about 60 per cent of forest land and its total volume is nearly 700 mil. cu.m. The front cover shows a young Scots pine and the back cover a 30-metre-high, 140-year-old tree. COMMUNICATIONES INSTITUTI FORESTALIS FENNIAE 138 ERKKI TOMPPO MODELS AND METHODS FOR ANALYSING SPATIAL PATTERNS OF TREES Academic dissertation to be presented, with the permission of the Faculty of Social Sciences of the University of Helsinki, for public criticism in Auditorium Porthania 111 , Hallituskatu 11, on December 18th, 1986, at 12 o'clock noon. SELOSTE MALLEJA JA MENETELMIÄ PUIDEN TILAJÄRJESTYKSEN ANALYSOIMISEKSI HELSINKI 1986 TOMPPO, E. 1986. Models and methods for analysing spatial patterns of trees. Seloste: Malleja ja menetelmiä puiden tilajärjestyksen analysoimiseksi. Communicationes Instituti Forestalls Fenniae 138. 65 p. The objective is to present modern statistical methods for analysing spatial patterns of trees. The Gibbs process with pairwise interactions is used as a stochastic model for tree patterns. The Takacs-Fiksel approach is applied as an estimation method. This method, intended originally for mapped point patterns, is modified for field observations where the measurements are distances. The final aim is to be able to predict and simulate the spatial pattern in terms of some conventional forest variates with the obtained results. Therefore, the prediction of the type of pattern, given these variates, is considered. The methods are applied to sample plots representing development classes from seedling to old thinning stands in mineral soils in the southern half of Finland. Tutkimuksessa esitetään moderneja tilastollisia menetel miä puiden tilajärjestyksen analysoimiseksi. Stokastise na mallina tilajärjestykselle käytetään pareittaisten vuorovaikutusten Gibbsin prosessia. Parametrien esti mointiin sovelletaan Takacs-Fikselin menettelytapaa. Tämä menetelmä, joka alunperin on esitetty kartoite tulle aineistolle, yleistetään etäisyysmittauksista muo dostuville kenttähavainnoille. Tavoitteena on, että esitetyillä menetelmillä ja tuloksilla voidaan ennustaa ja simuloida puiden tilajärjestys, kun joidenkin tavan omaisten metsikkömuuttujien arvot tunnetaan. Tämän vuoksi tarkastellaan myös tilajärjestyksen ennustamista näiden muuttujien avulla. Menetelmiä sovelletaan metsikkökoealoihin, jotka edustavat Suomen eteläpuo liskon kivennäismaiden kasvatusmetsiä. Keywords: spatial pattern, Gibbs point processes, Takacs-Fiksel method, nearest neighbour measurements ODC 53 + 181.6 Author's address: Technical Research Centre of Finland, Laboratory of Urban Planning and Building Design, Itä tuulenne 2, SF-02100 Espoo, Finland Helsinki 1986. Valtion painatuskeskus ISBN 951-40-0752-2 ISSN 0358-9609 1 462662 T CONTENTS 1. INTRODUCTION 5 2. MATHEMATICAL CONCEPTS 8 3. SOME MATHEMATICAL MODELS FOR SPATIAL PATTERNS OF TREES ... 12 31. Poisson forest 12 32. Inhomogeneous Poisson processes 13 33. Poisson cluster processes 14 34. Doubly stochastic Poisson processes 15 35. Lattice-based processes 16 36. Markov point processes 17 361. Hard-core processes 18 362. Pairwise interaction processes 18 4. ANALYSIS OF SPATIAL POINT PATTERNS 23 41. Field methods 23 411. Quadrat methods 23 412. Distance methods 24 42. Analysis of mapped data 25 421. Testing of Poisson hypothesis by means of Ripley's K 26 43. Estimation of the parameters of point processes 27 431. Problems occurring in estimation 27 432. Takacs-Fiksel estimation method 29 433. Estimation of the parameters of Gibbs process based on the nearest neighbour measurements 31 5. THE EMPIRICAL DATA OF THE STUDY 34 51. The sampling design of INKA sample plots 34 6. RESULTS 37 61. Proportions of spatial pattern types 37 62. Prediction of the spatial pattern type in terms of stand variates 39 621. Estimation method used 40 622. Conditional distributions of the spatial pattern types 41 63. The estimates of the parameters of Gibbs processes 46 631. The estimates from mapped data 46 632. The estimates of the parameters of Gibbs processes by means of nearest neighbour measurements 50 7. SIMULATION OF THE SPATIAL PATTERNS OF TREES IN A FOREST REGION 53 71. Simulation of spatial birth-and-death processes 53 8. SUMMARY 57 REFERENCES 60 SELOSTE 62 APPENDIX LIST OF SYMBOLS 64 4 Tomppo, E. PREFACE I wish to express my sincere gratitude to Professor Hannu Niemi for his encourage ment, for tirelessly reading through the many versions of the manuscript, and for his valuable advice. I am also deeply indebted to Dr. Antti Penttinen for constant guidance, encouragement and fruitful discussions in the course of the study. For the empirical data, my thanks are due to Professor Yrjö Vuokila and the Depart ment of the Forest Inventory and Yield of the Finnish Forest Research Institute, es pecially to Mr. Hans Gustavsen and Mr. Risto Päivinen. The manuscript was read and criticized in a constructive manner by Professors Pekka Kilkki, Kullervo Kuusela, Simo Poso and Yrjö Vuokila. Further, Ms. Juni Palmgren suggested several improvements in one chapter. My thanks are due to all of them. I also wish to thank my family, my wife Irma and my daughters Liisa-Maaria and Anna Karelia for the support. For financial support I am indebted to the Finnish Cultural Foundation. The Academy of Finland has granted me a half year research scholarship, the Technical Research Centre of Finland, apart from granting me a research scolarship for one and half a months, provided the facilities for me to carry out the study, and the Finnish Forest Research Institute gave me free computer time for the heavy computations of the study. Finally, I am grateful to the Finnish Forest Research Institute for ac cepting this study to its publication series. Espoo, October 1986 Erkki Tomppo 5 Commun. Inst. I'or. I'enn. 138 1. INTRODUCTION The characteristics used in forestry are commonly divided into three groups de scribing 1) an individual tree, 2) a forest stand and 3) a forest region. A forest stand can be assumed to consist of soil, trees and other vegetation. Trees in a stand can be characterized in terms of their size distribu tions and their relative locations with respect to each other. The relative locations of trees can be illustrated by the point configuration formed by the dimensionless trees on the horizontal plane, called the spatial pattern of trees. The spatial pattern of trees can also be described by using stochastic models, called spatial point processes. The main objective of this study is to present statistical models and methods to analyse spatial patterns of trees in a forest stand (or point configurations in general). The models are estimated for the spatial patterns of stands representing forests from the seedling stage to the old thinning stage in mineral soils in southern half of Finland. Further, the prediction of the spatial pattern type, given some 'conventional' forest vari ates, is considered. The goal is that, with the obtained results, one can predict and simulate the spatial pattern if some conven tional stand variates are known. Before going to the details, we present a survey of different fields of forestry where the spatial pattern should be taken into account. The relative spatial distribution of trees plays an important role in many areas of forest research and forest management. Examples of purposes where its estimate may be utilized are forest inventory plann ing, the construction of growth models of trees or stands, and problems relating to forest regeneration and thinning. A forest inventory usually involves measurement in the field even if aerial photographs or satellite imagery are utilized. Before the measurements can be carried out, the sampling design has to be defined, that means, the size, form and the locations of the sample plots (for example the choice between a fixed size sample plot and a relascope sample plot). The sampling design is quite often affected also by tradition or the purpose of the inventory, for instance a timber or forest sale, a working out forest management planning, a reconnaissance type of survey and so on. However, in the ideal case, the final sample plan should be chosen in such a way that the variances of estimators, given the total costs, are mini mized. Minimizing the variances is rather complicated in forestry. One reason is the large number of variates affecting the variances of estimators. The variances de pend for example on the size distributions of trees, the spatial variation in the fertility of the soil, and the relative locations of trees; see Matern (1960 and 1972) and Oder wald (1981). (The fact that different variates may require different solutions causes addi tional difficulties; cf. Ranneby (1981).) All these factors, including spatial patterns of trees, should therefore be taken into ac count in the inventory planning. It should at least be possible to estimate their values, given a priori information. Because of the complexity of the task, the variances of estimators have been assessed experimen tally. Examples of extensive investigations are Nyyssönen et ai. (1967 and 1971). A large number of samples from a completely measured forest were selected in those studies, and variances of estimators were computed through sample estimates. If a forest model (involving the spatial distribu tion of trees) were available, one could produce artificial forests and simulate sam pling, comparing the efficiency of different sampling methods. Thus the spatial pattern of trees can be utilized in forest inventory at least in two different ways: 1) for determi ning or approximating the variances of estimators, and 2) for generating artificial forests in which sampling designs will be simulated. The spatial pattern affects, through the interaction of trees, the growth of an individual tree and thereby the current timber productivity of the whole stand. This fact has been recognized in forestry for 6 Tomppo, E. a long time. Early references to it in Finnish forest research are to be found in Aaltonen (1923, 1925 a and 1925 b). According to Aaltonen (1925 b), interaction depends for instance on the crown sizes, the lengths and especially the root contacts of trees. On the other hand, root contacts depend for instance on the forest site type (Aaltonen 1923). The interaction of trees could be integrated into the growth models as follows. The growth of an individual tree is estimated as a function of the soil fertility, the size and age of the tree, and the sizes and locations of the neighbouring trees. (The locations can be introduced for instance by using tessellations.) The growth of the stand is obtained as the sum of the growths of individual trees. Another possibil ity is to estimate the growth of the stand as a function of the 'conventional' stand vari ates and the relative locations of trees, the last one expressed for instance by the para meters of a statistical model. The distribution of the crowns in the vertical plane affects the growth of trees, too. The distribution of the growing space in the vertical plane has been treated by Lönnroth (1925) and its development and connections to the quality classes of crowns and stems by Nyyssönen (1950). The methods of spatial analysis have been applied in forestry perhaps most frequently in regeneration surveys. The problem usual ly is whether the intensity of a seedling or sapling stand is high enough and the spatial distribution of the trees regular enough. Different kinds of indices have traditionally been applied; cf. Persson (1964). The configuration of trees is usually compared with a configuration of independently and uniformly distributed trees, that is, with a configuration generated by a Poisson pro cess. A description used in forestry is the zero-plot diagram which measures the pro portion of empty plots as a function of the size of the circle-shaped sample plot (Cox 1971). This has been applied by Pohtila (1980) to investigate how the degree of aggregation of the spatial pattern develops in seedling and sapling stands in a region located in Northern Finland. However, mathematical models have not been exten sively applied even to seedling surveys. One aim of the treatments, such as thinnings, applied to forests is to make the spatial pattern more regular in order to distribute the growth factors evenly among the trees. Usually, however, the thinning models are based for instance on the basal area and dominant height of trees. (In practice the degree of aggregation is taken into account by thinning the thicknesses.) Attempts to include the degree of aggrega tion in thinning models have been discussed by Kilkki et ai. (1985). Further examples of circumstances under which the spatial pattern of trees might be of significance are 1) spreading of diseases, 2) production high-quality timber (the growing space usually affects the quality of timber), and 3) the simulation of artifical forests for different research purposes; cf. Kilkki (1983). Flow can the distribution of trees be taken into account when solutions for these and other forestry problems are being sought? Traditionally, spatial distribution, if introduced at all, has been summarized in indices; cf. Loetsch et al. (1973, Sec. 72). For some problems they may be informative enough. However, according to Ripley (1981, p. 1), 'there are so many different types of spatial patterns that we need to summarize the data in one or more graphs rather than by single numbers, such as the mean and standard deviation of classical statistics'. Further, the indices do not indicate, for instance, in which scale the possible aggregation or repulsion occurs. A natural data summary is a mathematical model of the phenomenon generating the pattern, even if 'a model must often be almost grotesquely oversimplified in com parison with the actual phenomenon stu died', as Matern (1960, p. 28) remarks. The use of statistical models leads to the problem of hypothesis testing and model fitting. The simplest model for point configur ations is the model of independently and uniformly distributed individuals, the so called Poisson process in the plane. There fore a classical problem in dealing with point configurations is whether the configuration can be regarded as generated by a Poisson process. The usual alternatives are the regular and clustered ones. Different kinds of indices have traditionally been applied, later also graphs, such as second-order summaries. If the Poisson hypothesis is rejected, more complicated models and mathematical tools for the parameter estima 7 Commun. Inst. For. Fenn. 138 tion will be needed. A natural sub-family of processes to model the interaction in point configur ations is the family of Gibbs processes, originally introduced in statistical physics. They will also be utilized in the present study. The parameters of Gibbs processes can also be used as a data summary. The model fitting and hypothesis testing of spatial point processes is somewhat problematic. One difficulty in hypothesis testing is that the distribution of the test statistic is often unknown under a null hypothesis other than a Poisson process. Therefore a Monte Carlo test is used quite often. In the parameter estimation, the straightforward maximum likelihood method is not applicable to many common spatial processes. Therefore various kinds of approximative methods have been de veloped; for a review see Penttinen (1984). A new and very elegant method to estimate the parameters of the Gibbs type processes has been developed by Takacs (1983) and Fiksel (1984). The method is based on the reduced Palm distribution of the relevant process. The Palm distribution, which is a function of the location x, is in fact a conditional distribution, given a tree is located at the point x. If the integral of a suitably chosen function u is computed with respect to the Palm distribution and with respect to the original process, a measure for a deviance from a Poisson process is obtained. This method is applied in the present study. We also pay special attention to the choice of the function u. In spatial analysis, the sampling method partly determines what kind of information we can get from the sample. The mapped data are usually the most informative and also the most expensive. So far the para meter estimation has been based on mapped data. A method of estimating the parameters from unmapped data is presented in this study. The measurements are distances from randomly chosen points and trees to the nearest tree and countings of the numbers of trees in circles around random points. The empirical part of this study concen trates on three main problems: 1) The estimation of parameters of Gibbs processes from mapped data. A pairwise interaction process is applied as the model. The models are estimated by the mean diameter class and spatial pattern type for the three most important dominant tree species in Finland. 2) The estimation of parameters of Gibbs processes with unmapped data described above. The models referred to are estimated for pine ( Pinus sylvestris L.) on the basis of the nearest neighbour measurements. Ac cording to the obtained results, the parame ters of a Gibbs process can in many cases be quite reliably estimated also with these kinds of samples. 3) The prediction of the spatial pattern type of trees as a function of conventional forest variables (tree species, basal area of trees, etc.). This information can be utilized for example for the produc tion of artificial forests, because the distribu tions of conventional forest variates are generally better known than the spatial distributions of trees. The conditional distributions of the spatial pattern types are estimated by a multivariate logistic model. We also deal with the simulation of the locations of trees in a forest region with the obtained results. The models are applied to the permanent sample plots of the National Forest Inven tory, called the INKA growth sample plots, and to the sample plots of the 7th National Forest Inventory founded by the Finnish Forest Research Institute, Department of Forest Inventory and Yield. Only plots located in mineral soils in the southern half of Finland, totally about 1300 sample plots, are used. An essential feature of the INKA plots is that the coordinates of the trees are known. Thus the data can be treated with methods based on mapped data or field measurements. The plan of the study is as follows. Some basic mathematical definitions are intro duced in Chapter 2. Chapter 3 presents some possible models for spatial pattern of trees. Chapter 4 deals with hypothesis testing and parameter estimation. The Takacs-Fiksel estimation procedure is pre sented in Section 432 and the new estima tion method based on the field measure ments in Section 433. The applied INKA stand data and its properties are discussed in Chapter 5. The obtained results are summar ized in Chapter 6. Finally, the simulation of the locations of trees is described in Chapter 7. 8 Tomppo, E 2. MATHEMATICAL CONCEPTS In modelling the locations of trees in a forest, the forest is generally regarded as a subset of the horizontal Euclidean plane on which the supposed origin points of trees are projected. If the area of the relevant forest is large compared with the diameters of the trees, an individual tree can be represented as a single point. The set consisting of these points is called a spatial point pattern. The points of a pattern are referred to as individuals or simply trees, to distinguish them from arbitrary points of the plane. A spatial point pattern thus indicates the distribution of the horizontal space among trees within the region. Figures 1 a)-f) show six spatial patterns consisting of the locations of trees in six INKA sample plots (see Chapter 5). In the first two figures, the locations of trees can be regarded as randomly distributed. In Figures c)-d) trees form clusters, which are bigger or appear more frequently than in a random case. This type of pattern will be called 'clustered' (the terms 'aggregated' and 'attractive' have also been used). In the last two figures, the trees are distributed over the sample plot more regularly than in the previous figures. There seems to be some kind of repulsion between the trees. We therefore refer to such a pattern as 'repul sive' or 'regular'. The aim of the analysis of spatial point patterns is to measure, by using quantitative characteristics, how individuals are located with respect to each other and to describe with mathematical models the laws regu lating the location. The choice of the method of analysis depends among other things on the type of spatial sampling, i.e., on the types of measurements used at data collection. The sampling methods usually fall into three classes: 1) The area in question is divided into small subareas, for example into quadrats, and the number of individuals in each quadrat is counted. 2) The distances between an arbitrary individual or an arbitrary point of the plane and the nearest neighbouring individuals are measured. 3) The locations of the individuals in the whole area or some subarea are mapped. With measurements belonging to the first and second groups, one can test whether the positions of individuals are distributed randomly, and, if not, whether the pattern can be regarded as a regular or a clustered one. Further, different kinds of indices to measure the amount and the direction of deviation from a random pattern can be calculated. In a more advanced analysis, mathemati cal laws for the relative locations of trees are searched. The most informative spatial data for this purpose are the mapped data, i.e., the data in which the coordinates of trees in some subarea are known. With mathematical models one can obtain de tailed knowledge about the underlying random mechanism which has generated the pattern. Further, mathematical models allow the use of simulation procedures for arti ficial production of point patterns compat ible with data. First, we shall give some notations and definitions which are used in modelling the locations of trees. For this purpose we need some basic definitions from probability theory; at this point the reader can refer for example to Neveu (1965). As a model for spatial point pattern we use stochastic point processes in the plane, also called spatial point processes. Recall that an ordinary stochastic process is a collection of random variables {Z(t,co) | tET, toEfi}, where T is a set of indices. Often T is the real line or its subset, for instance the set of non-negative real numbers [o,°°). With every fixed t, Z(t,w) is a random variable and with every fixed to it is a real valued function of t. This is called a realization or trajectory of Z. As an example of stochastic processes, we mention the stem volume of a tree from 9 2 462662 T Commun. Inst. For. Fenn. 138 Figure 1. a) —f) Locations of trees in six INKA sample plots. The location of each number refers to the location of a tree in a plot and the value of the number to the diameter class of the tree, measured at the height of 1.3 m, the class interval five cm. The mean diameter (D), the basal area (G), the number of trees per hectare (N) and the radius of the sample plot (r) are given under the figure. 10 Tomppo, E, birth to death. In this case, the set of non negative real numbers, R_|_, or non-negative integers, N O, could be taken as the index set, depending on whether the volume is observed continuously or at certain time intervals. A realization of the process in this case is the volume of a certain tree from birth to death. In applications, the probability laws related to the stochastic process are deter mined through finite dimensional distribu tions, that is, through probabilities of the form Under certain general regularity condi tions, a probability measure into the space of all realizations can be defined in this way. This probability measure is called the distribution of the process. In the case of spatial point processes Z, one could take the set of all pairs of real numbers x=(xi,x2) as an index set, and Z could be defined by Z(x)=l if there were an individual in the point x, and Z(x)=o otherwise. Such representation would, however, be useless because P(Z(x)=l) would be zero almost everywhere and the distribution of the process would contain no information at all. This problem can be overcome by consider ing, instead of points, subsets of the Euclidean plane and the numbers of individ uals in the subsets. Formally, we define a point process in the plane as a probability measure in the space consisting of all possible realizations; see for example Mecke (1967). Let us denote an individual located at the point x by <5 X and the configuration formed by individuals by /J. The point configuration can be identified with an integer-valued Radon measure defined on Borel sets R 2 of the Euclidean plane R 2. The value of the measure fj(B), BG/i 2, is equal to the number of individuals in the set B. Let us further denote the set of all the measures defined in this way by N and by N the smallest a-algebra which makes all the counting functions measurable for every B €r R 2. (The value of the counting function at /J for any B £ R 2 is the number of individuals in the configur ation /U belonging to the set B.) A spatial point process can be defined as a probability measure P on the space (N,7V). (We can interpret P(N;), N; G N as the relative frequency of configurations belonging to the set N;.) The intensity measure of the point process P on (N,7V) is a mapping A: R 2 (o,°°) defined by Also, the intensity measure of a set B is the expected number of individuals be longing to the set B. The intensity function A(x) of the point process is, by definition, the Radon- Nikodym derivative of the intensity measure with respect to the Lebesgue measure, that is, the function defined by the equation where v stands for the Lebesgue measure of the real plane. The second-order intensity measure A 2 of the point process P is defined through the equation The second-order intensity function A.2(x,y) is the Radon-Nikodym derivative of the second-order intensity measure with respect to the measure v(g)v; cf. for example Nguyen and Zessin (1979). For every x E R 2 we define the shift operator T x on the space (N,AO by where B+s ={x | x-s GB). A point process is stationary if its distribution is invariant under an arbitrary translation of the origin, that is, A point process P is isotropic if its distribution is invariant under an arbitrary rotation about the origin. Stationarity implies \(x) = A (= constant) for every x, in other words, the homogeneity of the process, and, further, that A.2( x>y) * s a function of the difference x-y only. For an isotropic process, A 2(x,y) is a function of x and the Euclidean distance of x and y, d(x,y) = t. If a process is both stationary and isotropic, then X2(x,y) is (2.1) P(Z(tj) G A;, i=l,.. ~m). f B : N - N o> hif) = A<(B) (2.2) A(B) =///(B) P(d/u), B G R 2. (2.3) A(A) = E&u(A)) = Ja(X) y(dx), A G R 2, A (2.4) A2(A XB) = P(djU), A, B G R 2. N T = /u(B+s), B £ R 2, P(T X F) = P(F) for every F6N, x G R 2. 11 Commun. Inst. For. Fenn. 138 a function of d(x,y) only. The second-order property of a stationary isotropic process can be summarized in the function A.2(t). Equivalently, this may be done by using Ripley's second-order para meter K(t) (Ripley 1976 and 1981). Formally K(t) is defined by and a l is the uniform distribution on the surface of the sphere of radius t; see Ripley (1981). The functions K(t) and are con ~ nected (if both exist) by the relation (Ripley 1977). The parameter K(t) has an intuitive interpretation: AK(t) is the expected num ber of distinct individuals within distance t from an arbitrary individual (without coun ting the centre individual). In some cases, it may be convenient to use some other summary characteristics of spatial point processes in addition to second-order properties. (In general, the second-order property does not uniquely determine the process). Diggle (1983) pre sents two distribution functions for station ary and isotropic processes, the distributions of the distances to the nearest tree, measured 1) from an arbitrary tree and 2) from an arbitrary point of plane, If k=l, we denote these functions by G and F, respectively. Ripley (1977) has introduced a so-called test set method based on the function where B is a test set or a structur ing element, usually a circle, hexagon or square, centred at the origin. Here, tB = {tx | x£B|, that is, if for example B is the unit circle, tB will be the circle with radius t. Moreover, if B is the unit circle, this function is in fact the distribution function F. (If the process is stationary, p x(t) is independent of x.) By using oriented shapes such as ellipses or rectangles we can describe departures from isotropy. For an unbiased estimator of p x(t) see Ripley (1977). It is also possible to use measures based on tessellations or triangulations. Dirichlet cells are constructed by associating with each tree the part of the plane that is nearer to that tree than any other. A Delanau triangulation can be formed from Dirichlet cells joining points for which the associated polygons have an edge in common; see for example Ripley (1981). In forestry applica tions, a Dirichlet cell of a tree stands for the available growing space of the tree and its area for the share of the growing factors of that tree. It thus represents a tree variate for the description of the spatial distributions of the trees of a stand. (2.5) E( m (A) m (B)) = Ai'(AOB) + A 2/ v t (A X B)dK(t). o where << t (A XB) =/ a t [{y-x |yG B, d(x,y) = t}] y(dx) A (2.6) \K(t) = 27rA-i/A„(s)sds O (2.7) G]l(t) =P( the distance from an arbitrary tree to the kth nearest tree is at most t) (2.8) F|l (t) =P( the distance from an arbitrary point to the kth nearest tree is at most t). (2.9) Px (t) = PG«(x+tß)>o), 12 Tomppo, E 3. SOME MATHEMATICAL MODELS FOR SPATIAL PATTERNS OF TREES 31. Poisson forest The homogeneous Poisson point process in the plane, subsequently called a Poisson process or Poisson forest, is the simplest model for spatial patterns of trees. Poisson processes are also suitable building blocks for other point processes. A Poisson process can be defined in several equivalent ways. We shall recall two definitions. A point process f is a homogeneous Poisson process with constant intensity function X (number of trees per hectare) if for every measurable non-negative function u in (R 2 X N, R2(g)N) holds; see Mecke (1967). This is equivalent with the following definition; cf. for example Kallenberg (1974) or Nguyen and Zessin (1979). A point process is a Poisson process if it satisfies the following two conditions: 1. The number of trees in any Borel set A is Poisson distributed with the mean Af(A). 2. The numbers of trees in disjoint sets are indepen dent. The equation 3.1 indicates among other things that the numbers of trees in two circles of the same size, one centred on an arbitrary point of the plane, and the other on an arbitrary tree, are the same. A Poisson process is completely charac terized by its intensity X. The second-order intensity can be calculated from the indepen dence property 2, The parameter K(t) is (cf. the intuitive interpretation of K(t)). In the case of a Poisson forest it is also easy to calculate the distribution functions (2.7) and (2.8). Let us denote the distance from an arbitrary point of the set to the kth nearest tree by Then because >t if and only if the number of trees in a circle, radius t, is at most k. The number of trees in this circle is Poisson distributed with the mean value X.7rt 2 . The differentiation of (3.4) with respect to t yields the probability density function In other words, has a Gamma (k, TTX) distribution and a x 2 distribution with 2k degrees of freedom. Because the locations of the trees are distributed over the area uniformly and independently of each other, the distribution function (3.4) and the density function (3.5) remain unchanged if an arbitrary tree is replaced by an arbitrary point of the set. This implies that Gk(t) is equal to Fk(t). This property is typical for a Poisson forest and can be used in constructing randomness tests based on distances. Especially in ecological literature, the spatial pattern produced by the homogene ous Poisson process is referred to as random, which term is also used here. There are, however, infinitely many different ran dom mechanisms. Therefore patterns pro duced by a Poisson process are sometimes also called completely random or purely random; see Matern (1971). The conditional Poisson process, defined in a bounded region E and conditioned on the event yi/(E)=n is called a binomial process. That is, the realization of a binomial process consists of n trees, which are distributed uniformly and independently over the set E. The simulation of the Poisson forest in a bounded set E is quite simple. Let us denote (3.1) // u(x,Ai)Ai(dx)f(tV) N R 2 = \J f u(x,/j+ö x )f(d/Li)i>(dx) R 2 N (3.2) \,(t) = A 2, t>o. (3.3) K(t) = 7rt 2 , t>o, (3.4) Fk(t) = P(Xkt) k_l (X 77" t2) n = 1-2 e_7rXt • 1——L , k = 1,2, ... n=o n! (3.5) fk(t) = 2(7r\) k t 2k - l exp(-77-At 2)/(k-l)! , t>o. 13 Commun. Inst. For. Fenn. 138 the intensity of the process by A. At the first stage, we simulate the total number of trees, /i(E)=n, in the set E by simulating the Poisson distribution with mean At the second stage, we shall sample the locations of trees by simulating n uniformly and independently distributed points in the set E. A uniformly distributed point in the set E can be simulated for example in the following way. First we imbedd the set E into a rectangle A=(o,a) X (o,b). Then we simulate uniformly distributed points, xj from interval (o,a) and x 2 from interval (o,b). If the resulting point (x!,x2) is situated in the set E, it is accepted as the location of a tree, otherwise the procedure is repeated until an accepted point is obtained. Figure 2 shows two realizations of Poisson forests. 32. Inhomogeneous Poisson processes A variation of growth factors within a forest stand and the effect of the variation on the spatial pattern of trees can be taken into account by replacing the constant intensity of a Poisson forest by an intensity function A(x), which depends on the loca tion x. This inhomogeneous (unstationary) Poisson process in the plane can be defined in the following way. A point process P is a (general) Poisson process with the intensity measure A if holds for any non-negative, measurable function u on (R 2 X N, R 2(x)N). The equation (3.1) follows from (3.6) by re placing the varying intensity measure A with a constant intensity measure \v. A point process P satisfies (3.6) if and only if it satisfies the following two conditions; see Nguyen and Zessing (1979). a) The number of trees, ju(A) in any Borel set A is Poisson distributed with parameter f\(x) dx. A b) The numbers of trees in disjoint sets are indepen dent. The realization of an inhomogeneous Poisson process can be simulated for example as the realization of a homogeneous Poisson process but by replacing the uni form distribution of the xj-coordinate on the xj-side of rectangle by a density function which is proportional to the integral The on the X2-side of the rectangle is simulated in a similar manner. Another possibility is to simulate a homo geneous Poisson process in the set E with Figure 2. Two realizations of Poisson forests. The process has been conditioned to produce a) 40 trees in a circle, radius 9 m, b) 39 trees in a circle, radius 8 m. Cf. Figure 1, a) and b). (3.6) / /u(x,/y)/y(dx)P(d/i) =/ /u(x,/i+6 x )P(d/i)A(dx) NR2 R2N (3.7) / X(x l ,x 2)dx 2 . 14 Tomppo, E. Figure 3. a) An inhomogeneous INKA sample plot, the tree species pine, the radius of sample plot 11 m, the number of trees 43. The value of the number refers to the diameter class of the tree as in Figure 1. b) A realization of an inhomogeneous Poisson process, the intensity function is proportional to the function λ(x1,x 2) = 1.3x1 - 4.0x 2 + 0.9x21 - 0.035x1x 2 . the intensity A 0 = max A(x) and to accept the resulting point as a location of a tree with probability X(x)/X 0 ; see Diggle (1983). Figure 3 a) shows an inhomogeneous INKA sample plot (see Chapter 5). Figure 3 b) shows a simulated realization of an inhomo geneous Poisson process. The inhomogeneous process can be used as a model of a spatial pattern of trees, for example in the following way. We observe some spatial covariates y 1 5.. . ,yp to describe the heterogeneity of the soil and try to find a mapping g such that 33. Poisson cluster processes In clustered forests (also called aggregated forests), trees form clusters (groups of two or more tree individuals) more frequently on an average than the Poisson hypothesis implies. Clustering can be caused by a birth mechanism of trees or the inhomogeneity of the soil or both. In the above-mentioned characteristics, clustering appears in such a way that with small values of t, G(t) is greater, and F(t) and K(t) are smaller than in the case of a Poisson forest. This is a direct consequence of the definitions of the characteristics. As a stochastic model for a clustered forest, we can use for example a process which produces centre points of clusters, called parents, whose number and locations are distributed according to a certain point process, called a parent process. Each parent produces new trees, called daughters, whose number and locations are distributed accord ing to a certain daughter process, indepen dently of the parent process. If the parent process is a Poisson forest, the point process is called a Poisson cluster process. The most important special case of Poisson cluster processes is the Neyman-Scott process. Here the daughter process generates a random number of independent and identically distributed trees around the corresponding parent (Ripley 1977). This process is defined by the following properties a)—c): a) The parent process is a Poisson forest with an intensity o. b) Each parent produces, independently of each other, a random number M of daughters according to a probability distribution |p m , m = 0,1,2,. .. c) The locations of the daughters are distributed identically and independently relative to their parents. The location is determined by a bivariate distribution function H(x), or probability density function h(x), if this exists. The final forest may consist either of daughters only, or of both parents and M*) = g(yi(x)-y2(x),. ■ -,y p(x)). 15 Commun. Inst. For. Fenn. 138 daughters, which may be indistinguishable (as in the case of the present study). If H(x) is isotropic, i.e., independent of direction (radially symmetric), the final process is isotropic, too. Stationarity follows from the stationarity of the parent process. The intensity of the isotropic process (the number of trees per hectare) is X = p + pE(M) if parents are included and A = pE(M) otherwise. For an isotropic process, parents ex cluded, Ripley's K is where Hj(t) is the distribution function of the distance between two daughters of the same parent. The equation (3.8) follows from the intuitive interpretation of K(t); see Ripley (1981). It can be seen from (3.8) that the corresponding process is clustered, because K(t) is greater than 7rt 2 . (The last term on the right-hand side is positive). For distribution functions F(t) and G(t) we have the formulas where P(r,t) is the probability that in the circle, centred at the origin with radius t, there are no daughters of a parent whose distance from the origin is r, and Q(t) is the probability that for an arbitrary daughter 8 X there are no daughters of the same parent within a distance t from <5 X (Bartlett 1975). Theoretically, the probabilities can be calcu lated from the properties a)—c), but in practice, difficulties may occur. For some special cases, there exist explicit formulas of probabilities; see for example Warren (1971) and Diggle (1975). Neyman-Scott processes are suitable models for spatial patterns of trees for example in forests where a great proportion of trees has grown from sprouts of decidu ous trees. The inhomogeneity of a forest may have caused this kind of clustering, too. A realization of a Neyman-Scott process is shown in Figure 4. Here, the intensity of the process is 400 trees per hectare, M is Poisson distributed with parameter 2, and the daughters are distributed around the parent uniformly in the circle, centred at the parent with radius 1.8 m; cf. Figure 1 d). Figure 4. A realization of the Neyman-Scott process in a circle, radius 12 m. The parameters of the process are indicated above. 34. Doubly stochastic Poisson processes If the intensity function \(x) of the inhomogeneous Poisson process, defined in Section 32, is replaced by a two-dimensional stochastic process A(x), a doubly stochastic point process is obtained. This process is a suitable spatial pattern model of a forest belonging to a set of forest stands in which the intensity of trees varies both within stands and between stands and contains also a random component in addition to a possible systematic component. Doubly stochastic Poisson processes are defined by the following conditions: a) {A(x)| is a non-negative, two-dimensional stochastic process. b) Conditioned on the event (A(x) = A(x)j the process is an inhomogeneous Poisson process with an intensity function A(x). This process is stationary if and only if A(x) is stationary. A similar condition holds for isotropicity; cf. for instance Matern (1971). A stationary, isotropic process is called a doubly stochastic Poisson process or a Cox process (Ripley 1981). The first- and second-order intensity functions of the Cox process are (3.8) K(t) =m 2 + E(M(M-l))H,(t)/p(EM) 2 , (3.9) F(t) = I—exp(—2np J (I—P(r,t)) rdr), o (3.10) G(t) = 1—(1—F(t)) Q(t), 16 Tomppo, E. and where 7(t) Cov[A(x), A(y)] and t = d(x,y) (Matern 1971). As an example of Cox processes, Matern (1971) presents a process where circles with same radius are placed on the plane according to a Poisson process or some other process. At every point of the plane the intensity A(x) is proportional to the number of circles covering the point x. Figure 5 a) shows circles placed according to a Poisson process and Figure 5 b) a realization of a Cox process, constructed with the aid of the previous realization. In this case the resulting Cox process is also a Neyman-Scott process; see Diggle (1983) and Bartlett (1964). It can be shown that, under certain conditions, a Cox process and a Poisson cluster process are equivalent; see Ripley (1981, p. 165—166) and Diggle (1983, p. 58—59). 35. Lattice-based processes Lattice-based processes are suitable models for spatial patterns of trees especially in plantation forests. An extreme case of this type of model is a deterministic lattice structure, where trees form regular, for instance rectangular or triangular configur ations. A more realistic model is obtained if the locations of trees are distributed around the corresponding point of the lattice according to some probability law. Another possibility is to place trees on the plane sequentially with the distance from a tree to the previous tree being a random variable, the length of the side of the lattice being its expected value. We can combine for example to the realization of this process a realization of a Poisson process, or the realization can be thinned by deleting each individual with a certain probability p. The survival probabil ity can depend on the state of the neigh bouring trees, i.e., on whether they are alive or not. This dependence can be caused for instance by competition between the trees or by variation in the fertilization of the soil. Let us denote by Y x the random variable which indicates whether the tree at the point x is alive or not (e.g. 0 meaning death, 1 alive). If, in the case of a square lattice, the interaction is isotropically affected by the four nearest neighbours only, the condi tional distribution of Y is Figure 5. a) Circles, radius 1.8 m, placed according to a Poisson process in a circle, radius 12 m. b) A realization of a Cox process, intensity proportional to the numbers of circles covering the point; cf. Figure 3 a). (3.11) E[A(X)] = K (3.12) A, =A2 + 7(t), nil , n/v , x exp(—y(a+/3(t+u+v+w))) (3.13) P(Y=y t.u.v.w) == , —7- 1 , 1 -hexp(—(a+jB(t-h u+v+w))) 3 462662 T Commun. Inst. For. Fenn. 138 Figure 6. Realizations of processes based on square lattices. The radii of the sample plots are 8 m. a) The locations of the trees are distributed around the corresponding point of lattice uniformly in a circle of radius 45 cm, b) the previous realization is thinned by deleting each tree with probability 0.2, c) a realization of a Poisson forest, intensity 0.05 trees per square metre, has been combined with the previous thinned realization. i.e., Y has an auto-logistic distribution; see Besag (1972) and Cormack (1979). Here t,u,v,w denote the values given for the four nearest neighbours and (3 indicates the interaction between trees (/3=o means no interaction). In this interaction model, equal dependence on all four nearest neighbours can be extended to include directional effects (different values of ft for different directions); about details see Cormack (1979). Non-lattice interaction models will be considered in the next chapter. Figure 6 shows realizations of square lattice pro cesses; the side length is 2 m; a) the locations of the trees are distributed uni formly in the circle, centred at the corre sponding point of the lattice, radius 45 cm; b) the previous realization has been thinned by deleting each tree with probability 0.2; c) a realization of a Poisson forest, intensity 0.05 trees/m 2 , has been combined with the previous thinned realization. The consideration of lattice-based pro cesses often leads to cumbersome functions. Distribution functions of distances have been calculated in some special cases by Persson (1964), Brown and Holgate (1974) and Diggle (1975). 36. Markov point processes Let us consider processes in a bounded region E and define a neighbourhood relation on E by calling neighbours a pair of points whose mutual distance is smaller than a prescribed real number r. Let B be a Borel subset of E. A neighbourhood of the set B is defined by the condition B* = {xEE | y E B with d(x,y)o and p'C /d dP implies see e.g. Ripley (1977). Let us denote by Fg the restriction of the event F, F EN, to the set B, BEi? 2 . By definition, a point process P has a Markov property if the conditional probability of an event FG, given //£\B, equals the probability of FB , given yu B .:, i.e., Recall that the Markov property of a stochastic process in a real line implies that to predict the future of the process it is enough to know only the present state of the process. The previous definition extends this property to spatial point processes, the future state is replaced by the unknown configuration in the set B and the present time by the neighbourhood of the set B. 361. Hard-core processes Some hard-core processes (or simple inhibition processes) are special cases of Markov processes. Only stationary hard core processes are considered in this section. In hard-core models, pairs, with intertree distances smaller than, let us say, r, are excluded. The minimum distance between trees may be due to their diameters, competition between them, the silvicultural or other measures, or the mode of birth of the forest. An essential parameter in the char acterizations of hard-core models is the packing intensity of the process, where A is the intensity of the process. The packing intensity is the expected proportion of region E covered by randomly placed non-overlapping circles, radius r/2, intensity A. Its maximum value is attained when the locations of trees are generated by a deterministic triangular lattice process. In this case p = 37t/6 ~ 0.907. A simple way to define a hard-core model in a bounded region E is to do it through realizations of a homogeneous Poisson process. A homogeneous Poisson process is simulated and a realization is retained only if it contains no pairs of trees less than r apart (Ripley 1977). The simulation of this model is time-consuming if the packing intensity p is large. In general, the hard-core processes introduced by Matern (1960) are faster to simulate. Their realizations are much like those of Ripley's model, but their statistical properties appear to be different (and Matern's models are not Markov processes). Some of Matern's hard-core models have the advantage that their first- and second-order intensity functions can be calculated; see Matern (1960). Diggle et al. (1976) have presented a variation of one of Matern's models, a so called simple sequential inhibition (SSI) process. Trees are placed sequentially in the region. A new tree is accepted only if its distance from all previously accepted trees is at least r. Figure 7 shows realizations of SSI processes. 362. Pairwise interaction processes Often, the spatial pattern of trees re quires a smoother model than the processes defined in the previous section can provide. For instance, repulsion between trees can be complete up to some distance, about the diameter of the trees. After that, within a certain distance, competition between the trees or the measures applied to the forest can cause an 'uncomplete' repulsion, that is, the probability of trees being located close to each other is smaller than in the corresponding Poisson forest. If the distance between the trees increases further, repul sion may change into attraction (possibly after an interval of no interaction). Such attraction may be caused by regularity in the locations of trees, which can be a conse quence of silvicultural measures. Suitable models for this kind of spatial patterns are interaction processes. Interaction may be pairwise, that is, the interaction between two trees may depend on the relative location of these two trees only, or it may depend on the relative location of other trees, too. The final interaction model consists, of course, of interaction between all the points. Because the trees of a Poisson forest do not interact, it is natural to compare the configurations of interaction processes with (3.14) P(FB I AO. In effect, this condition also sets an upper bound to the total number of trees in a bounded region E. The third sufficient condition is obtained by imposing an upper limit on the total number of trees in each circle, radius a; see Gates and Westcott (1985). In this way, an upper limit can also be imposed on the size of clusters of trees. Formally, this can be done by defining a new Markov function where f(yu) is the original Markov function; cf. Gates and Westcott (1985). In actual fact, this corresponds to a type of k-point interaction process. In the present study, we apply a special case of the general interaction process, the so-called multiscale pairwise interaction pro cess, where g is a step function, i.e., The corresponding Markov function is where n = /u(E) and Y; = y;(/u) is the number of pairs of trees in n with interpoint distance in (r; _ i ,rj] ,i = 1,...,p; see Pent tinen (1984). A sufficient condition for the well-behavedness is also for example that cj = 0 or that q < 1 for all i=1,...,p. If p = l, we have the so-called Strauss process (Strauss 1975). For the Strauss process, the Markov function is where y(/u) is the number of trees in fj. with interpoint distance in (O,R]. The corres ponding interaction function is The parameter b reflects the intensity of the process and c describes the interaction between trees. If c = 1, a Strauss process is a Poisson process. If o 1, the Markov function (3.22) corresponds to a clustered point pattern. In this case one has to fix the total number of trees or to use a function like (3.18). In general, dealing with clustered interac tion processes one has carefully to take into account the so-called stability condition to guarantee that the process is well-behaved; see Gates and Westcott (1985). In statistical physics interaction processes are called Gibbs processes. Let P be a point process, whose Radon-Nikodym derivative with respect to some base process (e.g. Poisson process) is f. Let us suppose that a configuration /J X 5 X j has been observed and consider adding a new tree <5 X to the configuration. The conditional probability that there is a tree at the point x, given fj, is The process P, which satisfies (3.23), is called a Gibbs process with local energy E(x,/i); see for example Glötzl and Rauchen schwandter (1981). The function (d) = °°) for d< r and an interaction distance R R. The Markov processes are special cases of Gibbs pro cesses. For Markov processes, E(x,//) de pends only on those trees of /u which are neighbours of 6 X . An equivalent charac terization of the Gibbs process will be given in Section 432. The realization of an interaction process can be simulated for example in the following way. Let us suppose that a f(/j) if all discs of diameter a have (3.18) f a ,k(f) = fewer than k points, 0 otherwise, p..i 6(j(>.y))-{; i i;uAS*'''r)£r" i "1 p ' (3.20) {(n) = cb n .ft , i=l (3.21) f(/i) =a b n Cy("), (3.22) g(d(x,y))=r lf J x >y)^ R ' 11 otherwise. ti/J JA.J < J23> -toT~ b ,!,«'■*> n = exp(— (—ln b ln g(x,x ;))) = exp( -E(x,^)). 21 Commun. Inst. For. Fenn. 138 realization of n trees in a bounded set E has to be simulated and that the Markov function is of the form (3.17). n Let yU = £ <5 X : be an arbitrary configuration i=l of n 1 trees in the set E. Let us consider adding a new tree <5 y . The conditional probability of the event (JJ(y) = 1), given fj, according to formula (3.23), is b n g(d(x;,y)). Because g is bounded, say g(d) < c, c < °°, the function is a probability density function on the set E. Let us delete an arbitrarily chosen tree from the n 1 trees in E and sample a uniformly distributed point y from E. The point y is accepted as a location of a new tree with probability given by (3.24). If the position y is rejected, the procedure is repeated until an accepted point y is obtained. It can be proved that the process defined by alternative depletions and replace ments converges to the process defined by the Markov function (3.17) (conditioned to produce n trees) if the number of depletions and replacements is increasing (Ripley 1977). The method presupposes a starting configur ation. Ideally, this would be a sample of the process (3.17). Theoretically, an arbitrary starting configuration n, f(/u) >O, guaran tees the convergence to the process (3.17), because the process defined by the alterna tive depletions and replacements is an irreducible Markov chain; see Ripley (1977 and 1979 a). The convergence holds also if the deleted trees are sampled systematically instead of by random sampling. This simulation is based on the fact that, on certain general conditions, a spatial birth and-death process has a Gibbs process as a unique equilibrium distribution; cf. Preston (1975) and Ripley (1977). Spatial birth-and death processes will be further discussed in Section 71. Ripley (1979 a) presents a FORTRAN subroutine for the simulation of a Strauss process. In this context he suggests that a sufficient number of depletions and replace ments would be four times the number of trees. In some cases, a simulation of a Gibbs point process with an attractive potential function is rather slow. There is a trick that might be used in that kind of case: the effect of the first-order property on the second order property. If a suitably chosen repul sive Gibbs process in a bounded region E is conditioned to produce a relatively high number of trees, the resulting configuration is a clustered one (see Figure 8 d). If the 'packing density' is large, the effect of the first-order property on the unconditional second-order property is essential. However, it is an open question how this effect can be expressed in an analytic form. Figure 8 presents realizations of pairwise interaction processes: The last process has been conditioned to produce 63 trees in a circle with radius 10 m. (3.24) p(y) - ( .n g(d(x p y)) )/c n 1 a) Strauss process, c 0.1, r 2 m, 0 ,0 2.7 0 ,0 1.5 0 ,0 1.75 22 Tomppo, E. Figure 8. Realizations of pairwise interaction processes. The interaction functions are given above. The radii of the sample plots are 9 m, 14 m, 5 m and 10 m, respectively. The plot of the interaction function is given by the realization. Cf. Figures 1 e), f) and d). 23 Commun. Inst. For. Fenn. 138 4. ANALYSIS OF SPATIAL POINT PATTERNS The purpose of the analysis of spatial point patterns is to identify the process that has generated the configuration, or to find a suitable model for this process. In some cases, it is sufficient to investigate whether the pattern is generated by the Poisson process or not, and if not, whether the process can be regarded as clustered or as repulsive. Different kinds of indices can be used to measure the deviation of the given forest from the Poisson forest. It is usually necessary for the identifica tion of the process to formulate hypotheses and testing them. The first hypothesis usually is that the pattern is generated by the Poisson process; if this is not true, a more complicated model is used. One difficulty in the testing of point pattern hypotheses is that the set of alternative hypotheses is so wide as to make it very hard to find a uniformly most powerful test. One test can be powerful against a repulsive alternative and another test against a clus tered one, perhaps of some special type. (This holds especially for the tests of Section 412.) This difficulty can be partially overcome by using several tests in the analysis. The methods of spatial point pattern analysis can be divided into two groups depending on whether they require mapped data or not. The methods of the latter group are more suitable for field work or for the preliminary analysis of the data, while the methods of the former group generally give more detailed information about the underlying process. Therefore, we will call the methods based on unmapped data field methods. These can be further divided into two subgroups, quadrat methods and dis tance methods. 41. Field methods 411. Quadrat methods Quadrat methods are based on what is called quadrat sampling. Sample plots of an equal size are placed in the region E and the numbers of trees in sample plots are counted. The name originates from the form of sample plots favoured by plant ecologists. (In forestry the most common plot form may be circular.) If the point pattern is a random pattern (generated by the Poisson forest), the numbers of trees in each sample plot are Poisson distributed with the mean Xa, where A is the intensity of the process and a the area of the sample plot (cf. Section 31). If the sample plots are placed indepen dently, any goodness of fit test is possible for testing the Poisson hypothesis. Instead of or besides the analysis of the distribution of the counts, different kinds of indices can be used to measure the deviance from the Poisson forest; see for example Ripley (1981, p. 102—106). In general, a more powerful test is obtained in this way. The earliest tests of this type originate from the early 1920'5. An index of dispersion suggested by Fisher et al. (1922), i.e., the ratio of sample variance and mean, is still widely used. Here x; is the number of trees in sample plot i, i = 1,..., m. In the case of the Poisson forest, (m 1)1 is approxi mately x 2 distributed with m 1 degrees of freedom. Large values of I refer to a clustered configuration and small ones to a regular configuration. One drawback of the quadrat method is that the results is greatly affected by the size of the sample plot. Such effect can be reduced to some extent by using a grid of contiguous quadrats, that is, by sampling an area over and over again with quadrats of sizes every time larger than the preceding one. The idea in this is that the way in which the measure of dispersion varies with the quadrat size provides information of the pattern. Apart from being time-consuming, this method also does not work in every case. Sometimes, quite different types of patterns may give very similar types of graphs of variance to mean ratios versus the size of the sample plot; see Pielou (1977). s x 2 £(x,-x) 2 (4.1) I= —=- , x (m— l) x 24 Tomppo, E. In forestry, quadrat sampling has mainly been used in the estimation of the intensity X. The estimator is, naturally, the total number of trees in plots divided by the total area of the sample plots. This estimator is always (for every process) unbiased, and it is the maximum likelihood estimator in the case of the Poisson forest. 412. Distance methods In distance methods, the distances from arbitrary trees or from arbitrary points to the nearest trees are measured. The measure ments can be used for hypothesis testing or for the estimation of the parameters of the underlying processes. The earliest tests based on distances were effected in the late 1940's and early 1950'5. The test proposed independently by Hopkins (1954) and Moore (1954) and the tests derived from it have appeared to be among the most powerful ones against the repulsive and clustered alternatives. This test is based on the equation (3.5). Let X and Y be the distances from an arbitrary point and, respectively, from an arbitrary tree of the area to be studied to the nearest tree. In the case of the Poisson forest, 2irX.X.2 as well as 2n\Y2 are x 2 distributed with two degrees of freedom. This implies that for indepen dent samples (X;, i = 1,..., m) and (Y p i = 1,..., m) 27rA2Xi 2 and 27rX2Yj2 are inde pendent and x 2 distributed with 2m degrees of freedom. Hence for the Poisson forest, is F(2m,2m) distributed. The test statistic proposed by Byth and Ripley (1980), is based on this fact, too. Therefore, is approximately N(0.5, (12m) —J ) distributed under the null hypothesis that the pattern is generated by the homogeneous Poisson process. Hopkins' test statistic provides a measure of the clustering of the pattern, too. In the case of the Poisson forest, X; and Y; have the same distribution, which implies that hopN is about 0.5. If the pattern is a clustered one, the distance Y; is small on an average, compared to the distance X; and l/2 7rt2 , and for repulsive processes, K(t) < 7rt 2 on some subinterval (o,t]. Furthermore, the sizes of the deviations from 7rt 2 and the range of their occurrence yield further information of the process. An estimator of K(t) can be derived from its intuitive interpretation: AK(t) is the expected number of all the other trees within the distance t from an arbitrary tree, where A is the intensity of the forest. Let us consider a sampling window E. The ex pected number of trees within this window E is \v(E). This implies that the expected number of tree pairs ( B x ,By ) whose inter-tree distance d(x,y) is at most t, is On the other hand, counted from the data, this is where 1 /i / x , /1 if d(x,y) tj < 4 - 5 ) F(t)= #|i|r,>tl (4.6) d = sup | F(t) - F(t) I, t(En) converges to some constant, say a > 0. Suppose further that the so-called 'sparseness' condition holds; see Saunders and Funk (1977). (This condition is usually satisfied in forestry applications.) Then, under the hypothesis of complete random ness, the stochastic process (on the real line) converges, with some interval (0, t Q) (in the sense of weak convergence), to the Poisson process with intensity «7rt. Flence U(t), 0 < t < tG, can be approximated by a Poisson process with intensity n(n \)Trt/i>(E) and with fixed t by a Poisson distribution with the mean n(n l)7rt 2/2j;(E). This gives the pointwise 1 a confidence limits for K(t) (Randomized intervals must be used because of jumps.) The approximation of U(t) by a limiting Poisson process is suitable if t is small compared to the intensity k and n is 'large enough'. In the case of a binomial process, the validity of the approximation can be checked by using the mean and variance of U(t); see Ripley (1981). For instance, according to the author's simula tion experiments, if E is a circle, radius r, the Poisson approximation is valid with t lower than o.7rP\/n; cf. Ripley (1981). According to the same experiments, the Poisson approximation is not good enough for small values of t. If the intensity corresponds to the usual growing intensity in Finnish forests and the number of trees in the sample plot varies from 40 to 80, the minimum value of t for the above mentioned Poisson approximation varies from 0.25 m to 0.75 m. Beyond the upper limit of t one has to use normal approximation. In the case of binomial process (cf. Ripley (1982)), the variance of K(t) is where P is the circumference of E. This gives the pointwise 1 a confidence limits where Z a/2 is the 100 a/2 percentage point of the normal distribution and s(t) = The widths of the limits (4.12) and (4.14) depend on the value of t. If a transformation &(t) V&(t) is applied in the case of a Poisson process or a repulsive process, almost constant confidence limits (4.9) 2 Sk(x.y) l (0, t) (d(x,y)) , x^y (4.10) K(t) = £ 2 k(x,y) l (o , t] (d(x,y)). n x y (4.1.) u,„ = f|ä (4.12) (cyI (t),cy2(t)), where c = 2 j>(E)/n 2, ni ' s1 a y,(t) = (max n, | Z e~ s <9) and i = 0 l - z n 2 ; y 2 (t) = (min n 2 | 2 e~ s >1- y) i = 0 *■ L with s = 7rn(n l)t 2 /2i/(E). ,«3, 0.244-^, (4.14) (TTt 2 -Z a/2s(t), TTt 2 + Z„/2s(t), 27 Commun. Inst. For. Fenn. 138 are obtained; see Besag (1977) and Saunders, Kryscio and Funk (1982). The test of spatial randomness is gener ally based on the pointwise confidence limits (4.12) and (4.14) if Ripley's summary has been applied. Here K has been considered as a data summary and used for the classification of plots with respect to the type of spatial pattern. The configuration is regarded as repulsive if K. crosses the lower confidence limit first and as clustered if it first crosses the upper confidence limit. (Because the test of randomness is consistent and the Poisson hypothesis will be rejected at any risk level if n is large enough, one could argue the applied procedure to be incorrect. A This problem does not exist here because K has been used as a data summary only; see Section 61 and discussion at the end of the section.) One should note that for example the 0.95 pointwise confidence limits do not correspond to the test of spatial randomness at 5 % risk level. A global test should be used for this purpose. One possible test statistic is see Ripley (1981). To apply this test, one has to fix the maximum distance of interest, t Q . (In applications, t Q depends among other things on the intensity, the size of sample plot and the alternative hypothesis.) An interesting question is, to what risk level of a global test the pointwise confi dence limits correspond. These risk levels can be estimated analytically by approxi mating (numerically) the solution of an integral equation (Durbin 1971) or by using the martingale technique (Novikov 1981). The problem can also be solved by simula tion. For this purpose, a great number of Poisson forests was generated and the summary K(t) was computed from each configuration. The realizations for which K(t) crossed the lower boundary, the upper boundary and both boundaries were counted. The used numbers of trees were 40, 60, 80, 100 and 120, and the sizes of sample plots were chosen such that the intensity of trees varied from 500 to 3000 trees per hectare with each number of trees. Both of the confidence limits 0.99 and 0.95 were used. In a simulation study with 1000 repetitions with each Poisson forest, it turned out that the boundary crossing probability was almost independent of the number of trees and of the intensity of trees. If the 0.99 pointwise confidence limits were used, the probability of crossing lower or upper boundary varied from 0.068 to 0.088, and if the 0.95 limits were used, the probability varied from 0.24 to 0.31. The probability of both boundaries being crossed varied with the 0.99 limits from 0.000 to 0.002, and with the 0.95 limits from 0.001 to 0.004. Another way to use simulation in the approximation of boundary crossing pro babilities is to simulate the limiting Poisson process of &(t); see Ripley (1981). Figure 10 shows the summaries VK.(t)/7r with the 0.95 pointwise confidence limits under the Poisson hypothesis, computed from the data of Figures 1 c) f). 43. Estimation of the parameters of the point processes 431. Problems occurring in estimation In this chapter, we consider the problem of estimating the parameters of the models for spatial patterns. A usual way to proceed with this type of problem is to introduce a subset of models, uniquely specified up to some unknown parameters, and to estimate these parameters by maximizing the likeli hood function, given the observed data. The fitting of a spatial process to an observed point configuration is in general not so straightforward a procedure. Difficulties are caused for example by the fact that, with some parameter values, many processes may give nearly similar spatial patterns. There fore it is difficult to find a correct subset of models. Another difficulty arises from writing the likelihood function and maxi mizing it. For many usual models, including cluster processes, the analytic form of the likelihood function is so far unknown. On the other hand, for instance for Gibbs processes the intractable form of the likeli hood equation makes a straightforward solution impossible. Let the Markov function of a Gibbs process be f. Then the likelihood function, given the point pattern (4.15) Lm sup | VK(t)/7T —t I; t(B XN) = A(B), B e R 2, where A is the intensity measure of the process P. (To see this, replace F by N in the equations (4.19) and (4.20).) This implies that, for a fixed AGA/, Cp(- XA) and X A) are absolutely continuous with respect to A. If A is a-finite, then for fixed x, x E R 2, A a.e. exists and is a measure on (N, N), that is, a point process on the plane. This process is called the Palm distribution of the point process P with respect to the point x. Analogously, there exists a point process which is called the reduced Palm distribution of the point process P with respect to the point x. (P x and are the same on the space (N C (X }> New) •) The measures P X(A) and Pji(A) can be interpreted as the conditional probability of A, given the process has a tree at the point x. (In the reduced case, the tree <5X is not counted. The set x can be excluded from E without affecting the unconditional probabilities of events, because the y-measure of the set x is zero). Any point process P is uniquely determined by its intensity measure and its Palm distributions; see for example Nguyen and Zessin (1979). We now discuss Takacs-Fiksel method more closely. It has been developed for Gibbs processes and is based on the reduced Palm distributions of the process. Let Q p be a Poisson process with an intensity measure p, a non-negative Radon measure on (R 2 , R 2). Recall that a point process P can be described as the Gibbs process with a local energy E(x,/x) and with a weight process Qp if and only if C[> is absolutely continuous 'o (4.18) D(0) = f (S(t) c S(t,o) c) 2 dt, O (4.19) C,,(8 XF)=f f N B B E R 2, FG TV (4.20) Cp(B XF)=/ / 1 F(// - 8 )//(dx)P(d,u) N B B E R 2, F€ N dC P (x XA) (4.21) P X (A) = ,A G N dCp(x X A) (4.22) PI(A) = ,A EN, 30 Tomppo, E. with respect to the measure (p (x) P). Further, (i/(x)P) a.e. see Glötzl (1980). Let now P be a stationary and isotropic Gibbs process. Then the intensity measure is of the form A=\pj>, where Xp is a constant. Let us suppose that also the weight process Q p is a stationary and isotropic process with the intensity Aq. The equation (4.22) and stationarity imply Further according to the equation (4.23) and because of stationarity, the right most side of (4.24) is P a.e., nG N. These equations also imply that, for all measurable functions u: N - [o,°°), The equation can be given the following interpretation. Let us suppose that u; = K(r;), the Ripley's second-order parameter. Then the left-hand side of the equation is the expected number of trees in a circle, radius r;, centred at an arbitrary tree (the centre tree excluded), and the right-hand side of the equation without the 'weight factor' exp(—E(x, fx)) is the expected num ber of trees in a circle, radius r ; , centred on an arbitrary point of the plane. This interpretation becomes more evident in the case of a Poisson process, since in this case exp(— E(x,ju)) =l. The left-hand side of (4.25) is estimated in a non-parametric form and the right-hand side is supposed to be in the parametric form Fiksel (1984) proposes the following estimation procedure: 1. Choice of different suitable functions U;, i = 1,...,NU. 2. Construction of estimators h G (i) of the left-hand side of (4.25) with u(ju) = Uj(ju). 3. Construction of estimators hö(i) of the right-hand side of (4.25) with u(/z) = uj(/n) and E = Eö, where 6 is the unknown parameter vector. 4. Estimation of Qby solving the optimization problem N u (4.26) min £ [h D (i) - h e(i)] 2 . d i= 1 That is, the parameters of Eg are chosen such that the estimators of both sides of the equation are as close to each other as possible. Further, Fiksel (1984) presents estimators of hc(i) and hg(i). Let us suppose that a S realization n 2 and a chemical activity a 0 must be estimated. Further, let qi,...,qM be uniformly distributed test points in the set Er = E 6 b(O,R), where E d b(O,R) means those points of E whose distance from each point of the boundary of the set E is at least the interaction radius R while M = m(Er). Then is an unbiased consistent estimator of the left hand-side of (4.25). (Here T x is the shift operator, see Chapter 2.) An estimator of the right-hand side of (4.25) is In the present study the estimators are constructed in the following way. Let us suppose that we have m sample plots, where the spatial patterns of trees can be supposed to be generated by the same Gibbs process and the potential function is a step function The vector (a 0 ,a],...,an) now corresponds to the parameter vector 6. Let us define u i+j(n +l)+ l = L iL j> i, j = 0,..., n, where L; = \/K(rj)Tn and L 0 =l. Then it follows from (4.27) that is an estimator of the left-hand side of (4.25), after (4.25) has been divided by A.p . Here is the estimator of the trans formed Ripley's V£(r;)/7r, estimated (4-23) d^ (x^=e " E(x, " ); P&o_Pl(m)_ dCj, X q dCj, (• } dP dP d(X p v(x)P) A p d(X q „®P) • exp(-E(x,M)) - • exp(—E(o,/x)), (4.25) / u( M ) P^(d M ) = N \ f u(/x)exp(-E(O,M))P(d/x). N X q /u( /u)exp(—E e(o, /x))P(d/i) (4.27) h D (i) v(Er) j x |£ £ x £ U;(T XM S 0 ) (4.28) h,(i) = j M n M ui( T qi M)exp(lnX -a0 - 2 if d = 0, (4.29) 0(d) = a; if r; _,< d < r; , i = (do = 0), 0 if d > r n . i m (4.30) h o (i,j) =—IL ? L ( ? v / \ <>/ mk = , 0,1 o,j 31 Commun. Inst. For. Fenn. 138 from the sample plot k, and = 1. An estimator of the right-hand side of (4.25) is obtained as follows. First, uni formly distributed test points are placed in each sample plot is chosen equal to the number of trees in the plot k.) We define for each plot where b(q,r) is a circle of radius r (rQ = 0), centrepoint q. Let us further define for each sample plot k, k = 1,..., m and £, ( ok> 0k> =l. An edge correction in accord ance with Figure 9 has been done in the formulas (4.31) and (4.32). The equation (4.28) yields the estimator of the right-hand side of (4.25) (after (4.25) has been divided by X p ). with 0 = and A. = X p/X q. The values of the parameters a; are solved by placing h o (i,j) and ho(i,j) in the equation (4.26) and by minimizing it with respect to a;, i = 0,..., n. The optimum value is obtained when the first derivatives are zero, i.e., The solution can be found iteratively, for example by Newton's method; cf. Takacs (1983). Choosing the final form of the functions u; , i.e., ui+ j(n +l)+ i = LjLj, we made experiments with functions of the form Here B, = b(o,r;)\b(o,r; _O, fj(B 0) =l, and L is the transformed Ripley's parameter K. The performance of the function u in the estimation of the parameters of the process was judged by simulating the corresponding Gibbs process with the obtained estimates of the parameters a 0,..., and calculating from the simulated realiza tions the summaries £(t) and comparing them with the corresponding summary computed from the original data, that is, using a kind of a Monte Carlo method. The superiority of our choice of u can also be argued for in statistical terms. The variance of L(t) is almost independent of t for some processes; cf. Besag (1977) and Saunders, Kryscio and Funk (1982). Thus, in minimizing the function (4.25), all the values of t (in the interval in question) are taken into account with the same weight. The product LjLj (for example instead of L;) is used in order to make the measures Pj, and exp(—E(o,//))P coincide as well as possible. More precisely, we fit together the first and second moments of the random variables i = 1,..., n with respect to the measures PJ, and exp(—E(o,/u))P. Finally, several test points q; are placed in each sample plot instead of for example one point, to decrease the variance of the estimator (4.33). 433. Estimation of the parameters of Gibbs process based on the nearest neighbour measurements The method of the previous section presupposes that the coordinates of trees in some sampling window are known. Often, in forestry, mapped data are not available. Further, measuring the coordinates of trees in a forest area is time-consuming and expensive. Measuring the distances from randomly chosen trees and points to the nearest trees is much easier and quicker to carry out. So far, however, the nearest neighbour measurements have been applied only in hypothesis testing and intensity estimation. In this section we show that the para meters of Gibbs processes can be estimated also with nearest neighbour measurements, i.e., with samples used in a Hopkins type test of randomness, if some additional measurements are carried out. These addi mk (4.31) y,(k) = S M(b(q v , ri)\b(q v ,ri i = V = 1 y m k M(b(q v,r;)) , i = 1,..., n (4.33) h e(i,j) = m J, W" Jexp(-aO -J ]av y< k) ), i,j = 0,..., n (4.34) V(k) = X (h o (i,j) - ho(i,j)) ■£- h e (i,j) =O, k = 0,..., n. 1)U; = /U(B;), i = 0,..., n 2 ) u i + j(n + l) + 1 = MBiMBj), i,j = 0,..., n 3) U; = K(r|), i = 0,..., n 4 ) "i + j(n +l) +1 = K(r i)K(rj), i,j = 0,..., n 5 ) u i + j(n + l) + 1 = L( r ,)L( r j)> '.J = 0,..., n. 32 Tomppo, E. tional measurements are the counting of the numbers of trees around sampled random points in circles with radii up to about from 2.25 m to 4.5 m. Further, it is not necessary to know or to estimate the intensity of trees if the aim is to estimate the potential function as a description of the relative locations of trees. (This is the case also in the method of the previous section. The intensity A. in (4.33) affects the estimate of a G only, not the estimates of a;, i = 1,..., n.) The fact is that if only nearest neighbour measurements are available, these samples do not produce a very reliable estimate for the intensity; cf. Section 412. (However, the intensity can often be estimated quite easily by quadrat methods.) Therefore, we base the estimation of the potential function of a Gibbs process on the nearest neighbour distribution functions (2.7) and (2.8) (and on the corresponding measurements), rather than on Ripley's K or its transformed variants. Let us suppose, as in the previous section, that the process P is stationary and isotropic, and define Then the integral of d with respect to PJ, is the expected distance from an arbitrary tree to the nearest tree, and, with respect to P, the expected distance from an arbitrary point to the nearest tree. The functions u; in (4.25) are constructed from (4.35) as follows: where d Q j is d in (4.35) and dG 0 = 1. (In other words, we are using the first and second moments of the nearest neighbour distribution functions G and F (2.7) and (2.8), respectively, when a weighting has been used in the distances in (2.8).) The estimators of both sides of (4.25) are obtained in the following way. (We assume that (4.25) has been divided by Xp and Aq = 1.) Let us suppose that the potential function is a step function of the form (4.29), i.e., the process is a multiscale pairwise interaction process. Note that this assumption is not necessary for the method under discussion. First, the interaction radius, R, of the corresponding model is estimated, possibly by using the models in Tables 17—19 of Section 631. In order to estimate the left-hand side of (4.25), choose randomly mj trees, with the distances from the trees to the boundary of the sample plot greater than the interaction radius. If the plot is situated within a stand, far enough from the boundary of the stand, it is sufficient for the distance to the boundary of the plot to be greater than the distance to the nearest tree. The distances k = 1,..., mi, from each randomly chosen tree to the nearest tree are measured. Then unbiased consistent estimators of the left hand side of (4.25) with the choices (4.36) of ui,j> i.j = 0,1, are where d^ o = 1; cf. (4.30). The estimator of the right-hand side of (4.25) is obtained as follows. Choose uniformly distributed test points k = 1,...,ni2 from the set E 6 b(O,R), i.e., such that the distances to the boundary of the area E are larger than the interaction radius R. Let k = 1,...,m2 be the measured distances from each of these points to the nearest tree. Further, the numbers of trees in the annuli b(qk>r v)\t )(qk,r v - ]), v = 1,..., n are counted for each random point k = 1,...,m2. Here r 0 = 0 and r n is the interaction radius R. Let us denote the numbers of trees in the annuli by y"vV v = 1,..., n, k = cf. (4.31). Then, we can use as estimators of the right-hand side of (4.25) with choices (4.36) of u; j, i,j = 0,1. Here - 1, X is the intensity of trees and a;, i = 1,..., n are the unknown parameters; cf. (4.33). The estimates of the parameters are obtained by minimizing with respect to 0 = (ai,...,an). The mini mizing is carried out by setting the first derivatives of Z with respect to a;, i = 1,..., n as zero, and solving these equations iter atively, as in the previous section. (As we noted at the beginning of this section, A can be replaced by an arbitrary real number if we (4.35) d(ju) = min {d(o,x) | <5 X 6 juj (4.36) uijOx) = doii(/j)d oij(M), i,j=o,l, ml (4.37) h (i,j) = 2 (3 l/ mi J ojl O.J» ' ' ' (4.38) ho(i,j) = 1 m 2 1 n X d'k, c! (k) -exp(-a 0 - S a v yj k) ), m 2 k=l 1 J A v— 1 i,j = 0,1 (4.39) 7.(8) = 2 2 [h D (i,j) - he(i,j)]2 I=o j = O 33 5 462662 T Commun. Inst. For. Fenn. 138 are interested in a;, i = 1,..., n only.) Here, as well as in the previous section, a general pairwise interaction process (3.17) can also serve as a model. Then the corresponding potential function must be used in (4.38) and (4.33), respectively. The presented method is actually based on the first two moments of the nearest neighbour distributions (2.7) and (2.8). Fiksel (1984) has suggested the estimation of the parameters of a Gibbs process from mapped data, by means of the nearest neighbour measurements, by using the functions This choice corresponds to the fitting together the sample distribution functions (2.7) and (2.8). u (lif^(b(o, ri 1 (0 otherwise. 34 Tomppo, E. 5. THE EMPIRICAL DATA OF THE STUDY As empirical data in the present study we use the grid of the permanent sample plots of the National Forest Inventory, the so called INKA growth sample plots, founded by the Finnish Forest Research Institute, Department of Forest Inventory and Yield. This set of field sample plots, which has been planned and collected under the supervision of Professor Yrjö Vuokila, is in fact the only mapped data covering the whole land; see Vuokila (1983). The INKA plots are a subsample of the 7th National Forest Inventory (7NFI). Some variates measured in this Inventory are also used. The 7th Inventory is presented in Kuusela and Salminen (1980 and 1983). The establishing of INKA sample plots was begun by the Department of Forest Inventory and Yield in 1976. The principal purpose of these sample plots is their use for developing growth and yield models of single trees and stands for forest manage ment planning (Roiko-Jokela 1975). The sampling method and stand population of INKA are discussed in this chapter. Figure 11. The inventory districts of INKA plots and the locations of INKA stands in the southern half of Finland. 51. The sampling design of INKA sample plots The INKA sample plots are distributed all over Finland. The plots applied here (1309 plots) are situated in the southern half of the country, more precisely in the forestry board districts 1—15; see Figure 11. Therefore, the described sampling design concerns only this area. (The sampling design in the northern half of Finland is nearly similar.) The sampling unit of this data is what is called a forest stand, which, by definition, is a connected subarea of forest, homogeneous with respect to some variates; see Loetsch and Haller (1964). These stand variates can be the tree species composition, the forest site type, the development class of the growing stock, and so on. The size of a stand varies in the southern half of Finland usually between 0.2 and 5 hectares (a typical size may be I—3 hectares). The population is restricted to the four most important fertility classes on mineral soils, namely, 1) rich sites, 2) damp sites, 3) sub-dry sites and 4) dry sites; see Kuusela and Salminen (1969 and 1983). At the planning stage, only sound one-storeyed stands capable of further development are accepted from the development classes 'seed ling or sapling', 'young thinning' and 'old thinning' stand (stoniness and marshiness are allowed); see Roiko-Jokela (1975) and Gustavsen (1985). At the sampling stage, a low proportion of the classes 'mature' and 'shelterwood' stand is also accepted. The accepted dominant tree species on rich and damp sites are Scots pine, Norway spruce 35 Commun. Inst. For. Fenn. 138 and birch, and on sub-dry and dry sites Scots pine alone. Another limitation is that the proportion of the dominant tree species in the stand volume is at least 70 percent. (At the sampling stage this limitation has been only partially observed.) The data include 10—15 percent of stands where the proportion of coniferous (or deciduous) trees is at least 70 percent if the dominant tree species is coniferous (or birch). The variables used in the definition of the INKA population and the feasible values of the variables are (Gustavsen 1985; see also Kuusela and Salminen 1969 and 1983): 1) The size of the forest stand. The minimum size of a forest stand is about 0.5 hectare. This follows from the sampling method. The stand must have space for the cluster of three circle sample plots. 2) Land use category. Only forest land is feasible, i.e., the minimum mean growing capacity during the rotation age is 1 m 3/ hectare. 3) Forest site type. Rich sites (Oxalis-Myrtillus Type, Pyrola Type, Geranium-Oxalis-Myrtillus Type), damp sites (Myrtillus Type, Vactinium-Myrtillus Type), sub dry sites ( Vactinium Type, Empetrum-Vactinium Type) and dry sites ( Calluna Type, Empetrum- Calluna Type). 4) Drainage. Undrained mineral soil, or, if the tree species is birch, a small proportion of heathy peatland (drained peatland with upland vegetation) is included. 5) Dominant tree species. Scots pine ( Pinus silvestris L.), Norway spruce (Picea abies Karst), birch ( Betula pendula Roth, and Bet ula pubescens Ehrh.). 6) Proportion of dominant tree species. At least 70 percent of the stem volume (a small number of stands where the total stem volume of coniferous trees or that of deciduous trees is at least 70 percent). 7) Development class of trees. The development classes 3 —5 of the National Forest Inventory, i.e., 'seedling' or 'sapling', 'young thinning' and 'old thinning' stands; also a small number representing the development classes 6—7, i.e., 'mature' and 'shelterwood' stands. For the description of the development classes see Kuusela and Salminen (1969). 8) Age classes. 11—20, 21—30,..., 101—110 years 9) Quality of stand. Good or satisfactory (a small number of low yielding or unmanaged stands; a low proportion of 'unsuitable tree species for site' if the tree species is birch). 10) Tree storeys. One tree storey. 11) Damage. No damage. A cluster of three permanent circle sample plots is placed in each sampling unit (INKA-stand). The partition of the sample plot into a cluster of three small plots aims at a reduction of the measurement costs, not at an improvement of the sampling efficiency by increasing the variances of estimators within a stand. The radii of the plots depend on the number of trees per hectare such that the trees in a cluster will total at least 120 even if the stand is an 'old thinning' stand. This, with the present density trends, implies the minimum size of a stand to be 0.5 hectare. The INKA data are sampled in the following way (Gustavsen 1985): 1. The southern half of Finland is divided into six inventory districts, each containing parts of some forestry board districts; see Figure 11. 2. The sample plot data satisfying the above conditions 1 —ll are selected from the 7NFI data. For each inventory district, the following steps have been repeated. 3. The numbers of the sample plots satisfying the conditions 1 —ll are counted for that part of each forestry board district, which belongs to the inven tory district. 4. The sample plots of the inventory district (usually 70 clusters) are allocated to the forestry board districts (relevant parts) in the proportions of these counts. 5. For each forestry board district in southern Finland, the relative frequencies of the sample plots of 7NFI are calculated on the basis of the variates (i) the forest site type, (ii) the dominant tree species, (iii) the precision of the dominant tree species, (iv) the development class of trees, (v) the age class of trees and (vi) the quality class of trees. (The precision of the dominant tree species is defined to be 0 if the stem volume of the species is at least 70 % of the total stem volume, 1 if the total volume of coniferous trees or deciduous trees is at least 70 °lc and 2 otherwise.) 6. Based on these distributions, the INKA plots are first allocated to different tree species within each forestry board district. For each tree species (in each forestry board district), the plots are classified according to the age, development, and quality classes. This last procedure is not effected exactly in accordance with the relative allocations. In fact, the proportions of observations in (i) the age class of over 80 years, (ii) the development classes 5, 6 or 7, (iii) the quality classes 3, 4 or 6 and (iv) the class where the precision of the dominant tree species is 1, are smaller within INKA plots than the quotas defined by the 7NFI show. 7. Finally, the sampling units are chosen from each stratum formed above, in such a way that the final spatial distribution of the units is regular. Because INKA are a subsample of the 7th National Forest Inventory of Finland we can combine the data of INKA and 36 Tomppo, E. 7NFI, which possibility is made use of in this study. Figure 11 shows the inventory districts of INKA plots in the southern half of Finland and the locations of INKA stands. The dense grid in the district of the year 1981 includes the sample plots used in the so-called Nurmes project. Those plots are also used in the present study. The numbers of sample plots used in this study in different inventory districts are: District (inventory year) Number of sample plots 1978 ' 209 1979 209 1980 209 1981 414 1982 209 1983 59 In total 1309 (11 sample plots including coding errors are excluded.) The number of stands is thus about 440. The following INKA variates are used here: Stand variates: 1) the inventory tract number in 7NFI, 2) the stand number in 7NFI, 3) the stand number in the tract, 4) the forest site type, 5) the development class, 6) the radius of sample plot, 7) the measurement year. Tree variates: 8) the coordinates of trees, 9) the tree species, 10) the diameter of the tree at the height of 1.3 metres. The following variates from the 7NFI data for each plot are also used: 11) the way of the regeneration of the stand, 12) the demand of thinning, 13) the quality of the stand. 37 Commun. Inst. For. Fenn. 138 6. RESULTS 61. Proportions of spatial pattern types The basic unit of analysis in this study is an individual sample plot, not a cluster of three sample plots. This makes it possible to take into account also the variation within the stand in the analysis of spatial patterns of trees. The sample plots are first divided into three classes by using Ripley's second-order summary K(t). The classes are called repul sive (or regular), Poisson and clustered forests. The classification is based on the pointwise confidence limits of the parameter K(t), obtained under the Poisson hypo thesis. The used confidence limits are the limits of (4.12) with small values of t and the limits of (4.14) with larger values of t. If K(t) crosses the lower limit first, the sample plot is regarded as repulsive, and if it first crosses the upper limit, it is regarded as clustered. In this way a new variate is obtained for each sample plot, namely, the estimated type of spatial pattern. us denote this by T. (The probability of K(t) crossing both the lower and upper limits, under the Poisson hypothesis, while 0.95 pointwise confidence limits are used, in the case of sample plot like INKA plots, is quite low. According to our simulation experi ments, this varies from 0.001 to 0.004, depending on the size of the plot and the intensity of trees.) The results are naturally affected by the actual confidence level. Tables 1 and 2 show the distributions of the spatial pattern types in each inventory district, with the classifi cation based on the 0.99 and 0.95 pointwise confidence limits. The choice of the used confidence levels is here based on the visual inspection of the sample plots and on the simulation experiments with different spatial pattern models. If the classification is made with the 0.95 confidence limits, the percentage of regular forests in the whole data is 57 %, of Poisson forests 25 % and of clustered forests 18 %. According to this result, most of the seedling, sapling and thinning stands are not so clustered as is sometimes conjectured. (One reason for this conjecture might be that Poisson forests are regarded as clustered; cf. Figure 1 a).) If the classification is carried out with the Table 1. The sample plots classified according to the spatial pattern type T in the inventory districts I—VI: cf. Figure 11. The classification is based on the pointwise 0.99 and 0.95 confidence limits. >unts istrict II III iv VI .99 .99 .95 .99 .95 .95 .99 .95 .99 .95 re Po cl 64 106 39 107 50 52 85 113 11 124 64 21 100 96 13 139 50 20 138 193 83 200 103 111 111 84 14 141 37 31 21 33 5 36 18 5 Vital 209 209 209 209 414 414 59 59 •oportions istrict II III IV VI .99 .95 .95 .99 .95 .99 .95 .99 .95 .99 .95 re Po cl .306 .507 .187 .512 .239 .249 .407 .541 .052 .593 .306 .101 .479 .459 .062 .665 .239 .096 .333 .466 .201 .483 .249 .268 .531 .402 .067 .675 .177 .148 .356 .559 .085 .610 .305 .085 'otal 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 38 Tomppo, E. 0.99 confidence limits, the classes we obtain are strongly repulsive and strongly clustered. From Table 2 it can be seen that, among 747 repulsive sample plots, there are 519 (69 %) strongly repulsive ones and that out of 240 clustered sample plots 165 (also 69 %) are strongly clustered ones. The proportion of Poisson forests in different inventory districts varies from 24 % to 30 %, except in the district V, where it is 18 %. As regards the percentages of repulsive forests and clustered forests, the ranges of variation are 48 %—6B % and 9 %—27 %, respectively. In the districts I and IV the proportions of repulsive forests are smaller and those of clustered forests larger than in other districts (the differences are statistically significant with a risk level below 1 %). The INKA plots represent young or middle-aged stands with spatial patterns strongly affected, among other things, by the degree of the forest manage ment intensity. According to this result, there seems to be a relatively larger number of unmanaged forests in the inventory districts I and IV than in other districts. In Table 3, the sample plots of each inventory district are classified according to the type of spatial pattern and the dominant tree species. The dominant tree species of a sample plot is here defined to be the tree species with the largest cumulative sum of basal area at the height of 1.3 m. (This definition usually yields the same dominant tree species as the INKA system, because the stem volume of the dominant tree Table 2. The absolute and relative frequencies of the spatial pattern types T in the whole study area. The classification is based on the 0.99 and 0.95 pointwise confidence limits. species in INKA plots represents in most cases at least 70 % of the total tree volume.) Table 4 shows the distributions of the spatial pattern types in the whole study area for each of the tree species. In pine- and spruce-dominant stands, the proportion of clustered forests is the same, 17—18 %, while repulsive forests are slightly more frequent in pine-dominant stands (61 %) than in spruce-dominant ones (52 %). In birch-dominant stands, the percentage of clustered forests is 31 % and that of repulsive forests 53 %. The distribu tions of the spatial pattern types also depend on other stand variates, for example the mean diameter of trees, as will be seen in the next section. If repulsive stands are further classified into repulsive and strongly repulsive classes, and clustered ones into clustered and strongly clustered classes, in the way described above, we obtain the distributions of the data presented in Table 5. Table 3. The number of sample plots of each inventory district classified by the spatial pattern type T and the dominant tree species (p = pine, s = spruce, b = birch). The classification is based on the pointwise 0.95 (above) and 0.99 (below) confidence limits. Abs. fr. Rel.fr. Abs. fr. Rel. fr. -e Po :1 519 625 165 .397 .477 .126 747 322 240 .571 .246 .183 "otal 1309 1.000 1309 l.i istrict re Po cl 60 19 24 38 28 23 9 3 5 71 22 5 44 39 15 9 3 1 78 29 12 53 17 5 8 4 3 137 66 67 55 33 29 8 4 15 88 19 16 44 18 12 9 0 3 25 12 4 8 6 1 3 0 0 "otal 103 89 17 98 98 13 119 75 15 270 117 27 123 74 12 41 15 re Po cl 38 47 18 19 53 17 7 6 4 52 42 4 28 64 6 5 7 1 53 58 8 41 31 3 6 7 2 89 134 47 42 48 27 7 11 9 73 42 8 31 40 3 7 2 3 15 22 4 5 9 1 1 2 0 "otal 103 89 17 98 98 13 119 75 15 270 117 27 123 74 12 41 15 39 Commun. Inst. For. Fenn. 138 Table 4. The sample plots of the whole study area classified by the spatial pattern type and the dominant tree species, with the 0.95 and 0.99 pointwise confidence limits. Table 5. The sample plots of the study area classified by the spatial pattern type (5 classes) and by the dominant tree species. All the classifications made here are, in fact, based on a test of the spatial randomness of trees; cf. Section 421. However, the use of a test in this kind of a classification is open for criticism. Firstly, the result is affected by the used risk level. Secondly, it should be noted that if the sample size is large enough the Poisson hypothesis will be rejected at any risk level. Another possibility would be to base the classification on a model and its parameters. However, the used method is justified, because the second-order parameter, Ripley's K(t), is equivalent for example to the potential function of the Gibbs process. Moreover, the number of trees in the sample plots does not much vary but falls between 32 and 48 in 76 % of the plots. The use of a parametric model is also quite problematic. The sample plots are fairly small, and the parameter estimation based on a single sample plot would therefore be unreliable. The resulting classifi cation would also be rather crude. Further, the estimation of the parameters based on a single sample plot is time-consuming if the used model is not a Gibbsian one. The suitable type of model for each plot should be chosen first, possibly by means of simulation, and the parameters of the models should then be estimated, possibly also with an intractable method. This procedure also requires a parametric classi fication rule. 62. Prediction of the spatial pattern type in terms of stand variates This section deals with the prediction of the spatial pattern type of trees by applying forest stand variates, herein called cova riates. If it is possible to predict the spatial pattern in terms of these 'conventional' forest characteristics, it will be possible to simulate artificial forests, for different kinds of research purposes, with the relative loca tions of trees distributed in accordance with the same laws as in natural forests. For the simulation, only the values of covariates from the relevant forest must be known. Further, the measurement of the values of conventional forest variates is usually easier and cheaper than the identification of the spatial pattern for example by investigating the exact locations of the trees. Naturally, a given combination of covariate values does not uniquely determine the spatial pattern of trees. The corresponding configuration Abs. fr. Rel. fr. Abs. fr. Rel. fr. re Po cl 459 167 128 242 141 85 46 14 27 61 22 17 52 30 18 53 16 31 320 345 89 166 245 57 33 35 19 42 46 12 36 52 12 38 40 22 "otal 754 468 87 100 100 100 754 468 87 100 100 100 frequency ;rcentage Pine Spruce Birch Total Pine Spruce Birch Total ;tr re e Po :1 ;tr cl 320 139 167 39 89 166 76 141 28 57 33 13 14 8 19 519 228 322 75 165 42 19 22 5 12 36 16 30 6 12 38 15 16 9 22 40 17 25 5 13 "otal 754 87 1309 100 100 100 100 40 Tomppo, E. can actually vary between extreme regularity and close clustering. However, it is probable that the relative frequencies of the spatial pattern types depend on the values of the covariates. For instance, on might expect that an increase from 15 cm to 30 cm in the mean diameter of trees will make the spatial pattern more regular on an average, and, if other variates are fixed, an increase in the standard deviation of the diameters of trees will raise the proportion of clustered forests. Our aim is to find those covariates which are the best predictors of the distribution of the spatial pattern types and to estimate the corresponding conditional distributions. Here the types of spatial pattern have been divided into three classes, repulsive, Poisson and clustered forests, by applying Ripley's K and the 0.95 pointwise confidence limits of K. The choice of the confidence level is based 1) on the fact that the number of trees in a plot is fairly small, 2) on the visual inspection of the point configurations, and 3) on simulation experiments with each type of model, the sizes of plots and the intensity of trees corresponding to those of INKA sample plots. 621. Estimation method used The relationship between the type of spatial pattern and the covariates was first inspected by applying correlograms and 2- way cross classifications. The objective was to find those covariates which would best explain the variation of the spatial pattern type. The dependence of the pattern type on some covariates of 7NFI is also inspected. The applied covariates and their abbrevia tions are: Forest site type (xj) Dominant tree species (x 2) Development class of trees (x 3) Basal area (x 4) Mean diameter of trees (x 5) Standard deviation of tree diameter (x 6) Number of trees per hectare (x 7) Mode of forest regeneration (x 8) Quality of stand (x 9) Demand of thinning ( xio) The values of all covariates are calculated plotwise. The mean diameter and standard deviation of diameter are calculated as weighted mean values, the diameter of the tree as the weight. The determination of the dominant tree species is described in the previous section. This preliminary analysis showed the dependence of the spatial pattern type on the forest site type (xj) and the dominant tree species (X 2) to be quite weak. Further, these covariates are mutually dependent due to the choice of the population; cf. Chapter 5. However, there are some differences in the distributions of the pattern types of different tree species, wherefore models are calculated by tree species, too. Moreover, the development class is strongly dependent on the mean diameter of the trees; so it is enough to include the mean diameter, because its explanation capacity seems to be better. The covariate 'quality of stand' is excluded due to its weak explanation capacity and its dependence on 'demand of thinning'. Also the values of 'quality of stand' do not vary very much; two thirds of the stands represented the value 'good quality'. The covariates chosen as potential ex planatory variates of the spatial pattern type in a more detailed analysis are 1) the basal area of trees, 2) the mean diameter of trees, 3) the number of trees per hectare, 4) the standard deviation of diameter of trees, 5) the mode of forest regeneration and 6) the demand of thinning. The aim is to identify those ones which best explain the variation of the pattern type and to estimate the probabilities of the classes, given the values of these covariates. A natural model for the conditional probabilities is a multivariate logistic model. To justify the use of the logistic model we assume, for simplicity, that we have three 'explanatory' covariates A, B and C. We apply the logistic model with classified covariates. The classification will be needed later in the estimation of the parameters in the Gibbs models; cf. Section 63. The sample plots are classified with respect to the values of the covariates and the response variate T into a four-dimensional array. Let us denote the (theoretical) frequencies of the array by Yj ', i = 1,2,3, j = 1,...,J, k = 1,...,K, 1 = 1,..., L, where i,j,k,l refer to the levels of T, A, B and C, respectively. We are not interested in modelling interactions between the covariates. Thus we consider the marginal totals $ to be fixed to their observed values, denoted by n)kl. We assume 41 Commun. Inst. For. Fenn. 138 that the random vector (Yjjkl; i = 1,2,3) is multinomially distributed with parameters (n) kl ; 7rji kl , i = 1,2,3), for all j,k,l, where Trjjkl = pr (T =i|A=j,B =k,C = 1) denote the conditional probabilities of T given the values of the covariates. The probabilities are subject to the constraints 2 7r; jkl=l, > 0. A model which has the'advantage of being linear in the parameters to be estimated and also takes account of the above constraints on TT\ jkl is the multivariate linear logistic model. We express the conditional probabilities as where has a linear structure in the parameters. We proceed by presenting different structures for rjjjkl and their interpretations. The model with no effects of covariates on the probabilities, called the minimum model, is According to this model, = for all j,k,l, and the estimates of are the proportions of the different values of T in the data. If all covariates A, B and C affect the probabilities, and do this independently of each other, the model is If, for example, the effect of A is different in the different categories of B, the corresponding interaction parameter (a/3)ijk should be included in the linear expression, and we obtain the model If the interaction parameters (ajB)ijk change with the values of C, also the second-other interaction parameter (ay3"y)ijkl should be included. The so-called full model (the saturated model) contains all the main effect parameters and the first- and second order interaction parameters. The aim is to find a parsimonious model between the minimum model and the full model which fits the data well. The fit of a particular model can be tested using the statistic where l c is the likelihood of the current model and If the likelihood of the full model with the given data. Note that the full model has as many parameters as the data has random frequencies, and, consequently, the full model fits the data perfectly. Under certain regularity conditions, any nested models 1 and 2 (i.e., the parameter space under the model 2 is a subspace of that under model 1) can be tested using a statistic of the form (6.5). Then, under the model 2, 5(2,1) = —210g(12/lj) is asymptoti cally x 2 distributed with t] —t2 degrees of freedom, where t; is the number of independent parameters estimated under the model i; see for example Kendall and Stuart (1967). Both the overall test of goodness of fit and tests of nested models can be used in the selection of interaction terms for the right-hand side of the model (6.2) and in the comparison of the explanation capacities of different covariates. If no computer program is available which fits multivariate linear logistic models directly, a program for fitting log-linear models to Poisson data can be used. For the correspondence between logistic models and log-linear models, see Birch (1963) and Palmgren (1981). 622. Conditional distributions of the spatial pattern types For the estimation of the conditional distributions of the spatial pattern types, the values of each covariate are divided into three classes, as shown in Table 6. This classification is thought to be fine enough also for continuous covariates, because of the conjecture that small changes in the covariates would not essentially change the distributions of the spatial Table 6. The classes of covariates. (6.1) 7r; >kl = £ kl V i = 1,2,3, j = k = 1,...,K, 1 = 1,..., L, (6.2) = Mi (6.3) rj;) kl -Mi + <*ij +Ak + Til (6.4) r7i« kl -m + «ij + + Til + («o)ijk- l c (6.5) S(c,f) =-2 log (jp, Zovariate Class 1 Class 2 Class 3 4 (m 2/ha) <12 5 (cm) <10 6 (trees/ha) < 1000 7 (cm) < 3 8 natural 10 urgent 12 < Xj < 24 24 < 10 < x 5 < 20 20 < 1000 7i ,kl =Mi + «ij +Ak + Til- (6.6) r;Jkl -M, + <*ij + ftk + 7il + (fiy),kl Covariates S(c,f) Degrees of freedom x 4, x5, x 6 x 4 , x 5, x 7 x4, x5, x 10 X 4. x 6, X 10 x 4 , x7, X 10 x 5> x 7, x 8 x 6> *7. x 8 75.0 74.2 139.7 137.0 104.5 104.2 155.5 40 40 40 40 40 40 40 x 5 x 4 1 Total 1 1 2 3 7 12 18 0 0 0 o o o 14 8 5 30 5 1 1 0 0 1 1 1 85 52 22 78 39 11 216 117 58 2 1 2 3 84 20 24 0 0 0 0 o o 44 16 15 91 36 28 41 24 14 0 0 0 3 8 7 34 33 6 297 137 94 3 1 2 3 100 25 28 26 7 6 1 1 1 7 2 4 57 19 33 40 13 15 0 0 0 1 o o 2 1 1 234 68 88 Total 318 39 3 115 300 148 3 178 205 1309 43 Commun. Inst. For. Fenn. 138 classes are given in Table 6. The spatial pattern type has been classified by using the pointwise 0.95 confidence limits of K. The classes are 1 = repulsive, 2 = Poisson and 3 = clustered sample plot. The conditional probabilities are estimated from Table 8 with the model (6.6). The weight zero is given to the cells with low frequencies, more precisely, to the 3 cells (i, j, k, 1), i = 1,.. ~ 3, with 1 yJ kl <3. i = l Those cells would have made the model poorer (and their frequencies would have been unreliably estimated in every case). In this way we obtain a model with S(c, f) = 19.44 (degrees of freedom = 24) and p-value = 0.728. The estimated probabilities are indicated in Table 9. All computations concerning multivariate logistic models are carried out by using the GLIM software package. (The Poisson error and log link in the GLIM are used; cf. Palmgren (1981).) The examination of the effects of the covariates on the distributions of the spatial pattern types in Table 9 shows among other things that 1. An increase in the mean diameter does not increase the proportions of repulsive stands, but mildly decreases them. (Here, unusual combinations of covariate values are excluded, for instance the combination of the mean diameter = 1 and the intensity of trees = 1. In this class, which mainly includes pine-dominant sample plots, the large proportion of clustered forests is a consequence of the inhomogeneity of the spatial pattern.) 2. In the diameter classes 2 and 3, a rise in the intensity of trees will increase the probability of aggregation and decrease the probability of repulsion. This kind of relationship does not exist in the diameter class 1. 3. If the mean diameter and the intensity of trees are kept constant, an increase in the basal area will raise the probability of repulsion and decrease the probability of aggregation. The last observation indicates that, with fixed values of the mean diameter and the intensity of trees, a stand with a regular spatial pattern has a larger basal area (and stem volume) than the corresponding clus tered forest. (One reason for this might be that in a clustered stand the diameter of the trees varies in a wider range than in the corresponding regular one.) The conditional distributions of the spatial pattern types have been estimated also by the dominant tree species, i.e., separately for pine, spruce and birch. Tables 10, 11 and 12 give the cell frequencies for these. The classification is based on the spatial pattern types, mean diameter, basal area and intensity of trees. The class limits are the same as in Table 6. The conditional probabilities correspond ing to Tables 10—12 are estimated by applying the model (6.6). The results are indicated in Tables 13 —15. The correspond ing p-values, the values of S(c,f) and the degrees of freedom are indicated in Table 16. A comparison of Tables 13 and 14, i.e., the distributions of pine and spruce, yields (among others) the following conclusions: 1. If the mean diameter of trees is less than 10 cm, spruce-dominant stands are more regular than pine dominant ones. The probability of a typical spruce dominant stand being regular is 0.8, while the figure for a pine-dominant stand is 0.6. One explanation of this phenomenon can be the mode of regeneration, spruce-dominant stands being plantation stands Table 9. The estimated probabilities of the spatial pattern types as a function of stand covariates, estimated from the whole data. x 5 x 7 T x 4 1 1 1 2 3 .189 .324 .487 .688 .207 .105 .706 .206 .088 .556 .309 .135 .583 .327 .090 2 1 2 3 .656 .156 .188 .539 .237 .224 .565 .242 .193 .607 .262 .131 .384 .424 .192 .412 .457 .131 3 1 2 3 .654 .163 .183 .676 .168 .156 .491 .165 .344 .527 .171 .302 .592 .193 .215 .500 .250 .250 44 Tomppo, E more frequently than the pine-dominant ones. 2. In the diameter class 2 the situation is the opposite, the probability of a stand being regular varies in pine-dominant forests from 0.56 to 0.77 and in spruce-dominant forests from 0.34 to 0.63. Corre spondingly, the probability of a stand being clustered varies from 0.20 to 0.24 with pine and from 0.14 to 0.40 with spruce. 3. When the mean diameter rises from the class 1 to the class 2, pine-dominant stands become somewhat more regular and spruce-dominant stands substan tially more clustered. This trend can be caused by the fact that new tree individuals capable of competition do not appear any more in pine dominant stands of this development phase as happens in spruce-dominant stands. In the diameter class 2, an increase in the intensity of trees raises the proportion of clustered forests for both tree species. 4. In the diameter class 3, the differences between the distributions of the spatial pattern types become smaller, especially with an intensity of trees below 1000 trees/hectare. With larger values of intensity, the pine-dominant stands are still more irregular than the spruce-dominant ones. The distributions of the spatial pattern types of birch (Table 15) markedly differ from the previous ones. A characteristic feature of the diameter class 1 is the small proportion of Poisson forests and of the diameter class 2 the small proportion of regular forests. When the mean diameter exceeds or equals 20 cm, the distributions of birch approach those of pine and spruce. However, one should note that the number of birch-dominant sample plots is rather small for this kind of analysis and that the obtained estimates can be unreliable. Table 10. Pine-dominant sample plots classified by the spatial pattern type (T), the mean diameter (x 5), the basal area (x 4) and the intensity of trees (x 7). Table 11. Spruce-dominant sample plots classified by the spatial pattern type (T), the mean diameter (x 5), the basal area (x 4), and the intensity of trees (x 7). x 5 x 7 T x 4 1 3 Total 1 1 2 3 6 10 17 0 0 o o o o 11 8 4 25 2 1 1 0 0 1 1 1 44 26 7 26 13 6 114 60 36 2 1 2 3 54 17 22 O O O o o o 32 14 10 69 13 10 19 9 4 0 0 0 1 o 2 13 7 1 188 60 49 3 1 2 3 71 24 18 20 7 3 0 1 0 3 1 3 43 11 12 17 2 7 0 0 0 1 0 o 2 1 0 157 47 43 Total 239 30 1 86 186 59 3 81 69 754 x 5 x 7 T x 4 1 3 Total 1 1 2 3 0 2 1 0 0 0 o o o 1 0 1 3 3 0 o o o o o o 33 23 11 49 23 5 86 51 18 2 1 2 3 15 3 1 O O O o o o 11 2 4 19 20 13 21 14 9 0 0 0 2 7 5 21 26 4 89 72 36 3 1 2 3 26 0 7 2 0 0 1 0 o 3 1 1 13 7 17 22 10 5 0 0 0 0 0 0 0 0 1 67 18 31 Total 55 2 1 24 95 81 0 81 129 468 45 Commun. Inst. For. Fenn. 138 Table 12. Birch-dominant sample plots classified by the spatial pattern type (T), the mean diameter (x 5), the basal area (x 4), and the intensity of trees (x 7). Table 13. The estimated probabilities of the spatial pattern types for pine as a function of stand covariates. Table 14. The estimated probabilities of the spatial pattern types for spruce as a function of stand covariates. x 5 "7 T x 4 1 3 Total 1 1 2 3 1 0 0 0 0 0 0 0 0 2 0 0 2 0 0 0 0 0 0 0 0 8 3 4 3 3 0 16 6 4 2 1 2 3 15 0 1 0 0 0 0 0 0 1 0 1 3 3 5 1 1 1 0 0 0 0 1 0 0 0 1 20 5 9 3 1 2 3 3 1 3 4 0 3 0 0 1 1 0 0 1 1 4 1 1 3 0 0 0 0 0 0 0 0 0 10 3 14 Total 24 7 1 5 19 8 0 16 7 87 x 5 T x 4 1 1 1 2 3 .182 .303 .515 .624 .240 .136 .773 .160 .067 .588 .312 .100 .549 .333 .118 2 1 2 3 .581 .183 .236 .557 .249 .194 .725 .174 .101 .690 .189 .121 .579 .294 .127 3 1 2 3 .603 .233 .164 .760 .157 .083 .466 .178 .356 .662 .136 .202 .618 .145 .237 x 5 x 7 T x 4 1 1 1 2 3 .500 .500 .000 .487 .346 .167 .641 .297 .062 2 1 2 3 .789 .158 .053 .634 .149 .217 .337 .349 .314 .516 .348 .136 .247 .505 .248 .383 .509 .108 3 1 2 3 .788 .000 .212 .644 .093 .263 .363 .233 .404 .577 .241 .182 46 Tomppo, E. Table 15. The estimated probabilities of the spatial pattern types for birch as a function of stand covariates. 63. The estimates of the parameters of Gibbs processes 631. The estimates from mapped data The models for spatial patterns of regular and clustered stands are estimated from mapped data in the way described in Section 432. The multiscale pairwise interaction process (3.19) is used as a model. The arguments for this choice are: 1) The model is suitable even if interaction between trees varies with distance from repulsive (even hard-core) to attractive. Such a variation seems to be a typical feature of the potential functions of trees, i.e., some intertree distances can occur less frequently while, simultaneously, some other intertree dis tances occur more frequently in a forest stand than in the Poisson forest. The estimated potential function can be regarded as a second-order data summary if it is hard to find a biological interpretation for it. 2) The parameters of the models can be estimated with the available data. 3) It is easy to simulate the locations of trees with the estimated models. 4) It is not necessary to know the pattern type beforehand. Further, the chosen model is suitable for both repulsive and clustered stands as well as for stands where both types of interaction occur, as indicated above. The choice of the model will be further discussed at the end of this section from the point of view of the obtained estimates. A step function of the form (4.29) is applied as the potential function. The width Table 16. The p-values, scaled deviances and degrees of freedom of the models (6.6) for the conditional probabilities of the spatial pattern types for pine, spruce and birch. The models are estimated from Tables 10—12. of the step, based on the experiments, is 25 cm in all cases, except in repulsive stands with the mean diameter exceeding or equalling 20 cm, for which it is 50 cm. Before the estimation of the models, the sample plots are divided into homogeneous groups, where the trees can be supposed to be generated by the same spatial process. The variates used in the classification are the tree species (pine, spruce, birch), the mean diameter ((0,10), [10,20), [2O,—)), and the spatial pattern type (repulsive, Poisson, clustered). Thus, each tree species has nine possible models for the spatial pattern (including the Poisson forest). Further, no subclassification of plots for example with respect to the basal area or intensity of trees is considered necessary, because it is as sumed that, after the mean diameter and spatial pattern type have been fixed, the variation of these covariates does not substantially affect the form of the potential function. The estimated models are tested by simulating the obtained process. The second order summary K is calculated from the x 5 1 2 3 x 7 T x 4 1 2 3 1 2 3 1 2 3 1 .532 .503 1 2 — — — — — — — .244 .391 3 — — — — — — — .224 .106 1 .938 .281 .302 2 2 .000 — — — .243 .442 — — — 3 .062 — — — .476 .256 — — — 1 .429 .571 .154 .216 3 2 .142 .000 — — .111 .266 — — — 3 .429 .429 — — .735 .518 — — — 'iance ;rees of 5 ine »pruce »• _1_ 0.7380 0.9999 19.26 6.46 24 24 ~ 1 47 Commun. Inst. For. Fenn. 138 realizations and compared with the corre sponding summary computed from the data, if the differences are not small enough, the model is estimated with a new interaction radius. The inhomogeneous sample plots are removed before the estimation. Tables 17—19 present the estimates of the potential functions and the parameter a Q for each tree species. The tables also give the estimated intensities, trees/m 2 (for the Poisson forests as well). The estimated potential functions are of the form Here, the step width c is 25 cm except for repulsive forests in the diameter class 3, where it is 50 cm. Table 17. The estimates of potential functions, parameters aG and intensities of trees (X) (trees/m 2 ) for pine, by the mean diameter class (D) and spatial pattern type (1: D < 10 cm, 2: 10 cm < D < 20 cm, 3: D > 20 cm). Table 18. The estimates of potential functions, parameters a Q and intensities of trees (X) (trees/m 2) for spruce, by the mean diameter class (D) and spatial pattern type (1: D < 10 cm, 2: 10 cm < D < 20 cm, 3: D > 20 cm). °° if d = O, (6.7) (d) = 3.\ if (i —l)c < d < ic, i=l, . . n, 0 if d > nc. Pine Repulsive Clustered Poisson 1 2 3 1 2 3 1 2 3 A *0 h A 2 a 3 ! 4 *5 ! 6 A 7 A 8 A 9 « 10 A 1 ' ?' 2 ? 13 A 14 A 15 a 16 0.214 0.144 0.067 -5.8 -4.6 -4.1 4.5 30.0 30.0 2.8 6.5 7.0 2.2 5.2 3.5 2.0 2.6 2.0 1.0 1.4 1.0 1.0 0.7 0.0 0.0 0.0 0.0 -0.2 0.0 -0.4 -0.2 -0.2 -0.4 -0.2 -0.3 -0.3 0.212 0.173 0.068 0.167 -4.0 -4.0 -2.9 30.0 30.0 30.0 -1.0 1.0 30.0 -0.7 -1.0 1.0 -0.3 -0.7 -1.2 0.0 -0.3 -1.1 0.7 0.0 -0.5 0.7 0.0 -0.2 -0.2 1.0 0.0 -0.3 1.0 0.0 -0.3 0.4 -0.3 0.6 0.8 0.6 0.4 -0.6 -0.6 0.132 0.064 Repulsive Clustered Poisson 1 2 3 1 2 3 1 2 3 0 1 L 2 3 4 5 6 7 8 9 10 11 12 13 14 l 15 0.218 0.182 0.067 0.198 0.197 0.080 0.156 -5.8 -4.6 -4.1 -4.0 -4.0 -2.9 5.0 30.0 30.0 30.0 30.0 30.0 3.0 6.0 7.0 -1.0 0.9 30.0 2.5 5.0 3.5 -0.7 -1.1 1.1 2.5 2.7 2.0 -0.5 -0.7 -1.3 1.0 1.5 1.0 0.0 -0.3 -1.1 1.0 0.7 0.0 0.7 0.0 -0.6 0.0 0.0 0.0 0.7 0.0 -0.2 -0.3 0.0 -0.4 -0.2 1.0 0.0 -0.3 -0.2 -0.4 -0.2 1.0 0.0 -0.2 -0.3 0.4 -0.2 -0.2 0.6 -0.2 0.7 0.6 0.4 -0.6 -0.6 0.178 0.087 48 Tomppo, E. Table 19. The estimates of potential functions, parameters ac and intensities of trees (X) (trees/m 2) for birch, by the mean diameter class (D) and spatial pattern type (1: D < 10 cm, 2: 10 cm < D < 20 cm, 3: D > 20 cm). In the previous tables the intensity function value of 30 corresponds to the hard-core distance (theoretical value °°), i.e., the minimum distance between trees. The estimation of these distances is based on the observed minimum intertree distances and on the simulation experiments. (At this point, the estimation method usually gives a smaller value for the potential function.) The value of the parameter a Q has been estimated to correspond to the given intensity values and an area of 400 m 2. If the intensity of trees or the size of the area differ from these, the estimate of a Q has to be corrected. value of A the correction term is ln(400X/bA'), where A is the intensity given in the tables, A' is the new value of the intensity and b is the size of the area (m 2 ). Figures 12—17 illustrate the estimates of the potential functions. Each figure shows the functions for all the three considered tree species with one diameter class and one spatial pattern type. The tables and figures indicate the following facts. The interaction radius varies from 2.25 m to 4.5 m, depending on the mean diameter. Obviously, this is not a consequence of the biological properties of the trees, but of their current relative locations. These mostly result from the mode of regeneration, silvicultural measures and thinning operations. Figure 12. The pontential functions of repulsive stands, the mean diameter less than 10 cm. Figure 13. The potential functions of repulsive stands, the mean diameter greater than or equal to 10 cm and less than 20 cm. Birch Repulsive Clustered Poisson 1 2 3 1 2 3 1 2 3 V 0 l l 2 3 4 15 16 7 8 9 10 '11 '12 '13 '14 '15 '16 0.169 0.132 0.061 0.481 0.197 0.088 0.274 -5.6 -4.5 -4.1 -4.3 -4.0 -2.9 4.2 30.0 30.0 30.0 30.0 30.0 2.5 6.3 6.8 -0.8 1.2 30.0 2.0 5.2 4.0 -0.6 -1.2 0.8 2.0 2.6 2.0 -0.5 -0.8 -1.3 1.0 1.4 1.0 0.0 -0.3 -1.1 1.0 0.7 0.0 0.6 0.0 -0.5 0.0 0.0 0.0 0.6 0.0 -0.2 -0.3 0.0 -0.4 -0.2 1.0 0.0 -0.3 -0.2 -0.4 -0.2 1.0 0.0 -0.2 -0.2 0.4 -0.3 -0.2 0.7 -0.3 0.8 0.6 0.3 -0.6 -0.6 0.151 0.071 49 Commun. Inst. For. Fenn. 138 Figure 14. The potential functions of repulsive stands, the mean diameter greater than or equal to 20 cm. Figure 15. The potential functions of clustered stands, the mean diameter less than 10 cm. Figure 16. The potential functions of clustered stands, the mean diameter greater than or equal to 10 cm and less than 20 cm. Figure 17. The potential functions of clustered stands, the mean diameter greater than or equal to 20 cm. The estimated models indicated that, in repulsive stands, the potential function, the 'repulsion power', about exponentially de creases with the intertree distance. Re pulsion becomes zero in the diameter classes 1 and 2 at the distance of 1.5 m and in the diameter class 3 at the distance of 2.5 m; it changes into attraction in the diameter class 1 with the distances of 1.75 m —2.25 m, in the class 2 with the distances of 2.0 m —3.o m and in the class 3 with the distances of 3.5 m—4.5 m. These distances correspond to the average intertree distances in young and middle-aged forests. The potential functions of different tree species do not differ very much. The differences correspond to those in Tables 13—15. In the diameter class 1, the probability of being regular is larger for spruce-dominant stands than for pine dominant ones. Similarly, the repulsion is a little more powerful in spruce stands than in pine stands. In the diameter class 2 the situation is the opposite. In the diameter class 3 the potential functions of pine and spruce are equal (small differences). The potential functions of clustered forests have a hard core at the beginning, the hard-core distance being 25 cm in the diameter classes 1 and 2, and 50 cm in the diameter class 3. After that, interaction turns into attraction (in the diameter classes 2 and 3 after an interval with milder repulsion), up to about 1.0 m —1.75 m, depending on the diameter class. Beyond this interval interaction will change into repulsion and after a certain interval change back into attraction, before it totally vanishes. A possible interpretation of the form of the potential function is that the clusters of trees are rather small and that immediately outside the clusters there is an area of no or just a few trees. If the Gibbs type processes are used as a 50 Tomppo, E. model for clustered forests, there are some problems concerning the interpretation of the results. First, it is difficult to find a biological explanation of the form of the potential function presented above (that is, of the phenomenon that trees tend to cluster at some intertree distance and to be inhibited by each other at some greater distance). On the other hand, if the potential function is purely attractive, poss ibly with a hard-core distance, the realiz ation of the process in a bounded area is only one big cluster with a high probability. Therefore, with clustered forests, the esti mated potential function has to be regarded as a data summary only like the estimator (4.10) of Ripley's K(t). In fact, K(t) is equivalent to the corresponding potential function. In other words, we observe a spatial pattern, summarize it in terms of a potential function, which in our case is well behaved, and simulate a corresponding pattern by using the obtained data sum mary. More realistic models for clustered forests might be for instance the Neymann- Scott or Cox processes. Interaction pro cesses of a higher order or with a Markov function like (3.18) might also be possible models for clustered forests. 632. The estimates of the parameters of Gibbs processes by means of nearest neighbour measurements In order to estimate the parameters of Gibbs processes by means of nearest neigh bour measurements (cf. Section 433), we first compute the Hopkins' index of random ness (4.3) for each sample plot. Our aim is to see whether this summary could be used for classifying sample plots by the spatial pattern type. From each sample plot, 20 trees and 20 uniformly distributed test points are sampled. Only such trees and points are accepted whose distance to the boundary of the sample plot is at least 0.5 m and, more importantly, also greater than the distance to the nearest tree. Table 20 shows the distributions of the index values by the spatial pattern type; the latter classification is based on the Ripley's summary and the pointwise 0.95 confidence limits of the corresponding parameter. (Here the sample plots belonging to the Nurmes-project, locating in the area of the Table 20. The sample plots classified by the Hopkins' index and the spatial pattern type. inventory year 1981, are excluded, which is why the total number of plots is smaller than in the previous sections.) For repulsive forests is usually below 0.5 and for clustered forests above 0.5. However, Poisson forests cannot be separated from repulsive or from clustered forests with this sample size. (The larger sample size is not used because of the rather small number of trees in plots and because of the guard area described previously.) Therefore, in order to estimate the parameters of potential functions, we divide the sample plots into subclasses in the same way as in the previous section by using the tree species, mean diameter of trees and Ripley's second-order summary. The function (6.7), with the same step width as in the Section 631, is applied as the potential function. The parameters a;, i = 0,..., n are estimated by using the method described in Section 433. Three trees and three uniformly distributed test points q; are sampled from each plot and the distances to the nearest tree measured. In addition, the numbers of trees in the annuli b(q„ c • j)\b(qj, c(j - 1)), j = 1,..., n, are counted. Here c • n is chosen as the interaction radius R in the models of Tables 17—19. Only the trees and points with a distance from the boundary exceeding R are Type of pattern Hopkins' index 0.000—0.410 0.411—0.420 0.421—0.430 0.431—0.440 0.441—0.450 0.451—0.460 0.461—0.470 0.471—0.480 0.481—0.490 0.491—0.500 0.501—0.510 0.511—0.520 0.521—0.530 0.531—0.540 0.541—0.550 0.551—0.560 0.561—0.570 0.571—0.580 0.581—0.590 0.591—1.000 151 35 44 52 50 58 49 28 34 28 21 15 14 12 8 3 0 1 3 8 4 4 2 6 15 12 17 14 16 24 28 28 20 21 14 13 8 12 7 15 2 0 0 3 1 1 1 7 2 3 15 10 9 3 14 13 6 10 8 69 157 39 46 61 66 71 67 49 52 55 64 52 43 36 36 29 14 23 18 92 Total 614 280 177 1071 51 Commun. Inst. For. Fenn. 138 Table 21. The estimated potential functions, parameters a 0, and intensities of trees (A) (trees/m 2 ) and used sample sizes (n), by the mean diameter class (D) and spatial pattern type. accepted. The estimates of the potential functions seem to depend (naturally) on the size of the used sample and also on the sample itself. This is because the spatial patterns of trees vary between sample plots, within one subclass, in spite of the classifica tion. If Hopkins' index is also used in the classification, there is usually less variation in estimates between samples. The estimated potential functions for pine as well as the intensities and the used sample sizes (the numbers of random trees and random points) are given in Table 21. Comparing the estimates in Tables 17 and 21 (and simulated configurations) we can see that the estimates of repulsive forests are quite close to each other, especially in the diameter classes 1 and 2. The estimates of clustered forests in the diameter class 1 are satisfactory, too. In the diameter class 2 there are marked differencies between esti mates obtained with different methods. However, the simulated spatial patterns seem to be clustered in both cases. The number of observations (18) in the diameter class 3 is probably too small for this kind of estimation because the obtained estimates appeared to be unreliable. (They are not given here.) It can be deduced from these results that the method works quite well especially with a small interaction radius. It seems to work better in repulsive forests than in clustered ones. (One reason for this, apart from the smaller number of observations in clustered cases, might be that the spatial pattern of trees vary more in clustered forests than in repulsive ones.) The effect of the sample size on the estimates is also studied. For this purpose, 10 samples (m uniformly distributed test points and m random trees) from pine dominant sample plots, the mean diameter below 10 cm, are picked up. The used sample sizes m are 100, 80, 60, 40 and 20. The estimate of the potential function (6.7) is computed from each sample. Table 22 shows the means and standard deviations of the estimates computed from the ten samples with each sample size, and, further, the estimates computed with 300 random points and 300 random trees. According to these estimates, the mini mum adequate sample size is about 60—100 random trees and random points. In applications, the estimation of the potential function by using nearest neigh bour measurements can be carried out as follows. 1) The interaction radius of the forest stand in question is estimated, possibly by using the interaction radius of the potential functions in Tables 17—19. 2) About 60—100 trees and uniformly dis tributed test points are sampled. 3) The numbers of trees around each test point are counted in circles, radii up to the interaction radius (2.25 m—4.5 m), the increment in the radius 0.25 m—o.s m depending on the l Repulsive Cl 2 3 1 Justerec ;d 2 D n A 100 100 87 54 78 0.2003 0.1805 0.0650 0.2764 0.2190 Distance Parameters of the potential functions 0.00—0.25 0.25—0.50 0.50—0.75 0.75—1.00 1.00—1.25 1.25—1.50 1.50—1.75 1.75—2.00 2.00—2.25 2.25—2.50 2.50—2.75 2.75—3.00 Estimate of « 2 a 3 *5 h A 7 ! 8 A 9 A 10 A U a 12 a o 4.44 3.38 1.83 1.14 0.24 0.12 -0.15 -0.14 -0.20 -4.90 9.53 4.10 4.37 1.57 3.50 -0.22 -0.70 -0.81 -0.10 -0.18 -0.85 0.39 -3.50 1.63 2.17 2.17 1.34 0.47 -0.06 0.02 -0.43 -0.39 -3.40 6.46 -0.70 -1.02 0.87 0.22 -0.11 0.02 -0.26 -0.19 -4.10 11.84 -0.45 -0.35 -0.14 -0.10 0.02 -0.01 0.07 0.03 0.09 -0.08 -4.40 52 Tomppo, E, mean diameter. 4) The estimates are com puted in the way described in Section 433. One should note that the measurements needed for this method are relatively easy to carry out, because the total average number of trees to be counted around each test point (at the third step) varies from 3.1 to 5.1 (with intensities in Tables 17—19). Table 22. The means and standard deviations (in parenthesis) of the estimates of the potential functions for pine-dominant sample plots in the diameter class 1 as a function of the sample size m, computed from ten samples, and the estimates computed with 300 random points and trees. m 300 100 80 60 40 20 ®i 3.96 4.44(1.94) 5.80(2.27) 4.91(2.41) 5.16(3.36) 4.07(9.77) A 2 3.17 3.38(1.15) 4.58(1.28) 3.30(2.34) 2.65(2.48) 2.38(3.59) *3 2.27 1.83(0.53) 2.12(1.13) 2.34(1.88) 1.26(2.04) 1.68(2.14) ! 4 1.33 1.14(0.30) 1.45(0.41) 1.33(1.78) 0.69(1.98) 0.68(1.26) *5 0.18 0.24(0.45) 1.07(0.38) 0.88(0.73) -0.02(0.98) -0.14(2.79) K 0.01 0.12(0.26) -0.04(0.32) -0.14(0.49) 0.31(0.76) 0.16(3.10) A 7 -0.24 —0.15(0.21) -0.83(0.34) -0.98(0.25) -0.27(0.41) -0.37(3.29) A 8 -0.14 -0.07(0.31) -0.28(0.26) —0.29(0.40) 0.62(0.71) -0.49(0.89) ä 9 -0.05 -0.20(0.28) -0.73(0.51) -0.67(0.28) -0.95(0.37) -0.62(0.56) 53 Commun. Inst. For. Fenn. 138 7. SIMULATION OF THE SPATIAL PATTERNS OF TREES IN A FOREST REGION This chapter deals with the simulation of the locations of trees applying the above models. Let us suppose that the forest region consists of a set of stands. In each stand, the known quantities are 1) the tree species, 2) the mean diameter, 3) the basal area and 4) the intensity of trees (or some of these variates) with an accuracy given in Table 6, plus the area of the stand. The locations of trees can be simulated in the following way: 1. Take an arbitrary stand of the forest region. 2. Fix the conditional distribution of the pattern type corresponding to the observed tree species of Tables 13—15 (or of Table 9 if the tree species is unknown), with the mean diameter, basal area and intensity of trees given. Let the conditional probabilities of the classes be p1? p 2, P3, 1 re pulsive, 2 = Poisson and 3 = clustered. 3. Choose a type of spatial pattern randomly according to the probabilities pj, p2, P3. 4. Simulate the locations of trees with a model corresponding to the observed tree species, the mean diameter, the intensity of trees and the drawn spatial pattern type. 5. Repeat steps I—4 until all stands have been treated. If a Poisson forest is obtained at step 3, the simulation is effected as described in Section 31. (If the number of trees must be fixed, a binomial process has to be simu lated.) On the other hand, if the picked up pattern type is regular or clustered, the corresponding model is taken from Tables 17—19, and the locations of trees are generated by simulating this pairwise inter action process P. The simulation can be performed with an acceptance-rejection method: A realization of a homogenoeus Poisson process and a random variable U, uniformly distributed on (0,1), are simu lated. The realization n is accepted if P(/u) > U; cf. Ripley (1977). If the potential function deviates distinctly from zero, the acceptance probability of the realization becomes very small (see for example Pentti nen 1984) and the simulation procedure is very time-consuming. Another simulation method, also applicable in these cases, is based on spatial birth-and-death processes; cf. Section 362. 71. Simulation of spatial birth-and-death processes The usual birth-and-death process is a continuous-time Markov process, having the set of all non-negative integers as the state space. The state of the process indicates the number of individuals being alive. If the process is in state n, the immediate transitions can only be made to state n + 1 (the birth of an individual), or to state n 1 (the death of an individual). (In state 0 the only possible transition is to state 1.) The births and deaths are regulated by the birth and death rates of the process. In the case of a spatial birth-and-death process, the birth and death rates depend not only on the number of individuals being alive but also on their locations. The interaction between individuals makes some relative locations more favourable than the others. The spatial birth-and-death process can be defined through the birth and death probabilities. Let b: R 2 X N —■ R be a positive R 2 (X) TV-measurable function sat isfying Jb(x,/Lt)e(dx) < 00 for all bounded BER 2 and nEN. Further, let d: R 2 X N - R be a positive K 2 (x) TV-measurable func tion. The probability of a new individual being born in a time interval (t, t + s) in the set B is and the probability of the individual ö x from a configuration n+ö x dying in the same time interval (t, t + s) is (7.1) s/b(x,MMdx) + o(s), B (7.2) s • d(x,n) + o(s). 54 Tomppo, E. Functions b and d are called birth and death rates. (Some conditions must be imposed on b and d to ensure that the process exists.) In the simulation of Gibbs processes, we use the link between spatial birth-and-death processes and Gibbs pro cesses. Let us suppose that a Gibbs process P, specified by f: N—• R+ is given and that the birth and death rates satisfy the 'balance' condition: if f(O. Then (under some further conditions on b and d), the birth-and-death process exists, is time-reversible and its unique equilibrium distribution is P; see Preston (1975) and Ripley (1977). (The time-reversibility means that the behaviour of the process, if we look from any time point forward or backward, is equal.) Further, (3.23) and (7.3) imply that the local energy of P is The simulation of the pairwise interaction process can be based on (7.4). The locations of trees are simulated through the following phases: 1. Place the stand in a rectangle E and simulate a starting configuration with a Poisson process or a binomial process. 2. Construct a set E* surrounding the rectangle such that for each point x of the rectangle E, b(x,R) O (E U E*) = b(x,R) (b(x,R) is a circle, radius R, centrepoint x), where R is the interaction radius of the process. Simulate a starting configuration also in the set E*. The starting configuration in E* can be chosen in the following way: a) empty con figuration, b) fixed configuration which the progress of the process does not change, c) periodic surroundings; see Figure 18. In the last alternative the configuration in the surroundings changes with the births or deaths occurring in the realization of the process. The trees located near the boundary of the rectangle are in the same position as the trees in the centre when the potential energies are computed. (Such surroundings have been applied to the simulations of the patterns in Figures 20 a)—f).) 3. Simulate a birth-and-death process by choosing birth and death rates according to the equation (7.4). Given the local energy E, feasible choices are for example rates satisfying d(x,/u) = exp(aE(x,/u)) and b(x,/j) = exp(—(l a) E(x ;( u)), a G [o,l]. The least time-consum ing process might in some cases be reached Figure 18. An example of periodic surroundings. The set E* is identical with the union of the sur rounding rectangles. by choosing a=l, i.e., b(x,/x) =1 and = exp(E(x,/x)). Hence the birth probability (7.1) is + o(s) and the death probability (7.2) sd(x,/u + + o(s), where yu is the configuration of trees in the rectangle E and /j.' in E*. Note that the simulation procedure discussed in Section 362 corresponds to the choice b(x,ju) = exp (—E(x,ju)) and d(x,ju) = 1 if n(E) = n 1, and both 0 otherwise. One step of the spatial birth-and-death process can be simulated as follows. Let us suppose that the configuration at the time t n (in a unit square) is n(t) = 2 !• Simulate the random variable X = min(Xj,..., X n, Xn+l ). If X = X;, i = 1,..., n, tree O, btx,n) (7.4) E(x, M) = -log y 55 Commun. Inst. For. Fenn. 138 To simulate whether at the next step some tree will die or a new tree will be born, first compute the probabilities q; . In the proportions of these probabilities, a lot is drawn as to the removal of a tree or the addition of a new tree. This lot-drawing is done in the usual way by simulating a uniformly distributed random variable U, and determining the next step as shown in Figure 19, u standing for the simulated value of U. If the value of U is that shown in Figure 19, tree BXi8 Xi will die. If a new tree is born, its location is chosen randomly within the square. After each step (birth or death), a new configuration in the surrounding E* and new probabilities q; will be calculated. The process is simulated through suffi ciently many steps in order to reach the state of equilibrium. If the most extreme potential functions are excluded, 1 000— 10 000 steps are enough the number of trees being about 50—100. Gumbinger (1983) presents a FORTRAN program for the simulation of the birth-and Figure 19. Tree will die. death process. This program has served as a basis for the program used in this work. Figure 20 shows some examples of the realizations of Gibbs processes given in Tables 17—19. The simulations are based on the above-mentioned choice of birth-and death rates. Finally, we would like to point out that, if the growing stock of the relevant area is unknown, it can be estimated, in the case of young or middle aged stand, by means of the distributions given in Tables 3, 4, 8 and 10—12. In seedling stands, the distributions with the mean diameter class 1, basal area class 1 and intensity class 3, might serve as estimates of the distributions of the spatial pattern types, and in mature stands the distributions with the mean diameter class 3, basal area class 3 and intensity class 1. b) Pr(X = Xi) = = q,,i = 1,..., n +l, S a, i=i where «j = exp(E(xj,/j + n')), j = a n +j= 1. 56 Tomppo, E. Figure 20. Examples of spatial patterns of trees simulated with models given in Tables 17—19. The size of each plot is 20 X 20 m2. a) a repulsive pine-dominant stand, the mean diameter class 2, number of trees in configuration is 64; b) repulsive spruce-dominant stand, the mean diameter class 2, 67 trees, c) clustered spruce-dominant stand, the mean diameter class 2, 88 trees; d) repulsive spruce-dominant stand, the mean diameter class 3, 32 trees, e) clustered birch-dominant stand, the mean diameter class 1, 161 trees; f) clustered pine-dominant stand, the mean diameter class 1, 93 trees. 57 Commun. Inst. For. Fenn. 138 8. SUMMARY The objective of the study The purpose of this study is to present certain methods of modern spatial statistics which can be used to solve specific forestry problems. The known methods are further developed. The objective is to introduce statistical models and methods for the analysis of the relative locations of trees (or other point configurations), called a spatial pattern of trees. The models are estimated for stands representing mineral-soil growth forests in the southern half of Finland. Further, the prediction of the type of the spatial pattern in terms of 'conventional' forest variates is discussed. The idea is that one can predict and simulate the locations of trees with the obtained results if these conventional forest variates are known. The Takacs-Fiksel estimation procedure is studied for this purpose; see Takacs (1983) and Fiksel (1984). The method is developed further for the estimation of the parameters of a model based on field observations. The spatial pattern of trees has an important role in many fields of forestry. It affects for instance 1) the sampling design of a forest inventory, 2) the growth possi bilities of an individual tree and thus the timber production of the whole stand, 3) the need of silvicultural treatments in a seedling or sapling stand, and 4) the need of thinning in a young or middle-aged stand. The spatial pattern is implicitly observed in forest management, for example in treatments of seedling or sapling stands and in thinning operations. Nevertheless, spatial patterns of trees have seldom been presented in an analytic form in forest research, even more rarely in forest management. Spatial analysis has, however, been developing already for a few decades. Professor Matern began his pioneering work at developing the theory and methods of spatial pattern analysis in the 1940'5. A well-known reference book is Matern (1960). Models A suitable subclass of the so-called spatial point processes on the Euclidean plane is used as a model for the spatial pattern of trees. Spatial point processes on the plane can be identified with probability measures on integer-valued Radon measures on the plane (i.e., with locally finite measures). The simplest model for a spatial pattern is the Poisson process on the plane. In this model, the locations of the trees are uniformly and independently distributed over the relevant area. There is no interaction between the locations of the trees. Poisson processes can also be used to build up other processes. A process called pairwise interaction process has been chosen as the model in this study. This is a special case of what are called Gibbs processes. Gibbs processes can be characterized in terms of their Radon- Nikodym derivative f taken with respect to some weight process, such as a Poisson process. For each point configuration fi, f(/u) measures how much more likely the con figuration n is for that process than for a Poisson process. In the case of Gibbs processes, this derivative can be written as a product of interaction functions g. The function = —ln gis called a potential function of the process. In the present study, 4> has been chosen to be a step function. The arguments for the choice of a Gibbs process for the model are: 1) The model is suitable also when the interaction between trees can vary from repulsion (even hard core) to attraction. 2) It is possible to estimate the parameters of the models with available data in a reasonable length of time. 3) It is easy to simulate the locations of trees with the estimated models. 4) It is not necessary to know the nature of the interaction beforehand. 5) If no biological interpretation of the form of the potential function can be given, the estimated poten tial function can in any case be regarded as a second-order data summary. 58 Tomppo, E. Methods The analysis methods of point con figurations can be divided into those based on unmapped and those based on mapped data. The former ones are also called 'field methods'. In the case of mapped data, the coordinates of the trees in some sampling window are known. Quadrat and distance samplings can be applied with both methods. (In this study, however, distance methods are mainly considered.) The classical prob lem involved in the analysis of point config urations is testing the spatial randomness of the locations of individuals against regular and clustered alternatives. Both methods, with some limitations, are applicable to hypothesis testing. The estimation of the parameters of models has for a long time been held to be quite a difficult task. However, in recent years some progress has been made in this respect. So far the methods developed have been based on mapped data. In this study we present a method for estimating the parameters of a Gibbs process from field measurements. The estimation of the parameters of Gibbs processes P is based on reduced Palm distributions of the process. A suitable function u of the point configuration ju is chosen for the estimation procedure. After experiments with certain functions, u has been constructed here from transformed Ripley's second-order parameter K. For stationary isotropic processes the integral of u with respect to Pjj appears to be equal to the integral of u with respect to P if P is multiplied with exp(—E(o (/u)), where E is, broadly speaking, the sum of potential functions. The parameters of the potential function are chosen such that the integrals are as close as possible. For estimating parameters of models from field measurements, we have used the distances from randomly chosen trees and points to the nearest tree and the numbers of trees in circles around random points, radii up to the interaction radius (2.25 m— -4.5 m). The function u above has been chosen as the distance from an arbitrary point to the nearest tree. Data The grid of the permanent growth sample plots of the National Forest Inventory, called INKA sample plots, founded by the Finnish Forest Research Institute, Depart ment of Forest Inventory and Yield, is used as the mapped stand data. Some variates obtained from the 7th National Forest Inventory are also applied. The used sample plots, totalling 1309, are located in the southern half of Finland, in the areas of forestry board districts I—ls. A typical feature of these plots is that the coordinates of the trees are known, which is a necessary condition for the estimation method pre sented in Section 432. Results The sample plots are first divided into three classes by using Ripley's second-order summary and the pointwise confidence limits of the corresponding parameter. The classes are called repulsive or regular (re), Poisson (Po) and clustered (cl) forests. Thus, a new variate, called the spatial pattern type, is obtained for each sample plot. The proportions of different pattern types, with the limits of 0.95, are re 57 %, Po 25 % and cl 18 %. The differences between inventory districts are not large. However, because the number of plots is large, some differences are statistically sig nificant. There are some differences between tree species in the distributions of spatial pattern types. Among pine-dominant plots, repul sive forests occur slightly more frequently than among spruce-dominant plots (61 %/ 52 %), the percentages of clustered forests among pine- and spruce-dominant plots be ing 22 % and 30 %, respectively. Thus, the proportions of Poisson forests are about equal. The birch-dominant plots differ substantially from the previous ones. Broadly speaking, birch-dominant plots seem to be either repulsive or clustered. The spatial pattern of trees depends also on other stand variates. The dependence of the distributions of the spatial pattern type on some conventional forest variates is studied, and at the same time the best covariates are searched. The logit model with multinomial response is used as the model for these conditional distributions. Table 9 shows the conditional distributions estimated from the whole data and Tables 13—15 show those estimated by the domi 59 Commun. Inst. For. Fenn. 138 nant tree species. In order to estimate the parameters of Gibbs processes (here a multiscale pairwise interaction process), we divide the sample plots into homogeneous groups where the spatial pattern can be taken to have been generated by the same process. The variates used in the classification are the dominant tree species, mean diameter of trees, and spatial pattern type. Thus nine potential models are obtained for each tree species (if Poisson processes are included). The chosen width of the step is 25 cm in all classes, but in repulsive stands with the mean diameter exceeding 20 cm the width is 50 cm. To find the correct interaction radius, we estimate the models with different lengths of radius. The resulting model is simulated, and Ripley's second-order summary is computed and compared to the corresponding sum mary computed from the data. The esti mates of the parameters of potential func tions for different dominant tree species are presented in the Tables 17—19. A typical feature of the estimated potential functions seems to be that both repulsion and attraction occur in the corresponding pattern. The models for pine are estimated also by means of the nearest neighbour distances. It turns out that, with the sample size of about 60 —100 random points and trees, the parameters can be estimated reliably enough. The method seems to work well especially in regular stands with a mean diameter below 20 cm. The simulation of locations of trees in a forest area is discussed in the last chapter of the study. The previous results can be applied if the dominant tree species, basal area of trees, mean diameter of trees and number of trees per hectare (or some of these) for each stand are known. Discussion This study deals with the modelling of spatial patterns by means of one possible model family. The choices of the model and estimation method are affected among other things by the available data, a large number of relatively small sample plots. The used method and model may give the best result with these data if the computation time is not allowed to increase arbitrarily. If we had a few relatively large sample plots, the methods discussed in Section 431 might be noteworthy alternatives. With these methods, some other models are also possible, and might, in the case of clustered forests, be even more realistic than the used one. In this study, an unmarked point process has been applied as the model. This can be argued for because the data consist of sample plots with relatively homogeneous tree stands. In uneven-aged forests or in forests with several tree species, a marked point process might be more realistic. The simulation of the locations of trees with the obtained results is straightforward. The development of methods which would utilize the reults in solving essential forestry problems is a research object of its own. Some suggestions are discussed above. 60 Tomppo, E. REFERENCES Aaltonen, V. T. 1923. Über die räumliche Ordnung der Pflanzen auf dem Felde und im Walde. Acta Forestalia Fennica 25. 85 p. 1925 a. Metsikön itseharventumisesta ja puiden kasvutilasta luonnonmetsissä. Referat: Über die Selbstabscheidung und den Wuchsraum der Bäume in Naturständen. Communicationes ex Instituti Quaestionum Forestalium Finlandiae 9(5): I—2o. 1925 b. Allgemeines iiber die Einwirkung der Bäume auf einander. Acta Forestalia Fennica 29. 19 p. Bartlett, M. S. 1964. The spectral analysis of two dimensional point process. Biometrika 51: 299 311. 1975. The Statistical Analysis of Spatial Pattern. Chapman and Hall, London. Besag, J. 1972. Nearest-neighbour systems and the auto-logistic model for binary data. Journal of the Royal Statistical Society B 34: 75—83. 1977. Discussion on Ripley's paper. In: Ripley, B. D. Modelling Spatial Patterns. Journar of the Royal Statistical Society B 39: 193—195. 1978. Some methods of statistical analysis for spatial data. Bulletin of the International Statistical Institute 47(2): 77—92. & Gleaves, J. T. 1973. On the detection of spatial pattern in plant communities. Bulletin of the International Statistical Institute 45(1): 153—158. , Milne, R. & Zachry, S. 1982. Point process limits of lattice processes. Journal of Applied Probability 19: 210—216. Birch, M. B. 1963. Maximum likelihood in three way contingency tables. Journal of the Royal Statistical Society B 25: 220—233. Brown, S. & Holgate, P. 1974. The thinned plantation. Biometrika 61: 253—261. Byth, K. & Ripley, B. D. 1980. On sampling spatial patterns by distance methods. Biometrics 36: 279 — 284. Cormack, R. M. 1979. Spatial aspects of competition between individuals. In: Cormack, R. M. & Ord, J. K. (eds.) Spatial and Temporal Analysis in Ecology. Fairland: International Co-operative Publishing House, p. 151 —211. Cox, F. 1971. Dichtebestimmung und Strukturanalyse von Pflanzenpopulationen mit Hilfe von Abstands messungen. Mitteilungen der Bundesforschungs anstalt. Reinbek bei Hamburg. 87: I—lB4. Diggle, P. J. 1975. Robust density estimation using distance methods. Biometrika 62: 39 —48. 1977. A note on robust density estimation for spatial point patterns. Biometrika 64: 91—95. 1979 a. On parameter estimation and goodness-of-fit testing for spatial point patterns. Biometrics 35: 87—101. 1979 b. Statistical methods for spatial point patterns in Ecology. In: Cormack, R. M. & Ord, J. K. (eds.) Spatial and Temporal Analysis in Ecology. Fairland: International Co-operative Publishing House, p. 95—150. 1983. Statistical Analysis of Spatial Point Patterns. Mathematics in Biology. Academic Press. London, New York. , Besag, J. & Gleaves, J. T. 1976. Statistical analysis of spatial point patterns by means of distance methods. Biometrics 32: 659—667. Durbin, J. 1971. Boundary-crossing probabilities for the Brownian motion and Poisson processes and techniques for computing the power of the Kolmogorov-Smirnov test. Journal of Applied Probability 8: 431 —453. Fiksel, T. 1984. Estimation of Parametrized Pair Potentials of Marked and Non-marked Gibbsian Point Processes. Elektron. Informationsverarb. Kybernet. Journal of Information Processing and Cybernetics-EIK 20: 285 —294. Fisher, R. A., Thornton, H. G. & Mackenzie, W. A. 1922. The accuracy of the plating method of estimating the density of bacterial populations, with particular reference to the use of Thornton's agar medium with soil samples. Annals of Applied Botany 9: 325—359. Gates, D. J. & Westcott, M. 1985. Clustering estimates for spatial point distributions with unstable poten tials. CSIRO Division of Mathematics and Statis tics, Canberra, ACT 2601. Glötzl, E. 1980. Lokale Energien und Potentiale fiir Punktprozesse. Mathematische Nachrichten 96: 195—206. & Rauchenschwandter, B. 1981. On the Statistics of Gibbsian Processes. In: The First Pannonian Symposium on Mathematical Statistics. Lecture Notes in Statistics 8. Springer. Greig-Smith, P. 1952. The use of random and contiguous quadrats in the study of the structure of plant communities. Annals of Botany 16: 293—316. Gumpinger, J. 1983. Ein Schätzer fiir die Chemische Aktivitet und fiir das Paarpotential eines Gibbs'schen Makrozustandes. Diplomarbeit. Jo hannes Kepler Universitet, Linz. Gustavsen, H. G. 1985. Valtakunnan metsien inven tointiin liittyvien pysyvien kasvukoealojen (INKA) kuvaus 1976—1984. The Finnish Forest Research Institute, Department of Forest Inventory and Yield. Holgate, P. 1965. Some new tests of randomness. Journal of Ecology 53: 261—266. Hopkins, B. 1954. A new method of determining the type of distribution of plant individuals. Annals of Botany 18: 213—227. Kallenberg, O. 1974. Lectures on random measures. University of North Carolina. Kendall, M. G. & Stuart, A. 1967 —1969. Advanced Theory of Statistics, vols. I—3. Charles Griffin & Co., London. Kilkki, P. 1983. Metsän inventointi osana metsätieto järjestelmää. Uudet menetelmät metsän inventoin nissa. Suomen Akatemian seminaari Hyytiälässä 21.—22.11.1983. p. I—4. , Pohjola, T. & Pohtila, E. 1985. Puiden ryhmittäi syyden huomioonottaminen harvennusmalleissa. Summary: Use of the spatial distributions of trees in thinning models. Silva Fennica 19(2): 137—143. Kuusela, K. & Salminen, S. 1969. The sth National Forest Inventory in Finland. Communicationes Instituti Forestalis Fenniae 69(4). 72 p. & Salminen, S. 1980. Ahavenanmaan maakunnan ja 61 Commun. Inst. For. Fenn. 138 maan yhdeksän eteläisimmän piirimetsälautakunnan alueen metsävarat 1977—1979. Summary: Forest resources in the province of Ahvenanmaa and the nine southernmost Forestry Board Districts in Finland 1977—1979. Folia Forestalia 446. 90 p. & Salminen, S. 1983. Metsävarat Etelä-Suomen kuuden pohjoisimman piirimetsälautakunnan alueel la 1979—1982 sekä koko Etelä-Suomessa 1977 1982. Forest resources in the six northernmost Forestry Board Districts of South Finland, 1979 1982, and in the whole of South Finland, 1977 1982. Folia Forestalia 568. 79 p. Loetsch, F. & Haller, K. E. 1964. Forest Inventory I. BLV Verlagsgesellschaft, Munchen. , Zöhrer, F. & Haller, K. E. 1973. Forest Inventory 11. BLV Verlagsgesellschaft, Miinchen. Lönnroth, E. 1925. Untersuchungen iiber die innere Struktur und Entwicklung gleichaltriger natur normaler Kiefernbestände basiert auf Material aus der Siidhälfe Finnlands. Acta Forestalia Fennica 30. 269 p. Matern, B. 1960. Spatial Variation. Meddelanden frän statens Skogsforskningsinstitut 49(5): 1—144. 1971. Doubly stochastic Poisson processes in the plane. In: Patil, G. P., Pielou, E. C. & Waters, W. E. (eds.) Statistical Ecology 1. University Park: Pennsylvania State University Press, p. 195—213. 1972. The precision of basal area estimates. Forest Science 18(2): 123—125. McCullagh, P. & Nelder, J. A. 1983. Generalized Linear Models. Chapman and Hall, London, New York. Mecke, J. 1967. Stationäre zufällige Masse auf lokal kompakten Abelschen Gruppen. Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte Gebiete 9: 36—58. Moore, P. G. 1954. Spacing in plant populations. Ecology 35: 222—227. Neveu, J. 1965. Mathematical Foundations of the Calculus of Probability. Holden-Day, Inc., San Francisco, London, Amsterdam. Ngyuen, X. X. & Zessin, H. 1979. Integral and Differential Characterizations of the Gibbs Process. Mathematische Nachrichten 88: 105—115. Novikov, A. A. 1981. A martingale approach to first passage problems and a new condition for Wald's identity. In: Arato, M., Vermes, D. & Balakrishnan, A. V. (eds.) Lecture Notes in Control and Information Sciences. Stochastic Differential Systems. Springer. Nyyssönen, A. 1950. Vertailevia havaintoja luonnon tilaisten männiköiden rakenteesta ja kehityksestä. Summary: Comparative observations on the struc ture and development of tended and natural pine stands. Silva Fennica. 68. 48 p. , Kilkki, P. & Mikkola, E. 1967. On the precision of some methods of forest inventory. Acta Forestalia Fennica 81(4). 60 p. , Roiko-Jokela, P. & Kilkki, P. 1971. Studies on improvment on the efficiency of systematic sampling in forest inventory. Seloste: Systemaatti seen otantaan perustuvan metsän inventoinnin tehokkuudesta. Acta Forestalia Fennica 116. 26 p. Oderwald, R. G. 1981. Comparison of Point and Plot Sampling Basal Area estimators. Forest Science 27(1): 42—48. Ogata, Y. & Tanemura, M. 1984. Likelihood Analysis of Spatial Point Patterns. Journal of the Royal Statistical Society B 46: 496 —518. Palmgren, J. 1981. The Fisher information matrix for log linear models arguing conditionally on observed explanatory variables. Biometrika 68(2): 563—566. Penttinen, A. 1984. Modelling interactions in spatial point patterns: parameter estimation by maximum likelihood method. Jyväskylä Studies in Computer Science, Economics and Statistics 7. Jyväskylä. (Dissertation). Persson, O. 1964. Distance methods. Studia Forestalia Suecica 15: I—6B. Pielou, E. C. 1977. Mathematical Ecology. Wiley, New York. Pohtila, E. 1980. Havaintoja taimikoiden ja nuorten metsien tilajärjestyksen kehityksestä Lapissa. Sum mary: Spatial distribution development in young tree stands in Lapland. Communicationes Instituti Forestalis Fenniae 98(1). 35 p. Preston, C. 1975. Spatial birth-and-death processes. Bulletin of the International Statistical Institute 46(2): 371—391. Ranneby, B. 1981. Provytestorlekens betyldese vid skogsinventering. Projekt Nytax 83, Rapport 4. Sveriges Lantbruksuniversitet, Umeä. Ripley, B. D. 1976. The second-order analysis of stationary point processes. Journal of Applied Probability 13: 255—266. 1977. Modelling spatial patterns (with discussion). Journal of the Royal Statistical Society B 39: 172 212. 1979 a. Simulating spatial patterns: Dependent samples from a multivariate density. Applied Statistics 28: 109—112. 1979 b. Tests of 'randomness' for spatial point patterns. Journal of the Royal Statistical Society B 41: 368—374. 1981. Spatial Statistics. Wiley, New York- Chichester—Brisbane—T oronto. 1982. Edge effects in spatial stochastic processes. In: Ranneby, B. (ed.) Statistics in Theory and Practice. Essays in Honour of Bertil Matern. Swedish University of Agricultural Sciences, Umeä. p. 242 262. & Kelly, F. P. 1977. Markov Point Processes. Journal of the London Mathematical Society 15: 188—192. Roiko-Jokela, P. 1975. Inventoinnin pysyvät kasvukoe alat. The Finnish Forest Research Institute, De partment of Forest Inventory and Yield. Saunders, R. & Funk, G. M. 1977. Poisson limits for a clustering model of Strauss. Journal of Applied Probability 14: 776—784. , Kryscio, J. & Funk, M. 1982. Poisson limits for a hard-core clustering model. Stochastic Processes and their Applications 12: 97 —106. Strauss, D. J. 1975. A model for clustering. Biometrika 63: 467—475. Takacs, R. 1983. Estimator for the pair-potential of a Gibbsian point process. Johannes Kepler Univer sitet, Linz, Institut fiir Mathematik, Institutsbericht Nr. 238. Vuokila, Y. 1983. Suomalaisen puuntuotostutkimuk sen menneisyys ja tulevaisuus. Metsäntutkimus laitoksen tiedonantoja 89: 54 —56. Warren, W. G. 1971. The centre satellite concept as a basis for ecological sampling. In: Patil, G. P., Pielou, E. C. & Waters, W. E. (eds.). Statistical Ecology 2. University Park: Pennsylvania State University Press, p. 87 —116. Total of 77 references 62 Tomppo, E. SELOSTE Malleja ja menetelmiä metsikön puiden tilajärjestyksen analysoimiseksi Tutkimuksen tarkoitus Tutkimuksen tavoite on esittää matemaattisia malle ja ja tilastollisia menetelmiä puiden keskinäisen sijainnin eli puiden tilajärjestyksen analysoimiseksi. Mallit esti moidaan Suomen eteläpuoliskon kivennäismaiden kas vatusmetsille. Tutkimuksessa esitetään myös menetel mä mallien parametrien estimoimiseksi kenttähavain noista lähtien. Lisäksi tarkastellaan tilajärjestystyyppien jakauman estimointia, kun joukko 'tavanomaisia' met sikkömuuttujia tunnetaan. Jos näiden tavanomaisten metsikkömuuttujien arvot tunnetaan, saatuja malleja ja tuloksia käyttäen on mahdollista simuloida puiden si jainti metsikössä tai metsiköistä muodostuvassa metsä alueessa. Puiden tilajärjestyksellä on merkitystä muun muassa metsän inventoinnin suunnittelussa, kasvumallien laa dinnassa sekä ratkaistaessa ongelmia, jotka liittyvät metsän uudistamiseen, tuhonkestävyyteen ja harven nukseen sekä vertailtaessa eri puunkorjuumenetelmien tehokkuuksia. Tilajärjestyksen tilastollisia malleja Puiden keskinäistä sijaintia vaakatasossa kuvataan ta son spatiaalisilla pisteprosesseilla. Nämä voidaan sa maistaa tason kokonaislukuarvoisilla Radon-mitoilla (lokaalisti äärellisillä mitoilla) määriteltyjen todennäköi syysmittojen kanssa. Yksinkertaisin malli on tason Poisson-prosessi. Tässä mallissa puiden paikat jakautu vat metsikön pinta-alalle toisistaan riippumatta tasa jakauman mukaan; puilla ei siis ole vaikutusta toistensa sijainteihin. Poisson-prosessia käytetään usein perustana monimutkaisemmille malleille. Tässä tutkimuksessa malliksi on valittu pareittaisten vuorovaikutusten prosessi. Se on erikoistapaus Gibbsin prosesseista, jotka voidaan määritellä jonkin painopro sessin (esimerkiksi Poisson-prosessin) suhteen muodos tetun Radon-Nikodym -derivaatan avulla. Prosessin oleellinen parametri on vuorovaikutusfunktio g (tai tä män logaritmin vastaluku, potentiaalifunktio ). Perus telut kyseisen prosessin valinnalle malliksi ovat: 1) Mal lia voidaan käyttää, vaikka vuorovaikutus samassa met sikössä etäisyydestä riippuen vaihtelee repulsiosta attraktioon. 2) Mallin parametrit on mahdollista esti moida käytettävissä olevasta aineistosta kohtuullisessa laskenta-ajassa. 3) Saatujen mallien avulla voidaan simu loida puiden sijainti. 4) Vuorovaikutuksen laatua ei tar vitse tuntea etukäteen. 5) Mallin parametrien estimaatit voidaan tulkita toisen kertaluvun tunnusluvuiksi, vaik ka niille ei voitaisikaan löytää biologista tulkintaa. Pistekuvioiden analyysimenetelmiä Pistekuvioiden analyysimenetelmät jaetaan kartoit tamattoman ja kartoitetun aineiston menetelmiin. Jäl kimmäisissä täytyy tuntea puiden koordinaatit jossakin otosikkunassa. Molemmista on olemassa lukumäärämit tauksiin ja etäisyysmittauksiin perustuvia menetelmiä. Tässä tutkimuksessa tarkastellaan pääasiassa etäisyys menetelmiä. Klassinen pistekuvioihin liittyvä ongelma on tilajakauman satunnaisuuden testaus, vaihtoehtoina säännöllinen tai ryhmittäinen jakauma. Sekä etäisyys että lukumäärämenetelmiä on käytetty hypoteesien tes taamiseen. Parametrien estimointi useiden spatiaalisten pisteprosessien tapauksessa ei ole mahdollista suoravii vaisella suurimman uskottavuuden menetelmällä, mistä syystä on kehitetty erilaisia tätä approksimoivia esti mointimenetelmiä; ks. Penttinen (1984). Nämä edellyt tävät usein interaktiivista estimointia, melko suurta laskenta-aikaa sekä riittävän suurta koealaa. Jos käytettävissä on useita pienehköjä koealoja, käyt tökelpoinen tapa estimoida Gibbsin prosessin para metrit on Takacsin (1983) ja Fikselin (1984) esittämä menetelmä, joka perustuu prosessin P Palmin pistepro sesseihin Pj,. Estimointia varten valitaan puiden muo dostaman pistekuvion n funktio u(yu) (tässä konstruoitu transformoidusta Ripleyn toisen kertaluvun parametris ta TT). Funktion u intergraali suhteen on sama kuin u • exp(—Eo(x,/z)):n intergraali P:n suhteen, missä Eo(x,/u) on pisteen x suhteen laskettujen poten tiaalifunktioiden summa ja 6 tuntematon parametrivek tori. Parametrien arvot valitaan siten, että integraalien estimaatit yhtyvät mahdollisimman tarkoin. Menetel mää sovelletaan oheisessa tutkimuksessa. Lisäksi funk tion u valintaan kiinnitetään erityistä huomiota. Alku peräinen menetelmä edellyttää kartoitettua aineistoa. Tässä se yleistetään etäisyysmittauksiin perustuville kenttähavainnoille, joiden kerääminen on nopeampaa ja halvempaa kuin puukartan laatiminen. Tutkimusaineisto Tutkimusaineistona käytetään Metsäntutkimuslai toksen metsänarvioimisen tutkimusosaston valtakunnan metsien inventoinnin koealoille parustamaa pysyvien kasvukoealojen verkkoa, n.s. INKA-koealoja. Tähän tutkimukseen valitut koealat sijaitsevat Suomen etelä puoliskossa, piirimetsälautakuntien I—ls alueella, pää asiassa kivennäismailla. INKA-koealojen puiden koor dinaatit tunnetaan, joten koealoja voidaan soveltaa myös kartoitettua aineistoa edellyttäviin menetelmiin. Koealat ovat otos valtakunnan metsien 7. inventoinnin koealoista, joten niihin voidaan yhdistää myös tämän inventoinnin tietoja, kuten oheisessa tutkimuksessa on tehty. 63 Commun. Inst. For. Fenn. 138 Tulokset Tutkimuksessa käytetty metsikköjoukko jaetaan Ripleyn toisen kertaluvun tunnusta K käyttäen kol meen luokkaan puiden tilajärjestyksen perusteella: repulsiivisiin (eli säännöllisiin) (re), Poisson-metsiin (Po) ja klusteroituneisiin metsiin (kl). Jos sovelletaan para metrin K 95 % pisteittäisiä luottamusvöitä, eri luokkien osuudet ovat (re) 57 %, (Po) 25 % ja (kl) 18 %. Vastaa vat luvut 99 %:n pisteittäisillä luottamusvöillä ovat 40 %, 48 % ja 12 %. Inventointialueiden välillä ei ollut jakaumissa suuria eroja. (Erot ovat kuitenkin tilastolli sesti merkitseviä suuren koealamäärän vuoksi.) Suhteel lisesti eniten klusteroituneita ja samalla vähiten repul siivisia metsiä on inventointialueilla I ja IV (mittaus vuodet 1978 ja 1981); ks. kuva 11. Tilajärjestystyypin jakaumissa on puulajien välillä jonkin verran eroja. Mäntyvaltaisten koealojen joukossa on tilajärjestykseltään säännöllisiä koealoja jonkin ver ran enemmän kuin kuusivaltaisten joukossa (61 %/ 52 %) ja Poisson-metsiä vähemmän kuin kuusivaltaisilla koealoilla (22 %/30 %). Koivuvaltaiset metsät poikkea vat edellisistä selvästi. Karkeasti ilmaistuna koivukoe alat ovat joko repulsiivisia tai klusteroituneita. Tilajärjestystyypin ehdollinen jakautuminen edellä mainittuihin kolmeen luokkaan ehdolla, että tunnetaan joitakin tavanomaisia metsikkötunnuksia, estimoidaan moniulotteista logistista regressioanalyysiä käyttäen. Samalla etsitään ne muuttujat, jotka parhaiten ennusta vat kyseiset ehdolliset jakautumat. Parhaan kolmen muuttujan kombinaation muodostavat metsikön pohja pinta-ala, keskiläpimitta ja runkoluku. Taulukossa 9 on esitetty ehdolliset jakaumat estimoituna koko aineistos ta ja taulukoissa 13—15 pääpuulajeittain männylle, kuu selle ja koivulle. Jakaumista havaitaan seuraavaa: 1) Keskiläpimitan suureneminen ei lisää säännöllisten metsien osuutta. 2) Suuret pohjapinta-alan arvot ja suuri säännöllisen tila järjestyksen todennäköisyys vastaavat toisiaan, toisin sanoen säännöllisissä metsissä on keskimäärin suurempi pohjapinta-alan arvo (ja suurempi runkotilavuus) kuin vastaavissa epäsäännöllisissä metsissä. 3) Nuoret (keski läpimitta D < 10 cm) kuusivaltaiset metsät ovat vastaa via mäntyvaltaisia säännöilisempiä. Keskiläpimitan olles sa 10 cm:n ja 20 cm:n välillä mäntyvaltaiset koealat ovat keskimäärin säännöilisempiä. Tätä suuremmilla läpimi tan arvoilla tilajärjestystyypin jakaumissa ei ole suuria eroja. 4) Koivun tilajärjestysjakaumat poikkeavat edelli sistä huomattavasti, kun D < 20 cm. Kun D < 10 cm, Poisson-metsän esiintymistodennäköisyys on pieni ja kun 10 cm < D < 20 cm, säännöllisiä metsiä esiintyy huomattavasti vähemmän kuin männyllä ja kuusella. Gibbsin prosessien parametrien estimoimiseksi koe alat ryhmitellään luokkiin, joissa tilajärjestyksen voi daan olettaa olevan saman prosessin generoima. Luokit telussa käytetyt muuttujat ovat puulaji, keskiläpimitta ja tilajärjestystyyppi. Malliksi on valittu niin sanottu moniasteinen pareittaisten vuorovaikutusten prosessi, jonka potentiaalifunktio on porrasfunktio. Portaan le veydeksi on valittu kokeilujen perusteella, 25 cm, paitsi repulsiivisissa metsissä läpimittaluokassa D > 20 cm, 50 cm. Vuorovaikutussäteen pituus vaihtelee keski läpimitasta ja tilajärjestystyypistä riippuen 2.25 m:stä 4.5 m:iin. Mallien parametrien estimaatit on esitetty puulajeittain taulukoissa 17—19. Tyypillinen piirre estimoiduille malleille näyttää olevan, että puiden kes kinäisestä etäisyydestä riippuen sekä repulsiota että attraktiota esiintyy samassa metsikössä. Mallit on estimoitu männylle käyttäen myös lähim män naapurin etäisyyksiä. Jokaiselta koealalta poimi taan kolme satunnaisesti valittua pistettä ja kolme satunnaisesti valittua puuta (käyttäen sopivaa reuna vyöhykettä). Sekä satunnaispisteistä että satunnaispuis ta mitataan etäisyydet lähimpään puuhun. Näistä otok sista poimitaan edelleen osaotokset, joista estimoidaan mallien parametrit. Tarvittava otoskoko osoittautuu olevan noin 60—100 satunnaispistettä ja satunnaispuu ta. Erityisesti säännöllisissä metsissä keskiläpimitan ol lessa alle 20 cm menetelmä näyttää toimivan hyvin. Tutkimuksen viimeisessä luvussa esitetään menetelmä puiden keskinäisen sijainnin simuloimiseksi metsiköistä muodostuvassa metsäalueessa käyttäen edellä esitettyjä tuloksia. Menetelmä edellyttää, että metsiköistä tunne taan joitakin tavanomaisia metsikkötunnuksia. Tulosten tarkastelua Tutkimuksessa on tarkasteltu tilajärjestyksen mallit tamista yhden potentiaalisen malliperheen avulla. Mallin ja estimointimenetelmän valintaan on vaikuttanut yh tenä tekijänä käytettävissä oleva aineisto: suuri määrä suhteellisen pienikokoisia koealoja. Menetelmä ja malli antanevat tässä tilanteessa parhaan tuloksen, mikäli las kenta-ajan ei haluta nousevan kohtuuttomaksi. Jos käy tössä olisi muutama suurehko koeala, luvussa 431 mai nitut approksimatiiviset menetelmät olisivat varteen otettavia vaihtoehtoja. Näiden yhteydessä myös jokin toinen malli saattaisi tulla kysymykseen. Tutkimuksessa on rajoituttu malleihin, joissa puita ei laadullisesti eroteta toisistaan. Tämä on perusteltua so velletulla aineistolla, joka muodostuu verrattain homo geenisista koealoista. Sekametsissä ja eri-ikäisissä met sissä todenmukaisempi malli saattaisi olla merkattu pis teprosessi. TOMPPO, E. 1986. Models and methods for analysing spatial patterns of trees. Seloste: Malleja ja menetelmiä puiden tilajärjestyksen analysoimisek si. Communicationes Instituti Forestalis Fenniae 138. 65 p. 64 Tomppo, E. APPENDIX LIST OF SYMBOLS Symbols appearing in the text only once are usually omitted here. (H, A) a field and a a-algebra of its subsets (R 2 , R 2) the Euclidean plane and its Borel a-algebra R + the set of the non-negative real numbers N 0 the set of the non-negative integers t usually, a real number x, y usually, points of the Euclidean plane d(x,y) the Euclidean distance of the points x and y d in Section 42, a test statistic of the spatial randomness ( ) an open interval of the real line a closed interval of the real line ( ] and [ ) semi-open intervals of the real line 1 A the indicator function of the set A (flj X fl2 , A i product field and the product a-algebra b(x,R) a circle centred at the point x, radius R B* a neighbourhood of the set B, defined by the Euclidean dis tance; in Chapter 7, a sur rounding of the set B P, £ probability measures, especial ly point processes on the plane v the Lebesgue measure on the plane v (x) P the product measure of the measures v and P A(A), AE.R 2 the intensity measure of a point process A(x) = A(x,co) in section 34, a stochastic process on the plane A 2 (A X B), A,BGR 2 the second-order intensity measure of a point process A(x) the intensity function of a point process A a constant intensity function X an estimate of a (constant) intensity function x >y) second-order intensity function of a point process p a constant intensity function; the packing intensity of a point process; in Section 432, an intensity measure of a point process Y(t) the covariance function of a point process T x the shift operator <5 X a tree located at the point x H a point configuration, a spatial pattern of trees; an integer valued Radon measure on the plane N the set of spatial patterns; the set of integer-valued Radon measures N a the restriction of N to the set A