Vol.:(0123456789) Theoretical and Applied Genetics (2025) 138:77 https://doi.org/10.1007/s00122-025-04860-9 ORIGINAL ARTICLE Genetic architecture and genomic prediction for yield, winter damage, and digestibility traits in timothy (Phleum pratense L.) using genotyping‑by‑sequencing data N. Vargas Jurado1  · H. Kärkkäinen1 · D. Fischer1 · O. Bitz1 · O. Manninen2 · P. Pärssinen2 · M. Isolahti1 · I. Strandén1 · E. A. Mäntysaari1 Received: 9 August 2024 / Accepted: 18 February 2025 © The Author(s) 2025 Abstract Key message Accurate prediction of genomic breeding values for Timothy was possible using genomic best linear unbiased prediction. Abstract Timothy (Phleum pratense L.) is a grass species of great importance for Finnish agricultural production systems. Genotyping-by-sequencing along with genomic prediction methods offer the possibility to develop breeding materials effi- ciently. In addition, knowledge about the relationships among traits may be used to increase rates of genetic gain. Still, the quality of the genotypes and the validation population may affect the accuracy of predictions. The objectives of the study were (i) to estimate variance components for yield, winter damage and digestibility traits, and (ii) to assess the accuracy of genomic predictions. Variance components were estimated using genomic residual maximum likelihood where the genomic relationship matrix was scaled using a novel approach. Genomic breeding values were estimated using genomic best linear unbiased prediction in single- and multiple-trait settings, and for different marker filtering criteria. Estimates of heritability ranged from 0.13 ± 0.03 to 0.86 ± 0.05 for yield at first cut and organic matter digestibility at second cut, respectively. Genetic correlations ranged from −0.72 ± 0.12 to 0.59 ± 0.04 between yield at first cut and winter damage, and between digestibility at first and second cuts, respectively. Accuracy of prediction was not severely affected by the quality of genotyping. Using family cross-validation and single-trait models, predictive ability ranged from 0.18 to 0.62 for winter damage and digest- ibility at second cut, respectively. In addition, validation using forward prediction showed that estimated genomic breeding values were moderately accurate with little dispersion. Thus, genomic prediction constitutes a valuable tool for improving Timothy in Finland. Introduction Timothy (Phleum pratense L.) is an important perennial grass species widely used in cool and humid climates across the Nordic countries, central Europe, northeastern and northwestern North America, Russia, and Japan (Berg et al. 1996). Due to its high yield, tolerance to winter conditions in northern latitudes, and high nutritional value, Timothy is commonly used for feeding livestock, either alone or in combination with other grasses and legumes (Niemeläinen et al. 2004), and is offered fresh or preserved as hay or silage (Bélanger et al. 2001). In addition, in Finland, Timothy has been used as a cover crop to reduce bare soil conditions after the production of cash crops such as oats, barley, and peas (Peltonen-Sainio et al. 2022). As such, Timothy cultiva- tion plays an important economic and environmental role in Finnish production systems. With the intent of improving economically important traits such as yield, digestibility, winter survival, and disease resistance, selection strategies for Timothy have focused on traditional methods, such as progeny testing in multiple environments (Bélanger et al. 2001). Therefore, develop- ment of new varieties has been a lengthy and an expensive process. The use of genomic information combined with performance records in the framework of genomic selection Communicated by Hai-Chun Jing. * N. Vargas Jurado napoleon.vargas@luke.fi 1 Natural Resources Institute Finland, Luke, 31600 Jokioinen, Finland 2 Boreal Plant Breeding Ltd., 31600 Jokioinen, Finland Theoretical and Applied Genetics (2025) 138:77 77 Page 2 of 17 (GS) provides an alternative to increase the rate of genetic change and accuracy of selection (Meuwissen et al. 2001). Timothy is a hexaploid (2n = 6x = 42), outcrossing, wind pol- linated species (Cai and Bullen 1994; Tanhuanpää and Man- ninen 2012), and cultivars are usually open pollinated popu- lations. However, its genomic constitution (autohexaploid vs. allohexaploid) is still unclear (Cai and Bullen 1994; Cai et al. 2003), and its reference genome is not yet available. This creates additional challenges for genomic prediction methods (e.g., genome-wide association studies) that rely on a map of the genome. However, genotype-by-sequencing (GBS) methods provide an alternative to obtain genomic information in a cost-effective manner (Gorjanc et al. 2015) without the explicit need for a reference genome. Still, chal- lenges remain due to low sequencing depths or allelic biases which may affect predictions of genetic merit (Gerard et al. 2018; de Bem Oliveira et al. 2020). Knowledge about the genetic architecture of traits of eco- nomic importance in Timothy is limited. The heritability of several nutritive quality traits of Timothy clones, includ- ing organic matter digestibility and crude protein content, has been previously estimated (Ashikaga et al. 2008) for Japanese populations. However, for the local Finnish (or European) populations and for many other traits of eco- nomic importance such information is not yet available. Information about the genetic architecture of traits is not only important for selection decisions but estimates of vari- ance components are also necessary for genomic evaluation. Moreover, determining relationships among traits, whether favorable or unfavorable, is crucial for the development of selection indices (Cerón-Rojas and Crossa 2018). Indices that combine estimates of the genetic merit of several traits along with economic weights can be used to increase rates of genetic gain for sets of traits with favorable correlations while simultaneously limiting changes in traits with unfa- vorable or antagonistic correlations (Batista et al. 2021). Also, indices can be tailored for specific environments thus making efficient use of the genetic resources available (Louw 1990). Genomic best linear unbiased prediction models (GBLUP) using GBS data have been recently applied to Timothy (Kovi et al. 2019, 2021) and other grass species (Guo et al. 2018; Fiedler et al. 2018). Despite the success- ful application of genomic prediction, estimates of variance components or genomic breeding values (GEBV) from one population do not necessarily generalize to a different population. In addition, the accuracy of GEBV may depend on how the training and validation sets were selected (Kovi et al. 2021). Furthermore, when using GBS data in polyploid species, factors related to the quality of sequencing such as read depth and call rate, and bias due to overdispersion may affect the accuracy of predictions as the genomic relation- ship matrix may be impacted (Cericola et al. 2018; Guo et al. 2018). Therefore, validating GEBV resulting from genomic evaluations of a Finnish Timothy population where marker information is obtained from GBS data is of paramount importance. The objectives of the current study were to (i) estimate the narrow-sense heritability and genetic correla- tions for traits of economic importance in Finnish Timothy, and (ii) determine the accuracy of GEBV prediction by using cross-validation through a forward-prediction strategy. Materials and methods Sequencing and genotypes Variant data was obtained by LGC Genomics, Berlin, Germany, using double digest restriction associated DNA sequencing (ddRAD), a Genotyping-by-sequencing (GBS) method to reduce the complexity of the genome often used for non-model species with no prior or little genomic knowl- edge (Peterson et al. 2012). However, due to the lack of a published reference genome for Timothy, a “mock” refer- ence genome was constructed from the ddRAD data from six individuals that represented diverse genetic backgrounds. These six samples were also used to determine the most efficient combination of restriction enzymes as measured by the number of variants produced. As such, it was determined that a sequencing depth of 1.5 million read pairs per sam- ple was sufficient. The combination of restriction enzymes Pstl-ApeKl resulted in the highest number of identified vari- ants. After demultiplexing of the library groups, sequencing adapter remnants from all reads were clipped and reads with a final length < 20 bases were discarded. Following this step, a quality trimming was applied, where reads containing “N” bases were removed and reads were trimmed at the 3’-end to a minimum Phred quality score of 20 over a window of ten bases. The quality trimmed reads were then aligned against the cluster reference using Bowtie 2 (Langmead and Salz- berg 2012; version 2.2.3) and variants were called with Free- Bayes (Garrison and Marth 2012; version 1.0.2–16) where further filtering criteria were applied (Supplemental File). The resulting variant information was provided in the variant call format (VCF) text file. A total of n = 1764 mother clones were genotyped using the ddRAD method described above. Out of these, 107 mothers were genotyped multiple times (two to five) such that the repeated samples could be used to provide a measure of the consistency of the sequencing approach. Read depth information extracted from the associated VCF files included the number of reads for the reference ( RA ) and alternate ( RB ) alleles for each of the genotyped individuals. In addition, a single nucleotide polymorphism (SNP) marker was con- sidered missing if both RA and RB were zero. In contrast to SNP chip data, where discrete allele counts are available Theoretical and Applied Genetics (2025) 138:77 Page 3 of 17 77 (i.e., allele dosage), the current study used allele frequencies instead. Thus, genotypes were defined as (Guo et al. 2018): where fij was the estimated allele frequency of the reference allele for individual i (i = 1, …, n) and marker j (j = 1, …, m), and RAij and RBij were the read depths of the reference and alternate alleles, respectively. As defined in (1), fij were con- tinuous variables (or continuous genotypes) on the interval [0, 1]. The number of SNP markers (m) depended on the filtering criteria described below. Also, in cases where an individual was sequenced multiple times, their allele fre- quencies were averaged into one allele frequency. Due to the use of the ratio in (1), small or large values of the number of reads may lead to biased estimates of allele frequency and thus impact genomic predictions. As such, the impact of different filtering criteria on predictions was tested. In addition to minor allele frequency (MAF) ≥ 0.05, markers were filtered (removed) based on their average mini- mum and maximum number of total reads ( RT ), and four lev- els of call rate (100%, 90%, 80%, and 70%). The minimum and maximum average number of reads were set to 10 or 20 and 100 or 200, respectively. Both criteria affected the num- ber of markers available for later use in genomic prediction (Table 1). Before filtering and across all samples there were a total of 280 808 common markers. For this set of markers, call rate ranged from 100.0 to 2.6% with a mean (SD) of 72.4% (24.4%), and there were 222 063 markers (or 79.1% (1)fij = RAij RAij + RBij of all markers) with a call rate of at least 70%. Missing SNPs for an individual were imputed using the marker average. After filtering based on MAF, a minimum and maximum read depth of 10 and 200, respectively, and for a call rate of 70% the number of markers was m = 58,816. Averaged over markers, the mean total read depth was 37.88 (SD 33.09), ranging from 4.92 to 220.72, while the mean allele frequency was 0.77 (SD 0.20). The distribution of read depth and allele frequencies for the genotyped mother clones is presented in Fig. 1. In addition, the correlation between repeated samples for mother clones genotyped multiple times was, on average, 0.85 (SD 0.10). Genomic relationships and scaling factors The genomic relationship matrix (G; n × n ) was constructed as (Van Raden 2009): where F was the n by m matrix of genotypes, 퐟 = [ f 1,… , f m ]� was an m by 1 vector of mean allele frequencies, 1n was an n by 1 vector of ones, and h was a scaling factor. In the cur- rent work three scaling factors and their impact on G were studied (i) a naïve approach with a scaling factor derived from allele frequencies, (ii) the de-biasing strategy of Ceri- cola et al. (2018), and (iii) a novel approach based on the Beta-Binomial distribution. The naïve approach used a scal- ing factor defined as (Guo et al. 2018): where k was the ploidy of an individual (assumed to be 6 for Timothy) and f j was the mean allele frequency of marker j. Due to the low number of reads or bias in the sequencing reads, using (3) as a scaling factor may inflate the diagonal elements of G and subsequently the resulting GEBV. The de-biasing strategy by Cericola et al. (2018), on the other hand, addressed this inflation by assuming that the reads for a given allele followed a Binomial distribution: where i indexed over individuals and j indexed over markers, RAij was the number of reads for the reference allele, RTij = RAij + RBij was the total number of reads, and pj was the probability of sampling the reference allele, assumed to be constant. Under these assumptions, the mean and vari- ance of the sequencing reads were 피 [ RAij ] = RTijpj and (2) 퐆 = ( 퐅 − ퟏn퐟 � )( 퐅 − ퟏn퐟 � )� h (3)h = 1 k m∑ j=1 f j ( 1 − f j ) RAij ∼ Binomial ( RTij , pj ) Table 1 Number of markers according to different filtering criteria including, minimum and maximum average read depth (RD), and call rate (%) Minimum RD Maximum RD Call rate (%) Markers 20 100 100 2883 90 26,452 80 29,587 70 30,692 200 100 4525 90 29,353 80 32,488 70 33,593 10 100 100 2887 90 38,152 80 48,802 70 54,397 200 100 4529 90 41,053 80 51,703 70 57,298 Theoretical and Applied Genetics (2025) 138:77 77 Page 4 of 17 핍ar [ RAij ] = RTijpj ( 1 − pj ) such that the mean and variance of the resulting allele frequencies fij = RAij∕RTij were 피 [ fij ] = pj and 핍ar [ fij ] = pj ( 1 − pj ) ∕RTij , respectively. Using these moments and a Normal distribution approximation, Cericola et al. (2018) proposed to correct the diagonals of G by Gii = G̃ii ( 1 − wi ) , where G̃ii and Gii were the biased and the corrected ith diagonal element of G, respectively, wi = (k − 1)∕ ( RTi + k − 1 ) was an adjustment factor, and RTi was the average (total) read depth for an individual (averaged across markers). In the current study a novel approach, denoted as the Beta-Binomial model, was considered which addressed the bias (overdispersion) in GBS data by assuming that sequencing reads at locus j can instead be modeled using the following hierarchy: where 휇j and 휏j were location and dispersion parameters of the Beta distribution, respectively, such that 피 [ pij ] = 휇j and 핍ar [ pij ] = 휇j ( 1 − 휇j ) 휏j . Above, 휇j can be considered the underlying “true” allele frequency for locus j. As specified in (4), pij could vary across individuals but the location and dispersion parameters were common to all individuals at a given locus. Conditional on pij , the expected value and variance of the sequencing reads were the same as in those for the Binomial model. However, for the hierarchical model (4), the marginal mean and variance of the sequenc- i n g r e a d s we r e i n s t e a d 피 [ RAij ] = RTij휇j a n d 핍ar [ RAij ] = 휇j ( 1 − 휇j )[ 1 + ( RTij − 1 ) 휏j ] which can be rec- (4) RAij |pij ∼ Binomial ( RTij , pij ) pij ∼ Beta ( 휇j, 휏j ) Fig. 1 Distribution of allele frequency (for the reference allele) and read depth for Timothy (Phleum pratense L.) clones obtained from genotyp- ing-by-sequencing data Theoretical and Applied Genetics (2025) 138:77 Page 5 of 17 77 ognized as the first two moments of a Beta-Binomial distribu- tion. Similarly, the mean and variance of the estimated allele frequencies for the Beta-Binomial model were given by 피 [ fij ] = 휇j and 핍ar [ fij ] = 휇j ( 1 − 휇j )[ 1 + ( RTij − 1 ) 휏j ] ∕RTij , respectively. Thus, the assumed uncertainty in the underlying frequency of a marker ( pij ) was propagated to the estimated allele frequency ( fij ) as observed by a larger variance. The scaling factor for G proposed in the current work was then given by: where the mean allele frequency for marker j ( f j ) was used as an estimate of the location parameter 휇j , 휏̂j was an esti- mate of the overdispersion parameter, and RTj was the mean read depth at locus j, averaged across individuals. The scal- ing factor (5) can be written more generally as (5)h = m∑ j=1 f j(1 − f j) [ 1 + ( RTj − 1 ) 휏̂j ] RTj h = ∑m j=1 f j � 1 − f j � 휔j where 휔j = [ 1 + ( RTj − 1 ) 휏̂j ] ∕RTj was a heterogeneity factor (Williams 1982). When 휏̂j was zero, 휔j was equal to 1∕RTj and its value decreased for increasing RTj . On the other hand, if �𝜏j > 0 and RTj was large, then 휔j was almost entirely determined by the overdispersion parameter 휏j (Fig. 2). Unlike the de-biasing approach of Cericola et al. (2018), where the adjustment factor was inde- pendent of the allele frequency, the scaling factor (5) depends on the overdispersion factor estimated at each locus. Also, note that in (5) the ploidy number is absent because it is a function of the heterogeneity factor. Estimates of 휏j were obtained by maximizing the logarithm of the Beta-Binomial likelihood using the optim function in R (R Core Team 2024) with the L-BFGS-B algorithm. Other options for esti- mating the overdispersion parameter include the quasi-like- lihood approach (weighted least squares) of Williams (1982) or Bayesian approaches. Fig. 2 Heterogeneity factor ( 휔 ) for the Beta-Binomial model as a function of total read depth ( R T ) and overdispersion parameter ( 휏) Theoretical and Applied Genetics (2025) 138:77 77 Page 6 of 17 Field trials and performance records Timothy (Phleum pratense L.) lines evaluated were devel- oped and provided by Boreal Plant Breeding Ltd. The data contained 11,375 performance records corresponding to the progeny of the 1764 genotyped mother clones collected from 2001 to 2023, with an average of approximately 495 records per year. Between 1 and 4 progeny crosses corresponding to approximately 194 (SD 101) mothers were evaluated each year and in an average of 4 locations including southern and northern Finland, Czech Republic, Latvia, and Canada. Progeny lines were the result of top- and poly-crosses. In a top-cross a every mother clone is intermated to the same pollinator variety or other population, while in a poly-cross every mother clone is intermated to all other mother clones in the poly-cross field. The traits included yield at first, second, and third cut (kg DM/ha), winter damage (%), and digestible organic matter in dry matter (D-value; %; Nousiainen et al. 2003) at first and second cut. Test plots were harvested using an experimental harvester where grass yield was measured, and representa- tive samples were taken for laboratory analysis. Cuttings were made in silage stage, at first cut in growth stages 3.1 – 3.3 according to Moore (Moore et al. 1991). In Finland the first cut was normally made between 10 and 17th of June, the second cut six weeks after the first cut and the third cut in late August/beginning of September. Winter damage was defined as the proportion of deceased individuals in a test plot in the early spring on the second and third year after sowing. D-value was measured with FOSS DS 2500 near- infrared spectroscopy (NIRS) analyzer. D-value of the cali- bration samples was calculated as (100 – ash content) × OM digestibility, where OM (organic matter) digestibility was assessed by cellulase digestibility, and the ash content of the samples was determined as the proportion of weight remain- ing after placing the samples in a furnace at 550 °C for 6 h (Rinne and Nykänen 2000). Prior to this study, performance records were corrected for spatial effects. The trial design in those field trials was a row-column design with two replications. The statistical model used can be described as: where 휇 was the overall mean, 휃r , was the effect of row block r, 휙c was the effect of column block c, and 휂l was the effect of treatment l. Statistical analyses were done with SPSS (IBM SPSS Statistics, version 27, 2019), and the best model was selected based on Akaike's information criterion (AIC). In addition, the performed quality control included removal of outliers. The best linear unbiased estimators obtained after pre-correction were used as the dependent variable (traits) in the subsequent genomic prediction models. Summary yrcl = 휇 + 휃r + 휙c + 휂l + 휖rcl statistics including mean, SD, and number of records for these traits are provided in Table 2. Estimation of variance components The following multiple-trait GBLUP model was used to esti- mate variance components for the traits mentioned above: where 퐲 = [ 퐲1,… , 퐲T ]� was the vector of phenotypic records with traits stacked, 퐛 = [ 퐛1,… , 퐛T ]� was the vector of fixed effects including trial by year interaction (210 levels), 퐦 = [ 퐦1,… ,퐦T ]� was the vector of random genetic effects (GEBV) for mother clones, 퐜 = [ 퐜1,… , 퐜T ]� was the random vector of permanent environmental effects (progeny ID within trial, 5601 levels), and 퐞 = [ 퐞1,… , 퐞T ]� was the resid- ual . The block-diagonal matr ices   퐗 = ⊕T i=1 퐗i , 퐙m = ⊕ T i=1 퐙mi , and 퐙c = ⊕ T i=1 퐙ci related records to the fixed, additive, and permanent environmental effects, respectively, and ⊕ denotes the direct sum of matrices. In addition, T = 6 was the number of traits. Furthermore, it was assumed t h a t   퐦 ∼ N ( ퟎ,퐆⊗ 퐕M ) , 퐜 ∼ N ( ퟎ, 퐈⊗ 퐕C ) , a n d 퐞 ∼ N ( ퟎ, 퐈⊗ 퐕E ) where ⊗ denotes the Kronecker (direct) product, G was the genomic relationship matrix calculated using the Beta-Binomial scaling factor (defined in (2) and (5)), 퐕M, 퐕C , and 퐕E were the 6 by 6 mother, permanent, and residual (co)variance matrices among traits, respectively, and 휎2 Mk , 휎2 Ck , and 휎2 ek correspond to their kth diagonal element. Variance components for model (6) were estimated using genomic Monte-Carlo (MC) Expectation–Maximization REML (EM-REML) in MiX99 (Vuori et al. 2006; Mati- lainen et al. 2019) with 10 MC samples per round. Conver- gence was assumed when the round-to-round change in the variance component estimates was less than1 × 10−11 . Stand- ard errors for (co)variance parameters were calculated with 200 additional MC samples after convergence. Narrow-sense heritability (h2 k ) and repeatability ( c2 k ) for trait k = 1,… , 6 were calculated as (Legarra 2016; Becker 1992): (6)퐲 = 퐗퐛 + 퐙m퐦 + 퐙c퐜 + 퐞, Table 2 Summary statistics including, number of records, mean, standard deviation, minimum, and maximum for Timothy (Phleum pratense L.) traits used in the genomic evaluation Trait N Mean SD Min Max Yield at 1st cut (kg/ha) 11,192 6460.7 2511.7 354.0 18,476.0 Yield at 2nd cut (kg/ha) 11,214 4159.8 1634.2 269.0 11,224.0 Yield at 3rd cut (kg/ha) 7745 2623.8 978.4 404.0 6675.0 Winter damage (%) 3052 12.1 12.1 −3.0 79.0 D-value at 1st cut (%) 7383 68.2 3.3 54.0 77.9 D-value at 2nd cut (%) 7007 67.6 4.4 53.0 78.9 Theoretical and Applied Genetics (2025) 138:77 Page 7 of 17 77 where 휎2 Ak = 4휎2 Mk corresponded to the additive variance for trait k and diag (퐆) was the average of the diagonal elements of G calculated using the Beta-Binomial method. Given that the mother variance ( 휎2 Mk ) explains only ¼ of the genetic variance of the progeny, it was multiplied by 4 (Becker 1992) to obtain the additive variance ( 휎2 Ak ). In addition, phe- notypic and genetic correlations among the traits described above were calculated as: where rljk was the correlation between traits j and k for the l effect (for l = A, P for additive and phenotypic, respectively), and the additive variance and phenotypic variances were de f ined a s 퐕A = 4퐕M and 퐕P = 퐕M + 퐕C + 퐕E , respectively Validation of GEBV Estimation of GEBV was performed using both single-trait and multi-trait GBLUP models (6). The single-trait model was similarly defined as: where 퐲 was the vector of phenotypic records for a given trait, 퐗 , 퐙m , and 퐙c corresponded to the design matrices for the fixed, mother, and permanent environmental effects, respectively. In addition, 퐦 ∼ N ( ퟎ, 휎2 M 퐆 ) , 퐜 ∼ N ( ퟎ, 퐈휎2 C ) , and 퐞 ∼ N ( ퟎ, 퐈휎2 e ) were, as before, the mother, permanent environmental, and residual effects, the variances 휎2 M , 휎2 C , and 휎2 e were the appropriate diagonal element of their cor- responding matrix ( 퐕M , 퐕C , or 퐕E ) estimated from the multiple-trait model (6), and 퐈 was an identity matrix. In the current work, prediction, and validation of GEBV was not performed for specific environments or locations but across all locations. However, to account for potential differences in performance among environments (or locations), both mod- els (6) and (7) included the trial by year and the permanent environmental effects, which incorporate trial location in their definition. Because the G matrix scaled using the naïve and Beta-Binomial model are the same up to a constant mul- tiplier (e.g., all elements of G are multiplied by a scalar) no differences are expected in terms of estimates of variance components or the corresponding GEBV. On the other hand, for the debiased approach potential differences may arise. Thus, such impacts are considered in subsequent analyses. h2 k = diag (퐆) × 휎2 Ak 휎2 Mk × diag (퐆) + 휎2 Ck + 휎2 ek , c2 k = diag (퐆) × 휎2 Ak + 휎2 Ck 휎2 Mk × diag (퐆) + 휎2 Ck + 휎2 ek rljk = 휎ljk√ 휎2 ljj 휎2 lkk (7)퐲 = 퐗퐛 + 퐙m퐦 + 퐙c퐜 + 퐞, GEBV were validated using two strategies: (i) a fam- ily cross-validation, and (ii) a forward-prediction strategy. For both (i) and (ii) two sets of individuals were defined: the estimation set which corresponded to mother clones whose progeny records were used for estimating GEBV using models (6) and (7), and a validation set which corre- sponded to mother clones whose progeny records were not used in the evaluation but instead were used for calculating predictive ability (and other accuracy measures). However, the genomic information of all mother clones was used in the estimation of GEBV such that the predicted GEBV for individuals in the validation set were derived from the genomic relationships with individuals in the estimation set. For (i), a total of 18 half-sib progeny families were defined such that there were at least 50 mothers, and their progeny, in each family. Thus, there were 18 individual evaluations where the performance records correspond- ing to the progeny of the mother clones in a given fam- ily were not used to estimate their GEBV. The number of mother clones in the validation set ranged from 54 to 262 representing approximately 5% and 23% of all genotyped mothers, and were associated with 48 and 251 individual progenies, respectively. For (ii), on the other hand, per- formance records from the last four years of recording (2020, 2021, 2022, and 2023) were considered part of the validation set and thus not used in the genomic evaluation (estimation of GEBV). This resulted in 8217 phenotypic records for the estimation set and 3158 for the validation set. That is, about 27% of the records were removed. A total of 338 mother clones were identified that had records in the full data set but did not have records in the estima- tion set. For the mothers in the validation set, predictive ability was defined as the correlation between GEBV and mean progeny performance, adjusted for fixed and permanent environmental effects. The impact of the marker filter- ing criteria previously described on predictive ability was tested by constructing G with the marker sets shown in Table 1 and fitting the model (6) or (7). In addition, the validation method described in Legarra and Reverter (2018) was used, where the full data set consisted of the progeny records of all genotyped mothers, while the reduced data set was defined as the progeny records from mothers remaining in the evaluation after removing moth- ers in a validation group. For the full and reduced sets, GEBV were obtained, and the following criteria were cal- culated: (i) the correlation between GEBV in the full and reduced sets, and (ii) the slope (or dispersion) and (iii) the coefficient of determination from the regression of GEBV from the full set on the GEBV from the reduced set. To avoid confusion with overdispersion in the read depth of SNPs, the terms over- and under-prediction are used to refer to over- and -under-dispersion of GEBV. Theoretical and Applied Genetics (2025) 138:77 77 Page 8 of 17 Results Scaling factor Averaging across the marker sets (Table 1), the mean of the diagonal elements of G was 2.32 ± 0.09, 2.06 ± 0.07, and 1.26 ± 0.01 when scaled using the naïve, debiased, and Beta-Binomial approaches, respectively. In addition, the average of the diagonals of G increased along with the number of markers for the naïve and debiased approaches. This was less the case when G was scaled using the Beta- Binomial scaling factor (Fig. 3). Across all markers, the mean estimate of the overdis- persion parameter 휏̂ was 0.26 (SD 0.16), with individual values ranging from 0.00 to 0.76. In the same manner, the mean heterogeneity factor ( ̂휔 ) was 0.30 (SD 0.15) with values ranging from 0.01 to 0.77. In addition, the correla- tion between 휔̂ and 휏̂ was 0.99. Estimates of variance components Estimates of additive variance were 60,342.8 ± 12,184.4, 102,520.8 ± 10,432.4, and 21,229.6 ± 3912.0, for yield at first, second, and third cuts, respectively, 6.60 ± 2.28 for winter damage, and 0.72 ± 0.08 and 1.24 ± 0.12 for D-value at first and second cuts, respectively. Similarly, estimates of permanent environmental variances were 79,075.3 ± 9021.0, 29,488.8 ± 3685.9, and 11,370.6 ± 1988.1, for yield at first, second, and third cuts, respectively, 5.88 ± 1.65 for winter damage, and 0.03 ± 0.02 and 0.10 ± 0.03 for D-value at first and second cuts, respectively. Likewise, estimates of residual variances were 482,725.8 ± 10,220.0, 179,893.3 ± 4114.8, and 82,912.9 ± 2319.9 for yield at first, second, and third cuts, respectively, 35.44 ± 1.84 for winter damage, and 0.89 ± 0.02 and 1.32 ± 0.04 for D-value at first and second cuts, respectively. Lastly, favorable genetic covariances were found among yield traits, ranging from 2331.4 ± 1161.7 to 5256.7 ± 1144.9, between cuts 1 and 3, and between cuts 2 and 3, respectively. Similarly, a smaller covariance of 0.14 ± 0.02 was found among D-value traits. On the other Fig. 3 Average diagonal values of G scaled under naïve, debiased, and Beta-Binomial model assumption for Timothy clones (Phleum pratense L.) sequenced using genotyping-by-sequencing Theoretical and Applied Genetics (2025) 138:77 Page 9 of 17 77 hand, negative covariances ranging from −114.3 ± 31.0 to −0.04 ± 0.01 were found between winter damage and yield at first cut, and between winter damage and D-value at first cut, respectively. Estimates of heritability ranged from 0.13 ± 0.02 for yield at first cut to 0.86 ± 0.05 for D-value at second cut (Table 3). Estimates of genetic correlations among traits ranged from −0.72 ± 0.12 between yield at first cut and winter damage to 0.59 ± 0.04 between D-value at first and second cut. Pheno- typic correlations, on the other hand were smaller in mag- nitude, ranging from −0.45 ± 0.02 between yield at first cut and winter damage to 0.23 ± 0.01 between yield at second and third cuts. Estimates of repeatability were 0.27 ± 0.02, 0.66 ± 0.04, 0.38 ± 0.04, 0.33 ± 0.05, 0.83 ± 0.05, and 0.91 ± 0.05 for yield at 1st, 2nd and 3rd cuts, winter damage, and D-value at 1st and 2nd cuts, respectively. Accuracy of genomic prediction Filtering markers based on minimum read depth had little effect on predictive ability for both the family cross-vali- dation and forward-prediction strategies (P = 0.31; Fig. 4; Supplementary Table S1). Averaging across traits, mar- ginal means for predictive ability was 0.412 ± 0.002 and 0.418 ± 0.002, and 0.324 ± 0.004 and 0.331 ± 0.004 for mini- mum read depth of 10 and 20, respectively, for the family cross-validation and forward-prediction strategies, respec- tively. This represented an increase in predictive ability, with respect to a minimum read depth of 10, of 1.4%, and 1.7%, respectively. On the other hand, a larger effect was observed when filtering markers on maximum read depth (P = 0.014; Sup- plementary Table S2). Overall, marginal means for pre- dictive ability were 0.422 ± 0.002 and 0.408 ± 0.002, and 0.332 ± 0.004 and 0.323 ± 0.004 for maximum read depth of 100 and 200, respectively, for the family cross-valida- tion and forward-prediction strategies, respectively. These changes in predictive ability, with respect to maximum read depth of 200, represented an increase of 3.4% and 2.8%, respectively. Similarly, call rate had a considerable effect on predictive ability (P < 0.01; Supplementary Table S3). Marginal means for predictive ability were 0.348 ± 0.006, 0.437 ± 0.006, 0.444 ± 0.006, and 0.432 ± 0.006 for call rates of 100%, 90%, 80%, and 70% for the family cross-validation approach. Similarly, for the forward-prediction strategy, marginal means for predictive ability were 0.280 ± 0.002, 0.337 ± 0.002, 0.348 ± 0.002, and 0.345 ± 0.002 for call rates of 100%, 90%, 80% and 70%, respectively. The maximum change in predictive ability, compared to a call rate of 100%, represented an increase of 27% and 24% for the cross-valida- tion and forward-prediction strategies, respectively. Predictive ability decreased when using the debiased approach (P < 0.01). Averaging over traits, marginal means for predictive ability were 0.30 ± 0.02 and 0.39 ± 0.02 for the debiased, and Beta-binomial (and naïve) approaches, respectively. This represented a 23% decrease in predictive ability. The decrease in predictive ability ranged from 14 to 48% for winter damage and yield at second cut, respectively. Overall, multiple-trait models resulted similar predictive abilities (P = 0.61). Averaging across traits, marginal means for predictive ability were 0.414 ± 0.004 and 0.417 ± 0.004, and 0.325 ± 0.002 and 0.330 ± 0.002 for single- and multi- ple-trait models, respectively, for the family cross-validation and forward-prediction scenarios, respectively. However, the effect of single- and multi-trait models on predictive abil- ity was, in general, not consistent across traits or validation strategies (Fig. 5; Supplementary Table S4). For the fam- ily cross-validation strategy, the use of multi-trait models increased the predictive ability for winter damage by 0.05 (or 27%), while it resulted in a slight decrease (less than 3%) for the remaining traits. For the forward-prediction strategy, on the other hand, fitting the multi-trait GBLUP increased the predictive ability for yield at the second cut, and D-value at the first and second cuts by 0.033, 0.030, and 0.023, respectively. Additional validation criteria, including correlations between GEBV for the full and reduced sets, slope (or Table 3 Estimates of heritability (diagonal, bold), genetic correlations (lower triangle), and phenotypic correlations (upper triangle) for yield traits (kg/ha), winter damage (%), and D-value (%) in Timothy (Phleum pratense L.). Parameter estimates obtained from a multiple-trait model with the genomic relationship matrix scaled using the Beta-Binomial approach. Stand- ard errors of estimates in parentheses Trait Yield at 1st cut Yield at 2nd cut Yield at 3rd cut Winter damage D-value at 1st cut D-value at 2nd cut Yield at 1st cut 0.13 (0.03) 0.10 (0.01) 0.11 (0.01) −0.45 (0.02) −0.17 (0.01) 0.11 (0.01) Yield at 2nd cut 0.13 (0.08) 0.53 (0.04) 0.23 (0.01) −0.01 (0.02) −0.04 (0.01) −0.29 (0.01) Yield at 3rd cut 0.26 (0.10) 0.45 (0.06) 0.26 (0.04) −0.01 (0.03) 0.01 (0.01) −0.05 (0.02) Winter damage −0.72 (0.12) 0.37 (0.11) −0.09 (0.14) 0.19 (0.05) 0.19 (0.02) −0.15 (0.02) D−value at 1st cut −0.42 (0.08) −0.26 (0.04) 0.03 (0.07) −0.08 (0.10) 0.80 (0.06) 0.12 (0.03) D−value at 2nd cut 0.04 (0.08) −0.64 (0.06) −0.08 (0.08) −0.52 (0.11) 0.59 (0.04) 0.86 (0.05) Theoretical and Applied Genetics (2025) 138:77 77 Page 10 of 17 Fig. 4 Marginal means for the effect of minimum (top row) and maxi- mum (middle row) read depth, call rate (bottom row) and validation strategy on predictive ability (correlation between genomic breeding value and mean progeny phenotypes adjusted for fixed and permanent environmental effects), averaged over type of model (single- vs multi- ple-trait), for yield traits (kg/ha), winter damage (%), and D-value (%) in Timothy (Phleum pratense L.). Marginal means for family cross- validation in the left column (A, C, E) and for forward prediction on the right column (B, D, F) Theoretical and Applied Genetics (2025) 138:77 Page 11 of 17 77 dispersion, 훽1) , and coefficient of determination (R2) from regression of GEBV from the full on the reduced set, for the family cross-validation and the forward-prediction strate- gies are presented in Table 4. Overall, the correlations were higher for the forward-prediction strategy and for single- trait GBLUP. Multi-trait GBLUP increased predictive abil- ity only for yield at second cut and D-value at first cut and only in the forward-prediction scenario. On the other hand, GEBV accuracy decreased when fitting multi-trait GBLUP, with the largest change in accuracy for D-value at second cut in the family cross-validation scenario. Dispersion was small, in general, and not consistent across traits or validation strategies, but was less perva- sive for single-trait GBLUP models. For the family cross- validation strategy, the largest over-prediction and under- prediction were observed for yield at second cut and winter damage, respectively. However, winter damage changed from under-prediction when using single-trait GBLUP to over-prediction when fitting multi-trait models. Moreover, GEBV were over-predicted for all traits when using multi- trait GBLUP. For the forward-prediction strategy, GEBV for winter damage were under-predicted the most when fitting single-trait GBLUP, which was reduced when using multi- trait models. On the other hand, a large increase in over- prediction was observed for yield at third cut when fitting multiple-trait GBLUP. In the family cross-validation strategy, R2 values from the regression of GEBV from the full set on the reduced set ranged from 0.20 to 0.52 for yield at second cut and winter damage, respectively. However, for the forward-prediction strategy the R2 values ranged from 0.33 to 0.63 for yield at third cut and winter damage, respectively. Overall, the R2 values were larger for the forward-prediction strategy and the single-trait GBLUP models. In the family cross- validation scenario, fitting multi-trait GBLUP reduced the R2 values, while an increase in R2 was observed only for the forward-prediction strategy for yield at second cut and D-value at first cut. Fig. 5 Marginal means for the effect of type of model (single- and multiple-trait) and validation strategy (family cross-validation (A) and forward prediction (B)) on predictive ability (correlation between genomic breeding value and mean progeny phenotypes adjusted for fixed and permanent environmental effects), averaged over minimum and maximum read depth, and call rate, for yield traits (kg/ha), winter damage (%), and D-value (%) in Timothy (Phleum pratense L.) Theoretical and Applied Genetics (2025) 138:77 77 Page 12 of 17 Discussion Genotypes A similar distribution of read depth to the one presented in our study has been reported for ryegrass (Guo et al. 2018). Due to the skewed distribution of read depth, applying a minimum read depth criterion of 10 instead of 20 increased the number of available markers substantially more than choosing a maximum read depth criterion of 200 instead of 100. While markers at the lower end of the distribution may be affected by low coverage and allelic bias, those at the upper end of the distribution may be outliers that are also affected by allelic bias (Gerard et al. 2018). Therefore, restricting markers to a pre-specified range may be necessary to improve the quality of the available genotypes. Scaling factor To our knowledge, this is the first time that overdispersion has been estimated for GBS data and used to scale G. The increase in the mean values of the diagonals of G for the naïve (Binomial) and the debiased approach by Cericola et al. (2018) suggest that the marker variance is consider- ably larger than pj ( 1 − pj ) , and that overdispersion plays a considerable role when sequencing Timothy materials. Indeed, this has been reported by Gerard et al. (2018), where other systematic biases in addition to overdisper- sion, including sequencing error, allelic bias, and outliers, were modeled to increase the accuracy of genotyping for polyploid species. Similarly, Clark et al. (2019) used a Bayesian model to adjust SNPs for biases by accounting for overdispersion and a contamination rate (i.e., sequenc- ing error). Nevertheless, when calculating genomic rela- tionships within a population, biases may have less of an impact when the number of markers is large (i.e., may be averaged out), such that GEBV and their ranking are less affected by the scaling of G. Because the scaling factors for the naïve and Beta- Binomial approaches are constants (i.e., a single scalar value for all individuals genotyped), the corresponding G matrices are also identical (up to a constant). As such, with proper scaling the corresponding estimates of variance and GEBV would also be identical. Such is not the case for the debiased approach as each element of the diagonal of G is adjusted (or corrected) differently. While the overall adjustment of the diagonal values of G for the debiased approach was small for this Timothy population, the cor- responding GEBV were subsequently impacted. While the main aim of developing the scaling factor was to address the inflation of the diagonals of G due to overdispersion in the GBS data, when properly scaled, G also provides information about inbreeding (i.e., genomic inbreeding; Villanueva et al. 2021). In our current work the average diagonal of G was 1.26, suggesting an inbreed- ing coefficient of 0.26 (or 26%). While little information related to inbreeding levels of Timothy is available, esti- mated inbreeding coefficients for ryegrass ranged from 0.27 to 0.65, depending on the type of individual (F1 vs twice selfing of F1 individuals; Harris et al. 2023). However, it should be noted that estimates of heterozy- gosity using GBS may be unreliable because a site may be regarded as homozygous simply because it was not sequenced sufficiently (Wang et al. 2020). Table 4 Correlation between genomic breeding values (GEBV), slope (or dispersion; 훽 1 ), and coefficient of determination (R2) for GEBV from full and reduced data sets for family cross-validation, and forward-prediction strategies for yield traits (kg/ha), winter damage (%), and D-value (%) in Timothy (Phleum pratense L.) a Pearson correlation coefficient between GEBV obtained from full and reduced data sets. bSlope and coef- ficient of determination from regressing GEBV obtained from a full model on GEBV obtained from the reduced model Model Trait Family cross-validation Forward prediction Correlationa 훽 1 b R2b Correlation1 훽 1 b R2b Single-trait Yield first cut 0.61 1.01 0.40 0.71 0.95 0.51 Yield second cut 0.43 0.76 0.20 0.63 0.92 0.40 Yield third cut 0.61 1.01 0.42 0.57 0.75 0.33 Winter damage 0.70 1.10 0.52 0.79 1.37 0.63 D-value first cut 0.56 0.92 0.34 0.55 0.96 0.30 D-value second cut 0.62 0.96 0.42 0.73 0.99 0.53 Multiple-trait Yield first cut 0.60 0.98 0.38 0.67 0.94 0.46 Yield second cut 0.62 0.73 0.20 0.64 0.92 0.41 Yield third cut 0.59 0.97 0.38 0.57 0.73 0.33 Winter damage 0.57 0.90 0.35 0.76 1.12 0.58 D-value first cut 0.52 0.88 0.29 0.59 1.00 0.35 D-value second cut 0.50 0.82 0.28 0.69 1.03 0.48 Theoretical and Applied Genetics (2025) 138:77 Page 13 of 17 77 Estimates of variance components To our knowledge, (co)variance estimates for traits of eco- nomic importance in Timothy have not been previously reported, further emphasizing the importance of our current work. On the other hand, variance estimates for dry matter yield and disease resistance has been reported for other grass species (Guo et al. 2018). However, due to differences in the definition of traits (e.g., kg/m2 vs kg/ha) their reported variances differ substantially to the ones from our study. Thus, parameters from one population may not always (eas- ily) translate from one population to another. In addition to narrow-sense heritability, information related to broad-sense heritability has been reported for Timothy (Ashikaga et a., 2011) where estimates of genotypic variance were estimated from analysis of variance. Other strategies for estimating broad-sense heritability require esti- mation of variances due to genotype, year, location, and their interaction, or dominance (and epistatic) effects (Schmidt et al. 2019; Wittenburg et al. 2013). Due to unbalanced tri- als and the need for correcting for spatial effects, and lack of information on allele dosage in our study, estimates of broad-sense heritability could not be obtained. Repeatability estimates indicate that the permanent envi- ronment of an individual contributes considerably to the variability observed in Timothy. Such contributions were more noticeable for traits with lower heritability (e.g., yield at first cut, winter damage). Because plots may be recorded multiple times in a trial, accounting for permanent environ- mental effects can help to better separate environmental and additive effects, thereby improving estimates of heritability. The heritability of yield traits, ranging from 0.11 to 0.44, reflects a considerable environmental effect. For the first cut, performance is determined by how well individuals are established in a field, as well as by the effect of the previous winter after the first year of recording. For the second cut, on the other hand, environmental conditions would be more sta- ble and homogeneous, such that performance depends more on the genetic potential of the individual. Finally, because not every location is suitable for a third cut, and because of changing weather conditions from late summer to fall, performance at this stage is again heavily influenced by the environment. Our estimates differ considerably from those reported by Ashikaga et al. (2016), who reported a broad- sense heritability of 0.79 and 0.91 for dry matter yield at first and second cuts, respectively. These discrepancies, however, likely reflect not only the different populations considered but also the methods used to obtain the estimates as Ashi- kaga et al. (2016) used partial least-squares regression for predictions and functions of the mean squares derived from analysis of variance to estimate the variance components, and the clones under evaluation had not been selected for nutritive value traits. The estimate of heritability for winter damage was mod- erately low suggesting a strong environmental component, and the variability in winter conditions at the locations in which the Timothy materials were tested (e.g., Czech Repub- lic and northern Finland). On the other hand, Ashikaga et al. (2016) reported a broad-sense heritability of 0.66 for winter survival, which was scored on a scale from 1 to 9, with 1 and 9 representing poor and good survival, respectively. Thus, in addition to population differences, winter damage and winter survival may represent different traits, and the differences in heritability may be expected. Estimates of heritability for digestibility traits (D-value) were high for both cuts, suggesting a stronger genetic com- ponent, and the possibility of faster genetic progress for these traits. While not the same traits, Ashikaga et al. (2011) and Ashikaga et al. (2016) reported high heritability esti- mates for nutritional quality traits, including low-digestible fiber, organic cell wall, and water-soluble carbohydrate frac- tions, which are strongly associated with digestibility of plant material (Huhtanen and Krizsan 2023). Furthermore, their heritability estimates at the second cut were higher than those at the first cut, which was also the case in our study. Thus, despite potential environmental effects due to pos- sible hot and dry summers, a considerable additive variance still influences digestibility traits in the Finnish Timothy population. Estimates of additive variance (or heritability) coupled with selection intensity and cycle length (or generation inter- val), can be used to obtain knowledge about the response to selection (rate of gain) for different breeding strategies (Lush 1937; Cobb et al. 2019). While larger additive variances (and heritability) are associated, in general, with a larger response to selection, faster genetic gains can be achieved by decreasing cycle length (Cobb et al. 2019). Thus, given the estimates of additive variance for yield and digestibility in Timothy, using genomic selection to both reduce the cycle length and increase accuracy of selection can potentially increase the rates of genetic gain. Positive correlations among yield traits (e.g., yield at second and third cuts), and among D-value traits (e.g., first and second cuts) indicate that selection for one of the traits is likely to have a positive impact on the remaining traits, potentially increasing the rate of genetic progress. Moreo- ver, individual D-value traits were negatively correlated with yield traits (e.g., yield and D-value for the second cut), which is consistent with the results in Claessens et al. (2004) who reported a phenotypic correlation of −0.44 between dry matter yield and true in-vitro digestibility. These negative relationships between digestibility and dry matter yield have been previously reported for Timothy under northern climate conditions (Thorvaldsson et al. 2007; Hyrkäs et al. 2018). On the other hand, a positive genetic correlation between dry matter yield and winter hardiness, defined as the combined Theoretical and Applied Genetics (2025) 138:77 77 Page 14 of 17 effect of tillers per unit after winter damage, was found for ryegrass (Fè et al. 2015). These favorable genetic correla- tions among yield traits coupled with unfavorable genetic correlations between yield traits and digestibility, and between some yield traits and winter damage reflect the chal- lenging relationships between these traits and environmental conditions. For instance, individuals affected by harsh winter conditions are likely to have poor performance at their first cut of the following season, and individuals who are still in the growing phases when winter conditions arrive are also likely to be impacted such that their subsequent perfor- mance next season will also be affected. In the same manner, high yielding individuals may have low digestibility. Thus, indices should be implemented for the selection of suitable materials that combine high yield, digestibility, and toler- ance to winter conditions. Genomic prediction and validation Although substantially increasing (or decreasing) the num- ber of available markers, filtering criteria related to mini- mum and maximum read depth had only a small effect on predictive ability. On the other hand, the larger effect observed due to call rate showed that genomic relationships were more accurately calculated with more markers, despite the potential effect bias and overdispersion. However, with a larger proportion of markers requiring imputation, a slight decrease in predictive ability was observed at the lowest level of call rate. In addition, the lack of a reference genome, lack of allele dosage information (i.e., allele counts), and our use of continuous genotypes (i.e., ratio of number of reads) limits the types of methods that can be used for increasing the accuracy of imputation. While multi-trait models have shown improvement of predictive ability in ryegrass (Arojju et al. 2020), a clear advantage of multi-trait models was not observed in the cur- rent study. In our validations, no phenotypic records were available for genotyped individuals in the validation set, such that predictions of GEBV for these individuals were based entirely on the genomic relationships with other geno- typed individuals in the estimation set, and on the genetic relationships among traits. Thus, uncertainty in the estimates of genetic correlations, along with collinearity may lead to poor predictive ability for multiple-trait models. The discrepancy in predictive ability across traits for the two validation strategies is likely due to differences in the size and structure of the training and validation sets. For the family cross-validation strategy, records are removed (to a certain extent) uniformly across years. On the other hand, a considerable portion of the records were collected in the last three years (2020 to 2022) such that the predictive ability is more affected for the forward-prediction strategy. Overall, the predictive ability for yield traits was similar to that reported by Kovi et al. (2021) for a second genera- tion of full-sib families. These authors reported prediction accuracies ranging from 0.18 for yield at first cut to 0.54 for yield at second cut which are comparable to those found in the current work. Similar predictive abilities have also been reported for dry matter yield in switchgrass and ryegrass (Fiedler et al. 2018; Guo et al. 2018). On the other hand, the accuracy reported by Kovi et al. (2021) for yield at the third cut was much lower (−0.03 to 0.01) than the one obtained in our study. However, the differences in accuracy may be due to a larger validation population (more individuals geno- typed) and larger number of phenotypic records in our study. Also, Kovi et al. (2021) used hierarchical clustering to define validation groups as compared to the family cross-validation and forward-prediction strategies in our study. Predictive ability for winter damage was the lowest for single-trait models in the family cross-validation scenario. An increase in predictive ability for multiple-trait GBLUP was perhaps due to some half-sibs having records for winter damage and other traits, thus increasing the available infor- mation for the estimation of GEBV. On the other hand, for the forward-prediction strategy such may have not been the case, leading to a decrease in predictive ability. In addition, in the forward-prediction validation scenario, winter damage was found to have the most under-prediction when using sin- gle-trait GBLUP, probably due to the low number of records overall and outside of Finland, and the lower heritability of the trait. Such under-prediction was ameliorated when multi-trait models were used. Thus, multiple-trait models may be more useful when there is interest in improving win- ter survival. Overall, the predictive ability for digestibility traits was moderately high, likely due to high heritability and a rela- tively large number of records in the training population. In switchgrass, in-vitro dry matter digestibility had a predic- tive ability of 0.69 when cross-validation was performed across the entire dataset (Fiedler et al. 2018). This predic- tive ability is consistent with that obtained from the family cross-validation in our study. On the other hand, the predic- tive ability for organic matter digestibility in ryegrass was approximately 0.20 (Arojju et al. 2020) which more closely corresponds to the predictive ability of D-value at first cut in our study for the forward-prediction strategy. For both validation strategies, D-value GEBV were not affected by over- or under-prediction. In addition, single- and multi-trait GBLUP were similarly accurate. Using the Legarra and Reverter (2018) methodology, moderately high correlations between GEBV from full and reduced sets, along with minimal dispersion (under- or over-prediction), and moderate R2 values were found for all traits and for both the family cross-validation and forward- prediction methodologies. This suggest that the predictions of GEBV were accurate, likely due to the high number of Theoretical and Applied Genetics (2025) 138:77 Page 15 of 17 77 records and moderate heritability for the traits. Bornhofen et al. (2022) reported similar dispersion values for GEBV of dry matter yield and forage quality traits in ryegrass, ranging from 0.55 to 1.51. Similarly, the correlations between GBV obtained from full and reduced models reported by these authors ranged from about 0.45 to 0.95. As in our study, their reported values depended on the model used (e.g., sin- gle- vs multiple-trait) and validation strategy (e.g., tenfold cross-validation vs family validation). The impact of environment is evident in many grass popu- lations and can be included in the model through GxE inter- actions (Bornhofen et al. 2022). For models including many traits and environments as it would be the case in our study, estimates of variance components derived from multiple- trait and multiple-environment models may be unreliable, especially with limited number of records and unbalanced data across locations. An alternative is to use so-called reac- tion norm models (Kolmodin et al. 2002), which can pro- duce a parsimonious model for estimating GxE interactions that allows for many environments. Reaction norm models will be tested as more data become available. Therefore, with the intention of providing reliable estimates of variance components and accurate genomic predictions such models were not considered in our current work. Still, to minimize the impact of environment on estimates of variance compo- nents and prediction of GEBV, trial location was included in the definition of contemporary group and permanent envi- ronment in the model. Conclusion In the present study, genomic relationships derived from GBS data along with phenotypic records were used to esti- mate variance components of several traits of economic importance for a Finnish Timothy population. The moderate to large genetic variances, and the corresponding heritabil- ity, of some yield and digestibility traits showed the potential for making genetic gains in these traits. However, for the low heritable traits (e.g., winter damage), favorable correlations with other traits can be used to increase genetic gain. On the other hand, negative correlations between some yield traits and winter damage (and digestibility traits) highlight the need to use selection indices to reduce the effects of these unfavorable relationships. The current work constituted a first step toward a rou- tine and comprehensive genomic evaluation of Timothy in Finland. Markers obtained from GBS data were used in sin- gle- and multi-trait GBLUP to predict GEBV. While read depth and call rate directly affected the number of markers available for downstream analyses, only call rate had a con- siderable effect on predictive ability. Moreover, the poten- tial benefits of using multi-trait GBLUP were not consistent and need to be further investigated. Nevertheless, the high accuracy of some yield and digestibility traits under the dif- ferent validation strategies used emphasize the possibility for developing materials with improved yield and nutritional qualities. Supplementary Information The online version contains supplemen- tary material available at https:// doi. org/ 10. 1007/ s00122- 025- 04860-9. Author contributions All authors have contributed to the study design and conception. Sequencing and genotype data collecting, and prepa- ration was performed by DF, OB, OM, EAM, and NVJ. Phenotypic data collection and processing was performed by PP and MI. Statisti- cal analysis, modeling, and genomic prediction by NVJ, HK, IM, and EAM. First draft of the manuscript is by NVJ. All authors have com- mented on previous versions of the manuscript, and all authors read and approved the final manuscript. Funding Open access funding provided by Natural Resources Institute Finland. This project was supported by Ministry of Agriculture and Forestry, Makera fund VN/7519/2020, seed industry (Hankkija, Pel- tosiemen, Tilasiemen) and meat industry (Atria, HKScan, Snellman). Data availability The data used for this study are property of Boreal Plant Breeding Ltd., and as such not publicly available. Declarations Conflict of interests The authors have no relevant financial or non- financial interests to disclose. Open Access This article is licensed under a Creative Commons Attri- bution 4.0 International License, which permits use, sharing, adapta- tion, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. References Arojju SK, Cao M, Trolove M, Barrett BA, Inch C, Eady C, Stewart A, Faville MJ (2020) Multi-trait genomic prediction improves predic- tive ability for dry matter yield and water-soluble carbohydrates in perennial ryegrass. Front Plant Sci 11:1197. https:// doi. org/ 10. 3389/ fpls. 2020. 01197 Ashikaga K, Tanaka T, Fujii H, Tamaki H, Sato K, Deguchi K, Iida K (2011) Relationship between the first and second crops and esti- mation of genetic parameters of the second crop on the nutritive value of timothy (Phleum pratense L). Euphytica 182:325–334. https:// doi. org/ 10. 1007/ s10681- 011- 0420-3 Ashikaga K, Tanaka T, Fujii H, Tamaki H, Sato K (2016) Simultane- ous selection for nutritive value and agronomic traits in timo- thy (Phleum pratense L). Euphytica. https:// doi. org/ 10. 1007/ s10681- 015- 1583-0 Theoretical and Applied Genetics (2025) 138:77 77 Page 16 of 17 Ashikaga K, Tamaki H, Deguchi K, Sato K. 2008. Heritability of nutri- tive value in first crop of Timothy Phleum pratense L. Jpn J Grassl Sci. https:// doi. org/ 10. 14941/ grass. 54. 19. Batista LG, Gaynor RC, Margarido GRA, Byrne T, Amer P, Gorjanc G, Hickey JM (2021) Long-term comparison between index selection and optimal independent culling in plant breeding programs with genomic prediction. PLoS ONE 16(5):e0235554. https:// doi. org/ 10. 1371/ journ al. pone. 02355 54 Becker WA (1992) Manual of quantitative genetics, 5th edn. Academic Enterprises, USA Bélanger G, Michaud R, Jefferson PG, Tremblay GF, Brégard A (2001) Improving the nutritive value of timothy through management and breeding. Can J Plant Sci 81:577–585. https:// doi. org/ 10. 4141/ P00- 143 Berg CC, McElroy AR, Kunelius HT (1996) Timothy. In: Moser LE, Buxton DR, Casler MD (eds) Cool-season forage grasses, ASA, CSSA, SSSA, Madison, pp 643–664. https:// doi. org/ 10. 2134/ agron monog r34. c20. Bornhofen E, Fè D, Lenk I, Greve M, Didion T, Jensen CS, Asp T, Janss L (2022) Leveraging spatiotemporal genomic breed- ing value estimates of dry matter yield and herbage qual- ity in ryegrass via random regression models. Plant Genome 15:e20255. https:// doi. org/ 10. 1002/ tpg2. 20255 Cai Q, Bullen MR (1994) Analysis of genome-specific sequences in phleum species: identification and use for study of genomic relationships. Theoret App Genet 88:831–837. https:// doi. org/ 10. 1007/ BF012 53993 Cai H-W, Yuyama N, Tamaki H, Yoshizawa A (2003) Isola- tion and characterization of simple sequence repeat markers in the hexaploid forage grass timothy (Phleum pratense L.). Theoret App Genet 107:1337–1349. https:// doi. org/ 10. 1007/ s00122- 003- 1386-x Cericola F, Lenk I, Fè D, Byrne S, Jensen CS, Pedersen MG, Asp T, Jensen J, Janss L (2018) Optimized use of low-depth gen- otyping-by-sequencing for genomic prediction among multi- parental family pools and single plants in perennial ryegrass (Lolium perenne L.). Front Plant Sci. https:// doi. org/ 10. 3389/ fpls. 2018. 0036 Cerón-Rojas JJ, Crossa J (2018) Linear selection indices in mod- ern plant breeding. Springer, Cham. https:// doi. org/ 10. 1007/ 978-3- 319- 91223-3 Claessens A, Michaud R, Bélanger G, Mather DE (2004) Character- istics of timothy genotypes divergently selected for fiber traits. Crop Sci 44(1):81–88. https:// doi. org/ 10. 2135/ crops ci2004. 8100 Clark LV, Lipka AE, Sacks EJ (2019) polyRAD: genotype calling with uncertainty from sequencing data in polyploids and diploids. G3 Genes Genom Genet. https:// doi. org/ 10. 1534/ g3. 118. 200913 Cobb JN, Juma RU, Biswas PS, Arbelaez JD, Rutkoski J, Atlin G, Hagen T, Quinn M, Ng EH (2019) Enhancing the rate of genetic gain in public-sector plant breeding programs: lessons from the breeder’s equation. Theor Appl Genet 132:627–645. https:// doi. org/ 10. 1007/ s00122- 019- 03317-0 de Bem OI, Amadeu RR, Ferrão LFV, Muñoz PR (2020) Optimiz- ing whole-genomic prediction for autotetraploid blueberry breeding. Heredity 25(6):437–448. https:// doi. org/ 10. 1038/ s41437- 020- 00357-x Fè D, Pedersen MG, Jensen CS, Jensen J (2015) Genetic and environ- mental variation in a commercial breeding program of perennial ryegrass. Crop Sci 55:631–640. https:// doi. org/ 10. 2135/ crops ci2014. 06. 0441 Fiedler JD, Lanzatella C, Edmé SJ, Palmer NA, Sarath G, Mitchell R, Tobias CM (2018) Genomic prediction accuracy for switchgrass traits related to bioenergy within differentiated populations. BMC Plant Biol. https:// doi. org/ 10. 1186/ s12870- 018- 1360-z Garrison E, and Marth G (2012) Haplotype-based variant detection from short-read sequencing. arXiv preprint. https:// doi. org/ 10. 48550/ arXiv. 1207. 3907. Accessed 1 July 2024. Gerard D, Ventorim Ferrão LF, Franco Garcia AA, Stephens M (2018) Genotyping polyploids from messy sequencing data. Genetics 210:789–807. https:// doi. org/ 10. 1534/ genet ics. 118. 301468 Gorjanc G, Cleveland MA, Houston RD, Hickey JM (2015) Potential of genotyping-by-sequencing for genomic selection in livestock populations. Genet Sel Evol 47:1–12. https:// doi. org/ 10. 1186/ s12711- 015- 0102-z Guo X, Cericola F, Fè D, Pedersen MG, Lenk I, Jensen CS, Jensen J, Janss L (2018) Genomic prediction in tetraploid ryegrass using allele frequencies based on genotyping by sequencing. Front Plant Sci 9:1–14. https:// doi. org/ 10. 3389/ fpls. 2018. 01165 Harris C, Hall M, Arrowfield R, Herridge R, Eady C, Macknight R, Brownfield L (2023) Assessing inbreeding in perennial ryegrass (Lolium perenne) as a step towards F1 hybrid breeding. Plant Breeding 142(4):518–526. https:// doi. org/ 10. 1111/ pbr. 13099 Huhtanen P, Krizsan SJ (2023) Nutritional uniformity of cell wall and lignin fractions in grass and red clover silage evaluated by the Lucas test with application to forage feed evaluation. Anim Feed Sci Tech 306:1–12. https:// doi. org/ 10. 1016/j. anife edsci. 2023. 115819 Hyrkäs M, Korhonen P, Pitkänen T, Rinne M, Kaseva J (2018) Grass growth models for estimating digestibility and dry matter yield of forage grasses in Finland. In: Proceedings of the 27th General Meeting of the European Grassland Federation, Cork, Ireland, pp 252–254. Kolmodin R, Strandberg E, Madsen P, Jensen J, Jorjani H (2002) Gen- otype by environment interaction in nordic dairy cattle studied using reaction norms. Acta Agric Scand A Anim Sci 52(1):11–24. https:// doi. org/ 10. 1080/ 09064 70025 28063 80 Kovi MR, Marum P, Alsheikh M, Amdahl H, Asp T, Rognli OA (2019) Implementation of genomic predictions in timothy (Phleum pratense L.). In: Proceedings of the joint 20th symposium of the European Grassland Federation and the 33rd meeting of the EUCARPIA section ‘Fodder crops and amenity grasses’. Zürich, Switzerland, pp 262–274. https:// doi. org/ 10. 3929/ ethz-b- 00036 9264. Kovi MR, Pashapu AR Amdahl H, Alsheikh M, Marum P, Rognli OA (2021) Impact of genetic relatedness on the genomic prediction accuracies in Timothy (Phleum pratense L.). In: Proceedings of the 34th meeting of the EUCARPIA ‘Fodder crops and amenity grasses’ section in cooperation with the EUCARPIA Festulolium Working Group. Freising, Germany, pp 109–112. https:// doi. org/ 10. 3929/ ethz-b- 00051 1542. Langmead R, Salzberg BL (2012) Bowtie 2: fast and sensitive align- ment of short DNA sequences to the human genome. Nat Methods 9(4):357–359. https:// doi. org/ 10. 1038/ nmeth. 1923 Legarra A (2016) Comparing estimates of genetic variance across dif- ferent relationship models. Theoret Pop Biol 107:26–30. https:// doi. org/ 10. 1016/j. tpb. 2015. 08. 005 Legarra A, Reverter A (2018) Semi-parametric estimates of population accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet Sel Evol 50(53):1–18. https:// doi. org/ 10. 1186/ s12711- 018- 0426-6 Louw JH (1990) A selection index to cope with genotype-environment interaction with an application to wheat breeding. Plant Breed 104(4):346–352. https:// doi. org/ 10. 1111/j. 1439- 0523. 1990. tb004 46.x Lush JL (1937) Animal breeding plans. Iowa State University Press, Ames Matilainen K, Strandén I, Mäntysaari EA (2019) Efficient monte carlo algorithm for restricted maximum likelihood estimation of genetic parameters. J Anim Breed Genet 136:252–261. https:// doi. org/ 10. 1111/ jbg. 12375 Theoretical and Applied Genetics (2025) 138:77 Page 17 of 17 77 Meuwissen TH, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genomewide dense marker maps. Genetics 157:1819–1829. https:// doi. org/ 10. 1093/ genet ics/ 157.4. 1819 Moore KJ, Moser LE, Vogel KP, Waller SS, Johnson BE, Pedersen JF (1991) Describing and quantifying growth stages of perennial forage grasses. Agron J 83:1073–1077. https:// doi. org/ 10. 2134/ agron j1991. 00021 96200 83000 60027x Niemeläinen O, Nissinen O, Jauhiainen L, Isolahti M. Timothy and tall fescue mixtures increase forage yield and improve yield quality in comparison with pure timothy in Finland (2004) In: Land use systems in grassland dominated regions. Proceedings of the 20th General Meeting of the European Grassland Federation. Luzern. Switzerland, pp 516–518 Nousiainen J, Rinne M, Hellämäki M, Huhtanen P (2003) Prediction of the digestibility of the primary growth of grass silages harvested at different stages of maturity from chemical composition and pepsin-cellulase solubility. Anim Feed Sci Technol. https:// doi. org/ 10. 1016/ S0377- 8401(02) 00283-3 Peltonen-Sainio P, Jauhiainen L, Känkänen H, Joona J, Hydén T, Mat- tila TJ (2022) Farmers’ experiences of how under-sown clovers, ryegrasses, and timothy perform in northern european crop pro- duction systems. Agronomy 12:1–18. https:// doi. org/ 10. 3390/ agron omy12 061401 Peterson BK, Weber JN, Kay EH, Fisher HS, Hoekstra HE (2012) Double digest radseq: an inexpensive method for de novo snp dis- covery and genotyping in model and non-model species. PlosOne 7(5):1–11. https:// doi. org/ 10. 1371/ journ al. pone. 00371 35 R Core Team (2024). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Aus- tria. https:// www.R- proje ct. org/. Accessed 1 July 2024. Rinne M, Nykänen A (2000) Timing of primary growth harvest affects the yield and nutritive value of timothyred clover mixtures. Agric Food Sci Finland. 9(2):121–134 Schmidt P, Hartung J, Rath J, Piepho H (2019) Estimating broad-sense heritability with unbalanced data from agricultural cultivar tri- als. Crop Sci 59:525–536. https:// doi. org/ 10. 2135/ crops ci2018. 06. 0376 Tanhuanpää P, Manninen O (2012) High SSR diversity but little differ- entiation between accessions of nordic timothy (Phleum pratense L.). Hereditas. https:// doi. org/ 10. 1111/j. 1601- 5223. 2012. 02244.x Thorvaldsson G, Tremblay FG, Kunelius HT (2007) The effects of growth temperature on digestibility and fibre concentration of seven temperate grass species. Acta Agric Scand B Soil Plant Sci 57(4):322–328. https:// doi. org/ 10. 1080/ 09064 71060 09842 21 Van Raden PM (2009) Efficient Methods to compute genomic pre- dictions. J Dairy Sci 91:4414–4423. https:// doi. org/ 10. 3168/ jds. 2007- 0980 Villanueva B, Fernández A, Saura M, Caballero A, Fernández J, Morales-Gonzáles E, Toro MA, Pong-Wong R (2021) The value of genomic relationship matrices to estimate levels of inbreed- ing. Genet Sel Evol. https:// doi. org/ 10. 1186/ s12711- 021- 00635-0 Vuori K, Strandén I, Lidauer M, Mäntysaari EA (2006) MiX99- effective solver for large and complex linear mixed models. In: Proceedings of the 8th World congress on genetics applied to livestock production, Belo Horizonte, Brazil, pp 27–33 Wang N, Yuan Y, Wang H, Yu D, Liu Y, Zhang A, Gowda M, Nair SK, Hao Z, Lu Y, San Vicente F, Prasanna BM, Li X, Zhang X (2020) Applications of genotyping-by-sequencing (GBS) in maize genetics and breeding. Sci Rep 10:16308. https:// doi. org/ 10. 1038/ s41598- 020- 73321-8 Williams DA (1982) Extra-Binomial variation in logistic linear mod- els. J R Stat Soc C Appl Stat 31(2):144–148. https:// doi. org/ 10. 2307/ 23479 77 Wittenburg D, Melzer N, Willmitzer L, Lisec J, Kesting U, Reinsch N, Repsilber D (2013) Milk metabolites and their genetic vari- ability. J Dairy Sci 96(4):2557–2569. https:// doi. org/ 10. 3168/ jds. 2012- 5635 Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.