Jukuri, open repository of the Natural Resources Institute Finland (Luke) All material supplied via Jukuri is protected by copyright and other intellectual property rights. Duplication or sale, in electronic or print form, of any part of the repository collections is prohibited. Making electronic or print copies of the material is permitted only for your own personal use or for educational purposes. For other purposes, this article may be used in accordance with the publisher’s terms. There may be differences between this version and the publisher’s version. You are advised to cite the publisher’s version. This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Author(s): Arash Chegini, Ismo Strandén, Emre Karaman, Terhi Iso-Touru, Jukka Pösö, Gert P. Aamand, Martin H. Lidauer Title: Marker weighting improves single-step genomic prediction reliabilities of udder health traits in Nordic Red and Jersey dairy cattle populations Year: 2025 Version: Published version Copyright: The Author(s) 2025 Rights: CC BY 4.0 Rights url: https://creativecommons.org/licenses/by/4.0/ Please cite the original version: Arash Chegini, Ismo Strandén, Emre Karaman, Terhi Iso-Touru, Jukka Pösö, Gert P. Aamand, Martin H. Lidauer, Marker weighting improves single-step genomic prediction reliabilities of udder health traits in Nordic Red and Jersey dairy cattle populations, Journal of Dairy Science, Volume 108, Issue 1, 2025, Pages 651-663, ISSN 0022-0302, https://doi.org/10.3168/jds.2024-25374. . https://creativecommons.org/licenses/by/4.0/ https://doi.org/10.3168/jds.2024-25374 651 ABSTRACT The standard single-step genomic prediction model as- sumes that all SNP markers explain an equal amount of genetic variance, which, however, may not be true. This is because SNPs are located in or near different genes with different functions. Therefore, it seems logical to consider SNP marker-specific weights when predicting genomic breeding values. We hypothesized that allowing differences in the amount of genetic variance explained by each SNP marker will improve prediction reliability and response to selection. To investigate this hypoth- esis, we first developed multitrait standard single-step genomic models based on the current multitrait random regression evaluation models for udder health traits of the Nordic Red (RDC) and Jersey (JER) dairy cattle populations. The models included 4 clinical mastitis (CM) traits, 3 test-day SCS traits, and the conforma- tion traits fore udder attachment and udder depth. In the second step, we investigated the effect of applying different SNP marker weighting scenarios in the single- step genomic prediction models, for which a single-step SNP best linear unbiased prediction model was applied. We investigated the prediction reliability of the different models by forward prediction, where the last 4 years of the data were removed to estimate breeding values for validation candidates. In addition, genetic trends of the pedigree-based estimated breeding values (PEBV) and GEBV were examined. The datasets for RDC and JER included 6.9 million and 1.2 million animals, of which 5.6 million and 0.9 million cows had records, respec- tively. The number of genotyped animals was 125,789 and 64,777 for RDC and JER, respectively. Cows had repeated SCS observations but only single observations for all other traits and breeding values for all traits were modeled by one covariance function. This required modeling 12 eigenvalue breeding value coefficients for each cow and developing SNP marker weights for the principal components rather than for the biological traits. We investigated 3 SNP marker weighting scenarios: (1) a nonlinear method similar to BayesA, (2) using the classi- cal formula 2pqû2 that accounts for allele heterozygosity, and (3) applying a mean SNP weight calculated by 2pqû2 for every 20 adjacent SNP markers. Bias, dispersion, and prediction reliability were calculated using PEBV or GEBV from the evaluation based on the full dataset on those using the reduced dataset. We found that the recent favorable genetic trend in CM and SCS has been accel- erated since the introduction of genomic selection. The study also shows that a significant increase in predic- tion reliability, i.e., 0.74 versus 0.48 for RDC and 0.72 versus 0.41 for JER cows for CM, can be achieved with a standard single-step genomic prediction model com- pared with a pedigree-based prediction model. Almost all scenarios with SNP marker weighting further improved the prediction reliability between 0.5% and 12.7%. The highest improvement was achieved by weighing the SNP markers based on the 2pqû2 formula. Key words: genomic selection, clinical mastitis, single- step SNPBLUP model, SNP weight, BayesA INTRODUCTION Modern dairy cattle breeding programs include udder health traits into total merit index selection to reduce the incidence of mastitis, thereby improving cow welfare and the sustainability of dairy farming. The most impor- tant udder health trait is clinical mastitis (CM), which is the most costly disease in dairy cows, causing the sec- ond largest monetary loss in dairy farming after fertility failure (Egyedy and Ametaj, 2022). The CM traits have relatively low heritability; therefore, test-day SCS traits are often included as correlated traits to improve reliabil- ity of prediction (Pösö and Mäntysaari, 1996). Neverthe- Marker weighting improves single-step genomic prediction reliabilities of udder health traits in Nordic Red and Jersey dairy cattle populations Arash Chegini,1* Ismo Strandén,1 Emre Karaman,2 Terhi Iso-Touru,1 Jukka Pösö,3 Gert P. Aamand,4 and Martin H. Lidauer1 1Natural Resources Institute Finland (Luke), 31600 Jokioinen, Finland 2Center for Quantitative Genetics and Genomics, Aarhus University, 8830 Tjele, Denmark 3Faba Co-op, 01301 Vantaa, Finland 4Nordic Cattle Genetic Evaluation (NAV), 8200 Aarhus, Denmark J. Dairy Sci. 108:651–663 https://doi.org/10.3168/jds.2024-25374 © 2025, The Authors. Published by Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY license (https://creativecommons.org/licenses/by/4.0/). The list of standard abbreviations for JDS is available at adsa.org/jds-abbreviations-24. Nonstandard abbreviations are available in the Notes. Received July 1, 2024. Accepted September 4, 2024. *Corresponding author: arash.chegini@​luke​.fi https://adsa.org/jds-abbreviations-24 mailto:arash.chegini@luke.fi 652 Journal of Dairy Science Vol. 108 No. 1, 2025 less, genetic progress is slow when selection is based on predicted breeding values from pedigree-based models (Pösö and Mäntysaari, 1996; Rupp and Boichard, 1999; Negussie et al., 2010). By using a single-step genomic prediction model (Aguilar et al., 2010; Christensen and Lund, 2010), which combines pedigree and genomic in- formation, higher prediction reliability can be expected (Aguilar et al., 2010). A standard single-step GBLUP (ssGBLUP) model as- signs equal weights to all SNP markers (i.e., ssGBLUP assumes that each SNP marker contributes equally to the genetic variation; VanRaden, 2008). This assump- tion may not be true, because some SNP markers are in the proximity of influential genes. Recent studies have highlighted the potential benefits of incorporating alter- native SNP marker weighting strategies to improve pre- diction accuracy (Zhang et al., 2010; Wang et al., 2012; Fragomeni et al., 2019). Several approaches have been proposed to weigh SNPs, including the classical weight- ing method by 2pqû2, i.e., based on allele frequencies and marker effect (Falconer and Mackay, 1996; Wang et al., 2012), a BayesA-like procedure, namely Nonlinear A (VanRaden, 2008; Cole et al., 2009), and Bayesian methods (Habier et al., 2010, 2011). In a simulation study, Zhang et al. (2010) reported an improvement in prediction ability by using a trait-specific genomic relationship matrix calculated based on squared SNP effects. Wang et al. (2012) pointed out that although Bayesian methods are able to account for the variation in the amount of genetic variance explained by each SNP marker, they impose higher computational costs and do not include phenotypic information from nongenotyped animals. Therefore, they proposed an ssGBLUP model that considers SNP marker-specific weights. Another previously used weighting method is Nonlinear A, which is similar to BayesA and where all SNP markers have nonzero weights. Fragomeni et al. (2019) tested the Nonlinear A method on stature in US Holstein cows and attained slightly higher prediction reliabilities (about 3%) when SNP marker weights were implemented in a GBLUP model, but no improvement was achieved when implemented in an ssGBLUP model. Similarly, Zhang et al. (2016) acknowledged the superiority of marker- weighted single-step for both genome-wide association studies and GEBV prediction based on the comparison of different weighting scenarios in the ssGBLUP frame- work with different Bayesian methods using simulated data. They concluded that SNP marker weighting is useful when a trait is influenced by a large number of QTL. Alternative marker weighting methods, such as the classical SNP weighting, have rarely been implemented in genetic evaluations, and have been reported to diverge in many cases (Fragomeni et al., 2019). The aim of this study is to investigate whether marker weighting in a single-step genomic prediction framework can improve prediction reliability when the heritabilities of the traits are low. Multiple trait genetic evaluations for udder health traits in the Nordic Red (RDC) and Jersey (JER) dairy cattle populations are used in the analyses of alternative SNP marker weighting approaches. We investigate the application of marker weighting in a multitrait single-step genomic prediction framework on large datasets, and we hypothesize that incorporat- ing alternative weighting strategies will improve the prediction reliability of single-step genomic prediction for udder health traits. To achieve our objectives, (1) standard multitrait ssGBLUP models are developed and the obtained genetic trends in GEBV are compared with those from the pedigree-based estimated breeding values (PEBV), (2) single-step genomic prediction models with SNP marker-specific weights that are based on different weighting approaches are developed, and (3) prediction reliability and bias of breeding values are investigated by forward prediction validation. MATERIALS AND METHODS Data and Trait Definition The datasets used in this study were the same as those used by the Nordic Cattle Genetic Evaluation (NAV; Aarhus, Denmark) for the official evaluation in Febru- ary 2023. The datasets included all available records for 9 udder health traits collected in Denmark, Finland, and Sweden since 1990. The datasets comprised in total 74.5 million and 17.1 million records for RDC and JER, respectively. The number of cows with records was 5.6 million and 0.9 million for RDC and JER, and the pedi- gree included in total 6.9 million and 1.2 million animals for RDC and JER, respectively (Table 1). Unknown parents were grouped by selection path, breed, and birth years into 391 and 319 unknown parent groups for RDC and JER, respectively. The total number of genotyped RDC and JER ani- mals was 249,223 and 136,562, respectively. However, only genotypes of individuals registered in the pedigree and born after 2008 were used for the analyses, leaving 125,789 and 64,777 genotyped RDC and JER animals, re- spectively. This was done to use the same genomic infor- mation as in the genomic evaluation by NAV. Individuals were genotyped with Illumina Bovine SNP50 Bead Chip (Illumina, San Diego, CA). Using the same editing crite- ria as NAV, 46,914 and 41,897 SNP markers remained for genotyped RDC and JER animals, respectively. The multitrait model currently used for the NAV udder health evaluation, which describes the observations of 9 Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Journal of Dairy Science Vol. 108 No. 1, 2025 653 traits, was the starting point for this study. The observa- tions for all 9 traits were adjusted by NAV for hetero- geneous variance across countries (Denmark, Finland, Sweden) and time. A description of the trait definitions and the variance components applied by NAV is given in Negussie et al. (2010). Summary statistics for the ob- servations of all 9 traits used in this study are shown in Table 2 for both breeds. The multivariate data includes 4 CM traits as the traits of interest and 5 correlated traits to increase the reliabil- ity of CM breeding values. The CM traits were coded as binary traits, where a 1 was coded for at least one occurrence of CM during a defined period, and a 0 was coded for healthy. In the first lactation, 2 CM traits were defined: one trait (CM11) for the period from 15 d before calving until 50 DIM, and another trait (CM12) for the period from 51 to 305 DIM. The other 2 CM traits were defined for the second (CM2) and third (CM3) lactation for the lactation periods from 15 d before calving to 150 DIM. The 5 correlated traits describe the information from test-day SCS observations from the first 3 lactations, and the observations of fore udder attachment (UA) and udder depth (UD) for the first lactation. Original test-day so- matic cell counts were transformed to SCS observations by a logarithmic transformation, i.e., loge(1,000cells/ mL). A cow’s SCS observations were assigned to 3 dif- ferent traits SCS1, SCS2, and SCS3, categorized by lactation 1, 2, and 3, respectively, where observations within a lactation were modeled by regression functions, as explained in the next section. The included type traits UA and UD were type scores given by classifiers using a scale from 1 to 9. Multitrait Udder Health Model with Covariance Functions for Animal Effects The applied multitrait model simultaneously includes the binary observations for the CM traits, the repeated observations for the SCS traits, and the observations for the udder conformation traits. The model is described in detail in Negussie et al. (2010), and here we present only the most important model features relevant for this study. The same model is used for both the RDC and JER breeds, with only minor breed-specific differences, because the JER population is mainly kept in Denmark. To model the animal effects, multivariate variance component analyses were carried out in the first step during model development, with all 9 traits included in the analyses. In these analyses, both the animal nonadditive and additive genetic effects for SCS were described by a second-order Legendre polynomial function plus the exponential term exp(−0.05 × DIM). Furthermore, the residual effects for the CM traits and the udder conformation traits were omitted, and an animal non-additive genetic effect was modeled instead. This allowed residual correlations between the repeated SCS observations and the observa- tions of the other 6 traits to be accounted in the model. Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Table 1. Summary of pedigree structure for Nordic Red (RDC) and Jersey (JER) dairy cattle Item RDC JER No. of animals in total 6,885,001 1,166,650 No. of sires (Average no. of daughters per sire) 91,956 (71.7) 26,657 (41.2) No. of dams (Average no. of daughters per dam) 4,212,920 (1.5) 719,384 (1.5) No. of unknown parent groups 391 319 Average inbreeding coefficient 0.022 0.037 No. of herds 59,514 12,639 Table 2. Summary statistics of the modeled observations given by trait and breed Trait1 Nordic Red   Jersey n Mean SD Minimum Maximum n Mean SD Minimum Maximum CM11 4,791,842 0.06 0.245 0.00 1.00 601,988 0.14 0.347 0.00 1.00 CM12 4,641,064 0.06 0.237 0.00 1.00 590,198 0.11 0.307 0.00 1.00 CM2 3,452,089 0.11 0.311 0.00 1.00 427,327 0.14 0.344 0.00 1.00 CM3 2,246,733 0.14 0.345 0.00 1.00 287,409 0.16 0.366 0.00 1.00 SCS1 29,944,467 4.05 1.191 0.69 9.90 7,317,581 4.44 1.093 0.69 9.47 SCS2 20,997,798 4.41 1.276 0.69 9.90 5,100,038 4.67 1.190 0.69 9.59 SCS3 12,978,293 4.63 1.302 0.69 9.90 3,301,183 4.84 1.232 0.69 9.62 UA 1,161,944 5.58 1.364 1.00 9.00 307,635 5.47 1.171 1.00 9.00 UD 1,160,324 5.62 1.661 1.00 9.00 307,634 5.39 1.117 1.00 9.00 1CM11 and CM12 = incidence of mastitis in the first lactation in 15 d before calving until 50 DIM, and from 51 to 305 DIM; CM2 and CM3 = inci- dence of mastitis in the second and the third lactations in the range of 15 d before calving up to DIM 150; SCS1, SCS2, and SCS3 = records of SCS in lactations 1, 2, and 3, respectively; UA = fore udder attachment; UD = udder depth. 654 Journal of Dairy Science Vol. 108 No. 1, 2025 The (co)variance components for the animal nonaddi- tive and additive genetic effects were used to build covari- ance functions (CF) following Lidauer et al. (2015). The correlation matrices of the original (co)variance matrices were decomposed by an eigenvalue decomposition and the largest eigenvalues and eigenfunctions that explained at least 99.0% of the variances of the original matrices were used to build the CF. The obtained CF for the animal nonadditive genetic effects describes an animal’s nonad- ditive genetic effects of all 9 traits by 15 animal-specific regression coefficients and 15 trait-specific covariables. In analogy, a CF with 12 animal-specific regression coef- ficients was developed for the animal additive genetic effects. Hence, the animal effects of the multitrait models correspond to the 15 and 12 largest eigenvalues of the original (co)variance matrices for the animal nonadditive and additive genetic effects, respectively. The multitrait udder health model can be described in matrix notation as y = Xb + Tk + Faa + Fpp + e, where vector y contains the observations of all traits; vec- tor b contains all fixed effects; vector k contains random herd × year effects for CM and udder conformation traits, and random herd × test-day effects for the SCS traits; vector a contains the random animal additive genetic effects modeled by CF; vector p contains the random animal nonadditive genetic effects modeled by a CF; and vector e contains the random residual. The matrices X, T, Fa, and Fp are design matrices relating the records to the fixed, the random herd, the animal additive and nonad- ditive genetic effects, respectively, and where Fa and Fp contain the trait-specific covariables of the CF. The fixed effects for CM and udder conformation traits included a herd × 5-year time period effect; for SCS traits, a herd × production year effect; for all traits, a calving year × month × country effect, a linear and quadratic regression on calving age × country effect, and a linear regression on the cow’s total heterozygosity; and additionally for the SCS traits, a lactation stage effect nested within 4-year time period × season × country, which was modeled by a third order Legendre polynomial plus the exponential term exp(−0.05 × DIM); and for the udder conformation traits the effect of the classifier. The assumptions for the random effects were as fol- lows: k ~MVN (0, I⊗K), with K being a (co)variance matrix for the random herd effects; a ~MVN (0, A⊗Ea), with Ea being a diagonal matrix containing the 12 largest eigenvalues of the original (co)variance matrix for the animal additive genetic effects and A being the numera- tor relationship matrix; p ~MVN (0, I⊗Ep), with Ep being a diagonal matrix containing the 15 largest eigenvalues of the original (co)variance matrix for the animal nonad- ditive effects; e ~MVN (0, I⊗R), with R being a (co) variance matrix for the random residual effects of all 9 traits, and R being derived in the same analyses that was applied for deriving the CF for the animal nonadditive genetic effects; and I is an identity matrix of size specific to the random effect. As can be seen, a linear model was used for the CM traits, which is computationally feasible compared with a logistic model given the volume of data in the present study. However, to make the linear model applicable for binary traits (Cook et al., 2017), the obser- vations have been adjusted for heterogeneity of variance to account for differences in CM incidence rate between participating countries. We applied the same variance components as in Negussie et al. (2010). For RDC, heritability was 0.038, 0.019, 0.051, and 0.046 for CM11, CM12, CM2, and CM3, and 0.30, and 0.39 for UA and UD, respectively. The lactational heritability for SCS was 0.135, 0.178, and 0.175 for SCS1, SCS2, and SCS3, respectively. For JER, the corresponding heritabilities were 0.037, 0.016, 0.040, and 0.065 for CM11, CM12, CM2, and CM3, and 0.24, and 0.32 for UA, and UD, respectively. The lacta- tional heritabilities for SCS were 0.154, 0.191, and 0.205 for SCS1, SCS2, and SCS3, respectively. Models For both breeds, we investigated the reliability of pre- diction for 5 different model alternatives. The effects of the above-described multitrait CF model can be predicted by pedigree-based BLUP, which yields PEBV. Hereafter, pedigree-based BLUP (PBLUP) refers to this model. For the other 4 model alternatives, we investigated different alternatives for including genomic information into the models to monitor the prediction reliability of GEBV. Single-Step Genomic Prediction with Equal SNP Marker Weights. As a first genomic prediction model, we applied a standard ssGBLUP model, which was iden- tical with the PBLUP model, except that the numerator relationship matrix A in the PBLUP model was substi- tuted with an H matrix calculated using both the pedigree and genotype information (Aguilar et al., 2010; Chris- tensen and Lund, 2010), i.e., a ~MVN (0, H⊗Ea), where H A G A − − − −= + −           1 1 1 22 1 0 0 0 , with G equal to the ge- nomic relationship matrix (GRM) and A22 being the pedigree-based relationship matrix of the genotyped in- dividuals. The standard ssGBLUP predictions were com- puted using a method called single-step GBLUP with a T factoring (ssGTABLUP) proposed in Mäntysaari et al. (2017). This method allows efficient solving of GEBV. In ssGTABLUP, GRM has the form G = ZZ′ + C, where Z is a centered and scaled marker matrix, C = wA22, with Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Journal of Dairy Science Vol. 108 No. 1, 2025 655 w equal to the residual polygenic proportion, which was 0.10. For scaling the Z matrix, we required the average diagonal of ZZ′ to be equal to the average diagonal of A22. Single-Step Genomic Prediction with SNP Marker- Specific Weights. We studied 4 single-step genomic pre- diction models, of which 3 had marker-specific weights. Thus, the trait-specific (j), i.e., for the models studied here the eigenvalue-specific trait, GRMj was G Z W CZj j j j= +′ , where the SNP markers received dif- ferent weights in the diagonal matrix Wj. For the models with SNP marker-specific weights, the predictions were computed by applying the single-step single-step single nucleotide polymorphism BLUP (ssSNPBLUP; Liu et al., 2014) approach, which is equivalent to ssGTABLUP when the Wj matrix is an identity matrix. The ssSNPB- LUP approach allows to apply trait-specific SNP marker weights (Strandén and Jenko, 2024). Solving an ssSNPB- LUP model needs less memory than that of an ssGTAB- LUP model, although ssSNPBLUP needs a higher num- ber of iterations to reach the same convergence (Vanden- plas et al., 2023; Strandén and Jenko, 2024). In the computation of SNP marker weights, we used SNP marker estimates obtained from the standard ssGB- LUP model using the reduced dataset, which will be ex- plained i the Prediction of Breeding Values section. The mean marker weight was standardized to be one within every trait. The SNP marker weights were computed for each of the 12 eigenvalue-based traits separately by 3 different approaches: 1. A nonlinear formula (hereafter “Nonlinear”) intro- duced by VanRaden (2008) and Cole et al. (2009) for the weight of marker i of eigenvalue trait j is 1 25 2 . , ˆ ˆ u sd u ji j( ) − where ûji is the absolute value of the estimated SNP effect for marker i and sd ûj( ) is the SD of all estimated SNP effects for eigenvalue j. No restriction was applied for the upper bound of ˆ ˆ u sd u ji j( ) because there were only a few SNPs with values higher than 10 (between 10 and 11). 2. In the second approach, which we will refer to as “2pqû2,” the SNP marker weights were calculated using the classical method described by Falconer and Mackay (1996) and Zhang et al. (2010), where there are no dominance and epistatic effects as- sumed among SNPs. The weight formula is 2 2p q ui i jiˆ , where pi and qi = 1 − pi are allele frequen- cies and ûji is the estimated SNP effect for marker i for eigenvalue trait j. 3. The third approach was similar to the method used by Zhang et al. (2016), in which average weights are calculated from the weights obtained by the 2pqû2 approach. The average weights of every 20 adjacent SNPs (20SNP_window) were calculated. As mentioned earlier, in the marker-weighted ssSNPB- LUP model, each trait (in our model eigenvalue trait) had its own specific marker weights in the Wj matrix. Thus, each eigenvalue trait j had a diagonal marker weight ma- trix Wjj. The marker-weighted ssSNPBLUP model re- quires weights for the eigenvalue-by-eigenvalue trait combinations, i.e., a weight matrix Wjl for every combi- nation of eigenvalues j and l. In our study, these weights were computed from the eigenvalue-specific weights by assuming a correlation of one between the trait-specific marker weights. Thus, the weight for marker i between eigenvalue traits j and l was w w wjl i jj i ll i, , ,= where wjj,i is the weight of eigenvalue trait j for marker i. Prediction of Breeding Values For each breed and for each model studied, 2 genetic evaluations were performed to facilitate model valida- tion by forward prediction. One evaluation with full data (denoted with subscript f), and a second evaluation with reduced data (denoted with subscript r) where all observations in the last 4 years were excluded. Inbreed- ing coefficients were included in the pedigree-based relationship matrix computations. For calculating the inbreeding coefficients, the RelaX2 program was used (Strandén and Vuori, 2006). The preconditioned conju- gate gradient method implemented in the MiX99 pro- gram suite (Strandén and Lidauer, 1999) was employed to solve the mixed model equations. In all analyses, a square root of relative difference between consecutive solutions equivalent to 10−6 was considered as a conver- gence criterion. An animal’s PEBV and GEBV for CM and SCS were calculated by multiplying the animal’s 12 regression coefficient estimates for the animal’s additive genetic effects with the trait-specific covariables used in Fa. For studying the reliability of predictions, we formed combined breeding value indices for CM and for SCS. For CM, the trait-specific breeding value index weighted CM11, CM12, CM2, and CM3 by 0.15, 0.15, 0.25, and 0.45, respectively, hereafter denoted combined_CM. For SCS, first an average daily SCS breeding value was es- timated for each lactation for the period DIM 8 to DIM 312, followed by combining the lactation averages for SCS1, SCS2, and SCS3 by applying the weights of 0.30, 0.25, and 0.45, respectively, denoted combined_SCS. The applied weights correspond to those currently used Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP 656 Journal of Dairy Science Vol. 108 No. 1, 2025 by NAV. For both combined_CM and combined_SCS, negative estimated breeding values are favorable. Summary statistics of the combined GEBV were tabulated for CM and SCS, and the yearly means of the combined PEBV and GEBV from the reduced and full data evaluations were examined for both breeds to assess the genetic trends in CM. In addition, to show the source of the difference between different single-step scenarios, SNP weights calculated by the 2pqû2 and Nonlinear ap- proaches were plotted. Model Validation For the calculation of validation statistics, suitable sets of validation animals were selected that received a breeding value evaluation without their own phenotypic information in the reduced data and a breeding value evaluation using the full data. The following steps were used to select validation candidate bulls and cows for each breed. First, individual animal reliabilities of PEBV were calculated for both the full and reduced datasets (PEBVf and PEBVr), using the method proposed by Tier and Meyer (2004) for multitrait models. The reliabilities of CM and SCS were computed for the combined_CM and combined_SCS breeding values. Second, the com- bined reliabilities of genotyped bulls and cows were used to calculate effective record contributions (ERC) separately for bulls and cows. This was carried out using the reversed reliability approximation method introduced by Harris and Johnson (1998) as implemented in MiX99 (Ben Zaabza et al., 2022). Then, genotyped bulls with an ERC ≥2 using the full data evaluation and equal to zero using the reduced data evaluation were selected as candidate bulls, yielding for CM 86 and 115 candidates, and for SCS 125 and 119 candidates for RDC and JER, respectively. Correspondingly, genotyped cows with an ERC ≥0.9 using the full data and equal to zero using the reduced data were selected as candidate cows, yielding for CM 8,440 and 8,224 candidates, and for SCS 18,112 and 6,537 candidates for RDC and JER, respectively. Following Legarra and Reverter (2018), we used linear regression of PEBVf on PEBVr, and genomic estimated breeding values using full and reduced datasets, respec- tively (GEBVf on GEBVr), to measure bias (b0; inter- cept), dispersion (b1; slope of regression), and prediction reliability (R2; square of correlations between obtained estimates using reduced and full datasets) for the 5 dif- ferent models. For investigating the consistency of the estimates in different scenarios, the SE of the estimates were estimated by bootstrapping. The number of boot- strap samples was set to 10,000, and the method used was Ordinary Nonparametric Bootstrap (R Core Devel- opment Team, 2012). All the analyses were performed on a server with an AMD EPYC 7443P CPU (2.85 GHz). RESULTS AND DISCUSSION Summary Statistics and Input Parameters Two large datasets of RDC and JER cows were used in this study to evaluate the value of applying SNP marker weights in single-step genomic prediction for the Nor- dic udder health evaluation. The 9 traits included in the Nordic multitrait model for udder health are CM and SCS in the first 3 lactations and UA and UD in the first lactation only, and of these, only a combined GEBV for CM has been incorporated in the Nordic Total Merit In- dex. Therefore, our main interest was on the effect of the different modeling alternatives on the reliability of the combined breeding values for CM. Furthermore, because breeding values for SCS are used in many countries, we also investigated the effect on the combined breeding values for SCS. The CM incidences were higher in JER cows than in RDC cows, ranging from 0.11 to 0.16 for JER cows and from 0.06 to 0.14 for RDC cows, with the highest dif- ference in the first lactation (Table 2). In addition, JER cows exhibited a higher susceptibility to new infections at the beginning of lactation as they had a higher inci- dence rate, which was more than double that of RDC cows during this period. Furthermore, JER cows also had higher SCS averages, ranging from 4.44 in the first lacta- tion to 4.84 in the third lactation, compared with RDC cows with average SCS of 4.05 and 4.63, respectively. The heritability of the combined_CM trait was 0.09 and 0.12 for RDC and JER cows, respectively, and the heri- tability of the combined_SCS trait was 0.25 and 0.26 for RDC and JER cows, respectively. The structure of the full and reduced datasets created by truncating the last 4 years of the data is shown in Table 3. Truncation reduced the number of cows with phenotypic records by 6% and 13% for RDC and JER, re- spectively. On average, there were more test-day records per JER cow than per RDC cow, because in Finnish RDC herds, test-day recording is done in bi-monthly intervals. As genotyping gradually became cheaper, there was an increasing trend in the number of genotyped cows such that more than half of the genotyped cows had observa- tions in the last 4 years of the data. However, not all of these genotyped cows could be used as candidates be- cause many of them only had observations in the reduced dataset. Validation of Clinical Mastitis Before performing the forward validation, the PEBV and GEBV of animals were standardized based on the averages of PEBV and GEBV of animals, respectively, born between 2000 and 2002. The results of the forward Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Journal of Dairy Science Vol. 108 No. 1, 2025 657 validation for bull and cow candidates of RDC and JER breeds for combined_CM are shown in Table 4. Regres- sion of breeding values using the full dataset on those of using the reduced dataset showed slightly lower bias and dispersion for PBLUP compared with the single-step genomic prediction models. The estimated bias using re- gression of PEBVf on PEBVr for RDC cows was −0.001, compared with 0.003 to 0.005 for the single-step model scenarios. The SD of PEBVf and GEBVf for RDC breed were 0.032 and 0.032, respectively. Among the genomic models, the standard ssGBLUP model generally resulted in b1 estimates closer to 1.0 than the marker-weighted scenarios. Similarly, Liu et al. (2020) reported higher dispersion when whole genome sequence was added to the conventional 50K SNP chip in genomic evaluation compared with using the conventional 50K SNP chip alone. This is difficult to interpret, but it seems that achieving an increase in prediction accuracy comes along with slightly lower precision (i.e., higher dispersion). Results showed that GEBV are significantly more accurate than PEBV. In the standard ssGBLUP model, the reliability of predictions for RDC and JER bull candidates were 0.50 and 0.65, respectively, which cor- responds to 82% and 132% higher reliability than when using PBLUP. Corresponding values for RDC and JER cow candidates were 0.74 and 0.72, respectively, repre- senting 54% and 77% improvements over PBLUP. We observed a higher relative improvement in reliability for JER compared with RDC when moving from PBLUP to single-step genomic prediction. This is probably due to Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Table 3. Structure of the full and reduced validation datasets and number of records of any trait Item Nordic Red   Jersey Full Reduced Full Reduced No. of cows 5,550,887 5,223,866 916,258 801,873 No. of records per cow 13.42 13.12 18.64 18.80 Genotyped cows with records 123,436 59,881 63,531 28,865 No. of records per genotyped cows 15.74 14.11 17.56 19.07 No. of sires 91,956 47,197 26,657 7,875 Genotyped sires with daughter records 1,818 1,640 895 569 No. of daughter records per genotyped sire 3,692 1,939 3,467 2,277 Table 4. Results of forward validation (SE in parentheses) of bull and cow candidate groups for combined clinical mastitis breeding values based on pedigree-based or single-step genomic prediction with different SNP weights for Nordic Red (RDC) and Jersey (JER) dairy cattle Breed   Group n ERC2 Model3 Forward validation1 % Gain4b0 b1 R2 RDC   Bull 86 8.37 PBLUP 0.000 (0.006) 0.79 (0.175) 0.28 (0.078)     Standard ssGBLUP 0.002 (0.005) 0.75 (0.084) 0.50 (0.077)     Nonlinear 0.001 (0.005) 0.73 (0.078) 0.51 (0.079) 1.8   2pqû2 0.001 (0.004) 0.68 (0.063) 0.57 (0.082) 12.7   20SNP_window 0.000 (0.004) 0.70 (0.073) 0.50 (0.082) −1.4   Cow 8,440 1.63 PBLUP −0.001 (0.000) 0.83 (0.010) 0.48 (0.009)     Standard ssGBLUP 0.005 (0.000) 0.87 (0.006) 0.74 (0.005)     Nonlinear 0.005 (0.000) 0.85 (0.005) 0.74 (0.005) 1.1   2pqû2 0.003 (0.000) 0.79 (0.005) 0.78 (0.004) 5.3   20SNP_window 0.005 (0.000) 0.85 (0.005) 0.75 (0.005) 1.8 JER   Bull 115 45.20 PBLUP 0.006 (0.007) 0.85 (0.128) 0.28 (0.066)     Standard ssGBLUP 0.013 (0.004) 0.78 (0.052) 0.65 (0.049)     Nonlinear 0.015 (0.005) 0.77 (0.051) 0.66 (0.049) 0.5   2pqû2 0.010 (0.004) 0.70 (0.043) 0.66 (0.047) 1.1   20SNP_window 0.012 (0.005) 0.74 (0.050) 0.64 (0.050) −2.5   Cow 8,224 1.30 PBLUP 0.004 (0.001) 0.90 (0.011) 0.41 (0.008)     Standard ssGBLUP 0.010 (0.000) 0.89 (0.006) 0.72 (0.005)     Nonlinear 0.012 (0.000) 0.88 (0.006) 0.73 (0.005) 1.9   2pqû2 0.008 (0.000) 0.79 (0.005) 0.76 (0.005) 5.3   20SNP_window 0.011 (0.000) 0.87 (0.006) 0.74 (0.005) 3.1 1Forward validation: b0 = intercept (bias), b1 = regression coefficient (dispersion), R2 = validation reliability. 2Average of effective record contribution (ERC) using full dataset for each candidate groups. 3Prediction of breeding values without genomic information (PBLUP) and with genomic information where equal SNP marker weights (standard ssGBLUP) or SNP marker-specific weights (Nonlinear, 2pqû2, 20SNP_window) have been applied. 4Percent of gain in R2 relative to standard ssGBLUP. 658 Journal of Dairy Science Vol. 108 No. 1, 2025 the differences in the population structure between these 2 breeds, resulting in differences in linkage disequilib- rium. The JER breed is genetically more homogeneous and more inbred than the RDC breed. Higher heritability of CM in JER breed and additionally higher ERC for JER bull than RDC bull candidates could be the reasons for obtaining higher reliability in JER bulls. Previous stud- ies (Meuwissen et al., 2001; Calus et al., 2008; Liu et al., 2015) showed that linkage disequilibrium has a di- rect significant effect on prediction reliability. Liu et al. (2015) found that when there is a high linkage disequilib- rium between QTL and marker, there is higher stability in prediction accuracy over generations. Therefore, ac- cording to the results, it could be concluded that linkage disequilibrium is higher in JER cows than in RDC cows. On the other hand, Wientjes et al. (2013) expressed that the level of relationship with individuals in the reference population has a much higher impact on the prediction reliability than linkage disequilibrium per se. Accord- ingly, JER bull candidates might have on average higher relationship with their reference population, as they have on average a higher ERC in the full dataset. All marker weighting scenarios, except for 20SNP_ window in bulls, resulted an increase in reliabilities compared with the standard ssGBLUP ranging from 0.5% (for the Nonlinear weighting approach for JER bull candidates) to 12.7% (for the 2pqû2 weighting ap- proach for RDC bull candidates). Nonetheless, it should be noted that the validation reliability estimates had large standard errors, especially for the bulls. The amount of improvement in reliability by marker weighting differed by breed. Considering the reliability of ssGBLUP as the basis, the RDC breed gained relatively higher improve- ments than the JER breed. Among the weighting scenarios, 2pqû2 was the best and 20SNP_window was less prominent. However, the Nonlinear weights were only run for one round of analysis. In a simulation study for a relatively highly heritable trait, Zhang et al. (2016) showed that the Non- linear weighting method is superior and yields higher prediction accuracy compared with the 20SNP_window method when the number of QTL affecting the trait is high (~500) and, by decreasing the number of influential QTL, the 20SNP_window method became more advanta- geous. Fragomeni et al. (2019) obtained about 3% lower reliability by applying quadratic SNP weights, which is inconsistent with our results. They did not include allele frequencies when calculating quadratic weights (i.e., û2). Moreover, they studied a different trait and, probably, with a different architecture. However, Lourenco et al. (2014) reported improvements in prediction reliability for fat and protein percentages in a small Israeli Holstein population by using quadratic weights. It can be expected that for CM, similar to milk yield traits, there are few QTL with large effects and many QTL with small effects (Cai et al., 2024). As inbreeding and consequently link- age disequilibrium are higher in JER than in RDC, SNP weights will undergo smaller changes over time in this breed and could therefore be used for a longer period before being updated. In several studies, biological information has been used as priors to improve the prediction reliability of genomic prediction for different traits and species (Brøn- dum et al., 2015; Abdollahi-Arpanahi, 2017; Fang et al., 2017; Liu et al., 2020; Rezende et al., 2020; Farooq et al., 2021). However, there was no general consensus on the usefulness of employing biological information. Brøn- dum et al. (2015) obtained some improvement in predic- tion reliability, ranging from 0.5 percentage points for fertility to up to 5 percentage points for production traits in French Holsteins. Fang et al. (2017) reported that by using the best gene ontology, an average of 0.16 higher prediction accuracy was attained for production traits and CM. In another study, by including whole genome sequence information, Liu et al. (2020) reported signifi- cant improvement in prediction reliability for milk and protein production. However, negligible improvement was obtained for mastitis in their study. No improvement in the predictive ability of the models enriched with bio- logical information was obtained by Abdollahi-Arpanahi et al. (2017) for sire conception rate in US Holstein cows. In contrast, using the same methodology, Rezende et al. (2020) obtained a 7% higher prediction accuracy for sire conception rate in US JER cows. Abdollahi-Arpanahi et al. (2017) stated that the predictive ability of functional classes of SNPs is not primarily influenced by their bio- logical roles, but rather by considering the genomic rela- tionship. By this definition, assigning weights to SNPs, such as constructing a trait-specific GRM, is better able to include the actual relationship between individuals for the trait of interest. In fact, this definition implies that the size of the relationship between 2 individuals is different for 2 different traits. For instance, considering that CM has different forms in terms of duration of disease as well as types of symptoms and can be caused by different spe- cies of bacteria, it can be expected that by allocating spe- cific marker weights for a specific form, higher genetic improvement for that form of CM could be obtained. The ranking of candidate bulls for combined_CM between different scenarios was compared. Spearman correlations between GEBV of candidate bulls estimated using 2pqû2 and those of estimated using Nonlinear, 20SNP_window, and standard ssGBLUP were 0.94, 0.94, and 0.92, respectively. In addition, Spearman cor- relations between GEBV of the 20 best candidate bulls selected based on 2pqû2 and their corresponding GEBV estimated using Nonlinear, 20SNP_window, and stan- dard ssGBLUP were 0.83, 0.83, and 0.77, respectively. Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Journal of Dairy Science Vol. 108 No. 1, 2025 659 The corresponding correlations for the 100 best cows selected from the candidate cows were 0.59, 0.56, and 0.47, respectively. Validation of SCS Results of forward validation for bull and cow candi- dates of RDC and JER for combined_SCS are shown in Table 5. For combined_SCS, prediction biases were low- est for PBLUP, and among genomic scenarios, the 2pqû2 weighting approach generally yielded lower biases. Bi- ases for combined_SCS were higher than those for com- bined_CM, ranging from 0.45 for PBLUP in RDC bull candidates to 8.43 for the standard ssGBLUP approach in JER cow candidates. The SD of PEBVf and GEBVf for the RDC breed were 27.59 and 27.80, respectively. The corresponding values for the JER breed were 23.14 and 24.01, respectively. Dispersions followed the same pattern as for CM. Among the different scenarios, the standard ssGBLUP and the 2pqû2 weighting approaches showed the lowest and highest dispersion, respectively. Reliabilities of the standard ssGBLUP model for combined_SCS ranged from 0.58 to 0.79 depending on the breed and sex. The improvement in reliability by employing the standard ssGBLUP model was more pronounced for combined_SCS than for combined_CM, and the amount of gain was higher for bull than for cow candidates. On average, the reliability of genomic prediction was 5.6% higher for combined_SCS than for combined_CM, which indicates that reliability for com- bined_CM was close to that of combined_SCS, although the heritability of combined_CM was clearly lower. The decreased discrepancies between the reliabilities of these 2 composite traits could be attributed to the type of mod- el, which was a multitrait model. Schaeffer (1984) and Thompson and Meyer (1986) showed that low heritable traits benefit more when analyzed with high heritable traits, and the amount of gain in reliability depends on the absolute value of the difference between the genetic and environmental correlations among the traits included in the model. Moreover, additional improvement in reliabil- ity would be obtained by establishing better connections in the data by using a multitrait model (Thompson and Meyer, 1986). In estimating genetic and environmental correlations between udder health traits in Finnish dairy cows, Pösö and Mäntysaari (1996) reported a large differ- ence in the absolute value of genetic and environmental correlations between CM and SCS. Considerable differ- ences between genetic and environmental correlations of udder type traits and CM have been observed in other studies (Rupp and Boichard, 1999; Amin et al., 2002). Similarly, marker weighting was also beneficial and improved reliabilities of GEBV for combined_SCS, and the amount of gain (compared with the standard ssGB- Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Table 5. Results of forward validation (SE in parentheses) of bull and cow candidate groups for combined SCS breeding values based on pedigree- based or single-step genomic prediction with different SNP weights for Nordic Red (RDC) and Jersey (JER) dairy cattle Breed   Group n ERC2 Model3 Forward validation1 % Gain4b0 b1 R2 RDC   Bull 125 15.28 PBLUP 0.45 (3.210) 0.89 (0.124) 0.28 (0.064)     Standard ssGBLUP 6.83 (2.622) 0.86 (0.062) 0.58 (0.069)     Nonlinear 7.34 (2.587) 0.83 (0.058) 0.59 (0.068) 2.6   2pqû2 7.21 (2.433) 0.77 (0.049) 0.64 (0.060) 11.1   20SNP_window 6.66 (2.555) 0.82 (0.062) 0.59 (0.068) 2.6   Cow 18,112 1.10 PBLUP 1.94 (0.185) 1.01 (0.007) 0.50 (0.005)     Standard ssGBLUP 6.11 (0.155) 0.97 (0.004) 0.77 (0.003)     Nonlinear 6.84 (0.155) 0.94 (0.004) 0.78 (0.003) 0.5   2pqû2 5.82 (0.156) 0.87 (0.003) 0.79 (0.003) 2.2   20SNP_window 6.63 (0.154) 0.94 (0.004) 0.78 (0.003) 0.9 JER   Bull 119 65.94 PBLUP −5.13 (5.164) 0.63 (0.143) 0.15 (0.061)     Standard ssGBLUP 8.17 (3.354) 0.81 (0.055) 0.61 (0.055)     Nonlinear 7.80 (3.284) 0.80 (0.052) 0.63 (0.052) 2.6   2pqû2 4.06 (3.116) 0.70 (0.045) 0.65 (0.051) 5.5   20SNP_window 7.55 (3.378) 0.80 (0.053) 0.64 (0.052) 4.1   Cow 6,537 1.07 PBLUP 4.06 (0.500) 1.00 (0.014) 0.41 (0.009)     Standard ssGBLUP 8.43 (0.313) 0.97 (0.006) 0.79 (0.005)     Nonlinear 8.43 (0.314) 0.96 (0.006) 0.80 (0.004) 0.9   2pqû2 5.71 (0.292) 0.87 (0.005) 0.81 (0.004) 2.8   20SNP_window 7.66 (0.306) 0.95 (0.006) 0.80 (0.004) 1.3 1Forward validation: b0 = intercept (bias), b1 = regression coefficient (dispersion), R2 = validation reliability. 2Average of effective record contribution (ERC) using full dataset for each candidate groups. 3Prediction of breeding values without genomic information (PBLUP) and with genomic information where equal SNP marker weights (standard ssGBLUP) or SNP marker-specific weights (Nonlinear, 2pqû2, 20SNP_window) have been applied. 4Percent of gain in R2 relative to standard ssGBLUP. 660 Journal of Dairy Science Vol. 108 No. 1, 2025 LUP) was higher for bulls (ranging from 2.6% to 11.1%) than for cows (ranging from 0.5% to 2.8%). Obtained reliabilities of genomic predictions were higher for cows than bulls. This could be because the candidate bulls had much more information in the full dataset compared with the candidate cows, and therefore more deviation from GEBV using the reduced dataset and subsequently lower reliability can be expected. Also, dispersions were lower for cows. Similar results were found in previous studies (Kudinov et al., 2022; Zavadilova et al., 2022). For combined_SCS, the 2pqû2 weighting scenario also outperformed the other marker weighting methods, and there was a trivial difference between the reliabilities obtained by the Nonlinear and 20SNP_window meth- ods. An average improvement obtained by all marker weighting scenarios over the standard ssGBLUP was about 2.8% for both traits, and an average increase in reliability obtained by all ssGBLUP scenarios over the PBLUP scenarios was 84% for combined_CM and 110% for combined_SCS. SNP Marker Weights In general, weighting markers by 2pqû2 resulted larg- est increase in reliability and the estimates for 2pqû2 had also a lower standard error compared with the other weighting scenarios. Figure 1 illustrates the distribu- tion of the average of SNP weights for 12 eigenvalues calculated by 2 approaches (2pqû2 and Nonlinear) in the RDC breed. The differences in SNP weights using 2pqû2 were significantly higher than those of Nonlinear. This indicates that the proportion of genetic variance that would be explained by different SNPs are different in these scenarios. The 2pqû2 formulation accounts for the allele frequencies and by doing so gives more weight to markers with a high rate of heterozygosity and no weight to noninformative markers. Therefore, more weight is allocated to gene regions which potentially can be under selection. Solving the models with marker weighting in- creased the computational costs. However, the additional computing costs do not preclude the use of such models. The computing time required to solve the full models for the RDC was 31.7 h for standard ssGBLUP, 34.6 h for Nonlinear, 45.7 h for 2pqû2, and 45.9 h for 20SNP_win- dow. Trends of CM and SCS Summary statistics of the GEBV for genotyped RDC and JER individuals for combined_CM and combined_ SCS, predicted by the standard ssGBLUP model, are shown in Table 6. It should be noted that a lower (or negative) GEBV is favorable for these traits. Compared with the average of their populations, genotyped RDC individuals had better ranking in their population rela- tive to genotyped JER individuals. Genotyped JER bulls were even slightly inferior compared with the mean of their population. A similar pattern was observed for com- bined_SCS. One reason is most likely that genotyping was started earlier for JER than for RDC cows; another is that more JER individuals were genotyped in the early years. Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Figure 1. Distribution of SNP marker weights calculated by average of 12 eigenvalue traits for 2pqû2 (A) and nonlinear (B) models in the Nordic Red dairy cattle breed. Journal of Dairy Science Vol. 108 No. 1, 2025 661 The trajectories of PEBV and GEBV for the combined_ CM using both the full and reduced datasets for RDC and JER cows, respectively, are plotted to monitor the genetic and genomic trends, as well as their consistency with each other (Figures 2 and 3). The average PEBV and GEBV of cows born in 1990 were considered as the baselines, and all PEBV and GEBV were adjusted ac- cording to their baselines. The PEBVf and GEBVf of CM overlapped until 2010 and, henceforward, as expected due to genomic preselection, the trend in PEBVf was less favorable than that of GEBVf. The PEBVr and GEBVr in the most recent years were slightly overestimated (were inflated), as also indicated by the forward validation b1 estimates. Both breeds have a relatively similar trend, starting with unfavorable increasing trends until 2002 (for RDC) and 2005 (for JER), when a favorable declin- ing trend began. This is because of the inclusion of udder health traits into the breeding goals due to increased at- tention to health traits and societal demand to improve animal welfare. CONCLUSIONS We conducted this study to investigate the effect of different SNP marker weighting scenarios in a single- step SNPBLUP framework on the prediction reliability of CM and somatic cell score in the Nordic Red and Jersey dairy cattle. According to the obtained results, the implementation of single-step genomic evaluation immensely impacts the rate of genetic gain for udder health traits compared with genetic evaluations based on pedigree relationship information. All the applied marker weighting scenarios in this study outperformed the standard single-step genomic prediction approach. In particular, the classical method of marker weighting by 2pqû2 was superior to the other studied approaches. In general, lower dispersion along with higher prediction reliability were obtained for cow candidates than for bull candidates, although the proportion of increase in reli- ability by applying marker weights was higher for bulls. The application of this method of marker weighting to Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP Table 6. Mean, SD, minimum, and maximum of combined genomic estimated breeding values for clinical mastitis and log-transformed SCS for all genotyped Nordic Red dairy cattle (RDC) and Jersey (JER) bulls and cows Trait1   Breed   Sex n Mean SD Minimum Maximum CM   RDC   Bull 1,830 0.000 0.037 −0.131 0.129     Cow 123,959 −0.006 0.036 −0.167 0.196   JER   Bull 900 0.004 0.045 −0.115 0.147     Cow 63,877 0.000 0.044 −0.184 0.214 SCS   RDC   Bull 1,830 −7.735 38.430 −113.601 155.751       Cow 123,959 −9.531 36.604 −164.312 159.639   JER   Bull 900 3.707 32.126 −84.610 112.958       Cow 63,877 −1.433 31.107 −139.761 143.647 1CM = clinical mastitis. Figure 2. Trends of genomic (GEBV) and pedigree-based (PEBV) estimated breeding values for clinical mastitis using the full (f) and the reduced (r) datasets for Nordic Red dairy cattle (SD of averages of GEBVf and PEBVf across the years were 0.008 and 0.006, respectively). Figure 3. Trends of genomic (GEBV) and pedigree-based (PEBV) estimated breeding values for clinical mastitis using the full (f) and the reduced (r) datasets for Jersey cows (SD of averages of GEBVf and PEBVf across the years were 0.016 and 0.015, respectively). 662 Journal of Dairy Science Vol. 108 No. 1, 2025 other evaluated traits and populations deserves further research. NOTES This study has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement no. 815668. The authors ac- knowledge Faba Co-op (Vantaa, Finland), Nordic Cattle Genetic Evaluation (NAV; Aarhus, Denmark), and Vi- kingGenetics Ltd. (Randers, Denmark) for providing all data necessary for the analyses. No human or animal sub- jects were used, so this analysis did not require approval by an Institutional Animal Care and Use Committee or Institutional Review Board. The authors have not stated any conflicts of interest. Nonstandard abbreviations used: 20SNP_window = 20 adjacent SNPs; CF = covariance function; CM = clinical mastitis; CM2 = CM trait for the second lactation from 15 d before calving to 150 DIM; CM3 = CM trait for the third lactation from 15 d before calving to 150 DIM; CM11 = CM trait for 15 d before calving until 50 DIM in the first lactation; CM12 = CM trait for 51 to 305 DIM in the first lactation; ERC = effective record contribution; GEBVf = genomic estimated breeding values using the full dataset; GEBVr = genomic estimated breeding values using the reduced dataset; GRM = genomic relationship matrix; JER = Jersey; NAV = Nordic Cattle Genetic Evaluation; PBLUP = pedigree-based BLUP; PEBV = pedigree-based estimated breeding value; PEBVf = PEBV calculated based on full dataset; PEBVr = PEBV calcu- lated based on reduced dataset; RDC = Nordic Red; SCS 1, SCS2, and SCS3 = SCS observations in lactations 1, 2, and 3; ssGBLUP = single-step GBLUP; ssGTABLUP = single-step GBLUP with a T factoring; ssSNPBLUP = single-step single nucleotide polymorphism BLUP; UA = fore udder attachment; UD = udder depth. REFERENCES Abdollahi-Arpanahi, R., G. Morota, and F. Peñagaricano. 2017. Predict- ing bull fertility using genomic data and biological information. J. Dairy Sci. 100:9656–9666. https:​/​/​doi​.org/​10​.3168/​jds​.2017​-13288. Aguilar, I., I. Misztal, D. L. Johnson, A. Legarra, S. Tsuruta, and T. J. Lawlor. 2010. Hot topic: A unified approach to utilize phenotypic, full pedigree, and genomic information for genetic evaluation of Holstein final score. J. Dairy Sci. 93:743–752. https:​/​/​doi​.org/​10​ .3168/​jds​.2009​-2730. Amin, A. A., T. Gere, and W. H. Kishk. 2002. Genetic and environmental relationship among udder conformation traits and mastitis incidence in Holstein–Friesian in two different environments. Arch. Tierz. 45:129–138. https:​/​/​doi​.org/​10​.5194/​aab​-45​-129​-2002. Ben Zaabza, H., M. Taskinen, E. A. Mäntysaari, T. J. Pitkänen, G. P. Aamand, and I. Strandén. 2022. Breeding value reliabilities for multiple-trait single-step genomic best linear unbiased predictor. J. Dairy Sci. 105:5221–5237. https:​/​/​doi​.org/​10​.3168/​jds​.2021​-21016. Brøndum, R. F., G. Su, L. Janss, G. Sahana, B. Guldbrandtsen, D. Boichard, and M. S. Lund. 2015. Quantitative trait loci markers derived from whole genome sequence data increases the reliability of genomic prediction. J. Dairy Sci. 98:4107–4116. https:​/​/​doi​.org/​ 10​.3168/​jds​.2014​-9005. Cai, Z., T. Iso-Touru, M. P. Sanchez, N. Kadri, A. C. Bouwman, P. K. Chitneedi, I. M. MacLeod, C. J. Vander Jagt, A. J. Chamberlain, B. Gredler-Grandl, M. Spengeler, M. S. Lund, D. Boichard, C. Kühn, H. Pausch, J. Vilkki, and G. Sahana. 2024. Meta-analysis of six dairy cattle breeds reveals biologically relevant candidate genes for mastitis resistance. Genet. Sel. Evol. 56:54. https:​/​/​doi​.org/​10​.1186/​ s12711​-024​-00920​-8. Calus, M. P., T. H. Meuwissen, A. P. de Roos, and R. F. Veerkamp. 2008. Accuracy of genomic selection using different methods to define haplotypes. Genetics 178:553–561. https:​/​/​doi​.org/​10​.1534/​genetics​ .107​.080838. Christensen, O. F., and M. S. Lund. 2010. Genomic prediction when some animals are not genotyped. Genet. Sel. Evol. 42:2. https:​/​/​doi​ .org/​10​.1186/​1297​-9686​-42​-2. Cole, J. B., P. M. VanRaden, J. R. O’Connell, C. P. Van Tassell, T. S. Sonstegard, R. D. Schnabel, J. F. Taylor, and G. R. Wiggans. 2009. Distribution and location of genetic effects for dairy traits. J. Dairy Sci. 92:2931–2946. https:​/​/​doi​.org/​10​.3168/​jds​.2008​-1762. Cook, J. P., A. Mahajan, and A. P. Morris. 2017. Guidance for the utility of linear models in meta-analysis of genetic association studies of binary phenotypes. Eur. J. Hum. Genet. 25:240–245. https:​/​/​doi​.org/​ 10​.1038/​ejhg​.2016​.150. Egyedy, A. F., and B. N. Ametaj. 2022. Mastitis: impact of dry period, pathogens, and immune responses on etiopathogenesis of disease and its association with periparturient diseases. Dairy 3:881–906. https:​/​/​doi​.org/​10​.3390/​dairy3040061. Falconer, D. S., and T. F. C. Mackay. 1996. Introduction to Quantitative Genetics. New York: Longman. Fang, L., G. Sahana, P. Ma, G. Su, Y. Yu, S. Zhang, M. S. Lund, and P. Sørensen. 2017. Use of biological priors enhances understanding of genetic architecture and genomic prediction of complex traits within and between dairy cattle breeds. BMC Genomics 18:604. https:​/​/​doi​ .org/​10​.1186/​s12864​-017​-4004​-z. Farooq, M., A. D. J. van Dijk, H. Nijveen, M. G. M. Aarts, W. Kruijer, T. P. Nguyen, S. Mansoor, and D. de Ridder. 2021. Prior biological knowledge improves genomic prediction of growth-related traits in Arabidopsis thaliana. Front. Genet. 11:609117. https:​/​/​doi​.org/​10​ .3389/​fgene​.2020​.609117. Fragomeni, B. O., D. A. L. Lourenco, A. Legarra, P. M. VanRaden, and I. Misztal. 2019. Alternative SNP weighting for single-step genomic best linear unbiased predictor evaluation of stature in US Holsteins in the presence of selected sequence variants. J. Dairy Sci. 102:10012–10019. https:​/​/​doi​.org/​10​.3168/​jds​.2019​-16262. Habier, D., R. L. Fernando, K. Kizilkaya, and D. J. Garrick. 2011. Exten- sion of the Bayesian alphabet for genomic selection. BMC Bioinfor- matics 12:186. https:​/​/​doi​.org/​10​.1186/​1471​-2105​-12​-186. Habier, D., J. Tetens, F. R. Seefried, P. Lichtner, and G. Thaller. 2010. The impact of genetic relationship information on genomic breeding values in German Holstein cattle. Genet. Sel. Evol. 42:5. https:​/​/​doi​ .org/​10​.1186/​1297​-9686​-42​-5. Harris, B., and D. Johnson. 1998. Approximate reliability of genetic evaluations under an animal model. J. Dairy Sci. 81:2723–2728. https:​/​/​doi​.org/​10​.3168/​jds​.S0022​-0302(98)75829​-1. Kudinov, A. A., E. A. Mäntysaari, T. J. Pitkänen, E. I. Saksa, G. P. Aa- mand, P. Uimari, and I. Strandén. 2022. Single-step genomic evalu- ation of Russian dairy cattle using internal and external information. J. Anim. Breed. Genet. 139:259–270. https:​/​/​doi​.org/​10​.1111/​jbg​ .12660. Legarra, A., and A. Reverter. 2018. Semi-parametric estimates of popu- lation accuracy and bias of predictions of breeding values and future phenotypes using the LR method. Genet. Sel. Evol. 50:53. https:​/​/​doi​ .org/​10​.1186/​s12711​-018​-0426​-6. Lidauer, M. H., J. Pösö, J. Pedersen, J. Lassen, P. Madsen, E. A. Män- tysaari, U. S. Nielsen, J. Å. Eriksson, K. Johansson, T. Pitkänen, I. Strandén, and G. P. Aamand. 2015. Across-country test-day model Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP https://doi.org/10.3168/jds.2017-13288 https://doi.org/10.3168/jds.2009-2730 https://doi.org/10.3168/jds.2009-2730 https://doi.org/10.5194/aab-45-129-2002 https://doi.org/10.3168/jds.2021-21016 https://doi.org/10.3168/jds.2014-9005 https://doi.org/10.3168/jds.2014-9005 https://doi.org/10.1186/s12711-024-00920-8 https://doi.org/10.1186/s12711-024-00920-8 https://doi.org/10.1534/genetics.107.080838 https://doi.org/10.1534/genetics.107.080838 https://doi.org/10.1186/1297-9686-42-2 https://doi.org/10.1186/1297-9686-42-2 https://doi.org/10.3168/jds.2008-1762 https://doi.org/10.1038/ejhg.2016.150 https://doi.org/10.1038/ejhg.2016.150 https://doi.org/10.3390/dairy3040061 https://doi.org/10.1186/s12864-017-4004-z https://doi.org/10.1186/s12864-017-4004-z https://doi.org/10.3389/fgene.2020.609117 https://doi.org/10.3389/fgene.2020.609117 https://doi.org/10.3168/jds.2019-16262 https://doi.org/10.1186/1471-2105-12-186 https://doi.org/10.1186/1297-9686-42-5 https://doi.org/10.1186/1297-9686-42-5 https://doi.org/10.3168/jds.S0022-0302(98)75829-1 https://doi.org/10.1111/jbg.12660 https://doi.org/10.1111/jbg.12660 https://doi.org/10.1186/s12711-018-0426-6 https://doi.org/10.1186/s12711-018-0426-6 Journal of Dairy Science Vol. 108 No. 1, 2025 663 evaluations for Holstein, Nordic Red Cattle, and Jersey. J. Dairy Sci. 98:1296–1309. https:​/​/​doi​.org/​10​.3168/​jds​.2014​-8307. Liu, A., M. S. Lund, D. Boichard, E. Karaman, S. Fritz, G. P. Aamand, U. S. Nielsen, Y. Wang, and G. Su. 2020. Improvement of genomic pre- diction by integrating additional single nucleotide polymorphisms selected from imputed whole genome sequencing data. Heredity 124:37–49. https:​/​/​doi​.org/​10​.1038/​s41437​-019​-0246​-7. Liu, H., H. Zhou, Y. Wu, X. Li, J. Zhao, T. Zuo, X. Zhang, Y. Zhang, S. Liu, Y. Shen, H. Lin, Z. Zhang, K. Huang, T. Lübberstedt, and G. Pan. 2015. The impact of genetic relationship and linkage disequilib- rium on genomic selection. PLoS One 10:e0132379. https:​/​/​doi​.org/​ 10​.1371/​journal​.pone​.0132379. Liu, Z., M. E. Goddard, F. Reinhardt, and R. Reents. 2014. A single-step genomic model with direct estimation of marker effects. J. Dairy Sci. 97:5833–5850. https:​/​/​doi​.org/​10​.3168/​jds​.2014​-7924. Lourenco, D. A. L., I. Misztal, S. Tsuruta, I. Aguilar, E. Ezra, M. Ron, A. Shirak, and J. I. Weller. 2014. Methods for genomic evaluation of a relatively small genotyped dairy population and effect of genotyped cow information in multiparity analyses. J. Dairy Sci. 97:1742–1752. https:​/​/​doi​.org/​10​.3168/​jds​.2013​-6916. Mäntysaari, E. A., R. D. Evans, and I. Strandén. 2017. Efficient single- step genomic evaluation for a multibreed beef cattle population hav- ing many genotyped animals. J. Anim. Sci. 95:4728–4737. https:​/​/​ doi​.org/​10​.2527/​jas2017​.1912. Meuwissen, T. H., B. J. Hayes, and M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829. https:​/​/​doi​.org/​10​.1093/​genetics/​157​.4​.1819. Negussie, E., M. Lidauer, E. A. Mäntysaari, I. Strandén, J. Pösö, U. S. Nielsen, K. Johansson, J.-Å. Eriksson, and G. P. Aamand. 2010. Combining test day SCS with clinical mastitis and udder type traits: A random regression model for joint genetic evaluation of udder health in Denmark, Finland and Sweden. Pages 25–31 in Proc. 2010 Interbull Mtg., Riga, Latvia. Pösö, J., and A. E. Mäntysaari. 1996. Relationship between clinical mastitis, somatic cell score, and production for first three lactations of Finnish Ayrshire. J. Dairy Sci. 79:1284–1291. https:​/​/​doi​.org/​10​ .3168/​jds​.S0022​-0302(96)76483​-4. R Core Development Team. 2012. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vi- enna, Austria. Accessed May 16, 2023. http:​/​/​www​.R​-project​.org/​. Rezende, F. M., M. Haile-Mariam, J. E. Pryce, and F. Peñagaricano. 2020. Across-country genomic prediction of bull fertility in Jersey dairy cattle. J. Dairy Sci. 103:11618–11627. https:​/​/​doi​.org/​10​.3168/​ jds​.2020​-18910. Rupp, R., and D. Boichard. 1999. Genetic parameters for clinical mas- titis, somatic cell score, production, udder type traits, and milking ease in first lactation Holsteins. J. Dairy Sci. 82:2198–2204. https:​/​/​ doi​.org/​10​.3168/​jds​.S0022​-0302(99)75465​-2. Schaeffer, L. R. 1984. Sire and cow evaluation under multiple trait mod- els. J. Dairy Sci. 67:1567–1580. https:​/​/​doi​.org/​10​.3168/​jds​.S0022​ -0302(84)81479​-4. Strandén, I., and J. Jenko. 2024. A computationally feasible multi-trait single-step genomic prediction model with trait-specific marker weights. Genet. Sel. Evol. 56:58. https:​/​/​doi​.org/​10​.1186/​s12711​ -024​-00926​-2. Strandén, I., and M. Lidauer. 1999. Solving large mixed linear mod- els using preconditioned conjugate gradient iteration. J. Dairy Sci. 82:2779–2787. https:​/​/​doi​.org/​10​.3168/​jds​.S0022​-0302(99)75535​-9. Strandén, I., and K. Vuori. 2006. RelaX2: Pedigree analysis program. Pages 27–30 in Proceedings of the 8th World Congress on Genetics Applied to Livestock Production. Belo Horizonte, MG, Brazil. Thompson, R., and K. Meyer. 1986. A review of theoretical aspects in the estimation of breeding values for multi-trait selection. Livest. Prod. Sci. 15:299–313. https:​/​/​doi​.org/​10​.1016/​0301​-6226(86)90071​ -0. Tier, B., and K. Meyer. 2004. Approximating prediction error covari- ances among additive genetic effects within animals in multiple-trait and random regression models. J. Anim. Breed. Genet. 121:77–89. https:​/​/​doi​.org/​10​.1111/​j​.1439​-0388​.2003​.00444​.x. Vandenplas, J., J. Ten Napel, S. N. Darbaghshahi, R. Evans, M. P. L. Calus, R. Veerkamp, A. Cromie, E. A. Mäntysaari, and I. Strandén. 2023. Efficient large-scale single-step evaluations and indirect genomic prediction of genotyped selection candidates. Genet. Sel. Evol. 55:37. https:​/​/​doi​.org/​10​.1186/​s12711​-023​-00808​-z. VanRaden, P. M. 2008. Efficient methods to compute genomic predic- tions. J. Dairy Sci. 91:4414–4423. https:​/​/​doi​.org/​10​.3168/​jds​.2007​ -0980. Wang, H., I. Misztal, I. Aguilar, A. Legarra, and W. M. Muir. 2012. Genome-wide association mapping including phenotypes from rela- tives without genotypes. Genet. Res. (Camb.) 94:73–83. https:​/​/​doi​ .org/​10​.1017/​S0016672312000274. Wientjes, Y. C., R. F. Veerkamp, and M. P. Calus. 2013. The effect of linkage disequilibrium and family relationships on the reliability of genomic prediction. Genetics 193:621–631. https:​/​/​doi​.org/​10​.1534/​ genetics​.112​.146290. Zavadilova, L., E. Kasna, J. Kucera, and J. Bauer. 2022. Genomic evalu- ation for clinical mastitis in Czech Holstein. Pages 89–94 in Proc 2022 Interbull Mtg., Montréal, Canada. Zhang, X., D. Lourenco, I. Aguilar, A. Legarra, and I. Misztal. 2016. Weighting strategies for single-step genomic BLUP: an iterative ap- proach for accurate calculation of GEBV and GWAS. Front. Genet. 7:151. https:​/​/​doi​.org/​10​.3389/​fgene​.2016​.00151. Zhang, Z., J. Liu, X. Ding, P. Bijma, D. J. de Koning, and Q. Zhang. 2010. Best linear unbiased prediction of genomic breeding values using a trait-specific marker-derived relationship matrix. PLoS One 5:e12648. https:​/​/​doi​.org/​10​.1371/​journal​.pone​.0012648. Chegini et al.: SNP MARKER WEIGHTING IN SINGLE-STEP SNPBLUP https://doi.org/10.3168/jds.2014-8307 https://doi.org/10.1038/s41437-019-0246-7 https://doi.org/10.1371/journal.pone.0132379 https://doi.org/10.1371/journal.pone.0132379 https://doi.org/10.3168/jds.2014-7924 https://doi.org/10.3168/jds.2013-6916 https://doi.org/10.2527/jas2017.1912 https://doi.org/10.2527/jas2017.1912 https://doi.org/10.1093/genetics/157.4.1819 https://doi.org/10.3168/jds.S0022-0302(96)76483-4 https://doi.org/10.3168/jds.S0022-0302(96)76483-4 http://www.R-project.org/ https://doi.org/10.3168/jds.2020-18910 https://doi.org/10.3168/jds.2020-18910 https://doi.org/10.3168/jds.S0022-0302(99)75465-2 https://doi.org/10.3168/jds.S0022-0302(99)75465-2 https://doi.org/10.3168/jds.S0022-0302(84)81479-4 https://doi.org/10.3168/jds.S0022-0302(84)81479-4 https://doi.org/10.1186/s12711-024-00926-2 https://doi.org/10.1186/s12711-024-00926-2 https://doi.org/10.3168/jds.S0022-0302(99)75535-9 https://doi.org/10.1016/0301-6226(86)90071-0 https://doi.org/10.1016/0301-6226(86)90071-0 https://doi.org/10.1111/j.1439-0388.2003.00444.x https://doi.org/10.1186/s12711-023-00808-z https://doi.org/10.3168/jds.2007-0980 https://doi.org/10.3168/jds.2007-0980 https://doi.org/10.1017/S0016672312000274 https://doi.org/10.1017/S0016672312000274 https://doi.org/10.1534/genetics.112.146290 https://doi.org/10.1534/genetics.112.146290 https://doi.org/10.3389/fgene.2016.00151 https://doi.org/10.1371/journal.pone.0012648 Kansi_Chegini-etal-2025-Marker_weighting_improves_single-step_genomic_prediction 1-s2.0-S0022030224011962-main Marker weighting improves single-step genomic prediction reliabilities of udder health traits in Nordic Red and Jersey dairy cattle populations INTRODUCTION MATERIALS AND METHODS Data and Trait Definition Multitrait Udder Health Model with Covariance Functions for Animal Effects Models Prediction of Breeding Values Model Validation RESULTS AND DISCUSSION Summary Statistics and Input Parameters Validation of Clinical Mastitis Validation of SCS SNP Marker Weights Trends of CM and SCS CONCLUSIONS NOTES REFERENCES