Jukuri, open repository of the Natural Resources Institute Finland (Luke) All material supplied via Jukuri is protected by copyright and other intellectual property rights. Duplication or sale, in electronic or print form, of any part of the repository collections is prohibited. Making electronic or print copies of the material is permitted only for your own personal use or for educational purposes. For other purposes, this article may be used in accordance with the publisher’s terms. There may be differences between this version and the publisher’s version. You are advised to cite the publisher’s version. This is an electronic reprint of the original article. This reprint may differ from the original in pagination and typographic detail. Author(s): Feng-Hua Lv, Yin-Hong Cao, Guang-Jian Liu, Ling-Yun Luo, Ran Lu, Ming-Jun Liu, Wen-Rong Li, Ping Zhou, Xin-Hua Wang, Min Shen, Lei Gao, Jing-Quan Yang, Hua Yang, Yong-Lin Yang, Chang-Bin Liu, Peng-Cheng Wan, Yun-Sheng Zhang, Wen-Hui Pi, Yan-Ling Ren, Zhi-Qiang Shen, Feng Wang, Yu-Tao Wang, Jin-Quan Li, Hosein Salehian-Dehkordi, Eer Hehua, Yong-Gang Liu, Jian-Fei Chen, Jian-Kui Wang, Xue- Mei Deng, Ali Esmailizadeh, Mostafa Dehghani-Qanatqestani, Hadi Charati, Maryam Nosrati, Ondřej Štěpánek, Hossam E Rushdi, Ingrid Olsaker, Ino Curik, Neena A Gorkhali, Samuel R Paiva, Alexandre R Caetano, Elena Ciani, Marcel Amills, Christina Weimann, Georg Erhardt, Agraw Amane, Joram M Mwacharo, Jian-Lin Han, Olivier Hanotte, Kathiravan Periasamy, Anna M Johansson, Jón H Hallsson, Juha Kantanen, David W Coltman, Michael W Bruford, Johannes A Lenstra, Meng-Hua Li Title: Whole-Genome Resequencing of Worldwide Wild and Domestic Sheep Elucidates Genetic Diversity, Introgression, and Agronomically Important Loci Year: 2022 Version: Published version Copyright: The Author(s) 2022 Rights: CC BY 4.0 Rights url: http://creativecommons.org/licenses/by/4.0/ Please cite the original version: Feng-Hua Lv, Yin-Hong Cao, Guang-Jian Liu, Ling-Yun Luo, Ran Lu, Ming-Jun Liu, Wen-Rong Li, Ping Zhou, Xin-Hua Wang, Min Shen, Lei Gao, Jing-Quan Yang, Hua Yang, Yong-Lin Yang, Chang-Bin Liu, Peng-Cheng Wan, Yun-Sheng Zhang, Wen-Hui Pi, Yan-Ling Ren, Zhi-Qiang Shen, Feng Wang, Yu- Jukuri, open repository of the Natural Resources Institute Finland (Luke) All material supplied via Jukuri is protected by copyright and other intellectual property rights. Duplication or sale, in electronic or print form, of any part of the repository collections is prohibited. Making electronic or print copies of the material is permitted only for your own personal use or for educational purposes. For other purposes, this article may be used in accordance with the publisher’s terms. There may be differences between this version and the publisher’s version. You are advised to cite the publisher’s version. Tao Wang, Jin-Quan Li, Hosein Salehian-Dehkordi, Eer Hehua, Yong-Gang Liu, Jian-Fei Chen, Jian- Kui Wang, Xue-Mei Deng, Ali Esmailizadeh, Mostafa Dehghani-Qanatqestani, Hadi Charati, Maryam Nosrati, Ondřej Štěpánek, Hossam E Rushdi, Ingrid Olsaker, Ino Curik, Neena A Gorkhali, Samuel R Paiva, Alexandre R Caetano, Elena Ciani, Marcel Amills, Christina Weimann, Georg Erhardt, Agraw Amane, Joram M Mwacharo, Jian-Lin Han, Olivier Hanotte, Kathiravan Periasamy, Anna M Johansson, Jón H Hallsson, Juha Kantanen, David W Coltman, Michael W Bruford, Johannes A Lenstra, Meng-Hua Li, Whole-Genome Resequencing of Worldwide Wild and Domestic Sheep Elucidates Genetic Diversity, Introgression, and Agronomically Important Loci, Molecular Biology and Evolution, Volume 39, Issue 2, February 2022, msab353, https://doi.org/10.1093/molbev/msab353. Whole-Genome Resequencing of Worldwide Wild and Domestic Sheep Elucidates Genetic Diversity, Introgression, and Agronomically Important Loci Feng-Hua Lv,†,1 Yin-Hong Cao,†,2,3 Guang-Jian Liu,†,4 Ling-Yun Luo,1 Ran Lu,1 Ming-Jun Liu,5 Wen-Rong Li,5 Ping Zhou,6,7 Xin-Hua Wang,6,7 Min Shen,6,7 Lei Gao,6,7 Jing-Quan Yang,6,7 Hua Yang,6,7 Yong-Lin Yang,6,7 Chang-Bin Liu,6,7 Peng-Cheng Wan,6,7 Yun-Sheng Zhang,6,7 Wen-Hui Pi,6,7 Yan-Ling Ren,8 Zhi-Qiang Shen,8 Feng Wang,9 Yu-Tao Wang,10 Jin-Quan Li,11 Hosein Salehian-Dehkordi,2,3 Eer Hehua,12 Yong-Gang Liu,13 Jian-Fei Chen,1 Jian-Kui Wang,1 Xue-Mei Deng,1 Ali Esmailizadeh ,14 Mostafa Dehghani-Qanatqestani,14 Hadi Charati,14 Maryam Nosrati,15 Ond�rej �St�ep�anek,16 Hossam E. Rushdi ,17 Ingrid Olsaker,18 Ino Curik,19 Neena A. Gorkhali,20 Samuel R. Paiva,21 Alexandre R. Caetano,21 Elena Ciani,22 Marcel Amills,23,24 Christina Weimann,25 Georg Erhardt,25 Agraw Amane,26,27 Joram M. Mwacharo,28,29 Jian-Lin Han,30,31 Olivier Hanotte,27,32 Kathiravan Periasamy,33 Anna M. Johansson,34 J�on H. Hallsson,35 Juha Kantanen,36 David W. Coltman,37 Michael W. Bruford ,38,39 Johannes A. Lenstra,40 and Meng-Hua Li *,1 1College of Animal Science and Technology, China Agricultural University, Beijing, China 2CAS Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences (CAS), Beijing, China 3College of Life Sciences, University of Chinese Academy of Sciences (UCAS), Beijing, China 4Novogene Bioinformatics Institute, Beijing, China 5Animal Biotechnological Research Center, Xinjiang Academy of Animal Science, Urumqi, China 6Institute of Animal Husbandry and Veterinary Medicine, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China 7State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China 8Shandong Binzhou Academy of Animal Science and Veterinary Medicine, Binzhou, China 9Institute of Sheep and Goat Science, Nanjing Agricultural University, Nanjing, China 10College of Life and Geographic Sciences, Kashi University, Kashi, China 11College of Animal Science, Inner Mongolia Agricultural University, Hohhot, China 12Grass-Feeding Livestock Engineering Technology Research Center, Ningxia Academy of Agriculture and Forestry Sciences, Yinchuan, China 13College of Animal Science and Technology, Yunnan Agricultural University, Kunming, China 14Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman, Iran 15Department of Agriculture, Payame Noor University, Tehran, Iran 16Department of Virology, State Veterinary Institute Jihlava, Jihlava, Czech Republic 17Department of Animal Production, Faculty of Agriculture, Cairo University, 12613 Giza, Egypt 18Department of Preclinical Sciences and Pathology, Faculty of Veterinary Medicine, Norwegian University of Life Sciences, Ås, Norway 19Department of Animal Science, Faculty of Agriculture, University of Zagreb, Zagreb, Croatia 20Animal Breeding Division, National Animal Science Institute, Nepal Agriculture Research Council (NARC), Kathmandu, Nepal 21Embrapa Recursos Gen�eticos e Biotecnologia, Parque Estaç~ao Biol�ogica, PqEB, Bras�ılia, DF, Brazil 22Dipartimento di Bioscienze, Biotecnologie e Biofarmaceutica, Universit�a degli Studi di Bari Aldo 24 Moro, Bari, Italy 23Department of Animal Genetics, Center for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Campus de la Universitat Aut�onoma de Barcelona, Bellaterra, Spain 24Department of Animal Sciences, Universitat Aut�onoma de Barcelona, Bellaterra, Spain 25Department of Animal Breeding and Genetics, Justus-Liebig-University Giessen, Giessen, Germany 26Department of Microbial, Cellular and Molecular Biology, Addis Ababa University, Addis Ababa, Ethiopia 27LiveGene Program, International Livestock Research Institute, Addis Ababa, Ethiopia 28Small Ruminant Genomics, International Centre for Agricultural Research in the Dry Areas (ICARDA), Addis Ababa, Ethiopia A rticle � The Author(s) 2021. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons. org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Open Access Mol. Biol. Evol. 39(2):msab353 doi:10.1093/molbev/msab353 Advance Access publication December 10, 2021 1 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://orcid.org/0000-0003-0986-6639 http://orcid.org/0000-0002-8516-0366 https://orcid.org/0000-0001-6357-6080 https://orcid.org/0000-0001-5194-2506 29CTLGH and SRUC, The Roslin Institute Building, Easter Bush Campus, Edinburgh, Scotland 30CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing, China 31Livestock Genetics Program, International Livestock Research Institute (ILRI), Nairobi, Kenya 32School of Life Sciences, University of Nottingham, University Park, Nottingham, United Kingdom 33Animal Production and Health Laboratory, Joint FAO/IAEA Division, International Atomic Energy Agency (IAEA), Vienna, Austria 34Department of Animal Breeding and Genetics, Faculty of Veterinary Medicine and Animal Science, Swedish University of Agricultural Sciences, Uppsala, Sweden 35Faculty of Natural Resources and Environmental Sciences, Agricultural University of Iceland, Borgarnes, Iceland 36Production Systems, Natural Resources Institute Finland (Luke), Jokioinen, Finland 37Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada 38School of Biosciences, Cardiff University, Cathays Park, Cardiff, Wales, United Kingdom 39Sustainable Places Research Institute, Cardiff University, Wales, United Kingdom 40Faculty of Veterinary Medicine, Utrecht University, Utrecht, the Netherlands †These authors contributed equally to this work. *Corresponding author: E-mail: menghua.li@cau.edu.cn. Associate editor: Katja Nowick Abstract Domestic sheep and their wild relatives harbor substantial genetic variants that can form the backbone of molecular breeding, but their genome landscapes remain understudied. Here, we present a comprehensive genome resource for wild ovine species, landraces and improved breeds of domestic sheep, comprising high-coverage (�16.10�) whole genomes of 810 samples from 7 wild species and 158 diverse domestic populations. We detected, in total,�121.2 million single nucleotide polymorphisms, �61 million of which are novel. Some display significant (P< 0.001) differences in frequency between wild and domestic species, or are private to continent-wide or individual sheep populations. Retained or introgressed wild gene variants in domestic populations have contributed to local adaptation, such as the variation in the HBB associated with plateau adaptation. We identified novel and previously reported targets of selection on mor- phological and agronomic traits such as stature, horn, tail configuration, and wool fineness. We explored the genetic basis of wool fineness and unveiled a novel mutation (chr25: T7,068,586C) in the 30-UTR of IRF2BP2 as plausible causal variant for fleece fiber diameter. We reconstructed prehistorical migrations from the Near Eastern domestication center to South-and-Southeast Asia and found two main waves of migrations across the Eurasian Steppe and the Iranian Plateau in the Early and Late Bronze Ages. Our findings refine our understanding of genome variation as shaped by continental migrations, introgression, adaptation, and selection of sheep. Key words: whole-genome sequences, genetic diversity, adaptive introgression, genetic selection, agronomic traits, migration. Introduction Since their domestication, sheep (Ovis aries) have been pro- viding meat, wool, skin, milk, and other products to humans. They are essential for welfare of hundreds of millions of peo- ple living in rural communities of some Asian, African, and South American countries (Nkedianye et al. 2019). Since their domestication from Asiatic mouflon�10,000 years ago in the Near East (Zeder 2008), sheep expanded to other parts of Asia (Lv et al. 2015; Zhao et al. 2017) and to Europe (Ryder 1984; Tapio et al. 2006; Ciani et al. 2020), Africa (Bradford and Fitzhugh 1984; Muigai and Hanotte 2013), and the New World (Crispim et al. 2012; Campos et al. 2020; Revelo et al. 2020; Thorne et al. 2021). An early and major event was the post-domestication transition of hair sheep to wool sheep (Chessa et al. 2009). In addition, introgression from wild rel- atives into domestic populations has occurred (Barbato et al. 2017; Hu et al. 2019; Ciani et al. 2020; Cao et al. 2021). Thus, sheep have acquired a worldwide distribution through adap- tation to a diverse range of environments and genetic im- provement under different production systems, developing into plenty of unique breeds (Lv et al. 2015; Cao et al. 2021). Hence, the extensive variations in landraces and improved breeds underpin the genomic variations, adaptive character- istics, and important agronomic traits of domestic sheep (Yang et al. 2016; Li et al. 2020). In this study, we sequenced genomes with an average depth of �16.7� for one Asiatic mouflon and a diverse set of 165 samples from 82 domestic breeds. These include African, South Asian, Southeast Asian, Central Asian, East Asian, South American, European, and the Middle Eastern breeds which were underrepresented in early work (Alberto et al. 2018; Naval-Sanchez et al. 2018; Deng et al. 2020; Li et al. Lv et al. . doi:10.1093/molbev/msab353 MBE 2 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 2020), but are essential to characterize the genetic variation and the introgression from wild into domestic sheep, and to explore for the first time the history of South-and-Southeast Asian sheep. By combining these novel genomic resources with previously published whole-genome sequences (WGS; >10�), we achieved a total of 810 samples from 158 domes- tic breeds and seven wild ovine species (fig. 1A, supplemen- tary table S1, Supplementary Material online). Using this comprehensive data set, we explored patterns of genome- wide sequence variation among species/breeds, characterized regional and population-specific variants, inferred demo- graphic expansions, discovered retained or introgressed var- iants from wild sheep, and identified genes and variants associated with agronomically important traits in domestic sheep. Results Whole-Genome Sequencing and Variation Individual genomes of one Asiatic mouflon and 165 samples from 82 domestic sheep breeds across the world were se- quenced to an average depth of 16.7� and average genome coverage of 96.40% to the Oar_rambouillet_v1.0 reference genome. These data were combined with 644 available genomes (fig. 1A and supplementary table S1, Supplementary Material online). Of the 82 domestic breeds, whole genomes of 24 breeds from regions such as Africa (e.g., Menz), Central-and-East Asia (e.g., Minxian Black Fur, Gala, Azerbaijian Mountain Merino, Kazakh Arkhar-Merino, and Kazakh Edilbai), South-and-Southeast Asia (e.g., Indonesia Thin-tailed, Kajli, Vembur, Mecheri, Madras Red, Pattanam, Kilakarsal, and Baruwal), Europe (e.g., Russian Edilbai, Lleyn, Hebridean, Ryelands, Swedish Finewool, Icelandic Leader, Icelandic, and Oxford Down), and America (e.g., Morada Nova) were sequenced for the first time and extended our coverage of the sheep genetic resources (supplementary table S1, Supplementary Material online). In total, we analyzed�31.5 Tb raw data from 810 samples, comprising �41 billion 100-bp and 150-bp paired-end reads with average genome depth of 16.10� per individual (or �39.8 Gb raw data per sample). The reads were mapped, resulting in varied alignment rate, genome coverage and se- quence depth among individuals, and across the whole ge- nome (fig. 1B and supplementary table S1, Supplementary Material online). After filtering, we identified �137.7 million genome-wide genetic variants, including �121.2 million sin- gle nucleotide polymorphisms (SNPs) (�12.1 million/individ- ual), �16.4 million small insertions and deletions (INDELs, <50 bp, �15.5 million/individual), and 74,173 structural var- iants (SVs, 51–886,322 bp, 9,190/individual; 58,256 deletions, 10,890 translocations, 3,501 duplications, and 1,524 inver- sions) across all the samples (fig. 1B and table 1). The number of SNP variants per individual showed significant (Mann– Whitney, P< 0.001) difference between species (supplemen- tary table S2, Supplementary Material online). Of the set of �121.2 million high-quality SNPs, 77.9% had a minor allele frequency (MAF) < 5% (supplementary fig. S1A, Supplementary Material online). The number of SNP identified varies from 9.82 to 12.9 million per domestic indi- vidual, �9.9 million (European mouflon) to �18.46 million (Bighorn) per wild individual (supplementary fig. S1B, Supplementary Material online), and 10.4 to 13.0 million per domestic breed, which did not depend on the population sample size (supplementary fig. S1C, Supplementary Material online). We identified �61.0 million SNPs (50.2%) that have not been reported previously in the dbSNP (ftp://ftp.ncbi.nih.gov/ snp/organisms/archive/sheep_9940/, last accessed July 10, 2021) or the European Variation Archive (EVA, including accessions PRJEB33111, PRJEB23437, PRJEB6025, PRJEB15642, and PRJEB14685) (fig. 1C), which could be due to prior un- derrepresentation of the sheep breeds studied here. We ob- served �4.51 million (�3.7% of the total SNPs) SNPs shared among all the eight ovine species, �5.64 million (European mouflon vs. snow sheep) to 34.6 million (domestic sheep vs. Asiatic mouflon) between pairwise species, �50.7 million only in wild species, and�40.96 million SNPs shared between landraces and improved breeds of domestic sheep (fig. 1D). We identified�0.024 million (Brazilian Creole) to�1.74 mil- lion (local Moroccan sheep) breed-specific SNPs (supplemen- tary fig. S2, Supplementary Material online). The number of SNPs present in each major geographic region, including region-specific SNPs, varies from 27.8 million in the American breeds to �49.4 million in the Central-and-East Asian breeds (supplementary fig. S3 and table S3, Supplementary Material online), which could be partially as- cribed to the differences in individual WGS variability and number of samples included in the study. Variant Accuracy On average, 9,954,398 of the SNPs (82.05%) identified in do- mestic sheep and 8,252,754–13,976,424 of the SNPs detected in the wild species (60.26–81.47%) were confirmed in the sheep dbSNP database v.151 (supplementary table S4, Supplementary Material online). The proportions of SNPs confirmed were comparable with those reported in a previ- ous study (Li et al. 2020). In addition, we inspected 133 se- lected SNPs in candidate functional genes below from 2 to 16 individuals of ten populations obtained by Sanger sequencing approach, giving an overall validation rate of 98.58% (supple- mentary table S5, Supplementary Material online). Overall, the results provided confidence on the accuracy of variant calling. Genome Variant Map The average SNP densities were 43.67/kb, 33.15/kb, and 19.20/ kb in autosomes, X, and Y-chromosomes, respectively, illus- trating the notion that the nucleotide diversity was lower on the nonrecombining sex chromosomes (Sayres 2018). From all SNPs, 44.21% were located in or near genes, including upstream (6.71%) and downstream (2.66%) of open reading frame, introns (32.84%) and untranslated regions (UTRs, 0.88%). In total, 1,042,367 SNPs were located in protein- coding exons, causing 492,609 nonsynonymous and 549,128 synonymous mutations, with 45,679 start codon and 8,803 stop codon changes, showing their potentially large impact Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 3 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data http://ftp://ftp.ncbi.nih.gov/snp/organisms/archive/sheep_9940/ http://ftp://ftp.ncbi.nih.gov/snp/organisms/archive/sheep_9940/ http://ftp://ftp.ncbi.nih.gov/snp/organisms/archive/sheep_9940/ https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data on function of the relevant genes (supplementary table S6, Supplementary Material online). From the 26,342 SVs discovered only in domestic sheep, 11,391 (43.24%) were predicted to have functional conse- quences, and 6,816 (25.88%) are in 3,592 functional genes (e.g., BMPR1B and MYPN; supplementary table S7, Supplementary Material online). Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) annota- tions of the SVs indicated these genes to be involved in axonogenesis and axon guidance (supplementary fig. S4 and A B C D FIG. 1. Geographic distributions and genetic variants of wild and domestic sheep. (A) Sampling locations of 158 domestic sheep populations, and geographic distribution ranges and sampling locations of the seven wild sheep species. The brown and black dots indicate the sampling locations of domestic and wild sheep. The geographic distribution range of each wild sheep species is shown on the map with different colors as follows: Ovis musimon (yellow), O. orientalis (blue), O. vignei (green), O. ammon (orange), O. nivicola (purple), O. dalli (pink), and O canadensis (red). All European mouflon populations in wild parks of Europe have been imported recently from Coasica and Sardinia. (B) Diagram of SNPs found by resequencing of 734 individuals. Circles represent from outermost to innermost, 27 chromosomes (chr. 1–26, X) denoted by different colors (a), average sequence depth for 810 individuals (b), candidate genes selection based on the top 1% values of global FST (c), SNP abundance bars in improved populations (d) and landraces (e), nucleotide diversity (p) abundance bars in improved populations (f) and landraces (g), and INDEL abundance bars in improved populations (h) and landraces (i). (C) Venn diagrams for novel variants detected in wild and domestic sheep. (D) Venn diagrams for variants shared among wild sheep species and improved populations and landraces of domestic sheep. Lv et al. . doi:10.1093/molbev/msab353 MBE 4 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data table S8, Supplementary Material online). We found 19,875 SVs present only in the seven wild sheep species. GO and KEGG enrichment analyses for 8,215 genes associated with these unique SVs in wild sheep uncovered their essential roles in MAPK signaling pathway, cell morphogenesis, and GTPase activator activity (supplementary fig. S5 and table S9, Supplementary Material online). Genomic Diversity Nucleotide diversities (p) in the eight Ovine species were estimated at individual level after the correction for sample size. Asiatic mouflon and urial have the highest nucleotide diversity, whereas the lowest was observed in bighorn sheep (fig. 2A). Asiatic mouflon and urial, which genetically are not completely separated (Demirci et al. 2013; Deng et al. 2020) (see also Results), are potential resources to expand the ge- netic resources of domestic improvement. The diversity of domestic sheep varies considerably with p values ranging from p¼ 3.1� 10-3 (Bashibai breed) to 1.7� 10-3 (Sv€ardsjö breed; fig. 2B and supplementary table S10, Supplementary Material online). There are no large differences among breeds from different geographic regions (fig. 2C). Breeds with a low diversity show a high coverage by runs of homozygosity (ROH; supplementary fig. S6 and table S10, Supplementary Material online). Sheep from central and northern Europe with a long history of breed formation and selective breeding have a relatively low diversity compared to most other breeds. There are no systematic differences between landraces and productive breeds: several fine-wool breeds and breeds from marginal regions such as the Balkan, the Caucasus, Libya, Kazakhstan, and Tibet and Xinjiang of China have retained high diversity, whereas small population sizes have decreased the diversity in breeds from both categories (fig. 2B and sup- plementary table S10, Supplementary Material online). In general, the values of nucleotide diversity calculated here are in the range of earlier estimates in domestic sheep (p¼ 1.9–2.5� 10-3, Yang et al. 2016; p¼ 1.6� 10-3, Naval- Sanchez et al. 2018; p¼ 2.15–2.68� 10-3, Alberto et al. 2018; p¼ 2.44–2.84� 10-3, Pan et al. 2018; p¼ 1.2–42� 10-3, Hu et al. 2019; p¼ 1.05–1.18� 10-3, Li et al. 2020; p¼ 3.2� 10-3, Chen et al. 2021) (supplementary table S11, Supplementary Material online). For the same populations such as Merino, Tan, Lop, and Tibetan sheep, most showed similar estimates in this and early studies (supplementary table S12, Supplementary Material online), but the discrepancy in some populations (e.g., Iran sheep) can be due to the differ- ence in sequencing depth, sample size (e.g., the number of animals sequenced) and the sheep reference genomes (Oar v.4.0 or Oar_rambouillet_v1.0) used in the SNP calling, etc. Linkage disequilibrium (LD) analysis indicated that the physical distance between SNPs measured as half of its max- imal value occurred at 40.0 kb (r2¼ 0.35) for domestic sheep and at 29.1–37.7 kb (r2¼ 0.31–0.40) for wild sheep except the inbred European mouflons (fig. 2D). For landraces, LD values ranged from 14 to 296 kb (but >1,000 kb for the Cameroon population) and were higher than for improved breeds (14.6– 67 kb) (supplementary fig. S7A and B, Supplementary Material online). The LD decay in domestic sheep was slower than in cattle (�2–10 kb) (Mei et al. 2018), goat (�1–2.5 kb) (Zheng et al. 2020), pig (� 8–20 kb) (Li et al. 2013), and dog (�1–3 kb) (Wang et al. 2015). Global FST value for domestic sheep (FST ¼ 0.082, P< 0.001) is lower than those across other domestic species such as goats (FST ¼ 0.112) (Kim et al. 2019) and pigs (FST ¼ 0.115) (Mu~noz et al. 2019), which may partially reflect a rel- atively frequent genetic exchange during the development of modern breeds (Kijas et al. 2012). FST value for breeds from the same region ranges from 0.050 (the Middle East) to 0.193 (America; supplementary fig. S8, Supplementary Material online). Effective Population Size Pairwise sequentially Markovian coalescent (PSMC) analysis revealed that all wild and domestic sheep species experienced a dramatic contraction in Ne �100–400 thousand years ago (ka). Ancestors of domestic sheep (i.e., Asiatic mouflon) underwent a second decline in Ne during �10,000–30,000 years BP, coinciding with the glacial periods (supplementary fig. S9, Supplementary Material online). SMCþþ analysis showed a continual decline of all the domestic sheep popu- lations during �8,000–10,000 years BP, which probably reflects the bottleneck effect of domestication (fig. 2E). Their subsequent population expansions show the expansion of agricultural in sedentary societies (Gignoux et al. 2011). Estimates of recent Ne by SNeP for the wild relatives and domestic sheep breeds �50 generations ago were �38–54 and 42–53, respectively. Relatively smaller Ne was observed in South-and-Southeast Asian and American domestic popula- tions, which may be due to strong artificial selection and/or genetic isolation (Li et al. 2020). Phylogenetic Relationships and Population Structure Phylogenetic relationships among the wild and domestic sheep populations were inferred from whole-genome SNPs for 714 unrelated individuals (supplementary fig. S10 and Table 1. Summary of Whole-Genome Variations Identified in Wild and Domestic Sheep. Variations Wild Sheep Domestic Sheep Total SNPs 90,771,286 70,626,497 121,253,260 INDELs 12,304,629 10,694,323 16,409,609 Structural variants Deletions 42,048 40,885 58,256 Translocations 4,823 9,315 10,890 Duplications 1,875 2,742 3,501 Inversions 1,035 1,064 1,524 Insertions 0 2 2 Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 5 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data table S13, Supplementary Material online). The phylogenetic tree (fig. 3A) shows a close relationship between the European mouflon and domestic sheep, followed by Asiatic mouflon. This supports the notion that the modern European mouflon is the feral descendent of early domestic sheep rather than a genuine species of wild sheep (fig. 3A) (Ciani et al. 2020). Domestic sheep were split into six geographically structured clades (Central-and-East Asian, South-and-Southeast Asian, the Middle Eastern, African, European, and American popu- lations) with obvious explanations of apparent exceptions: the East European Baidarak sheep (blue) cluster within Central-and-East Asian sheep; thin-tailed Tibetan and Southwest Chinese sheep (red) are close to Indian breeds (yellow); Azerbaijan sheep (red) cluster within the Middle East sheep; and Asian fine-wool breeds (red) cluster are close to European Merino sheep (fig. 3A). The phylogenetic tree using the wild relatives as the outgroup revealed clear between-population genetic differentiation in domestic sheep as most of the samples from the same breeds clustered together. The Middle Eastern breeds were placed near the root within domestic sheep, being the descendant of the first domesticated sheep there (Zeder 2008). The European branch confirms separate subclusters for British and Nordic breeds, respectively (supplementary fig. S11, Supplementary Material online) (Kijas et al. 2012; Ciani et al. 2020), and the African branch indicates a genetic division between North- and-West African and East-and-South African breeds (Muigai and Hanotte 2013). The geographic pattern is further supported by principal component analysis (PCA) and model-based clustering via the Admixture. A PCA plot of 653 unrelated domestic sheep shows largely separates the six geographic groups, most of which could be connected to continental origins (fig. 3B and supplementary fig. S11, Supplementary Material online). The inbred Ouessant and Gotland breeds have the extreme posi- tions. European and African populations are largely separated from Asian populations with overlaps such as Merino-derived individuals (fig. 3B). Genetic clustering analysis using sNMF (Frichot et al. 2014) recapitulates the same patterns observed in the phylogenetic tree (fig. 3A) and PCA (fig. 3B). At the optimal number K¼ 11 with the smallest cross validation error (supplementary fig. S12, Supplementary Material online), we observed five genetic clusters of the seven wild sheep species (Asiatic mouflon and urial are in the same genetic clusters; bighorn and thinhorn sheep are in one genetic cluster) and the six geographically distributed genomic components of domestic sheep (fig. 3C). Interestingly, sNMF analysis revealed the genetic components of wild relatives in domestic breeds, for example, the genetic ancestry of Asiatic mouflon in Awassi and Bangladeshi sheep breeds, argali in Bashibai sheep, and European mouflon in Cameroon and Sv€ardsjö sheep (fig. 3C). Additionally, we 0. 00 0 0. 00 4 0. 00 8 0 100 200 300 400 500 0. 2 0. 4 0. 6 0. 8 Distance(Kb) r2 Domestic sheep European mouflon Asiatic mouflon Urial Argali Snow sheep Thinhorn sheep Bighorn sheep African populations American populations Central-and-East Asian populations European populations South-and-Southeast Asian populations the Middle Eastern populations Years (generation time = 3 years, mutation rate=1.51×10-8) 103 104 105 105 104 103 N e Predomestication (10,000-12,000 BP) Domestication episode (8,000-10,000 BP) π A D E B 0. 00 0 0. 00 4 0. 0 0 8 π 1. Central-and-East Asian populations 2. South-and-Southeast Asian populations 3. the Middle Eastern populations 1 2 3 4 5 6 C 4. African populations 5. European populations 6. American populations -50 0 50 -100 0 100 200 0.0020 0.0024 0.0028 La tit ud e Longitude π Dom es tic sh ee p Euro pe an m ou flo n Asia tic m ou flo n Uria l Arga li Sno w sh ee p Thin ho rn sh ee p Bigh orn sh ee p FIG. 2. Genetic variability of wild and domestic sheep. (A) Average nucleotide diversity (p) of eight wild and domestic sheep species. (B) Nucleotide diversity (p) of 158 domestic populations across the world. (C) Average nucleotide diversity (p) of six geographic groups of domestic sheep populations. (D) Decay of LD for eight wild and domestic sheep species, with one line per species. (E) Effective population sizes (Ne) for six domestic sheep populations inferred using SMCþþ, with one line per population. A generation time of 3 years and mutation rate of 1.51� 10�8 per site per generation were used to convert coalescent scaling to calendar time. Lv et al. . doi:10.1093/molbev/msab353 MBE 6 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data 0.04 Bighorn sheep Thinhorn sheep Snow sheep Argali Asiatic mouflon/Urial European mouflon HDW SXW W DS SSS THQ HUS MXS/HZS/GMA/SAB/SAJ/CHA TAN W ZS KHO/TAM BAY DAR BAJ1 BAG DLS LOP CLS KAZ/ALS BAJ2 KAED BSB DEG1 VEM KIL PAS/NES NES M ES M RS NES JLS KAC THA/KAJ/W G R KAG BAS G AR/BG E SUM ITT/G UR DQ S TCS ZLX ZR J DEG 1 KAS KAR BAL G SS H AS AFH SH A G H E M AZ BO Z G ALA AW A M AK/M O H IR O A AW A KR S N D Z AFH M EN BO G AFS SO M N Q A/R D A AW D /D PS W AD /M BS M O S/D JI SAH /M O S YAN U D A C AM LO C A D M AN D M AN A C OL / NA M D NA M D A C OL A C OL/ NA M D A C OL/ NA M D A C OL/A MIT A C OL/ NA M D A C OL/ D RAS/ NA M D BO U J/ D M AN /S AR D /L O C A D M AN /S AR D /T IM A/ LO C A BE N I/D M AN /S AR D /O U LE /L O C A LO C A O SS /B AR /L BS /R AH FI N SF S G O T H EL SV S IL S/ O SP /F SS /G TS SH E N R S EF R O LS D R S BE N T O U E RY E LI E O XD TE X ZW A R O S SW A W H S G TS /N W S/ W M S HE B SF K HA M G CN KU IB SU M RH S SO L BB S/ W BS AL T LE C M KS PR AM VA L PI S DA S RI P XI S AR M E/ KA TO /C AU C/ ZA B/ AZ M E/ DA M E M SF /M ER BC S CA S LA C CH U RO M SA L M NR SI S BE R As iat ic mou flo n 70.01. Central-and-East Asian populations 2. South-and-Southeast Asian populations 3. the Middle Eastern populations 4. African populations 5. European populations 6. American populations 1 2 3 4 5 6 -0.05 0.00 0.05 -0 .0 5 0. 00 0. 05 0. 10 PC1 (22.6%) P ) %2.71( 2 C Central-and-East Asian populations South-and-Southeast Asian populations the Middle Eastern populations African populations European populations American populations yrtsecnA 0.0 0.1 41 2 3 5 6 7 8 1. Central-and-East Asian populations 2. South-and-Southeast Asian populations 3. the Middle Eastern populations 7. European mouflon 8. Asiatic mouflon/Urial 9. Argali A B C 4. African populations 5. European populations 6. American populations 10. Snow sheep 11. Thinhorn sheep/Bighorn sheep TCEA G host 2 31 54 TAF NAME TME TSSA2 Asiatic mouflonPresent TSSA1 TEU 9.25 ka (9.14-9.33) 8.74 ka (8.66-8.83) 7.75 ka (7.68-7.83) 7.35 ka (7.27-7.43) 5.93 ka (5.85-6.01) 4.05 ka (3.93-4.18) 2.80 3.23 1.83 2.638.87 (8.70-9.05) 0.302 4.95 2.12 0.268 2.42 1.19 1.68 D E (0.296-0.307) (0.263-0.274) (2.37-2.46) (3.19-3.27) (2.78-2.83) (2.35-2.91) (1.47-2.20) (1.18-1.20) (1.67-1.69) (2.08-2.16) (4.86-5.04) 1. Central-and-East Asian populations 2. South-and-Southeast Asian populations 3. the Middle Eastern populations 4. African populations 5. European populations 1 2 3 4 5 6 7 9 10 11 1213 14 1 The Mediterranean routes 2 The Danubian route 3 The northern Europe route 4 The ancient sea trade route to the Indian subcontinent 5 Thin-tailed sheep routes in Africa Fat-tailed sheep routes in Africa 7 The first route to America ~15th century 8 The second route to America ~ 16th to 19th century 9 The recent route to America 10 11 12 The Qinghai-Tibetan Plateau route The Yunnan-Kweichow Plateau route 13 The South-and-Southeast Asian route 14 The admixture event from Middle Eastern sheep into South-and-Southeast Asian sheep The Eastern Eurasia route The major post-domestication migratory events of sheep All arrows are approximate The approximate domestication center The geographic distribution of Asiatic mouflon The approximate major migratory events of sheep in South-and-Southeast Asia inferred in this study 8 6 aE=0.30 (0.293-0.316) 9 East-and-South African Ouessant Gotland East-and-South African West African North African West African North African FIG. 3. Population genetic structure of wild and domestic sheep and the demographic history. (A) Neighbor-joining (NJ) trees of wild and domestic sheep based on whole-genome SNPs using the identity-by-state (IBS) genetic distances implemented in PLINK v1.90 (Purcell et al. 2007). BAJ1 and BAJ2 are from the population BAJ, and DEG1 and DEG2 are from the population DEG. (B) PCA of domestic sheep populations. (C) Population genetic structure of wild and domestic sheep inferred from the sNMF analyses (K¼ 11) using whole-genome SNP data. (D) Demographic history reconstruction of domestic sheep. The best-supported demographic model of South-and-Southeast Asian population: South-and-Southeast sheep diverged from Central-and-East Asian population�5.93 ka and was later admixed by the Middle Eastern population�2.42 ka. The Middle Eastern population contributed�30% of their gene pool to the South-and-Southeast Asian population. Dates of events (ka) are indicated on the Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 7 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 noted clear evidence of genetic heterozygosity among European, African, and Asian populations. This indicated sub- stantial historical genetic flows among elite genetic stocks from different continents in the development of local breeds (Galal et al. 2008; Ciani et al. 2020). One objective of this study was to fill the gaps of the global genomic coverage of population-scale WGS data by including samples from the so far neglected South-and-Southeast Asian, African, and South American sheep breeds. South- and-Southeast Asian breeds cluster with breeds from Pakistan and India. South American breeds share ancestry with sheep from both Western Africa and Europe as reported previously (Kijas et al. 2012). Genetic Origins of South-and-Southeast Asian Sheep In order to explore the origin of South-and-Southeast Asian sheep (Singh et al. 2013; Lv et al. 2015), we tested three alter- native models (supplementary fig. S13, Supplementary Material online). We computed the expected multidimensional site fre- quency spectra (SFS) under specific models and compared it with the observed multidimensional SFS by a composite like- lihood method. The best supported SFS with the highest aver- age value of estimated log10(likelihood) (supplementary fig. S14 and table S14, Supplementary Material online) suggested that South-and-Southeast Asian sheep descended from Central- and-East Asian population 5.85 to 6.01 ka (95% CIs). The op- timal model also showed a later admixture from the Middle Eastern sheep 3.93–4.18 ka, which contributed approximately 29–32% of the gene pool (fig. 3D and supplementary table S15, Supplementary Material online). The best model further indicated a divergence of the European, African, and the Middle Eastern sheep from their common ancestor 9.16–9.33, 8.66–8.83, and 7.68–7.83 ka, re- spectively (fig. 3D and supplementary table S15, Supplementary Material online). This was in agreement with the initial expansions of ancestral sheep as evidenced by the archaeological findings and early genetic studies (Zeder 2008; Deng et al. 2020). Asymmetric gene flow occurred from the Central-and-East Asia to the South-and-Southeast Asia (8.70–9.05 migrants per generation) and from the South-and- Southeast Asia to the Middle East (4.86–5.04 migrants per generation) (fig. 3D and E and supplementary table S16, Supplementary Material online). Retained or Introgressed Sequences from Wild Sheep in the Domestic Genomes The sharing of ancestry by wild and domestic sheep, indicated by the Admixture analysis, may be the result of two scenarios. First, sequences from Asiatic and European mouflons that were inherited directly by the first domestic sheep may have been retained in specific breeds, whereas other parts of the genome adapted to the domestic status and breeding objectives, for instance during the transition from hair to wool sheep (Chessa et al. 2009). Second, introgression may have taken place during later contacts between wild and domestic sheep. Here we investigated the pattern and con- sequences of the retained or introgressed sequences (RIS) using a combination of D and fd statistics. We used ABBA-BABA statistics to detect ancestry shared by wild donors to domestic breeds. In agreement with Ciani et al. (2020), we found substantial influence (negative D values) of European mouflon on several breeds with the higher jZj values in North European sheep such as Welsh Mountain, Icelandic Leader, and Shetland (fig. 4A and supplementary table S17, Supplementary Material online). Retained or introgressed an- cestry from Asiatic mouflon and the related urial are much lower (D value is closer to zero), although Z values indicate statistical significance for a few breeds. Asiatic mouflon and urial appear to have influenced the same breeds, confirming that these wild species have become mutually admixed (fig. 3C) (Demirci et al. 2013; Deng et al. 2020; Chen et al. 2021). Interestingly, the genomes of Bashibai sheep carry a sizable argali component. In addition to several other potential explan- ations such as horizontal gene transfer, hybridization, and com- mon ancestry (Wang et al. 2021), the observation could be very likely the result of introgression (fig. 4A and supplementary table S17, Supplementary Material online). To further locate the RIS genomic regions, we computed the fd values for 100-kb windows and 50-kb step across the genomes of relevant domestic populations. After the P-values adjusted (false discovery rate, FDR), we detected 258 (fd > 0.1714) � 434 (fd > 0.1585) significant introgressed blocks from argali, 283 (fd > 0.2271) � 335 (fd > 0.2363) significant introgressed tracts from urail, 311 (fd > 0.2374) � 344 (fd > 0.5322) significant introgressed blocks from Asiatic mouflon, 296 (fd > 0.5117)� 323 (fd > 0.3157) significant introgressed tracts from European mouflon, respectively (supplementary fig. S15 and table S18, Supplementary Material online). The proportion of genome introgression (PGI) of domestic sheep breeds from different donor wild species on the basis of the fd statistics gives different patterns (fig. 4B–E and supplementary table S19, Supplementary Material online). The proportion of RIS segments, which reflects the actual genome size of do- mestic sheep retained or introgressed from wild relatives was estimated to be 7.64–13.40%, 5.94–9.61%, 4.31–6.39%, and 3.67–5.03% of their whole genomes originating from left and right, and estimated migration rates (migrants per generation) are scaled by the effective population size were given above/below the corresponding arrows. The 95% confidence intervals (CIs) are shown in parentheses. (E) Major sheep migrations across the world. (1) The Mediterranean routes (Ryder 1984), (2) the Danubian route (Ryder 1984), (3) the northern Europe route (Tapio et al. 2006), (4) the ancient sea trade route to the Indian subcontinent (Singh et al. 2013), (5) thin-tailed sheep routes in Africa (Muigai and Hanotte 2013), (6) fat-tailed sheep routes in Africa (Muigai and Hanotte 2013), (7) the first route to America�15th century (Dunmire 2013), (8) the second route to America�16th to 19th century (Spangler et al. 2017), (9) the recent route to America (Ryder 1984), (10) the Eastern Eurasia route (Lv et al. 2015), (11) the Qinghai- Tibet Plateau route (Zhao et al. 2017; Hu et al. 2019), (12) the Yunnan-Kweichow Plateau route (Zhao et al. 2017; Hu et al. 2019), (13) the South- and-Southeastern Asia route (this study), and (14) the admixture event from Middle Eastern sheep into South-and-Southeastern Asian sheep (this study). Lv et al. . doi:10.1093/molbev/msab353 MBE 8 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data 0.0764 - 0.0914 0.0915 - 0.1014 0.1015 - 0.1128 0.1129 - 0.1227 0.1228 - 0.1340 0 0.0594 - 0.0638 0.0639 - 0.0694 0.0695 - 0.0773 0.0774 - 0.0892 0.0893 - 0.0961 0 0.0431 0.0432 - 0.0477 0.0478 - 0.0486 0.0487 - 0.0553 0.0554 - 0.0638 0 0.0367 0.0368 - 0.0503 0 -0.09 0 -30 0 D (M en z, X ; W ild s he ep , B ig ho rn s he ep ) D -v al ue Z- va lu e European mouflon Asiatic mouflon Urial Argali W H SF H AM IL S H EB R O S O XD G TS N R S IC S C AM SW A M ea t LI E R YE M ilk G C N N W S O U E TE X SF K O SP SH E M ER C AS SO L BE N T SV S O LS ZW A XI S R IP C H U D R S C AU C KU IB FS S M SF G O T R H S SA L BC S M FW H EL SF S D AM E R O M AR M EBM N LE C PI S KA TOAW D ZA B W BS AL T BB S EF R FI NSU M A O U LE BE N I D PS D M AN U N KN PR AM BS I SA R D TI M A LO C A SI S M KS D ASBE R A Z M EVA L N Q A R D A D Q S D JI M O S SA H ZR J BO U J TC SM N R KA Z KR S BG E TH O G AR BO Z M O H M AK S O M BA R YA N ZL X SU M LB S U D A W AD D EGM BS G U R D AR BA J C H A BA G KA ED BA S SS S R ED H U S W D S IT T W M S G U R N Q A R D A G AR N D Z SW A IR O A LO C A M ER SU M N Q A M ER G U R R D A BG E SU M BS B British North European South-and-East European Asian and African PGI from Asiatic mouflon PGI from Urial PGI from Argali PGI from European mouflon 0 20 40 60 80 100 120 140 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 European mouflon Asiatic mouflon Urial Argali Le ng th o f a cc um ul at ed R IS fr ag m en ts (M b) Chromosomes 0 0.2 0.4 0 0.2 0.6 0 0.2 0.4 0.6 0 0.2 0.6 UrialArgali Asiatic mouflonEuropean mouflon 0.40.4 0.8 H B BA K T3 M R P4 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 N PF FR 1 B C D E A F G FIG. 4. Genetic introgression of wild sheep species in domestic populations. (A) D statistic tests in the form (Menz sheep, X, wild sheep, Bighorn sheep) where wild sheep indicates European mouflon, Asiatic mouflon, urial, or argali, respectively. X indicate a candidate domestic population. Introgressions from European mouflon into British, North European, Southwest-and-East European, and Asian, and African populations are indicated by green, gold, purple, and black color, respectively. (B, C, D, E) The PGI across the whole genome of each wild sheep species in worldwide domestic populations. The levels of PGI from European mouflon, Asiatic mouflon, urial, and argali are mapped based on the results of fd statistics: fd (Menz sheep, X, European mouflon, Bighorn sheep), fd (Menz sheep, X, Asiatic mouflon, Bighorn sheep), fd (Menz sheep, X, Urial, Bighorn sheep), and fd (Menz sheep, X, Argali, Bighorn sheep), separately. X indicates a candidate domestic population. PGI can be estimated with the equation: PGI =(Rfdi�Gi)/G, where fdi and Gi refer to the fd value and the window size in base pairs for the ith window, and G is the genome size in base pairs (Zhou et al. 2020). (F) RIS of wild sheep species in domestic sheep on each chromosome. Colored heatmaps of blue, red, purple, and gray show the abundance of genomic regions containing RIS from the four wild sheep species based on fd statistics (Argali, Asiatic mouflon, Urial, and European mouflon). The genes with the most significant signals of introgression are highlighted. (G) Length of accumulated RIS from the four wild sheep species (Argali, Asiatic mouflon, Urial, and European mouflon) in domestic sheep. Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 9 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 European mouflon, Asiatic mouflon, urial, and argali, respec- tively (supplementary table S19, Supplementary Material on- line). Counting the RIS tracts of union set for every wild donor into all domestic recipients yielded 566, 1437, 1718, and 4808 RIS tracts from argali, urial, Asiatic mouflon, and European mouflon, respectively (fig. 4F and supplementary table S20, Supplementary Material online). The accumulated genome sizes of RIS segments on each chromosome in domestic genomes were shown in figure 4G. The number and size of detected introgression windows show different patterns among different chromosomes and different donor wild spe- cies (fig. 4F and G and supplementary table S20, Supplementary Material online). Of the tracts shared with the wild relatives, 79 common blocks were shared with all four wild species and span 125 functional genes (supplementary tables S21 and S22, Supplementary Material online). In the GO analysis of the retained or introgressed genes, we identified a significant (ad- justed P< 0.05) overrepresentation of genes involved in orga- nonitrogen compound (GO:0010243), hemoglobin complex (GO:0005833), olfactory receptor activity (GO:0004984), and oxygen transporter activity (GO:0005344) (supplementary fig. S16 and table S23, Supplementary Material online). The KEGG analysis identified a major and significant enrichment of genes associated with insulin signaling pathway and African trypanosomiasis (oas05143) (supplementary fig. S16, Supplementary Material online). In particular, the significantly enriched GO terms and KEGG pathways contained the oxy- gen transporter activity, patterning of blood vessels, and olfac- tory receptor activity genes such as HBB, VEGFC, and Olr51L1. In order to illustrate the potential of this approach, we focus on the Changthangi sheep in the high-altitude region of Ladakh ca. 4,000 m above sea level. Genome scans of fd values of Changthangi versus Asiatic mouflon were shown in figure 5A. One of the introgression signals is the HBB region on chromo- some 15 (chr15: 51.9–52.0 Mb). Various analyses such as the population branch statistics (PBS, Yi et al. 2010; fig. 5B and supplementary table S24, Supplementary Material online), dxy, FST distances, and haplotype pattern (supplementary fig. S17, Supplementary Material online) suggested an anomalous phy- logeny as the result of wild introgression of variants of HBB and/ or neighboring genes that confer hypoxia adaptation in the Changthangi sheep (Hu et al. 2019). In the GO analysis of the introgressed genes from wild species into Changtangi sheep, we identified a significant (P< 0.05) overrepresentation of genes involved in oxygen transporter activity (GO:0005344), oxygen binding (GO:0019825), pigmentation (GO:0043473), cellular re- sponse to UV-B (GO:0071493), and energy homeostasis (GO:0097009) (fig. 5C). The KEGG analysis identified a major and significant enrichment of genes associated with olfactory transduction (oas04740) and PI3K-Akt signaling pathway (oas04151) (fig. 5C). The two highest peaks in the PBS scan (fig. 5B) cover IGF2BP2, encoding an important protein in cell metabolism and development, and RXFP2, a gene involved in horn development (Li et al. 2018; Aldersey et al. 2020). Both also have elevated FST values (fig. 6A). We calculated the probability of incomplete lineage sort- ing (ILS) using a recombination rate of 1.5 cM/Mb (Petit et al. 2017), a generation time of 3 years for domestic sheep (Zhao et al. 2017), a divergence time of 2.36 Ma between argali and domestic sheep (Yang et al. 2016), �1.26 Ma between urial and domestic sheep (Rezaei et al. 2010), �11 ka between Asiatic mouflon and domestic sheep (Chessa et al. 2009), and 5–6 ka between European mouflon and domestic sheep (Vigne 1992). This gives a length of tracts in domestic sheep shared with argali of 42.4 bp, with urial of 79.4 bp, with Asiatic mouflons of 9,091 bp, and with European mouflon of 16.7– 20 kb. The probability of a length of 85.4 kb (i.e., the observed introgressed region from argali, urial, Asiatic mouflon, and European mouflon) is zero, which ruled out that the intro- gressed region identified in Changthangi sheep were due to random ILS without selection. Novel Signatures of Selection The distribution of genetic differentiation statistic FST values for 50-kb windows with 25-kb step across the genomes of domestic sheep, an indicator of selection, is summarized in figure 6A and supplementary table S25, Supplementary Material online. Selective sweeps contained 559 genes, of which 259 (46%) were novel, and 300 were identified by sev- eral previous investigations on the basis of WGS (supplemen- tary table S25, Supplementary Material online) (Yang et al. 2016; Alberto et al. 2018; Li et al. 2020). Functional annotation of putatively selected genes revealed that they were predom- inantly associated with morphological and agronomic traits, including genes such as KRT71 and FGF7 (skin development function), Olr226, Olr51B2, and Olr2D3-like (olfactory func- tion), and HBB, HBB-like, HBBC, and HBE2 (oxygen transporter function, see above). In addition, the previously characterized genes with relevant traits such as RXFP2 (see also fig. 5B), PDGFD (Dong et al. 2020), and TMEM154 (Heaton et al. 2012; Hu et al. 2019; Li et al. 2020) have elevated FST values (fig. 6A). We found significant overlap of the FST outliers with known QTLs in sheep (permutation test, P< 0.001), most notably with milk-, meat-, and disease-resistance-related QTLs (supplementary table S26, Supplementary Material on- line). Within the protein-coding exons in the selective regions, we identified 8,753 nonsynonymous SNP mutations in 559 putatively selected genes. The allele frequencies of nonsynon- ymous SNPs and their genotypes pattern in ten genes (e.g., EVC, GDF6, RXFP2, TMEM154) showed differences among populations with varied phenotypes such as body size (e.g., normal/dwarf), horn status (e.g., horned/polled), and disease resistance (e.g., pneumonia susceptibility/resistance) (supple- mentary table S27, Supplementary Material online). A genome-wide reduction of diversity (ROD) as alterna- tive selection signature indicated 134 genomic regions with top 1% of both ROD values between landraces and im- proved breeds of domestic sheep (fig. 6B and supplemen- tary table S28, Supplementary Material online). We found that ROD chr15: 49,000,001–50,000,000 were in concor- dance with high nonsynonymous to synonymous (N/S) ratio (N/S ratio ¼ 1.49; supplementary table S29, Supplementary Material online). Additionally, 59 SVs under selection during sheep improvement including 1,601 Lv et al. . doi:10.1093/molbev/msab353 MBE 10 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data candidate genes (supplementary table S30, Supplementary Material online), several of which are related to regulation of ovulation rate (e.g., BMP15 and CYP17) (Estienne et al. 2017). The composite likelihood ratio (CLR) statistic scores for 10- kb windows in Ouessant sheep with an extreme dwarf phe- notype was shown in figure 6C. Outlier windows with CLR > 1.31 (P< 0.001) were considered to be under artificial selec- tion and cover 1,702 genes (supplementary table S31, Supplementary Material online). These genes were mainly involved in bone resorption (e.g., PTGER4), bone mineraliza- tion (e.g., BMP6), and dwarfism (e.g., GDF6 and PTDSS1). In gene GDF6, we also detected significant (Mann–Whitney, P< 0.001) differences in a few nonsynonymous SNP variants and genotype patterns between the populations with differ- ent statures (supplementary fig. S18, Supplementary Material online). Several of the genes (e.g., IGF2BP1, SMOX family mem- ber, and FGFR family member) related to stature in sheep have been associated with body size in cattle and dogs (Rimbault et al. 2013; Hayward et al. 2016; Bouwman et al. 2018). Selective Signatures Associated with Wool Traits Genome-wide XP-CLR selection tests of domestic breeds pos- sessing different fleece fiber variations (e.g., hair, coarse wool, medium wool, and fine wool; supplementary table S32, FIG. 5. Signals of introgression and selection in the Changthangi sheep. (A) Manhattan plot showing the introgression signal from Asiatic mouflon to Changthangi sheep. The locations of the HBB, HBE2, HBBC, HBE1, RAB6A, and MRPL48 genes are shown by the highest fd value. (B) Signals of selection in the Changthangi sheep detected by the PBS (Yi et al. 2010). PBS values in windows of 100 kb and names of genes associated with the highest peaks are shown. (C) GO and KEGG pathway enrichment analysis based on genes across significant introgressed regions from wild species to Changthangi sheep. The dot size shows the gene count enriched in the pathway, and the dot color shows adjusted significance value of the enriched pathways. Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 11 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data Supplementary Material online) were shown in all the six pairwise comparisons (fig. 7A and supplementary figs. S19– S23 and tables S33–S38, Supplementary Material online). From 2,759 genes associated with 5,995 selective signals, 86 are scored in at least two of the six comparisons and with functions involved in fiber and wool development, including novel (e.g., HR, ASTN2, TWIST2, IRF4, INHBA, and FGF7) and previously reported (e.g., MCIR, FOXQ1, TP63, VEGFA, and PTCH1) genes. We also identified 414 selective SVs associated with wool traits in the selective sweeps of SVs (supplementary tables S39–S44, Supplementary Material online). Annotation of the SVs indicated their colocalizations with genes or QTLs associated with mean fiber diameter (e.g., FST gene) (Ma et al. 2017). Positional gene enrichment analysis of the gene sets showed significant (Padj. < 0.05) overrepresentation of genes related to fiber diameters in these regions (De Preter et al. 0. 0 0. 2 0. 4 0. 6 0. 8 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 23 26 G lo ba l F ST IGF2BP2 ASTN2 NFX1 OSTF1 SOCS2 HOXC13 MSRB3 HOXA1 SSBP2 ABCC4 HOXB13 KIF1C NF1 CDH8 PDGFD LOC114118424 TMEM154 KIF9 LOC101111733 IRF2BP2 CSMD1 FOXO4 EVC RXFP2 113,776,294-bp(T/G;Tyr/Asp) CDS 113,776,250 113,776,275 113,776,300 113,776,325 EVC 30,967,896−bp(G/A,Glu/Lys) CDS 30,967,900 30,967,950 30,968,000 30,968,050 RXFP2 5,776,842−bp(A/G;Lys/Glu) CDS 5,776,900 5,777,000 TMEM154 0. 0 0. 1 0. 2 0. 3 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 22 25 R O D ATPAF1 TMEM275 POMGNT1 MAST2 KNCN NAB1 NEMP2 ZNF804A EXOC6B ACTR1B ANKRD39 TMEM176B CNOT8 STBD1 FBXL5 HEBP2 SMIM28 PERP OSR2 RXFP2 HSPH1 SLC25A10 DYRK3 KIF5B GPCPD1 BMP2 MC1R MRPL17 TPP1 TRIM3 CCKBR DGAT2 TSSK1B MCTP2 KIF15 MDGA1 TMEM151A FGF8 SPIRE1 CHMP1B IRF2BP2 SMIM18 CAMK4 A B C LR s co re TGFBR3 SAMD5 EPHA2 ESR1 PTDSS1 GDF6 IGF2BP1 XPR1 SMOX SOX6 PTPN5 PRR14FGF10 PTGER4 PDGFC INPP4B BMP5 BMP6 TMED7 CHRM2 0 2 4 6 8 10 12 14 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 22 24 26 CDS 89,112,200 89,112,400 GDF6 OGFOD2 PRSS16 ATP2A1 C 89,112,389-bp(C/A;Thr/Lys) FIG. 6. Genome-wide selective signals across all the domestic sheep populations based on global FST and ROD, and associated with the stature (i.e., body size) phenotype. (A) Genome-wide distribution of global FST across all the 158 domestic sheep populations. (B) Candidate selective genomic regions based on ROD analysis. (C) Manhattan plot for the dwarf (i.e., small body size) phenotype showing the CLR score distribution in the Ouessant sheep population. Genes within regions with significant outlier scores are highlighted. Lv et al. . doi:10.1093/molbev/msab353 MBE 12 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data A C B D E F FIG. 7. Genomic evolution of hair/coarse-wool/medium-wool/fine-wool sheep. (A) Whole-genome selective signals between fine-wool and hairy populations of domestic sheep by the cross-population composite likelihood ratio (XP-CLR) test. (B) Six comparisons of p ratio (pHair sheep/pCoarse- wool sheep, pHair sheep/pMedium-wool sheep, pHair sheep/pFine-wool sheep, pCoarse-wool sheep/pMedium-wool sheep, pCoarse-wool sheep/pFine-wool sheep and pMedium- wool sheep/pFine-wool sheep) indicates strong selective signal in IRF2BP2 gene. (C) Allele frequency of the mutation site and frequencies of genotypes CC, TT, and CT in 30-UTR of IRF2BP2 (chr25: T7,068,586C), and significance of the genotype difference is tested. (D) Haplotype pattern of IRF2BP2 among wild sheep, hairy, coarse-wool, medium-wool, and fine-wool populations of domestic sheep. (E) Sequence comparison among different species at mutations chr25: 7,068,586, and chr25: 7,070,630. (F) GO and KEGG pathway enrichment analyses, with the significant (P< 0.05) GO terms and pathways and associated genes shown. Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 13 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 2008). All candidate genes under selection are directly or in- directly associated with wool growth and development (fig. 7F). We further dissected the genomic architecture of these strong candidate genes by calculating allele frequencies at nonsynonymous SNPs in exons. We observed significant dif- ferences in allele frequencies (Mann–Whiteney test, P< 0.001), and genotype or haplotype patterns of nonsynon- ymous SNPs in IRF2BP2 and MC1R genes between breeds displaying different wool phenotypes (supplementary table S27 and fig. S24, Supplementary Material online). Functional Analysis of IRF2BP2 Variation Remarkably, the signals overlapping with IRF2BP2 are outliers in four of the six comparisons in the XP-CLR analysis and p ratio analysis (fig. 7B and supplementary tables S26–S31, Supplementary Material online). We annotated the gene con- taining 85 SNPs and detected one intron and two exons in the gene. The 30-UTR variant (chr25: 7,068,586) and the intronic mutation (chr25: 7,070,630) showed significantly (P< 0.05) differentiated allele and genotype frequencies be- tween the groups of breeds with different fleece fiber diam- eters, such as hair, coarse wool, medium wool, and fine wool (fig. 7C and supplementary fig. S25, Supplementary Material online). In domestic sheep, the ancestral T (chr25: 7,068,586) and G alleles (chr25: 7,070,630) showed higher frequencies in hair and coarse-wool populations, corresponding to the an- cestral alleles, whereas the derived reference C (chr25: 7,068,586) and A (chr25: 7,070,630) alleles were dominant in medium-wool and fine-wool populations (fig. 7C and D). Further characterization of alleles of these two variants in a large variety of species including wild sheep, wild yak, goat, and horse showed both ancestral alleles in animals with hair or coarse-wool coat (fig. 7E). Thus, this observation indicated that the two variants in IRF2BP2 may have advantageous effects on fleece fiber diameter. In our worldwide sheep panel, these SNPs do not cosegregate with the EIF2S2 insertion up- stream of the IRF2BP2 gene previously found to be associated with wool and fine-wool genotype (Demars et al. 2017). The fragments per kilobase million (FPKM) value for IRF2BP2 mRNA expression showed significant (P< 0.05) dif- ference in skin tissues at different prenatal stages between coarse-wool and fine-wool sheep (fig. 8A and supplementary table S45, Supplementary Material online). No expression was observed in IRF2BP2 for coarse-wool sheep (fig. 8A). We iden- tified binding sites of the highly conserved miRNAs oar-miR- 496-3p, oar-miR-379-3p, oar-miR-411a-3p, and oar-miR-20a- 3p binding in the 30-UTR of IRF2BP2, which suggested IRF2BP2 as a target of these four miRNAs (supplementary table S46, Supplementary Material online). MiRNA expression profiling in skin samples of three coarse-wool Hu sheep and nine Aohan fine-wool sheep (supplementary table S47, Supplementary Material online) revealed expression of only oar-miR-20a-3p in coarse-wool sheep (fig. 8B). Binding of oar-miR-20a-3p to the 30-UTR of wild-type (WT) IRF2BP2 (fig. 8C) was confirmed by luciferase assay in HEK293T cells. We introduced the WT IRF2BP2 30-UTR into a luciferase expression vector and found that expression of oar- miR-20a-3p reduced reporter activity compared with a neg- ative control mimic (fig. 8D). However, mutated 30-UTR of IRF2BP2 abolished the reduction of luciferase by the oar-miR- 20a-3p-mediated (fig. 8D and E and supplementary fig. S26 and table S46, Supplementary Material online). Altogether, these data supported that the WT IRF2BP2 30-UTR was a direct and functional target of oar-miR-20a-3p, which could directly bind IRF2BP2 and potentially regulate IRF2BP2 expres- sion in the skin tissues of coarse-wool sheep. We also detected a strong and consistent selective signal in the gene VEGFA (vascular endothelial growth factor A) in four comparison tests (hair vs. fine wool, hair vs. medium wool, coarse wool vs. fine wool, and coarse wool vs. medium wool, supplementary tables S26–S31, Supplementary Material on- line). Early evidence suggested that IRF2BP2 coactivates VEGFA expression, leading to the activation of secondary wool follicles (Ho et al. 2012; Li, Lu, et al. 2012; Li, Man, et al. 2012; Zhang et al. 2018). In the histological analysis, hematoxylin and eosin (H&E) staining showed a larger ratio of diameters (DS/DP) but a smaller number (NS/NP) of sec- ondary to primary wool follicles in Wadi sheep (coarse wool) than in Chinese Merino sheep (fine wool) in the important developmental stage of wool follicles (fig. 8F), which conse- quently accounted for the higher yields and finer wool of uniformly small-diameter fibers in fine-wool sheep (Galbraith 2010). Thus, oar-miR-20a-3p may act on the de- velopment of wool follicles in coarse-wool sheep through its effect on IRF2BP2, which potentially inhibits VEGFA signaling expression in wool follicles (fig. 8F). Discussion In this study, we performed a whole-genome sequencing analysis of 810 wild and domestic sheep. This is thus the hitherto most comprehensive data set on population genetic structure of worldwide sheep using more than �2 times as many individuals and WGS as in previous studies (Alberto et al. 2018; Naval-Sanchez et al. 2018; Deng et al. 2020; Li et al. 2020). We explored the occurrence of sequences from wild species in the domestic genomes and further identified useful nonsynonymous SNP mutations and SVs involved local ad- aptation and agronomic traits. The Influence of Wild Species on Domestic Genomes In general, a high level of diversity with novel, rare, and private variants was observed in wild relatives compared with both improved and landrace breeds (fig. 1C and D). In domestic sheep, we observed many region-specific variants and a clear clustering of populations according to their geographic ori- gins (supplementary fig. S3, Supplementary Material online). Several valuable genetic variants for modern sheep appeared to originate from wild sheep species. Genetic influence of wild on domestic species as a mode of adaptation have already been described for chicken (Wang et al. 2020), pigs (Ai et al. 2015), dogs (Miao et al. 2016), and cattle (Chen et al. 2018) and is supported by ancient DNA studies (Frantz et al. 2020). We observed that among the wild sheep species, the European mouflon has the most widespread influence on domestic sheep, although the influence of Asiatic mouflon Lv et al. . doi:10.1093/molbev/msab353 MBE 14 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data may have been underestimated by a substantial influence of the urial on the current Asiatic mouflon population (this study) (Demirci et al. 2013; Deng et al. 2020). The European mouflon is supposed to have emerged as a feral population when the original hair-coat domestic sheep were replaced by wool sheep. During this process, mouflon DNA may have been retained in domestic sheep, in particular in North European breeds, some of which have retained a wild appear- ance (Chessa et al. 2009; Ciani et al. 2020; Deng et al. 2020). Several English breeds such as the Suffolk, Texel, Dorset, and the Longwools became transboundary mutton sheep and have been used for improving local breeds on different con- tinents (Kijas et al. 2012; Deniskova et al. 2018; Cao et al. 2021), thus spreading the ancestry of European mouflon. Alternatively, mouflon influence may be the result of a later introgression in a sympatric domestic population (Barbato et al. 2017; Cao et al. 2021). Historical records from ancient Rome and the 18th century have described hybridization between wild and domestic sheep in Europe (Cetti 2000). Even nowadays, farmers in Sardinia typically al- low sheep to graze in the wild, where European mouflon reside (Lorenzini et al. 2011). Our estimates of the proportion of the mouflon component in domestic breeds (7.64–13.40%) are higher than the 1.0–4.1% estimated by Barbato et al. (2017) and the 0.6–1.3% estimated by Cao et al. (2021), both of which were on the basis of 50K genotypes. This discrepancy is likely explained by sensitive detection of short mouflon fragments by WGS, which are missed by 50K SNP BeadChip genotyping with an average spacing between SNPs of about 54 kb. Introgression is the most plausible explanation for the argali influence on the Bashibai sheep. We observed that domestic sheep may have acquired advantageous alleles of several immune and sensory genes via natural or managed hybridization with the wild relatives (supplementary table S22, Supplementary Material online). This may very well have facilitated the worldwide range of domestic sheep adapting to a wide range of environments (Cao et al. 2021), for instance to the extreme conditions of the Arctic climate and the Himalaya Highland. Previous studies have showed genetic introgression of wild relatives in domestic animals such as sheep, goat, pig, dog, cattle, yak, cattle, rabbit, horse, and chicken (Ai et al. 2015; Medugorac et al. 2017; Jones et al. 2018; Wu et al. 2018; Lawal et al. 2020; Wang et al. 2020; Zheng et al. 2020; Cao et al. 2021; Yu et al. 2021; ). The proportion of wild introgression varies from 0.05% to 27.3% depending on the populations of do- mestic animals and the relevant wild relatives (supplementary table S48, Supplementary Material online). These introgressed variants and segments were located in genes with functions such as environmental adaptation (e.g., the response to hyp- oxia), innate immunity, olfactory relevance, and morpholog- ical traits (e.g., coat color, the polled, and the different shapes hair follicle VEGFA expression secondary follicle primary follicle secondary follicle IRF2BP2 mRNA oar-miR-20a-3p 3’UTR IRF2BP2 VEGFA expression C E * oar-miR-20a-3p negative control (NC) oar-miR-20a-3p N.S. R el at iv e lu ci fe ra se a ct iv ity NC+M U oa r-m iR -20 a-3 p+ MU NC+W T oa r-m iR -20 a-3 p+ W T Fine -w oo l sh ee p Coa rse -w oo l sh ee p 0 20 40 60 80 100 FP KM E55E65 E75Birth 1 m on th 0 20 40 60 80 100 5 m on th TP M A B D 3’-…UGAAAUUCACGAGUAUUACGUCA…-5’ 5’-…GUUGGUUACAUAAUACACA…-3’ 3’ UTR WT (coarse-wool sheep) 5’-…GUUGGUUACGUAAUACACA…-3’ 3’ UTR MU (fine-wool sheep) 7,068,586 IRF2BP2oar-miR-20a-3p MU: fine-wool mRNA WT: coarse-wool mRNA 0 0.2 0.4 0.6 0 20 40 60 80 E90 d E12 0d Birth 6 m on ths 2 y ea rs oa r-m iR -20 a-3 p oa r-m iR -37 9-3 p oa r-m iR -49 6-3 p oa r-m iR -41 1a -3p F Wadi sheep (coarse-wool) Merino sheep (fine-wool) 0 5 10 15 tra ns ve rs e 100 C oarse-w ool Fine-w ool Fine-wool sheep Coarse-wool sheep × IRF2BP2 mRNA Fine-wool sheep Coarse-wool sheep IRF2BP2 ① ② coactivates VEGFA tra ns ve rs e PF SF PF SF 40× 40× FIG. 8. Histological analysis, mRNA, and miRNA expressions and genetic mechanisms for the gene IRF2BP2 and oar-miR-20a-3p involved in the fleece fiber variation in domestic sheep. (A) Expressions of IRF2BP2 in embryonic developmental stages and 2 years old of fine-wool (left), coarse- wool sheep (medium) and boxplot of FPKM distribution between fine-wool and coarse-wool sheep (right). (B) Expressions of miRNAs in coarse- wool sheep (Hu sheep) normalized by reads per million reads mapped to miRNAs (RPM). (C) Putative binding sites for oar-miR-20a-3p to the 30- UTR of IRF2BP2. (D) Dual-luciferase reporter assays, *P< 0.05 (t-test); N.S., not significant. (E) Proposed genetic mechanisms for IRF2BP2 and oar- miR-20a-3p involved in the fleece fiber variation in domestic sheep. ‹Ho et al. (2012) and ›Zhang et al. (2018). (F) Histological transverse section of skin in Chinese Merino (fine wool, at embryonic 85 days) and Wadi sheep (coarse wool, at embryonic 90 days); PF, primary wool follicle; SF, secondary wool follicle. Genetic Diversity, Introgression, and Agronomically Important Loci of Worldwide Sheep . doi:10.1093/molbev/msab353MBE 15 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data of horns). Introgressed variants in genes associated with sim- ilar functions such as the oxygen transportation and hemo- globin complex have been observed in this and early studies (Hu et al. 2019). Nevertheless, genes with different functions were also detected in these studies (supplementary table S48, Supplementary Material online), which could be due to dif- ferentiated natural and artificial selection forces in a diverse range of environments and varied production systems. Demographic History of South-and-Southeast Asian Sheep Reconstruction of the history of the Asian sheep populations by demographic modeling suggested that Central-and- Eastern and South-and-Southeast Asian sheep diverged 5.85–6.01 ka with subsequent gene flow from the Middle East 3.93–4.01 ka. This was possibly mediated by gene flow along the Southern Silk Road route and the Tibet–Nepal and Burma–India routes (Pe 1959; Shaha 1970; Lv et al. 2015). In the Holocene and Neolithic time, the major transformation of farming and pastoralism in Eurasia were accompanied by human migrations (Narasimhan et al. 2019) and cultural transitions, such as the Late Neolithic to Early Bronze Age material cultures in the Near East, the Kura-Araxes culture in the Southern Caucasus, the Yamnaya culture across the Eurasian Steppe, and the Indus Valley Civilization (Narasimhan et al. 2019; Skourtanioti et al. 2020). Our findings are consistent with recent paleogenomic evidence for the dynamic population history on the Eurasian Steppe and the genetic makeup of most present-day South-and-Southeast Asian populations during the Early and Late Bronze ages, which share ancestry of: 1) Anatolian and Iranian farmers across the broad region of Iranian plateau �6,500–6,000 BP (Narasimhan et al. 2019; Raghavan et al. 2019; Shinde et al. 2019); and 2) North Eurasian and Central Asian Steppe pas- toralists along the Inner Asian Mountain Corridor, the ancient “Silk Roads” across Asia (e.g., Kazakhstan, Turkmenistan, Afghanistan, and Pakistan) �4,500–4,000 BP (Majumder 2010; Frachetti et al. 2017; Lipson et al. 2017; Narasimhan et al. 2019). Thus, our results illustrate the impact of the Middle Eastern farming to both the sheep husbandry in Iran and to the pastoralism on the Eurasian Steppe spreading eastward into South-and-Southeast Asia (Lazaridis et al. 2016). The demographic reconstruction and population genetic differentiation supported close genetic relationship between South-and-Southeastern Asian and Central-and-Eastern Asian populations (fig. 3A–D). Our results supported a rela- tively recent divergence and a close genetic affinity between South-and-Southeastern Asian and Chinese sheep popula- tions (including Tibetan and Southwestern Chinese popula- tions). The observation could also be explained by the strong gene flow from Central-and-Eastern Asian populations to South-and-Southeastern Asian populations indicated by the modeling analyses (fig. 3D), which could be through several popular ancient land (the Southern Silk Road route and the Tibet–Nepal and Burma–India routes) (Pe 1959; Shaha 1970; Lv et al. 2015). The later Near Eastern infusion could be as- sociated with the introduction of fat-tailed sheep, which originated in the Near East (around 5,000 years BP; Ryder 1984), as supported by the Y-chromosomal genetic variation analyses (Deng et al. 2020). Alternatively, an early mtDNA study suggested that sheep could have been introduced to Indian subcontinent from Near East, probably by ancient sea trade route (Singh et al. 2013). Overall, our study comple- mented the origins and migrations of sheep in previously understudied regions and presented a comprehensive picture of expansions of sheep across the world (fig. 3E). We noted that the estimated time of the main demographic events were in the approximate periods of the recent paleogenomic studies (Lazaridis et al. 2016; Frachetti et al. 2017; McColl et al. 2018; Narasimhan et al. 2019; Shinde et al. 2019; Skourtanioti et al. 2020), but the time difference observed here could be due to a limited number of modern samples and the lack of ancient samples in Central, South-and-Southeast Asia. Also, there were controversial estimates of time for the population demographic histories in these paleogenomic investigations. Additional samples in the regions (e.g., Kazakhstan, Afghanistan, Pakistan, Vietnam, Thailand, and Malaysia) in- cluding particularly ancient DNA will be needed to elucidate the genetic origins and prehistoric population interactions of sheep in South-and-Southeast Asia. Our results of sheep his- tory in South-and-Southeast Asia can provide insights into the waves of human migration in the regions. For a more complete visualization of the major migrations of domestic sheep (fig. 3E), we combined the evidence of the modeling for the Asian populations (fig. 3D) with 1) the to- pology with separate positions for British and Nordic breeds (fig. 3A and supplementary fig. S11, Supplementary Material online) (Tapio et al. 2006; Kijas et al. 2012; Ciani et al. 2020); 2) the clear separation of North-and-West African sheep and East-and-South African sheep (fig. 3A and supplementary fig. S11, Supplementary Material online); and 3) tentatively, the model-based clustering for the American populations (fig. 3C) (Kijas et al. 2012), which is in agreement with earlier results based on mtDNA and SNP BeadChip (Spangler et al. 2017; Paim et al. 2021). Molecular Basis of Adaptive and Agronomic Traits We performed several alternative methods for genome-wide searching of selection signatures in order to identify novel trait-associated genes and variants. With our data set, PBS, FST, and CLR gave clear signals above a uniform background, which is in contrast to rather noisy fd and ROD scans. In general, the various approaches each detect different panels of genes with only little overlap, indicating for all methods a high frequency of false negatives. At the same time, all meth- ods are prone to delivering false positives, further complicat- ing the interpretation (Utsunomiya et al. 2015; Saravanan et al. 2020). On the other hand, our large WGS data set allows to identify candidate causative mutations, at least in coding sequences but also in plausible miRNA binding sites, and to test their correlation with the breed-dependent phenotypes. In addition, alignment of haplotype patterns (figs. 5D and 7D) for selected regions appeared to be informative. From the multitude of potentially informative selection signatures, we have highlighted HBB and neighboring genes Lv et al. . doi:10.1093/molbev/msab353 MBE 16 D ow nloaded from https://academ ic.oup.com /m be/article/39/2/m sab353/6459180 by Finnish Forest R esearch Institute / Library user on 14 February 2022 https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data https://academic.oup.com/mbe/article-lookup/doi/10.1093/molbev/msab353#supplementary-data (adaptation to hypoxia), EVC (development), RXFP2 (horn development), TMEM154 (disease resistance), GDF6 (growth), PDGFD (tail configuration), and MC1R (coat color), all of which are key genes for the most relevant phenotypes of domestic sheep. For fleece fiber variation, we found an asso- ciation of mutated allele chr25:7,068,586(C) in the 30-UTR of IRF2BP2 with the medium- or fine-wool phenotype. Although we observed homozygous genotypes of ancestral and derived alleles at the SNP mutation (chr25: T7,068,586C) in the hair and coarse-wool sheep populations, the ancestral allele chr25:7,068,586(T) showed much higher frequencies than those in medium-wool and fine-wool sheep (fig. 7C). Thus, we concluded the ancestral allele chr25:7,068,586 (T) is asso- ciated with the phenotypes of hair and coarse wool in wild and domestic sheep. Variation between hairy and woolly fleece has been previ- ously found to be associated with an insertion of an antisense EIF2S2 retrogene in the 30-UTR of IRF2BP2, using a mainly French panel of populations (Demars et al. 2017). However, this association could not be entirely reproduced in our worldwide panel of wild and domestic sheep, particularly in the wild species (with the phenotype of hair), hair sheep, and coarse-wool sheep populations (supplementary fig. S27, Supplementary Material online). We estimated LD between the mutation (chr25: T7,068,586C) and the insertion (asEIF2S2) by calculating the parameter of LD (r2) (Hill and Robertson1968). We found strong linkage between them in wild species (frequency of IRF2BP2wt ¼ 1, frequency of T¼ 0.96) and fine-wool (r2 ¼ 0.27) sheep populations, whereas lower (r2 < 0.1) or no linkage was observed in the hair, coarse-wool, and medium-wool sheep populations (sup- plementary table S49, Supplementary Material online). We noted that only seven fine-wool individuals were included in the LD estimation, which might have inadequate statistical power and significance. The results showed cosegregation between the ancestral allele chr25:7,068,586T and the absence of EIF2S2 insertion in the 30-UTR of IRF2BP2 in wild sheep species. Thus, the expression of IRF2BP2 was regulated by both the EIF2S2 insertion and the SNP (chr25: T7,068,586C) in the 30-UTR, which is associated with the fleece fiber phe- notypes in wild and domestic sheep. Our results provided complementary evidence for the genetic mechanisms under- lying the fleece fiber variation in sheep. Different from Demars et al. (2017), we did not find IRF2BP2asEIF2S2/IRF2BP2asEIF2S2 homozygotes. The difference in genotype pattern between this and the previous study could be due to that 1) multiple approaches and strict thresholds have been adopted in obtaining the accurate genotypes; 2) the EIF2S2 insertion was genotyped in much less samples of medium- and fine- wool sheep here, which, however, showed a high frequency in medium- and fine-wool sheep (Demars et al. 2017). Besides these, we detected a set of candidate genes asso- ciated with fleece fiber variation in sheep. Among these and previously identified candidate genes, 7 genes (e.g., VEGFA, HRAS, and AKT) were located in the pathways of VEGF sig- naling, which plays a central role in regulating cell prolifera- tion, migration, and survival; 17 genes (e.g., CACNA2D3, FGF7, and MAP4K4) were found in MAPK signaling pathway, and 16 genes (e.g., EFNA5, PIK3R3, and VTN) were involved in the PI3K-Akt signaling pathway, which were functionally related to cell proliferation and survival (supplementary fig. S28, Supplementary Material online). Functions of these genes and network of these relevant pathways indicated that IRF2BP2, VEGFA, and oar-miR-20a-3p are the potentially im- portant factors regulating the fleece fiber diameter in domes- tic sheep. We conclude that the historical wild-relative genetic intro- gression played an important role in shaping the genetic di- versity and agronomic traits and introduced adaptive variants in domestic sheep. The large data set presented here is a useful resource by providing a genomic framework for explo- ration of the genomic basis underlying phenotypic traits and genetic improvement of sheep. Materials and Methods Data Sets, Samples, and DNA Extraction The sequencing data sets consisted of WGS generated in this study and published previously, totaling 810 individuals from 7 wild relatives (72 individuals) and 158 domestic sheep pop- ulations (738 individuals) around the world. The novel sequences included genomes of one Asiatic mouflon and 165 individuals from 82 domestic populations (fig. 1A and supplementary table S1, Supplementary Material online), which are from regions that so far have been understudied in previous investigations, such as India, Indonesia, Malaysia, and Pakistan (supplementary table S1, Supplementary Material online). The publicly available data sets included WGS of 644 indi- viduals (572 domestic sheep, 32 Asiatic mouflon, 6 bighorn sheep, 6 thinhorn sheep, 9 urial, 8 argali, 8 snow sheep, and 3 European mouflon), and they were derived from five sources: the NextGen Consortium, the International Sheep Genomics Consortium (ISGC), and three of our previous studies (Deng et al. 2020; Li et al. 2020; Chen et al. 2021). Summary infor- mation of these newly sequenced and publicly available sam- ples, including population names, codes, geographic origins, sample sizes, and contributors, is detailed in supplementary table S1, Supplementary Material online. Coordinates of geo- graphic origins of the populations were provided by the con- tributors or were assigned as the centroid of their known core distribution areas. Blood samples, ear or skin tissues were collected; ear and skin samples were preserved in 95% ethanol at �80�C. Resequencing and Variant Calling DNA of the samples sequenced in this study was extracted using the DNeasy Blood & Tissue kit (QiaGen, Shanghai, China), including a