SciELO - Scientific Electronic Library Online

vol.48 issue2Calving interval genetic parameters and trends for dairy breeds in South Africa author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand



Related links

  • On index processCited by Google
  • On index processSimilars in Google


South African Journal of Animal Science

On-line version ISSN 2221-4062
Print version ISSN 0375-1589

S. Afr. j. anim. sci. vol.48 n.2 Pretoria  2018 

Whole genome study of linkage disequilibrium in Sahiwal cattle



H. MustafaI, II, #; N. AhmadI; H. J. HeatherII; K. Eui-sooII; W. A. KhanIII; A. AjmalI; K. JavedI; T. N. PashaIV; A. AliI; J. J. KimV; T. S. SonstegardII

IDepartment of Livestock Production, University of Veterinary and Animal Sciences, Lahore 54000, Pakistan
IIUnited States Department of Agricultural, Maryland 27030, USA
IIIDepartment of Biotechnology, University of Sargodha, Pakistan
IVDepartment of Animal Nutrition, University of Veterinary and Animal Sciences, Lahore 54000, Pakistan
VDepartemnt of Biotechnology, Yeungnam University Gyeongsan, Gyeongbuk 712749, Republic of Korea




The linkage disequilibrium (LD) is an important tool to study quantitative trait locus (QTL) mapping and genetic selection. In this study, we identified the extent of linkage disequilibrium (LD) in Sahiwal (n = 14) cattle using the bovine high density single-nucleotide polymorphisms (SNPs) BeadChip. After data filtering, 500,968 SNPs comprising 2518.1 Mb of the genome, were used for the LD estimation. The minior allele frequency (MAF) was 0.21 in a substantial proportion of SNPs and mean distance between adjacent markers was 4.77 ± 2.83 kb. The overall mean LD between adjacent markers was 0.18 (r2) and 0.55 (|D'|), respectively. The LD (r2) values reduced with the increase in distance between adjacent markers from 1 kb (0.35) to 100 kb (0.12) and |D'| specified distinct decay of the LD. Chromosomes 1, 27, 28 and 29 presented the LD at some distance between markers. The extent of LD was higher, except these four chromosomes, for markers separated by 20 kb. At < 3 kb distance, the upper value of the linkage disequilibrium (LD) was observed at 0.30. High level of the linkage disequilibrium (LD) between markers was observed at high minor allele frequency (MAF) threshold (0.15), at the short distance between markers. The results of this study revealed that the Bovine high density SNPs BeadChip will be informative for the estimation of breeding value in Sahiwal cattle.

Keywords: Breeding value, linkage disequilibrium, minor allele frequency, molecular marker




Sahiwal is an important dairy breed of Pakistan, belonging to Bos indicus. This breed is well known in the world for dairy traits and is present in twenty-nine countries (FAO, 2007). The population of Sahiwal cattle is under threat, due to crossbreeding. There are about 2, 753 thousand heads of this breed in Pakistan (GOP, 2006). Over the last two decades, the interest in the genetic evaluation of indigenous cattle breeds has increased. Several genetic evaluation schemes for the genetic improvement of Sahiwal cattle are under progress. The main objective of these programs is the enhancement of production and reproduction performance of this breed. Sahiwal is an internationally important dairy breed and is used in various synthetic breed's development (Khan et al., 2008).

Animal breeding value can be estimated from whole genome data by using marker assisted selection, also known as genetic selection (Bennewitz et al., 2009). Genetic selection identifies the linkage disequilibrium (LD) between adjacent markers and reveal that chromosome segments will affect the whole population and genes responsible for traits expression (QTL) since the markers are in LD. The LD is a non-random allelic association with different loci in a population. The LD maps are significant approaches for determining the genetic basis of economically key traits in cattle. Similarly, LD maps comparison permits to institute the diversity between cattle breeds with diverse biological characteristics and to classify genome regions that were open to dissimilar selection pressures (McKay et al., 2007).

The two methods most usually used to calculate LD between markers (biallelic) are r2 and |D'| and these parameters can vary from 0 to 1 (Hill & Robertson, 1968; Bohmanova et al., 2010). When |D'| <1, this indicates the existence of recombination between two loci, whereas there is lack of recombination between two loci at |D'| = 1. The r2 parameter denotes the correlation between two loci and is used in association studies. Nevertheless, inverse relationships exist between r2 and the sample size required to detect the same power. The LD is required to study the association between a marker and a QTL (Pritchard & Przeworski, 2001). Marques et al. (2008) studied the chromosome 14 with 505 SNPs in Holstein cattle and found moderate LD level (0.2), separated by 100 kb. McKay et al. (2007) found similar results for 2, 670 markers in eight different cattle breeds. In a comprehensive study of 19 taurine and indicine breeds with 32, 826 SNPs dataset, a higher proportion of low allele frequencies and low LD level was observed in indicine breeds (Villa-Angulo et al., 2009). Silva et al. (2010) genotyped Gyr breed with 54,000 SNPs and found LD (0.21) between adjacent markers.The authors of different studies identified that Bos indicus breeds have a higher percentage of low alleles frequency and a lower level of LD than Bos taurus breeds (McKay et al., 2007; Sargolzaei et al., 2008; Kim & Brian et al., 2009; Espigolan et al., 2013). However, there was no single report of LD estimation in Sahiwal cattle breed, which is an internationally important dairy breed. The LD identifies population features and different chromosomes pattern in a population. Thus, this present study aims to identify the LD in Sahiwal cattle breed using the bovine high density SNPs BeadChip.


Materials and Methods

In this present study, fourteen Sahiwal individuals from two different cattle farms were used. Sample details are described in a study by Mustafa et al. (2017). There was no need to take approval from animal's ethical committee for the present study, because existing data set were used. Genotyping was completed by using the bovine HD SNPs (700 K) BeadChip (Illumina HD assay(R)). The bovine HD SNPs BeadChip contained 777, 962 SNP spread across the genome with mean 3.43 kb distance between markers. The Genome Studio(R) software (Illumina) was used to analyze genotypes and total 1467 markers were excluded from the data set due to their unknown position in the genome. Monomorphic markers were found 15, 186. Only autosomal markers were included in LD analysis with MAF higher than 0.05, 0.10 and 0.15. Additionally, only markers with call rate >0.90 and heterozygote (<0.30) were considered. The LD was evaluated using r2 and |D'| between two SNPs. The r2 was measured as follows:



Where, frequency X, frequency x, frequency Y and frequency y are alleles frequencies of alleles X, x, Y and y, respectively.

Frequency XY, frequency xy, frequency xY and frequency Xy are the haplotypes frequencies of XY, xy, xY and Xy, respectively. If the two loci are independent, the expected frequency of haplotype XY (freq. XY) is calculated as the product between freq. X and freq. Y. X freq. XY higher or lower than the expected value indicates that these two loci in particular tend to segregate together and are in LD. Higher or lower frequency of XY indicates that two loci separate together and in LD. The linkage disequilibrium (r2 and |D'|) were measured for all markers pairs at each chromosomes using SnppldHD software (Espigolan et al., 2013). The present study measured LD (r2 and |D'|) estimation using maternal haplotypes as commonly used in LD estimation studies, where, half sib families exist (Reich & Lander, 2001).


Results and Discussion

The descriptive statistic results for each autosome of LD between adjacent markers and the SNP markers are shown in Table 1. A total of 500, 968 (64.4 %) markers met the filtering criteria and were comprised in the final analysis. This subset of markers included 2, 518.1 Mb of the genome, with a mean distance between markers of 4.77 ± 2.89 kb. The density of markers was similar across all the chromosomes ranging from 5.1 to 5.4 kb.

Chromosome 1 was found longer and chromosome 25 was shortest with 158.49 Mb and 42.91 Mb, respectively. The SNPs proportion < 0.20 MAF was observed (Figure 1), which is similar to the previous studies in different indicine breeds (McKay et al., 2007; Silva et al., 2010; Espigolan et al., 2013). Khatkar et al. (2008) described that the MAF threshold affects the extent of LD and distribution. Although, the mean MAF 0.21 was found in this study, which is considerably high than that reported by Matukumalli et al. (2009) and HapMap consortium (2009) in Nellore cattle (0.19 to 0.20). However, the mean MAF was similar to that reported by Espigolan et al. (2013) in the study conducted on Nellore cattle. In this study, Chromosomes 2, 4,7 15 17, 25 and 26 showed a higher minor allele frequency proportion (< 0.10), while chromosomes 6, 8, 16, 22 and 23 showed a lower proportion of minor allele frequency (< 0.10). All possible SNP pairs on the same chromosome separated by 100 kb formed 9,162,314 combinations of SNP pairs to estimate linkage disequilibrium across the 29 autosomes. The overall mean linkage disequilibrium (r2 and |D'|) between marker pairs measured was 0.18 and 0.55, respectively. Espigolan et al. (2013) genotyped Nellore cattle using 444, 986 SNPs and found mean linkage disequilibrium (r2 and |D'|) between adjacent markers of 0.17 and 0.52, respectively. The present study results and those reported in other studies endorse that |D'| parameter overestimates linkage disequilibrium exclusively in case of low MAF.

The mean LD between SNPs ranged from 0.002 to 0.19 for r2 and from 0.13 to 0.58 for |D'| across all autosomes (Table 1). This is similar to that previously reported in Nellore cattle (Espigolan et al., 2013). In this study low levels of r2 were estimated at chromosomes 1, 27, 28 and 29, which were comparatively lower than that reported in zebu cattle (McKay et al., 2007; Silva et al., 2010), but Espigolan et al. (2013) reported a similar value of these chromosomes in Nellore cattle. Diversity in LD pattern of the different genomic region is due to variation in autosomal recombination rate (Arias et al., 2009). Therefore, the results of this study revealed that chromosomes 1, 27, 28 and 29 might be attributed to variation in sampling since markers number, density, MAF mean or MAF proportion did not differentiate from the other previous autosomes studies.

To investigate the LD decline, synthetic SNPs pairs were categorized into bins (intervals) according to the distance between markers. Figures 2 and 3 depict the LD decline and mean r2 and |D'| value for each interval per chromosome and for whole genome and LD decreased with increasing distance between markers (Table 2). Comparatively, |D'| change showed a less marked LD decrease. At distance < 30 Kb moderate r2 were observed (0.19 to 0.33). The mean r2 decreased in 0.19 to 0.10 were observed with increase in the distance between markers from 30 to 100 kb. At 10 kb marker distances, high r2 variation was estimated. Average 38.81 and 41.74 kb spacing was observed, when LD was higher than 0.30 and 0.15 and there was not all markers with 40 to 50 kb spacing and have higher r2 value than 0.3. The r2 (> 0.15 and > 0.30) for markers proportion ranging from 33 to 54 percent and from 22 to 41 percent, respectively, which was similarly reported on a distance between fewer than 40 kb as reported in Nellore cattle (Espigolan et al., 2013). However, this proportion of markers for r2 was lower than reported by Sargolzaei et al. (2008) in 821 Holstein sires using 5, 564 SNPs at 0.3 threshold for LD (r2) and 68.34 % markers spacing found from 0 to 0.1 Mb. Qanbari et al. (2010) reported the proportion of 29 % markers with r2> 0.25 separated by less than 100 kb.

Espigolan et al. (2013) reported LD (r2) from 0.11 to 0.05 at 100 kb to 1, 000 kb. The results of this study revealed the same values at this distance but MacKay et al. (2007) estimated mean r2 between adjacent markers range from 0.15 to 0.20 at distance of 100 kb. In this study, a linear relationship was observed between LD (r2) and chromosomes length, as the length of chromosome increases due to the decrease in recombination rate (Yan et al., 2009). Bohmanova et al. (2010) did not find any association with LD level and chromosome size. The low allele frequency of SNPs pairs underestimates LD (r2). Estimation of LD (r2) is considered less biased when allele frequency polymorphism is high in data set (Reich and Lander, 2001). Therefore, in this study, we examined the MAF effects on the estimation of LD (r2) and |D'| (Figure 4 and 5). At high MAF threshold (0.15), the distance between markers was short. The LD was higher between markers, similarly reported by Yan et al. (2009) in maize. The |D'| was observed unchanged at different MAF threshold for the adjacent marker (< 10 kb) and at the threshold of MAF increased, the |D'| was lower as reported in the study of different cattle breeds (Bohmanova et al., 2010). SNPs pair with low frequencies of allele, |D'| divided by a small number and resulting in |D'| large value indicated in the formula, where the denominator is the allele frequencies product (Reich & Lander, 2001). The results of this study revealed a significant variation in the r2 magnitude and pattern of the Sahiwal genome. The cause of this variation might be recombination rate, genetic drift, heterozygosity and selection (Reich & Lander, 2001).

The r2 level between adjacent markers (30 - 40 kb) detected in this study was similar as reported in Bos indicus but lower than Bos taurus cattle (McKay et al., 2007). The breed difference between indicus and taurus separated markers by 80 to 100 kb, because of population`s recent history. Generally, LD comparison is difficult to obtain in different studies due to the difference in sample size, LD measures, markers type and density of markers.

However, it is difficult to compare LD level found in other studies due to change in sample size, LD measures, markers type, markers density, and recent population history (Pritchard & Przeworski, 2001). Nevertheless, Bos taurus and Bos indicus difference emerged during the domestication process and selection, and consequently effective population size seem to elucidate the inconsistency in linkage disequilibrium at short distance between markers (Tenesa et al., 2007). Another reason is the fact that indicine populations present higher proportion of low allele frequency in the bovine high density SNPs BeadChip than the taurine, which in turn affects LD estimations (Weiss & Clark, 2002; McKay et al., 2007).



The LD level assessed for markers segregated by less than 30 kb designates that the high density bovine SNPs BeadChip will probably be an appropriate tool for the predication of genomic breeding values in Sahiwal cattle. Additional studies considering the linkage disequilibrium in larger samples of Sahiwal population are required to ratify the linkage disequilibrium estimates found in this study.


The authors would like to acknowledge cooperation and support of the animal breeder and research institutions for animal's blood provision. ARS/USDA platform is acknowledged for laboratory facilities. Financial support from Higher Education Commission (HEC) of Pakistan is also appreciated.

Authors' Contributions

HM and TSS designed the experiment, HM, HJH, KE carried out the analysis. HM, AA drafted the manuscript. JJK assisted with statistical analysis. HM, AA and TSS structured scientific content. All authors provided editorial suggestions and approved the final manuscript draft.

Conflict of Interest Declaration

The authors of this manuscript declare that they have no conflict of interests.



Arias, J.A., Keehan, M., Fisher, P., Coppieters, W. & Spelman, R., 2009. A high density linkage map of the bovine genome. BMC Genet. 10, 18.         [ Links ]

Bennewitz, J., Solberg, T. & Meuwissen, T., 2010. Genomic breeding value estimation using nonparametric additive regression models. Genet. Sel. Evo. 41, 20.         [ Links ]

Bohmanova, J., Sargolzaei, M. & Schenkel, F., 2010. Characteristics of linkage disequilibrium in North American Holsteins. BMC Genom.11, 421.         [ Links ]

Espigolan, R., Fernando, B., Arione, A.B., Fabio, R.P.S., Daniel, G.M.G., Rafael, L.T., Diércles, F.C., Henrique, N.O., Humberto, T. & Mehdi, S., 2013. Study of whole genome linkage disequilibrium in Nellore cattle. BMC Genom. 14, 305.         [ Links ]

FAO., 2007. The state of the world's animal genetic resources for food and agriculture. Food and Agriculture Organization; Rome, Italy.         [ Links ]

Farnir, F., Coppieters, W. & Arranz, J.J., 2000. Extensive genome-wide linkage disequilibrium in cattle. Genom. Res. 10, 220-227.         [ Links ]

Gibbs, R.A., Taylor, J.F., Van Tassell, C.P., Barendse, W., Eversole, K.A., Gill, C.A., Green, R.D., Hamernik, D.L., Kappes, S.M. & Lien, S., 2009. Genome-wide survey of SNP variation uncovers the genetic structure of cattle breeds. Science 324, 528-532.         [ Links ]

Government of Pakistan. 2006. Livestock Census 2006. Agricultural Census Organization, Statistics Division; Government of Pakistan.         [ Links ]

Hill, W.G.& Robertson, A., 1968. Linkage disequilibrium in finite populations. Theo. App. Genet. 38, 226-231.         [ Links ]

Khan, M.S., Zıa, U.R., Muqarrab, A.K. & Sohaıl, A., 2008. Genetic resources and dıversıty in Pakıstani cattle. Pak. Vet. J. 28, 95-102.         [ Links ]

Khatkar, M.S., Nicholas, F.W., Collins, A.R., Zenger, K.R., Cavanagh, J.A., Barris, W., Schnabel, R.D., Taylor, J.F.& Raadsma, H.W., 2008. Extent of genome-wide linkage disequilibrium in Australian Holstein-Friesian cattle based on a high density SNP panel. BMC Genom. 9,187.         [ Links ]

Kim, E.S & Brian. W.K., 2009. Linkage disequilibrium in the North American Holstein population. Anim. Genet. 40, 279-288.         [ Links ]

Marques, E., Schnabel, R., Stothard, P., Kolbehdari, D., Wang, Z., Taylor, J.F. & Moore, S.S., 2008. High density linkage disequilibrium maps of chromosome 14 in Holstein and Angus cattle. BMC Genet. 9, 45.         [ Links ]

Matukumalli, L.K., Lawley, C.T., Schnabel, R.D., Taylor, J.F., Allan, M.F., Heaton, M.P., O'Connell, J., Moore, S.S., Smith, T.P., Tad, S.S.& Van Tassell, C.P., 2009. Development and characterization of a high density SNP genotyping assay for cattle. PLoS One. 4, e5350.         [ Links ]

McKay, S.D., Schnabel, R.D., Murdoch, B.M., Matukumalli, L.K., Aerts, J., Wouter C. W., Crews, D., Dias Neto, E. & Gill, C.A., 2007. Whole genome linkage disequilibrium maps in cattle. BMC Genet. 8, 74.         [ Links ]

Mustafa, H., Kim, E., Huson, J.H., Adeela, A., David, R., Talat, N.P., Afzal, A., Khalid, J. & Tad, S.S., 2017. Genome-wıde SNPs analysıs of indigenous Zebu breeds ın Pakistan. Bio. Anim. Hus.33, 3-25.         [ Links ]

Pritchard, J.K. & Przeworski M., 2001. Linkage disequilibrium in humans: models and data. Am. J. Hum. Genet. 69, 1-14.         [ Links ]

Qanbari, S., Pimentel, E.C.G., Tetens, J., Thaller, G., Lichtner, P., Sharifi, A.R. & Simianer, H., 2010. The pattern of linkage disequilibrium in German Holstein cattle. Anim. Genet. 41, 346-356.         [ Links ]

Reich, D.E. & Lander, E.S., 2001. On the allelic spectrum of human disease. Trend. Genet. 17, 502-10.         [ Links ]

Sargolzaei, M., Schenkel, F.S., Jansen, G.B. & Schaeffer, L.R., 2008. Extent of linkage disequilibrium in Holstein cattle in North America. J. Dair. Sci. 91, 2106-2117.         [ Links ]

Silva, C.R., Neves, H.H.R., Queiroz, S.A., Sena, J.A.D. & Pimentel, E.C.G., 2010. Extent of linkage disequilibrium in Brazilian Gyr dairy cattle based on genotypes of AI sires for dense SNP markers. In: Proceedings of the 9th World Congress on Genetics Applied to Livestock Production. Leipzig, Germany: WGAPL.         [ Links ]

Tenesa, A., Navarro, P., Hayes, B.J., Duffy, D.L., Clarke, G.M., Goddard, M.E. & Visscher, P.M., 2007. Recent human effective population size estimated from linkage disequilibrium. Genom. Res. 17, 520-526.         [ Links ]

Villa-Angulo, R., Matukumalli, L.K., Gill, C.A., Choi, J., Van Tassell, C.P., John, J. & Grefenstette, J., 2009. High-resolution haplotype block structure in the cattle genome. BMC Genet. 10, 19.         [ Links ]

Weiss, K.M. & Clark, A.G., 2002. Linkage disequilibrium and the mapping of complex human traits. Trend. Genet. 18, 19-24.         [ Links ]

Yan, J., Shah,T., Warburton, M.L., Buckler, E.S., McMullen, M.D. & Crouch, J., 2009. Genetic characterization and linkage disequilibrium estimation of a global maize collection using SNP markers. PLoS One. 4, e845.         [ Links ]



Received 26 September 2017
Accepted 27 December 2017
First published online 30 December 2017



# Corresponding author:

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License