SciELO - Scientific Electronic Library Online

vol.38 número1 índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados



Links relacionados

  • Em processo de indexaçãoCitado por Google
  • Em processo de indexaçãoSimilares em Google


South African Journal of Animal Science

versão On-line ISSN 2221-4062
versão impressa ISSN 0375-1589

S. Afr. j. anim. sci. vol.38 no.1 Pretoria Jan. 2008


The use of a cluster analysis in across herd genetic evaluation for beef cattle



F.W.C. Neser; G.J. Erasmus; M.M. Scholtz

Department of Animal, Wildlife and Grassland Sciences, UFS, PO Box 339, Bloemfontein 9300, South Africa





To investigate the possibility of a genotype x environment interaction in Bonsmara cattle, a cluster analysis was performed on weaning weight records of 72 811 Bonsmara calves, the progeny of 1 434 sires and 24 186 dams in 35 herds. The following environmental factors were used to classify herds into clusters: solution for herd effects corrected for year-season, sex, age of dam and age at weaning (indicative of the management level in a herd), herd size and average temperature and rainfall. Two different genetic analyses were performed. Breeding values obtained in a univariate analysis were used as basis for comparison to breeding values obtained from a multivariate analysis where weaning weight in each cluster was considered as a separate trait. Direct additive, maternal additive, permanent maternal environment and Herd-Year-Season x Sire (HYSxS) interaction were included as random effects in both analyses. The direct genetic correlations between the clusters varied between 0.51 and 1.00. The low correlation estimates between some of the clusters indicate a possible genotype x environment interaction. Substantial reranking of sires between clusters did occur. However, further research is needed to identify and prioritize variables that can describe the genetics, management and climate of each herd more accurately.

Keywords: Bonsmara cattle, Genotype by environment interaction, weaning weight




Modern reproduction technologies such as artificial insemination (AI) and embryo transfer ease the movement of genetic material across countries. Sub-populations of traditional South African breeds, such as the Bonsmara and Dorper, exist in countries such as Brazil, Argentina and Australia. By combining these populations in a total breed analysis, selection progress can be enhanced and genetic evaluation costs reduced.

Genotype x environment interaction can be defined as different sets of genes that determine varying levels of expression in different environments (Bertrand et al., 1987). Such genotype x environment interaction can reduce the accuracy of the genetic analysis (Neser et al., 1996). If interactions exist between genotype and environment, the accuracy and reliability of such evaluations are overestimated (Ibi et al., 2005).

In dairy cattle cluster analysis over different ecological regions and countries are used to overcome this problem. The herd-cluster model is appealing because an animal's genetic merit is predicted for each unique environment or management system, regardless of country borders (Weigel & Rekaya, 2000). A number of studies were done to investigate genotype environment interaction by using cluster analyses (Lin & Lin, 1994; Weigel & Rekaya, 2000; Fiske et al., 2001; Zwald et al., 2003a; b). These were mostly done on dairy cattle investigating borderless international genetic evaluations.

In South Africa large differences in animal performance exist between different biomes or ecological regions. These differences can also extend to management levels between different herds within a specific region. It is therefore possible, by using only South African data, to simulate a total breed analysis over different environments in order to investigate the feasibility of borderless genetic evaluations.

The aim of the study was to investigate the presence and consequences of genotype x environment interaction that could occur in borderless genetic evaluations. This was done by clustering the National Bonsmara dataset into different smaller sets, according to meteorological and performance data. Weaning weight in each cluster was considered as a separate trait in a multivariate genetic analysis.


Materials and Methods

Weaning weight records of 274 882 Bonsmara calves were available for the study. These records were obtained from the National Beef Cattle Improvement Scheme. After editing, weaning weight records on 72 811 calves, the progeny of 1 434 sires and 24 186 dams in 35 herds, were used in the analysis. The period covered was between five and 15 years per herd. The following edits were performed on the data: All incomplete records (records without both parents) and sires with less than five progeny as well as herds with less than five years data were deleted from the dataset. Only herds that were genetically linked to at least five other herds (via sires used) were used in the analysis. This was done to improve connectedness in the dataset. Connectedness was further enhanced by utilizing an extended pedigree file.

A stepwise regression analysis was done to assess the importance of different environmental factors on weaning weight. The following environmental factors were tested using SAS (1992) and were found to be significant: solutions for herd effects corrected for year-season, sex, age of dam and age at weaning (to give an indication of the management level within a herd), herd size, rainfall and temperature. Herds were allocated to five rainfall and six temperature zones using the mean annual rainfall chart of Dent et al. (1989) as presented in Tainton (1999) and a mean annual temperature chart of South Africa (Environmentek, CSIR, South Africa These factors were used to classify herds into clusters.

Cluster analysis is an exploratory technique designed to classify data into subgroups, which share similar characteristics (Lin & Lin, 1994). The Fastclus clustering procedure in SAS (1992) was used for herd clustering. This procedure performs a disjoint cluster analysis based on Euclidean distances. This iterative method guarantees that the distances between all observations in the same cluster will be less than the distances between observations in different clusters (Zwald et al., 2003a).

Two genetic analyses were carried out. In the first analysis a univariate animal model was fitted using all weaning weight records without clustering. In the second analysis a multivariate animal model, treating weaning weight as a different trait in the different clusters was fitted.

Estimates of the (co)variance components were obtained using the ASREML program developed by Gilmour et al. (2002). Direct additive, maternal additive, permanent maternal environment and Herd-Year-Season x Sire (HYSxS) interaction were included as random factors in both the univariate and multivariate analysis. The covariance between direct and maternal effects was excluded from the model as it causes convergence problems in the multivariate analysis. Herd-year-season and sex were included as fixed effects. Age of dam and age at weaning were fitted as linear regressors.

The following model was used for all the different analyses:

y = Xβ + Z1a +Z2m + Z3c1 +Z4c2 + e

Where y is a nx1 vector of records, X is a nxp incidence matrix that relates data to the unknown vector of location parametersβ. The vector β contained HYS and sex as fixed effects and age of dam and age at weaning as random regressors. The incidence matrices Z1 and Z2 relate the unknown random vectors of direct breeding value (a) and maternal breeding value (m), respectively, to y. The incidence matrices Z3 and Z4 relate the unknown additional random vectors, permanent maternal environment (c1) and HYSxS interaction (c2), to y. The unknown vector e contains the random residuals due to environmental effects peculiar to individual records. The assumed (co)variance structure of the model is as follows:

where A is the numerator relationship matrix, Ic1 is an identity matrix with order number of cows, Ic2 is an identity matrix with order number of HYSxS interaction levels and In is an identity matrix with order number of records.

Weaning weight in each cluster was considered as a separate trait. The results of the single trait analysis were used as starting values for the multivariate analysis. The multivariate analysis was used to obtain correlation estimates between the clusters. Data from progeny and other relatives in different clusters can be used to predict the performance of each sire in each production environment (Zwald et al., 2003a).


Results and Discussion

Four clusters were formed. Solutions for herd effects were by far the most important factor, accounting for about 82% of the variation in weaning weight. Ten percent of the variation was accounted for by rainfall while the rest of the factors made up less than 5%. The large contribution of the herd effects is unfortunate as it totally dominates the whole clustering process. Research to find better descriptors of the climate and management levels in the different herds should form part of further studies.

In Table 1 the basic statistics for each cluster are presented. One of the clusters (Cluster 2) had less than 3500 animals. The distribution of records between Clusters 1, 3 and 4 was quite even. Unfortunately only 22 sires were used in all four clusters, while 95 were used in at least three clusters and 207 in only two clusters.

Cluster 2 had the least number of sires in common with the other clusters (Table 2). This was to be expected as it comprises only 3 188 weaning weight records. The restricted use of AI is, however, a common problem in all beef cattle herds in South Africa and this could, unfortunately, adversely affect the genetic correlation estimates between clusters. This is supported by Weigel & Rekaya (2000) who stated that inadequate genetic ties might lead to erroneous covariance estimates.



The direct heritability estimates in Clusters 1 and 2 were similar, while the estimates for Clusters 3 and 4 were closer to those obtained in the univariate analysis. The direct heritability estimates obtained in Clusters 1, 2 and 3 are similar to those obtained by Neser et al. (1996) and Nephawe et al. (1999) in the same breed.

The estimates for maternal heritability were quite similar in both the clusters and the univariate analysis. The estimates are, however, lower than those obtained by Neser et al. (1996) in the same breed. In general all heritability estimates were within the range (0.068 - 0.66) obtained by Meyer (1992) and Koots et al. (1994).

The estimates for permanent maternal environment were in general higher than the maternal heritability estimates. The estimates of HYSxS as a proportion of the total variance correspond to results obtained by Neser et al. (1996) and Nephawe et al. (1999) in the same breed.

Click here - Table 4

The direct genetic correlation estimates between the clusters varied between 0.51 and 1.00 (Table 4). In a preliminary study on a sub data set of the Bonsmara breed, in which AI was used extensively, Neser (2002) obtained genetic correlations that varied between -0.04 and 0.91 between six different clusters. This is both higher and lower than the present study. Robertson (1959) indicated that the genetic correlation gives a measure of the practical rather than statistical significance of the traits and suggested that an estimated genetic correlation appreciably lower than one (<0.80) would indicate changes in the ranking of genotypes in the two environments. Although the genetic correlation between some of the clusters was below 0.80 indicating a possible genotype x environment interaction, the high standard error in some of the estimates renders this result doubtful. The low correlations between some of the clusters indicate that a re-ranking of animals could take place. This would mean that different animals will be selected for different clusters (management levels). Similar results were obtained by Bradfield et al. (1997) and Nephawe et al. (1999), when investigating the possibility of a genotype x environment interaction between different regions.

A further analysis was conducted to assess the impact of genotype x environment interaction on the ranking of sires using estimated breeding values of sires from both the univariate and multivariate analyses. All sires with more than 100 progeny were sorted according to their estimated breeding values in the univariate analysis. The top 20 sires out of a possible 201 are presented in Table 5.

The bull that performed the best in all of the clusters ranked third in the univariate analysis (Table 5), while the best bull in the univariate analysis ranked 2nd, 7th, 21st and 2nd in the different clusters. There was, however, a substantial re-ranking of sires in the different clusters. Bull no 1143 for instance ranked 10th in the univariate analysis, but 22nd, 48th, 63rd, and 14th in the different clusters while bull no 153 556 ranked 29th in the univariate analysis, but 73rd, 74th, 10th and 99th in the different clusters. This re-ranking occurred despite the high genetic correlations that exist between the clusters.

Bulls in the top third are more likely to be selected for breeding purposes. It was therefore decided to estimate the product moment correlations as well as the Pearson correlation for animals in the top third, ranked according to their estimated breeding values in the univariate analysis (Table 6).

If Cluster 3 is ignored, the correlations indicate no genotype x environment interaction (product moment correlations above 0.80) and only a small to moderate re-ranking between the clusters and the univariate analysis. In the case of these clusters the product moment correlations were between 0.67 and 0.86. Between clusters the correlation was as high as 0.94. In contrast, re-ranking in Cluster 3 was the most severe. For this cluster all the product moment correlations were below 0.40. The reasons for the peculiar values obtained whenever cluster 3 is involved needs to be investigated in future studies.



The results of this study give an indication of some of the problems that could occur in a cross country genetic evaluation when genotype x environment interaction is present. The most important problem in the beef industry in South Africa is the limited use of AI and the lack of ties between clusters. The precision of the estimated genetic parameters depended more heavily on the level of genetic ties between clusters than the number of observations per cluster (Weigel & Rekaya, 2000). Ties between beef herds in South Africa are much weaker than in dairy herds. Furthermore, more complex models are used in the beef industry compared to the dairy industry, which enhances this problem. The only way to overcome this problem is to actively promote the use of AI in the beef industry.

With the herd-cluster model, herds are grouped based on likeness, rather than location (Weigel & Rekaya, 2000). Animals can therefore obtain breeding values for different management levels. This would simplify selection substantially as the best-adapted animal for a specific environment or management level could be selected. In theory an animal could thus get four breeding values, one for each cluster. The different breeding values would then give an indication of environmental sensitivity. Breeding values of animals that are more sensitive to the environment should vary more between the different clusters. This could facilitate the selection of better adapted animals over a wider variety of management levels as well as environments.

The increasing use of large scale genetic evaluations also raises concern about possible occurrence of large differences between the environment of the seed stock producer and that of the commercial producer. With the current information available the best option for commercial producers seems to be to buy bulls from seed stock producers that have similar environments and production systems.

In this study, the average rainfall and temperature levels of each region inter alia were used to classify the herds in the different clusters. In future studies climatic factors specific to each herd should be used. Further research is needed to identify and prioritize variables that can describe the genetics and management levels of each herd more accurately before across country analysis could be performed.



The authors would like to thank the Bonsmara Breeder's Society as well as the ARC - Livestock Business Division for providing the data for this study.



Bertrand, J.K., Hough, J.D. & Benyshek, L.L., 1987. Sire x environment interactions and genetic correlations of sire progeny performance across regions in dam-adjusted field data. J. Anim. Sci. 64, 77-82.         [ Links ]

Bradfield, M., Graser, H-U. & Johnston, D.J., 1997, Investigation of genotype x production environment interaction for weaning weight in the Santa Gertrudis Breed in Australia. Aust. J. Agric. Res. 48, 1-5.         [ Links ]

Dent, M.C., Lynch, S.D. & Schulze, R.E., 1989. Mapping mean annual and other rainfall statistics over Southern Africa, Report No. 109/1/89, Water Research Commission, Pretoria. (Cited by Tainton, 1999).         [ Links ]

Environmentek, CSIR, South Africa:        [ Links ]

Fikse, F., Rekaya, R. & Weigel, K.A., 2001. Genotype by environment interaction for milk production traits in Guernsey cattle. Proc. Interbull Bull. 27, 9-12.         [ Links ]

Gilmour, A.R., Cullis, B.R., Welham, S.J. & Thompson, R., 2002. ASREML user guide. NSW Agriculture Biometric Bulletin No. 3. NSW Agriculture, Orange Agricultural Institute, Forest Road, Orange 2800, NSW, Australia.         [ Links ]

Ibi, T., Hirooka, H., Kahi, A.K., Sasae, Y. & Sasaki, Y., 2005. Genotype x environment interaction effects on carcass traits in Japanese Black cattle. J. Anim. Sci. 83, 1503-1510.         [ Links ]

Koots, K.R., Gibson, J.P., Smith, C. & Wilton, J.W., 1994. Analyses of published genetic parameter estimates for beef production traits. 1 Heritability. Anim. Breed. Abstr. 62, 309-338.         [ Links ]

Lin, C.Y. & Lin, C.S., 1994. Investigation of genotype-environment interaction by cluster analysis in animal experiments. Can. J. Anim. Sci. 74, 607-612.         [ Links ]

Meyer, K., 1992. Variance components due to direct and maternal effects for growth traits of Australian beef cattle. Livest. Prod. Sci. 31, 179-204.         [ Links ]

Nephawe, K.A., Neser, F.W.C., Roux, C.Z., Theron, H.E., Van der Westhuizen, J. & Erasmus, G.J., 1999. Sire x ecological region interactions in Bonsmara cattle. S. Afr. J. Anim. Sci. 29, 189-201.         [ Links ]

Neser, F.W.C., 2002. A preliminary investigation into the use of cluster analyses in genotype x environment interaction studies in beef cattle. Proc. 7th World Congr. Gen. Appl. to Livestock Prod. Montpellier, France. 32, 391-393.         [ Links ]

Neser, F.W.C., Konstantinov, K.V. & Erasmus, G.J., 1996. The inclusion of herd-year-season by sire interaction in the estimation of genetic parameters in Bonsmara cattle. S. Afr. J. Anim. Sci. 26, 75-78.         [ Links ]

Robertson, A., 1959. The sample variance of the genetic correlation coefficient. Biometrics 15, 469-475.         [ Links ]

SAS, 1992. Statistical Analysis Systems user's guide (Ver 6.03). SAS Institute Inc., Cary, North Carolina, USA.         [ Links ]

Tainton, N., 1999. Veld Management in South Africa. University of Natal Press, Scottsville, South Africa.         [ Links ]

Weigel, K.A. & Rekaya, R., 2000. A multiple - trait herd cluster model for international dairy sire evaluation. J. Dairy Sci. 83, 815-821.         [ Links ]

Zwald, N.R., Weigel, K.A., Fikse, W.F. & Rekaya, R., 2003a. Application of a multiple trait herd cluster model for genetic evaluation of dairy sires from seventeen countries. J. Dairy Sci. 86, 376-382.         [ Links ]

Zwald, N.R., Weigel, K.A., Fikse, W.F. & Rekaya, R., 2003b. Identification of factors that cause genotype by environment interaction between herds of Holstein cattle in seventeen countries. J. Dairy Sci. 86, 1009-1018.         [ Links ]




Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons