versión On-line ISSN 1996-7489
versión impresa ISSN 0038-2353
S. Afr. j. sci. vol.110 no.3-4 Pretoria feb. 2014
School of Mathematics, Statistics and Computer Science, University of KwaZulu-Natal, Westville Campus, Durban, South Africa
Academic members of staff at the University of KwaZulu-Natal (UKZN) are expected to publish in research journals that have been accredited by the South African based Department of Higher Education and Training. However, some members of staff have chosen to focus solely on the teaching aspect of their careers and as a result they have no publication record. In this study, a set of per annum productivity unit counts was calculated for every academic at UKZN. Because it is possible for a publishing academic to also record a zero count for a given year, it is necessary to develop an appropriate methodology that can distinguish this zero count from one that will always be recorded by a non-publishing academic. By fitting a zero-inflated Poisson model to the data, specific factors can be identified that separately drive the non-publishing and publishing processes at UKZN. In particular, having a PhD and working in a large school has a significant impact on improving the research output of a publishing academic. If UKZN wants to become a research-focused university, non-publishing academics should be encouraged to undertake a PhD degree.
Keywords: publishing record; zero inflation; Poisson; science faculty; UKZN
In South Africa, the Department of Higher Education and Training allocates funds to universities based on a formula that rewards student throughput as well as the research output of publishing academics.1,2 In this paper, an attempt is made to identify possible demographic and academic factors that help to improve the research output of academics at the University of KwaZulu-Natal (UKZN). Similar studies have been conducted, mainly on US data (for example Xie and Shauman3, Aksnes et al.4 and Kyvik and Teigen5), in which it was found that for almost every age group in their respective data sets, men publish more than women. Barjak6, Gonzalez-Brambila and Veloso7 and Kyvik8 have found that research productivity tends to increase with age, reaching a peak before tapering off towards retirement. What distinguishes this study from those mentioned above is that it analyses a set of per annum based publication counts for all members of staff at UKZN. Some staff members will have a zero value for a particular year because they did not publish anything during that year even though they have a record of previous publications, whereas others may have a zero value because they have chosen to focus entirely on the teaching aspect of their careers and therefore never publish. Essentially, a per-annum publication-based output variable is observed, but an accompanying variable denoting whether or not the person recording this zero outcome is (or will eventually become) a publishing academic (or not) cannot be known. A zero-inflation based modelling approach helps to overcome this problem of not being able to identify the true identity of a given record as belonging to that of a publishing or non-publishing academic by introducing into the modelling process an underlying process - process Z - that generates two types of zeroes: a structural zero when dealing with a non-publishing academic and a true zero when dealing with a publishing academic.
A zero-inflation model assumes two possible sources from which a zero observation can arise. Academics who have made a conscious decision to never publish will generate what is called a 'structural' zero. Those who have a record of previous publications but have not published in a particular year generate what is called a 'true' or 'sampling' zero.
Focusing on the structural zero recorded every year for a non-publishing academic, a zero inflation model links the probability po of being able to observe such an outcome with the covariate profile x associated with a particular academic using the following logistic regression model9:
Focusing on the per annum based research records generated by a publishing academic for whom, in a particular year, a true zero value may have been recorded, a Poisson distribution with a covariate dependent intensity parameter λ(x) that is linked to a set of covariates x using the following function
is used to model the output generated. With these model choices in hand, a final model formulation - a zero-inflated Poisson model - is obtained for our response variable Y that sets13
Parameter estimates for θ and y can be obtained using the method of maximum likelihood, details of which can be found in Lambert9. Because the mean of a Poisson distribution equals the intensity parameter λ(x) that is helping to characterise that distribution, the above expression linking λ(x) to the covariates in x implies for any positive estimate that occurs in θ that the resulting covariate to which this estimate refers has a positive effect, that is, increases the number of per annum based research outputs produced by that academic at UKZN. Similarly, those covariates that have a negative parameter estimate associated with them are associated with a reduced level of research output.
The response variable
At UKZN, the productivity unit counts that appear in Table 1 have been used to unbiasedly apportion a 'unit of worth' to a published piece of work. This approach differs from that of most other studies in which each article that has been published is usually weighted by the journal impact factor which measures the average number of citations a journal paper will receive in the first 2 years following its year of publication. Because the focus in this study was to identify specific factors that affect the production of research rather than on the quality of this research, a 'unit of worth' was chosen to use as a measure, but it can be noted that the method to be outlined can just as easily be applied as an impact factor adjusted response variable. This response variable was then rescaled using the following rule so that it better fitted into the model paradigm outlined in the previous section for a zero-inflated Poisson model:
In addition to this response variable, the following covariates that help to distinguish one academic from another were also collected:
• a 0/1 indicator variable denoting whether the academic member of staff was female or male
• a set of separate 0/1 indicator variables denoting whether the academic was a lecturer, senior lecturer or professor
• a set of separate 0/1 indicator variables denoting the racial group to which the academic belongs (African, Indian or white)
• a 0/1 indicator variable denoting whether the academic has a PhD or not
• a variable denoting the number of academics in the school in which the academic resides (size)
• an age-based category variable taking on a value 0 if the academic (in that particular year) is in their twenties, a value 1 if they are in their thirties, etc.
Data was collected for the period from 2004 to 2008 and consisted of a total of 1236 year-on-year productivity unit counts. A breakdown of the data set according to age, race, gender, qualification and job position is given in Table 2.
Parameter estimates for the zero inflation model were obtained using Stata. The strongly skewed nature of each of the plots that appear in Figure 1 seem to support the idea that the observed counts are being generated from a combination of two different sources - a non-publishing academic who has chosen to focus on the teaching aspect of their careers and will therefore never produce a research article throughout their entire academic career or a research active academic who in a so-called 'dry year' may have recorded a zero record for that particular year.
Table 3 contains a set of parameter estimates for θ and y. Any significantly positive value that occurs in y can be associated with a factor that will serve to increase the log-odds ratio in favour of generating a structural zero (i.e. a person who will never publish a research article). Any significantly positive value obtained for θ should be interpreted as identification of a factor that will help to increase the expected per annum based research output of that individual when compared with a person from the baseline category (a white female who holds a senior lecturer position and has no PhD qualification).
Focusing on those factors that help to distinguish someone who does publish from someone who does not, the estimates for y that appear in Table 2 suggest that having a PhD, being a professor (rather than a lecturer) and being of non-Indian origin are the only variables that significantly increase the odds ratio associated with becoming an academic who does publish. More importantly, however, are the factors that do not seem to be affecting the odds ratio associated with becoming a research productive academic in the Faculty of Science and Agriculture at UKZN. In particular, age and school size do not seem to play a significant role in whether or not someone becomes a more research productive academic.
Focusing now on those academics who appear to be research active, the estimates for 6 that appear in Table 3 suggest, at a 5% level of significance, that holding the position of professor, having a PhD qualification, being male and working in a large school all help to increase the research productivity of academics in the Faculty of Science and Agriculture at UKZN when compared with someone from the baseline category (a white female who holds a senior lecturer position in the faculty and has no PhD qualification). However, staff who are older seem to be less productive when compared with younger staff. Similarly, African and Indian researchers are not producing as much research output as their otherwise identical white counterparts.
The excess number of zero values observed in our data set may be attributable in part to a large number of academics within this data set who have chosen never to publish in their academic careers. Including these individuals in the analysis may impair the ability to focus on the problem of interest, which is to identify those factors that help to improve the publication rate of academics in the Faculty of Science and Agriculture at UKZN who have chosen to publish. The purpose of this study was to identify a specific set of factors that affect research productivity in the Faculty of Science and Agriculture at UKZN. Because the data set available does not explicitly identify someone as being a publishing or non-publishing academic, a method had to be developed to appropriately separate this pool of academics into those that publish and those that have elected to never publish. A set of covariates were then included in the model to establish their effect on research productivity.
In particular, having a PhD and working in a large school were found to have a significant impact on improving the research output of a publishing academic. If UKZN wants to become a research-focused university, non-publishing academics need therefore to be encouraged to complete a PhD degree.
1. Melck AP. Methods of financing universities with special reference to formula funding in South Africa [DCom thesis]. Stellenbosch: Stellenbosch University; 1982. [ Links ]
2. Venter RH. An Investigation of government financing of universities. 2nd ed. Report: SAPSE-110. Pretoria: Department of National Education; 1985. [ Links ]
3. Xie Y Shauman K. Sex differences in research productivity: New evidence about an old puzzle. Am Sociol Rev. 1998;63(6):847-870. http://dx.doi.org/10.2307/2657505
4. Aksnes DW, Rorstad K, Piro F, Sivertsen G. Are female researchers less cited? A large-scale study of Norwegian scientists. J Am Soc Inform Sci Technol. 2011;62(4):628-636. http://dx.doi.org/10.1002/asi.21486
5. Kyvik S, Teigen M. Child care, research collaboration, and gender differences in scientific productivity. Sci Technol Hum Val. 1996;21(1):54-71. http:// dx.doi.org/10.1177/016224399602100103 [ Links ]
6. Barjak F. Research productivity in the Internet era. Scientometrics. 2006;68(3):343-360. http://dx.doi.org/10.1007/s11192-006-0116-y
7. Gonzalez-Brambila C, Veloso FM. The determinants of research output and impact: A study of Mexican researchers. Res Policy. 2007;36(7):1035-1051. http://dx.doi.org/10.1016/j.respol.2007.03.005
School of Mathematics, Statistics and Computer Science,
University of KwaZulu-Natal, Westville Campus, Durban 4001, South Africa
Received: 10 Oct. 2013
Revised: 19 Dec. 2013
Accepted: 01 Jan. 2014