Services on Demand
Journal
Article
Indicators
Related links
Cited by Google
Similars in Google
Share
Lexikos
On-line version ISSN 2224-0039Print version ISSN 1684-4904
Abstract
MALELE, Nomsebenzi and BOSCH, Sonja. Using Semi-automated Term Extraction for IsiNdebele Health Terminology. Lexikos [online]. 2024, vol.34, pp.269-287. ISSN 2224-0039. https://doi.org/10.5788/34-1-1926.
IsiNdebele, also known as Southern isiNdebele, has a limited availability of language resources and specialised terminology, especially when compared to other members of the Nguni language family. This study therefore explores means of addressing the shortage of specialised terminology in isiNdebele by using semi-automatic term extraction methods. The focus is on health terminology, intended for communication with laypersons rather than between experts in the health field. Semi-automatic term extraction methods are employed, combining manual identification and extraction of data from available corpora with the use of a software tool named WordSmith Tools (WST). The study illustrates the necessity of utilising all functions of the WST, as they complement each other. Terms overlooked by one function may be captured by another. For instance, while the KeyWords function identified only a limited number of terms in this research, manual identification proved more fruitful. Interestingly, the Concord function emerged as particularly effective in identifying a greater number of terms. The use of the WST in this research highlights the viability of corpus-driven studies, even for resource-scarce languages like isiNdebele. Therefore, considering the limited resources available for isiNdebele, particularly the absence of specialised dictionaries, this collection of health terms exemplifies ideal candidates for inclusion in a general dictionary.
Keywords : Isindebele; Corpus-Driven Term Extraction; Health Corpora; Language For Specific Purposes (LSP); Language For General Purposes (LGP); Wordsmith Tools; Word List; Key Words; Concordance; Semi-Automatic Extraction.












