SciELO - Scientific Electronic Library Online

 
vol.28 índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Artigo

Indicadores

Links relacionados

  • Em processo de indexaçãoCitado por Google
  • Em processo de indexaçãoSimilares em Google

Compartilhar


Lexikos

versão On-line ISSN 2224-0039
versão impressa ISSN 1684-4904

Resumo

VAN NIEKERK, Tim; SCHAFER, Johannes  e  HEID, Ulrich. Semi-automating the Reading Programme for a Historical Dictionary Project. Lexikos [online]. 2018, vol.28, pp.343-360. ISSN 2224-0039.  http://dx.doi.org/10.5788/28-1-1468.

This paper describes the resources and software procedures used or developed in a major enabling step towards the revision of the scholarly reference work A Dictionary of South African English on Historical Principles (DSAE, Silva et al. 1996), namely the semi-automatic generation of a digitally-sourced lexical database on which new and updated dictionary entries will be based; as well as the addition, in parallel, of a new corpus of South African English (SAE) to the project. Drawing on online data sources and an extensive list of known SAE word forms, we have developed a software toolchain to gather, encode, annotate and collate textual sources, producing: (i) a 3.1-billion part-of-speech-annotated corpus of South African English; (ii) a lexical database of illustrative quotations for over 20,000 known SAE word forms, available for selection at the entry-revision stage; and (iii) a list of potential new variant spellings and headword inclusion candidates. These steps replace, where recent electronic sources are concerned, the mechanical aspects of quotation gathering, normally undertaken manually through a reading programme requiring years of teamwork to acquire sufficient coverage (cf. Hicks 2010).

Palavras-chave : corpora; dictionary workflows; historical lexicography; language varieties; lexical databases; reading programmes; south african english.

        · resumo em Africaner     · texto em Inglês     · Inglês ( pdf )

 

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons