SciELO - Scientific Electronic Library Online

 
vol.115 número3-4 índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Artigo

Indicadores

Links relacionados

  • Em processo de indexaçãoCitado por Google
  • Em processo de indexaçãoSimilares em Google

Compartilhar


South African Journal of Science

versão On-line ISSN 1996-7489
versão impressa ISSN 0038-2353

Resumo

BREED, Douw G.  e  VERSTER, Tanja. An empirical investigation of alternative semi-supervised segmentation methodologies. S. Afr. j. sci. [online]. 2019, vol.115, n.3-4, pp.1-7. ISSN 1996-7489.  http://dx.doi.org/10.17159/sajs.2019/5359.

Segmentation of data for the purpose of enhancing predictive modelling is a well-established practice in the banking industry. Unsupervised and supervised approaches are the two main types of segmentation and examples of improved performance of predictive models exist for both approaches. However, both focus on a single aspect - either target separation or independent variable distribution - and combining them may deliver better results. This combination approach is called semi-supervised segmentation. Our objective was to explore four new semi-supervised segmentation techniques that may offer alternative strengths. We applied these techniques to six data sets from different domains, and compared the model performance achieved. The original semi-supervised segmentation technique was the best for two of the data sets (as measured by the improvement in validation set Gini), but others outperformed for the other four data sets.Significance: •We propose four newly developed semi-supervised segmentation techniques that can be used as additional tools for segmenting data before fitting a logistic regression. •In all comparisons, using semi-supervised segmentation before fitting a logistic regression improved the modelling performance (as measured by the Gini coefficient on the validation data set) compared to using unsegmented logistic regression

Palavras-chave : data mining; predictive models; multivariate statistics; pattern recognition.

        · texto em Inglês     · Inglês ( pdf )

 

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons