SciELO - Scientific Electronic Library Online

 
vol.116 issue9-10Human germline editing: Legal-ethical guidelines for South AfricaGrade 9 learners' algebra performance: Comparisons across quintiles, insights from errors and curriculum implications author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Article

Indicators

Related links

  • On index processCited by Google
  • On index processSimilars in Google

Share


South African Journal of Science

On-line version ISSN 1996-7489
Print version ISSN 0038-2353

Abstract

PAZI, Sisa; CLOHESSY, Chantelle M.  and  SHARP, Gary D.. A framework to select a classification algorithm in electricity fraud detection. S. Afr. j. sci. [online]. 2020, vol.116, n.9-10, pp.1-7. ISSN 1996-7489.  http://dx.doi.org/10.17159/sajs.2020/8189.

In the electrical domain, a non-technical loss often refers to energy used but not paid for by a consumer. The identification and detection of this loss is important as the financial loss by the electricity supplier has a negative impact on revenue. Several statistical and machine learning classification algorithms have been developed to identify customers who use energy without paying. These algorithms are generally assessed and compared using results from a confusion matrix. We propose that the data for the performance metrics from the confusion matrix be resampled to improve the comparison methods of the algorithms. We use the results from three classification algorithms, namely a support vector machine, /r-nearest neighbour and naïve Bayes procedure, to demonstrate how the methodology identifies the best classifier. The case study is of electrical consumption data for a large municipality in South Africa. SIGNIFICANCE: • The methodology provides data analysts with a procedure for analysing electricity consumption in an attempt to identify abnormal usage. • The resampling procedure provides a method for assessing performance measures in fraud detection systems. • The results show that no single metric is best, and that the selected metric is dependent on the objective of the analysis.

Keywords : electricity fraud detection; confusion matrix; classification algorithms.

        · text in English     · English ( pdf )

 

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License