SciELO - Scientific Electronic Library Online

vol.108 issue2 author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand



Related links

  • On index processCited by Google
  • On index processSimilars in Google


SAIEE Africa Research Journal

On-line version ISSN 1991-1696
Print version ISSN 0038-2221


VAN STADEN, W.J.C.  and  VAN DER POEL, E.. Using automated keyword extraction to facilitate team discovery in a digital forensic investigation of electronic communications. SAIEE ARJ [online]. 2017, vol.108, n.2, pp.45-55. ISSN 1991-1696.

A major problem that often occurs in Digital Forensics (DF) is the huge volumes of data that has to be searched, filtered, and indexed to discover patterns that could lead to forensic evidence. The nature of, and the process by which the data gets collected, implies that the data also contain information about persons that are not implicated, or only incidentally involved in the crime under investigation. Privacy is therefore an important issue that needs to be managed in a DF investigation. This paper shows that techniques used in the Team Formation (TF) task can be successfully applied to address both the problems of data volume and privacy. The TF task can be re-formulated to fit the DF arena: to commit a crime, the culprit(s) may require the assistance of several other individuals, which implies that a team of some sort gets established. During a post-mortem DF analysis, an investigator may only have one, or a few names to start with. One of the key challenges is finding possible co-conspirators. From a TF point of view, the culprit is trying to find the best team to commit the crime, given some constraints. The TF task in DF requires the recording of skill-sets, and the generation and/or discovery of a graph depicting interaction between candidates. If the data consist of an email corpus and peoples' roles in an organisation (such as in the Enron data), both of these are readily available. In this paper we consider the TF problem in general and extend it to the DF arena by considering the information that an investigator may have access to during the investigation. We also show that simple information retrieval and keyword extraction techniques (such as RAKE) can be used to automatically discover potential teams from the data, while preserving privacy; results from a series of experiments (using the new definitions of TF and the proposed information retrieval techniques) on the Enron data is then presented.

Keywords : Digital Forensics; Digital Forensic Investigation; Cyber-crime; Team-formation; Social Network Analysis; Expert Finding.

        · text in English     · English ( pdf )


Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License