Intelligent Malware Detection Using a Neural Network Ensemble Based on a Hybrid Search Mechanism

Malware threats have become increasingly dynamic and complex, and, accordingly, artificial intelligence techniques have become the focal point for cybersecurity, as they are viewed as being more suited to tackling modern malware incidents. Specifically, neural networks, with their strong generalisation performance capability, are able to address a wide range of cyber threats. This article outlines the development and testing of a neural network ensemble approach to malware detection, based on a hybrid search mechanism. In this mechanism, the optimising of individual networks is done by an adaptive memetic algorithm with tabu search, which is also used to improve hidden neurons and weights of neural networks. The adaptive memetic algorithm combines global and local search optimisation techniques in order to overcome premature convergence and obtain an optimal search outcome. The results from the testing prove that the proposed method is strongly adaptive and efficient in its detection of a range of malware threats, and that it generates better results than other existing methods.


Introduction
Malware attacks have increased recently, in Africa and globally, due to advances in technology and the growing number of miscellaneous internet of things (IoT) devices being connected to data networks (Xiao, Lin, Sun & Ma, 2019). The nature of malware attacks has also dramatically changed, as sophisticated attacks have become ubiquitous. The sophistication and complexity of malware have manifested in miscellaneous ways; the common way has been malware camouflage and obfuscation, where the attack comes in the form of a solution to a problem together with a demand for ransom money (Kalaimannan, John, DuBose & Pinto, 2017). This type of attack has continued to evolve and to produce more variants, so that the perpetrators can continue to profit from such pernicious attacks.
According to Symantec, malware has increased sharply since 2014 (Lee & Kwak, 2016) and continues to increase. Symantec also reports that the majority of the new malware programs are variants of the existing destructive malware, which is indicative of the evolution that is taking place in order for the programs to be more complex for the countermeasures and to avoid detection.
In order to mitigate malware, a number of countermeasures have been advanced in the literature ( Jerlin & Marimuthu, 2018). However, existing techniques have not fared well due to the obfuscation tactics of malicious software and other advances in evading detection. Malware targets have also expanded to include mobile platforms, thereby posing another challenge to existing mitigation efforts.
Most existing research has reported on the efficiency of machine learning (ML) and artificial intelligence (AI) techniques in malware detection and mitigation (Chen, Su & Qiao, 2018). The techniques are useful in the classification of malware, such as Trojans, worms, among others, and in mapping suitable techniques to the malware type. In addition, the ML techniques, through feature extraction, increase the accuracy of detecting malware by reducing the search space, so as to home in on the specific malware (Khammas, 2018). This alleviates some of the current challenges, including the conventional detection methods being evaded by new variants of malware due to Intelligent Malware Detection Using a Neural Network Ensemble search limitations. This can be attributed to the learning abilities of ML, as well as their capacity to mine data patterns, relationships and procedure analysis.
The modern malware is increasingly adaptive and dynamic in nature, which makes self learning techniques important. Specifically, the self learning techniques, such as neural networks, are able to self-organise and self-evolve, as well as classify and process data in parallel, and are hence able to detect mutated and other instrinsic forms of malware (Barriga & Yoo, 2017).
Agent-based methods have also become promising approaches for malware detection in both web and mobile applications (Kendrick, Criado, Hussain & Randles, 2018). This is due to the interactions between agents and their environments, which create more focused and accurate inputs, so as to generate robust intelligent solutions against malware. (Agents are any entities that can make decisions like a human being through their interactions with one another and the environment.) Moreover, the heterogeneity of agents and of their attributes enriches the capabilities of agent-based applications in combating different types of malware. Through these interactions and the inherent data collection and storage capabilities of agents, patterns can be inferred, which are useful for predictions.
Prediction has been extensively explored in the domain of malware detection, especially using machine learning techniques to predict the behaviour of malware and its hotspots or risk areas (Mahrin et al., 2018). The behavioural analysis of malware also includes classification, which is essential for investigating and prioritising threats. The recent malware attacks have become much more coordinated, for example, botnets which represent a string of devices that are interconnected to communicate and share information with one another, so as to launch large-scale and high-level attacks. Botnets use autonomous programs known as bots, which mimic human behaviour in interactions with users with a view to collecting information and using it to conduct various kinds of malicious attacks (Khoshhalpour & Shahriari, 2018). The level of complexity inherent in these kinds of attacks continues to pose a significant challenge to the traditional malware detection techniques, as well as to the common network environments.
The threat of malware has spread to mobile telephony platforms and proliferated exponentially on account of the openness, and popularity of use, of mobile platforms (Ren, Liu, Cheng, Guo & Chen, 2018). These mobile platforms have become carriers of very sensistive data, ranging from personal financial information to the private details of users' lives. Any data breach on these platforms due to malware attacks can have severe consequences.
The web and desktop platforms nowadays also carry a similar risk, due to a proliferation of desktop applications that enable users to process sensitive information, and also due to mixed storage and processing of business and personal information. One of the biggest attacks in modern times was the WannaCry ransomware attack in 2017, which affected more than 150 countries in less than a week. More than 200,000 devices were attacked in a matter of days, through forcing the encryption of users' data until a ransom was paid . This is one of the many examples of malware that use advanced algorithms to conduct large-scale malicious attacks. The existential challenge posed by these techniques is their ability to evade detection and literally throw the traditional and less intelligent detection methods into confusion.
In this article, we advance a novel approach, with a view to outflanking the intelligent malware and providing a robust countermeasure to a wide variety of malware. The approach is based on a neural network ensemble, wherein an intelligent search optimisation process is conducted by memetic algorithm and by the k-means machine-learning clustering algorithm, in order to generate the optimal solution for complex malware detection. The approach that we advance applies to cyber systems throughout the world. Malware attacks in Africa are similar to attacks elsewhere, and thus we have designed the algorithm with the aim of helping to solve a global problem.
The rest of the article is organised as follows: section 2 provides a review of related work, section 3 provides the problem definition, section 4 provides the methodology, section 5 provides the discussion and presentation of experimental results, and section 6 concludes.

Neural-network-based malware detection
Due to the dynamic behaviour and increasing obfuscation tactics of malware, more intelligent solutions have been sought, and prominent among them are neural networks. Many factors make neural networks attractive for solving problems of this nature, but the overriding factor is that, due to their intrinsic training processes, they achieve accuracy and efficacy in solving very complex problems (Hassan & Hamada, 2017), thus making them suitable for malware detection. Yan, Qi and Rao (2018) present an ensemble method for detecting malware based on a deep neural network. The approach uses a convolutional neural network and a memory technique to learn raw data and make inferences regarding the existence or nonexistence of malware. The inferences are based on patterns extrapolated from both the structure and code of the malicious file. (A convolutional recurrent neural network is a blend of the recurrent neural network and the convolutional neural network. Convolutional neural networks can be characterised as those that apply convolutions (a kind of mathematical operation) and that classify data regardless of the positioning.) This approach is similar to the recurrent neural network ensemble proposed by Rhode, Burnap and Jones (2018). The ensemble studies behavioural data Intelligent Malware Detection Using a Neural Network Ensemble and makes inferences regarding the maliciousness of an executable file. This is done during the execution by collecting a small sample of behavioural data with a view to detecting and blocking malicious processes before they cause damage. In order to classify this behavioural data, a classifier is presented based on a convolutional recurrent neural network (Alsulami & Mancoridis, 2018), in order to classify families of malware and to extrapolate better patterns for improving detection acurracy. The method extracts features adaptively from MS Windows files to classify them.
In the same vein, Kabanga and Kim (2018) apply the convolutional neural network to the classification of malware image. Instead of using text and other forms of data as inputs, image vectors are used to train neural networks. The convolutional neural network is set up with three layers, in order to achieve the classification function. In order to identify and classify complex patterns in data for malware detection, Le, Boydell, Namee and Scanlon (2018) present a classification method based on deep learning. The approach uses data-driven techniques to identify features for classification. Multiple deep-learning architectures are utilised, and each input is classified into a malware class in terms of various neural network layers, whereby vectors are generated for feature extraction and classification. This classification method contrasts with mechanisms that rely on expert domain knowledge. For example, Zarras, Webster and Eckert (2016) propose a malware classification strategy based on both recurrent and convolutional neural networks. System calls are obtained and sequenced using a sequential model to form a domain for feature extraction and classification. The same principle is applied by Martinelli, Marulli and Mercaldo (2017), wherein a dynamic analysis is conducted on system calls and a convolutional neural network is deployed to distinguish malicious data from benign data in an Android data sample. The recurrent neural networks, unlike convolutional neural networks, recall input samples and reuse them for classification of the current samples.
A dynamic malware detection technique based on deep learning is also presented by Yin, Zhou, Wang, Jin and Xu (2018), wherein malware execution and monitoring functions are separated and analysed independently. A log is produced for the monitoring processes where information is then extracted as the input for the neural network. The training enables the neural network to recognise and classify various types of malware. The approach of Selvaganapathy, Nivaashini and Natarajan (2018) uses a restricted Boltzmann machine (RBM) with a stacking technique to select features in a neural network and detect malicious patterns in uniform resource locators (URLs). A miscellany of classes are used for classification. A similar problem is solved by Le, Pham, Sahoo and Hoi (2017), with convolutional neural networks and the detection technique embedded in the URL so as to train the neural network in all aspects of the URL, including words and characters. An RBM is an artificial neural network that is stochastic (i.e., randomly determined) in nature, and a probability distribution can be generated from its inputs.
In order to detect and classify malware in unseen files, Rad, Nejad and Shahpasand (2018) apply a binary classifier to MS Windows files. The training of the neural network classifier is done with a view to giving it the ability to distinguish malicious files from benign ones.
Many of the aforementioned approaches provide novel solutions, but the malware landscape has dramatically changed in recent years. The changes are mostly epitomised in the shifting optima (Souri & Hosseini, 2018), i.e., shifts in the most favourable solution among a set of constantly changing feasible solutions. The shifting optima makes tracking difficult, and makes it overwhelmingly difficult for static and slow-evolving solution approaches to find optima. Thus, the more efficient malware countermeasures will be those that are highly adaptive and dynamic in their search and optimisation processes for malware detection.

Memetic algorithm solutions
As malware has become more disguised and sophisticated, search heuristics have come to be considered effective optimisation mechanisms, not only for detecting malware threats but also for finding optimal solutions. However, the challenge for single search heuristics is that they get stuck in local optima in the course of the search (Xu et al., 2017), which undercuts the quality of the search outcome. Thus, hybrid mechanisms, such as memetic algorithms, are preferred metaheuristics for conducting the optimisation search process, especially in non-stationary malware environments.
Okobah and Ojugo (2018) present a memetic model for malware intrusion detection. The approach uses classification rules, as well as a fitness function based on the evolutionary process, to generate a feasible solution. Mohammadi and Namadchian (2017) apply a memetic algorithm to optimise detection of irregular traffic. In order to classify malicious traffic, a memetic evolutionary classifier is used. The classifier functions in diverse and dynamic environments. This approach draws parallels with the hybrid mechanism proposed by Xue, Jia, Zhao and Pang (2018), where the feature selection is done by differential evolution, and neighbourhood improvements are conducted by the k-nearest algorithm, with a view to averting premature convergence. Shah, Ehsan, Ishaq, Ali and Farooq (2018) present a hybrid classifier to classify irregular activity. A genetic algorithm and a support vector machine are deployed for, respectively, feature selection and optimising of parameters to enhance accuracy. The training resources required, with respect to time, are significantly reduced through faster covergence, provided a robust and optimal feature selection process exists.
Dash (2017) applies a hybrid of particle swarm optimisation (PSO) and gravitational search to detect intrusions and malicious activity (Dash, 2017). In the same vein, Altaher and Barukab (2017) combine the PSO with the adaptive neural fuzzy inference system in order to distinguish malware-infected Android applications Intelligent Malware Detection Using a Neural Network Ensemble from malware-free applications. Fuzzy rules are generated to guide the classification process, through intelligent optimisation of parameters, using PSO search based on the evolutionary process that Razak et al. (2018) apply to solve a similar problem.
In the work by Zhirou and Jing (2018), the malicious attack vector is presented as a weighted network, and a memetic algorithm is used to optimise the cost on each node with a view to minimising the cost of attacking a node. An optimal search yields a node combination with the lowest cost.
Although all the approaches outlined above represent state-of-the-art advances in the area, the malware problem landscape continually evolves, at a exponential speed, as a result of the interconnectedness of systems (WEF, 2018) in the fourth industrial revolution (4IR), which demands more intelligent and adaptive solutions provided by advanced algorithms to counter the increasingly complex attacks.

Problem definition
The malware detection problem we addressed with this research is a combinatorial optimisation problem, where there is a finite set of multiple feasible solutions and the aim is to optimise and generate the best solution (Schweidtmann & Mitsos, 2019). The neural network ensemble we developed represents a blend of various neural networks whose functions are synergised with a view to minimising error and achieving precision (Yan et al., 2018). Neural networks are trained to build a strong capability for solving specific problems. Once they are trained, neural networks generate invidual outcomes which are combined to form an ensemble solution outcome. Ensemble approaches aim to offset the drawbacks associated with individual networks, and they present robust solutions to complex problems.
Accordingly, the problem we seek to address with our proposed model is two-fold: (1) optimisation of a neural network ensemble, and (2) malware detection. The mathematical formulation of the problem is as follows: Notation Denotation class label class of malware class of benign behaviour dataset where are feature vectors in the dataset. The behaviour of the data is then represented as and the directed graph, which represents the data relationships and dependencies in the form of weights in a neural network (Xiao et al., 2019), is denoted by , where represents the retrieved number of graphs from the dataset. The behaviour of the sample for a class of malware is defined as follows: where is the behaviour of the sample, in the dataset.
Therefore, the behaviour of the dataset is defined as: for malware or benign, so that if the sample contains 1 then data is malware and if it contains 0 then data is benign. This helps to calculate the detection error from the fitness function (Sheng et al., 2017), as follows.
where , and denote, respectively, the training error, complexity of the neural network, and detection error. The target output, and current network output, are used to compute the error generated from training the neural network, as shown in Equation (7). The parameters are defined by , and while active connections with weights, total connections including those on hidden neurons, training patterns, and neurons, are represented, respectively, by , , and . The problem therefore is to minimise the function in Equation (6), which ultimately reduces training and detection errors, which in turn increases the detection rate of malware behaviour patterns in the data sample.
The second part of the problem is the optimisation of a neural network ensemble based on connection weights. As mentioned before, the neural network ensemble represents an amalgam of disparate networks (Choi & Lee, 2018), where each individual network generates an output pattern from training input data. The network connections are defined by weights, which indicate the strength of these connections and of the influence of neurons on each other (Ojha, Abraham & Snasel, 2017) through synapses. The influence of neurons is determined by the strength level of the connections between them. The synapses generate an output, as a product of weight and input. Let represent weight of the neural network connection, such that and represents inputs. In order to adjust and map outputs and inputs to the neuron, a bias operator, is utilised. This is expressed as follows.
where all connections, use the activation function, , which activates neurons based on the weight, thus creating a bias to the process with a view to achieving a nonlinear output pattern, as demonstrated by Abiodun et al. (2018). The problem therefore is to optimise the weights in Equation (10) for all connections using a learning optimisation algorithm, so as to create a convergence of high-quality networks to the ensemble.

Memetic algorithm
Memetic algorithm is one of the methods we used in this study to tackle the optimisation problem defined in the previous section. Memetic algorithm is a blend of global and local search heuristics that combines exploration and exploitation search processes in order to generate high-quality solutions (Nguyen & Sudholt, 2018). The main differentiating factor between memetic algorithm and genetic algorithm is that memetic algorithm mimics the evolution of the cultural environment, rather than mimicking the natural evolution of genes. This capability enables memetic algorithm to lend itself to malware detection, since malware, seeking to avoid detection, is capricious in its attributes and its interactions with the environment (Acarali, Rajarajan, Komninos & Zarpelao, 2019).
The memetic algorithm exploitation function is executed by the local search technique, with a view to forestalling premature convergence of the search process, and thus yielding an optimal result. The local search scours the neighbourhood regions of the solution for a better solution, based on fitness values for mutated solutions. Due to this strength and other benefits, memetic algorithm has been extensively applied to solve various real-world complex problems (Chaimanee & Supithak, 2018) that have become increasingly difficult to deal with through single and non-dynamic optimisation techniques. As indicated by Gu et al. (2019), hybrid algorithms provide a good balance between diversification and intensification of the search, enhancing the quality of the search process as well as the outcome.
In the model we developed and tested, adaptive mutation and recombination operators are applied to the memetic algorithm, in order to create better offspring configurations, which are essentially hybrid solutions composed of two existing parent configurations. The features from both parent chromosomes (i.e., individual solutions in the sample) are extracted and combined using the recombination operator through individual interactions and cooperation. The mutation operator is then applied to build new features, with a view to forming more robust solutions (Bereta, 2019). The mutation operator helps in calibrating the diversification levels of the population sample, e.g., if there is low diversification, new features are injected by increasing mutation. Figure 1 illustrates this memetic algorithm procedure.  The population is randomly constructed in Algorithm 1, and one of the essential features of memetic algorithm is application of the local search technique. The local search procedure implemented in this work is the tabu search metaheuristic, and one of the prominent hallmarks of tabu search is its use of the adaptive memory function to store solutions as the search progresses through various iterations. This is important because it makes the information readily available for decision-making at any point in the course of the search (Lucay, Galvez & Cisternas, 2019). The search can then be strategically directed to promising areas where optimal solutions are most likely to be found, based on search information collected by tabu search. Figure  2 illustrates this tabu search procedure. In the tabu search algorithm, solution cycling is prevented via continual improvements until there is attainment of local optimality. The tabu search procedure in Algorithm 2 depicts process steps for local exploitation of the search space with a view to forestalling premature convergence. The procedure exemplifies dynamic local search optimisation, where continual update of the tabu list is conducted. This lends itself to malware detection and search environments, especially in today's environments where malware evolves rapidly. The application of tabu search in optimisation is guided by weightage to determine the penalty severity of constraint violations (Dai, Cheng & Guo, 2018). Hard constraints, which are constraints that must be satisfied and applied, carry high weights, and thus large penalties in the case of violations, while soft constraints carry lower weights and smaller penalties.

Neural network ensemble
A neural network is a data-based network that maps inputs to output patterns, or processes inputs to generate output through training processes. The neural network architecture mimics the natural functioning of the brain, in which billions of interconnected neurons transmit signals to each other to generate activity or action for the various functions of the body (Shapshak, 2018). In order to apply this natural phenomenon to computing, artificial neural networks are designed to simulate the neuron structure and processes of the brain, with a view to creating cognitively intelligent computer systems that can be deployed to solve complex problems ( Jat, Dhaka & Limbo, 2018). Neural networks go through training processes to enable them to learn and develop the required capability and mastery to solve various problems. The learning of the neural network is based on the way information travels through networks, as propagated by neurons. The neurons are connected to one other and assigned weight values to indicate the importance and value of each connection. The neural network observes and learns the information flow, and influence, of the neurons. This can be achieved by feed-forward, where learning is performed in a forward sequential format from inputs to outputs, or by back-propagation, where the optimisation procedure is reversed, starting from the actual outputs and comparing them with the expected output, so as to adjust the connection weights with a view to decreasing the error. Figure 3 illustrates the functioning of a neural network ensemble. A neural network ensemble represents a blend of individual neural networks that is aimed at combining various models to generate an environment with strong Intelligent Malware Detection Using a Neural Network Ensemble generalised capacity and minimal error (Li et al., 2018). The mathematical formulation of the neural network that we developed and tested is as follows: From Figure 3, let , and represent neural networks 1, 2 and 3, such that , and Let denote the weights and be the network that represents the combined output of neural networks, and . The activation function is used to map inputs to output based on the weights of connections for inputs as well as biases between neurons (Eger, Youssef & Gurevych, 2018), such that . The neural network ensemble in Figure 3 is therefore presented as follows.
where , and are the activation function, the weight and bias vectors respectively. The bias factor helps to influence the outcome of the neural network, as well as its behaviour by determining the triggering value of the activation function, , hence acting as an anchor to the network.
This implies that with this additional parameter, the behaviour of the neural network can be adjusted with a view to achieve optimal learning and performance. In order to get an optimal neural network ensemble, the optimisation of the individual neural networks is vital ( Ju, Bibaut & Van der Laan, 2018) and to this end, a memetic algorithm is deployed in this work to optimise the search process by exploitation and exploration of the search space so as to generate high quality trained neural networks that can compose a robust ensemble network.

Experiments
The experiments are conducted using the Intel Core i3-4005U @1.70GHz(4 CPUs), 8GB RAM, 64-bit Operating System. R-programming and MATLAB environments are used for the neural network ensemble implementations and analysis. The neuralnet Library in the R platform is utilised to train the neural networks. The environments are also used to perform memetic optimisation, where global search, as well as local search improvements, are done using genetic and tabu search algorithms respectively. A stacking approach is used to combine classifiers, so as to synthesise estimate outputs from various neural networks and achieve high levels of accuracy (Ma, Wang, Gao, Wang & Khalighi, 2018). A single outcome is then produced for the neural network ensemble. The optimisation threshold in the neural net is set at 0.1, such that if the , the optimisation will automatically stop.
The datasets used are obtained from the Center for Machine Learning and Intelligent Systems (2016). The training datasets are multivariate, with both malicious and benign features labelled as -1 and +1 respectively. Features are extracted, and data is divided into testing and training sets. The algorithm is first trained on the training dataset before conducting actual tests, so as to be able to detect malware. The mathematical formulation for the training is as follows: Let be the learning rate at , which controls the rate at which weights update, for each training epoch. Feature selection is conducted by memetic algorithm, which helps to avoid premature convergence and to reduce data dimensionality and computational resource use, in order to achieve faster convergence. The neural network ensemble using a memetic algorithm (NNE-MA) is compared with well-known optimisation techniques on a similar set of datasets. The techniques are genetic algorithm (GA) (Amjad et al., 2018), ant colony optimisation (ACO) , and particle swarm optimisation (PSO) (Liu, Li & Zhu, 2019). The neural network ensemble is then combined with each of these techniques for feature selection optimisation, which results in, respectively, the NNE-GA, NNE-ACO, and NNE-PSO convergence comparisons, as shown in Figure 4. The training error as defined in Equation (9) (presented earlier, in section 3) represents the difference between the current output and the desired output, as per the labels. The actual output is defined in Equation (11) (presented earlier, in section 4), based on weights and bias of neural networks. The error that is generated is measured by the difference between the expected and obtained outputs based on the specific threshold value.
The neural network ensemble using a hybrid search (NNE-MA) produces the least error across all epochs, as shown in Figure 5. This can be ascribed to the improvement in the search mechanism embedded in the memetic algorithm to improve the fitness of solutions, as well as to provide a balance between intensification and diversification of the search. The tabu search, as presented in section 4, is applied to the solutions in the sample, so as to search the neighbourhood of each solution for better fitness individuals.

Figure 5: Error comparisons between NNE-MA and NNE with existing algorithms
Once a better solution is obtained, the current individual is replaced with the new solution and this ultimately leads to more accuracy and better quality of the final output for the neural network ensemble, which is demonstrated in Figure 5.
The statistical output in Figure 6 demonstrates a low error mean for the proposed algorithm, compared to other methods. There is also less variability in data when NNE-MA is applied, as shown by comparisons in the standard deviation. N represents the dataset sample. The degrees of freedom, which subtract one from the valid sample and the mean ratio, are represented by df and t respectively, as shown in Figure 7. Based on the t distribution, the p-value in Sig. (2-tailed) indicates a level that is less than the 0.05 threshold value for determining the significance of the results (Shaffer, 2019). It can therefore be inferred that there exists a significant difference between experimental results generated by the algorithms in this work.

Conclusions
In this article, we have made the case for a neural network ensemble, based on a hybrid search mechanism, for malware detection. The approach combines global search and local search heuristics, through a memetic evolutionary search process. The tabu search algorithm is used as the local search technique, to improve the quality and fitness of solutions through scouring the neighbourhood of each solution for better individuals. After training the model on malware datasets to learn both benign and malicious features, the proposed model is able to detect malicious software and achieve faster convergence when compared with existing techniques. In addition, a proper balance between diversification and intensification of the search is achieved, which enables the algorithm to achieve strong accuracy levels. The experimental results we have presented in this article thus indicate that combining several neural networks in an ensemble generates strong performance, especially when a memetic algorithm is applied to develop solutions and produce optimal outcomes. Future work will include creating more algorithmic synergies to improve the ability of the search technique to converge towards high-quality solutions, which is necessary in today's rapidly changing and increasingly risk-ridden cyberspace environments.