M van Dijk^{I, }^{*}; SJ van Vuuren^{I}; JE van Zyl^{II}
^{I}Department of Civil & Biosystems Engineering, University of Pretoria, Pretoria 0001, South Africa
^{II}Department of Civil Engineering Science, University of Johannesburg, PO Box 524, Auckland Park 2006, South Africa
ABSTRACT
Genetic algorithms (GAs) have become the preferred water system design optimisation technique for many researchers and practitioners. The main reason for using GAs is their ability to deal with nonlinear complex optimisation problems. The optimal decision in terms of designing, expansion/extending, addition or rehabilitation of water supply systems has to review possible options and select a costeffective and efficient solution. This paper presents a new approach in determining a penalty value depending on the degree of failure, of the set pressure criteria, and the importance of the link supplying a specific node. Further modifications are also made in the crossover and mutation procedures to ensure an increase in algorithm convergence. EPANET, a widely used water distribution network simulation model, is used in conjunction with the proposed newly developed GA for the optimisation of water distribution systems. The developed GA procedure has been incorporated in a software package called GANEO, which can be used to design new networks, analyse existing networks and prioritise improvements on existing networks. The developed GA has been tested on several international benchmark problems and has proved to be very efficient and robust. The EPANET hydraulic modelling software as well as the developed GANEO software, which performs the optimisation of the water distribution network, is freeware. The software provides a tool for consulting engineers to optimise the design or rehabilitation of a water distribution network.
]]> Keywords: optimising, water distribution system, genetic algorithm
Introduction
As a vital part of water supply systems, water distribution networks represent one of the largest infrastructure assets of industrial society. Simulation of hydraulic behaviour within a pressurised, looped pipe network is a complex task, which effectively means solving a system of nonlinear equations. The South African objective to provide 'water for all' makes it essential that the limited capital has to be employed in such a way to provide the maximum benefit. Optimising a relatively small system will require numerous repetitive calculations. The discrete nature of the network optimisation problem (pipe diameters) and the size of the solution space make it virtually impossible to apply any of the conventional optimisation techniques to find the global optimum.
The development of hydraulic models in the last two decades improved the ability to simulate hydraulic behaviour of large water distribution networks (Rossman, 2000). Most optimisation techniques are applicable when continuous variables are evaluated. GAs are applicable when a large solution space has to be searched, consisting of discreet variables, and it is now accepted by most experts that the GA is the best technique for network optimisation (Van Vuuren et al., 2005).
According to Michalewicz (1994), GAs can basically be described as artificial evolution search methods based on the theories of natural selection and mechanisms of population genetics. GAs apply the principle of survival of the fittest in a mathematical sense. In the evolutionary process, all species develop in such a way to improve their chances of survival and quality of living. It therefore means that all species are striving to a certain optimum, being physical, behavioural or otherwise. In this paper a traditional GA is developed which incorporates a unique penalty structure and tailored crossover and mutation procedures.
Problem formulation
The difficulty of optimising water distribution systems is mainly due to the discrete nature of the variables and the size of the solution space. Many optimisation techniques can only be applied to problems, which have continuous variables, unlike the pipe diameter variables in the network optimisation problem. The size of the solution space (the total number of possible solutions to the problem) for the network optimisation problem can be calculated as the number of possible discrete pipe diameters to the power of the number of pipes in the network.
]]> A major problem for water supply authorities is to select those components in a network that should be changed, increased or replaced to ensure a sustainable service to the consumers at the lowest cost. Water supply authorities are also faced with problems other than that of network design such as network calibration, operation and reliability. For each type of optimisation, the main objective function, possible variables and the main constraints are summarised in Table 1.
A water supply distribution system consists of a complex network of interconnected pipes, service reservoirs and pumps that deliver water from the treatment plant to a consumer. The distribution of water through the network is governed by complex, nonlinear, nonconvex and discontinuous hydraulic equations (Keedwell and Khu, 2005).
Two equations, which are used to determine if a network is hydraulically balanced, are the continuity and energy equations (Eqs. (1) and (2) respectively).
The continuity equation is applied to each node with q_{i} the flow rate (in and out of the node) and n the number of pipes joined at the node.
]]> The energy equation is applied to each loop in the network with h_{i} the head loss in each pipe and m the number of pipes in the loop. The head loss is the sum of the local head losses and the friction head losses. The friction head loss can be calculated using the DarcyWeisbach (Eq. (3)) or HazenWilliams (Eq. (4)) empirical equations or something similar.where:
L = length (m)
D = internal diameter (m)
V = velocity (m/s)
g = gravitational acceleration (m/s^{2})
λ = Darcy Weisbach friction factor
C = HazenWilliams friction factor ]]> ω = numerical conversion constant (in this paper ω = 10.667)
The network can be hydraulically balanced utilising Eqs. (1) to (4) and methods such as the HardyCross and nodal methods. The aim of designing a new network is to obtain a system that will meet the demand at each node at a required minimum pressure. Similar for the rehabilitation or improvement of a network, the optimum system is achieved when the system components are identified, to replace or rehabilitate, which will provide the level of service required.
Genetic algorithm
Genetic algorithms (GAs) have been developed (Holland, 1975) to assist in searching through complex solution spaces for the optimum solution. GAs have been applied as search techniques for various engineering problems such as, structural design optimisation, water distribution network evaluation, pump scheduling, hydrological runoff predictions and resource utilisation. According to Michalewicz (1994), GAs can basically be described as artificial evolution search methods based on the theories of natural selection and mechanisms of population genetics. GAs emulate nature's optimisation technique of evolution, based on:

Survival and reproduction of the fittest members of the population

The maintenance of a population with diverse members
 ]]> The inheritance of genetic information from parents

The occasional mutation of genes.
A GA evolves optimal solutions by sampling from all the possible solutions. The best of these solutions are then combined, using the genetic operators of crossover and mutation, to form new solutions. The identification of these best solutions is done based on a set objective function. This process continues until some termination condition is fulfilled. A flow diagram of the basic GA process is given in Fig. 1.
The objective function in the optimisation of a water distribution system is usually the minimisation of total cost. The total actual cost is a combination of the capital costs and operating and maintenance costs. In this paper only the capital cost of the pipes (supply, lay and jointing) are considered and hence it can be generally expressed as:
]]>The pipe costs per unit length usually vary nonlinearly with its diameter and in Eq. (5) it is assumed that it can be expressed as a single term for all diameters, where f_{cost}(D) is the cost of the pipes, L_{j} and D_{j} are the lengths and diameters of the j^{th} pipe and K and n are constants that will depend on local conditions (Vairamoorthy and Ali, 2000).
Genetic algorithm optimisation model
GAs can be used to optimise various different parameters in water distribution systems. Various types of optimisation can be identified for water distribution systems as shown in Table 1. In many cases the types and subsequent objectives of optimisations are in conflict with each other. For example, attempts to minimise operational cost will generally place the system in a more vulnerable state and less able to handle abnormalities such as pipe bursts, thus reducing the level of service (Jowitt et al., 1988). In such cases it is necessary to strike a balance between the objectives. This can be done by either defining the balance before hand (for example using an objective function equally weighed for cost and level of service) or by using multiobjective optimisation. In this paper only a single objective, in this case minimising cost, will be set for a water distribution system (new or upgrading/rehabilitation). The genetic algorithm process with its newly developed penalty function is summarised in Table 2.
Benchmarking the model
]]> The developed genetic algorithm optimisation model was tested against benchmark problems in order to establish its functionality and efficiency. Software was developed (called GANEO) and used to test common networks for which many optimisations have been performed; these include traditional and heuristic methods (Savic and Walters, 1997). Three systems were tested: New York Tunnels (Example 1), Hanoi (Example 2) and Twoloop network (Example 3) detailed below.In all three examples the HazenWilliams equation (Eq. (4) will be utilised to determine the friction loss in a pipe link between two nodes. Previous researchers have investigated these systems extensively and obtained numerous solutions that met the defined fitness function of minimum cost based on the constraint of required pressure and demand at every node. Due to different interpretations of the HazenWilliams equation different researchers obtained different solutions (Savic and Walters, 1997) which made direct comparison not always possible. According to Savic and Walters (1997) the numerical conversion constant (ω), see Eq.(4), varied from 10.5088 to 10.9031. The consequence of this variation of the ω values used is that systems designed with ω = 10.5088 calculate a lesser friction head loss when compared to ω = 10.9031. This result in solutions meeting the set pressure criteria when analysed with the lower boundary of ω but when reanalysed with the upper boundary failing and thus providing an unfeasible solution. The value of ω as used in the EPANET software, which was used in the hydraulic analyses, is 10.667 (similar to Dandy et al. (1996), Montesinos et al. (1999), Wu and Simpson (2002) and Keedwell and Khu (2005)).
All three benchmark problems were analysed utilising a Pentium computer wit a 3.2 GHz Intel processor, 1024 MB RAM, using Microsoft Windows XP Professional as operating system. The GANEO software utilised, an example of an improvement to a distribution network (Example 1) and two examples of designing of new networks (Examples 2 and 3) are described in the following paragraphs.
GANEO software
The developed GA optimisation model as described above required the development of a software program to perform the computations. The developed program called GANEO requires six easy steps to optimise a water distribution system:
Step 1:  Using EPANET, create a working network model and export the water distribution system into the correct format for importing in GANEO (*.inp file). 
Step 2:  Create a new project in GANEO and import the EPANET model (see Fig. 2). 
Step 3:  ]]> Create a pipe selection file in GANEO from which pipes (genes) will be selected for population of the network (strings). 
Step 4:  Set the GA parameters (crossover method, mutation percentage, etc.) and boundaries (penalties) in GANEO. 
Step 5:  Run the GA optimisation analysis procedure (see Fig. 3). 
Step 6:  Evaluate the alternative options and export the result back to EPANET. 
]]>
Example 1: New York tunnels
The New York water supply system has been studied by a number of researchers in the pipe network optimisation field. The aim of the various studies was to determine the most economical design to improve the existing system of tunnels that constituted the main water distribution system. Figure 4 is a general layout of the system indicating the pipes (Table 3), nodes (Table 4) and single supply reservoir. The existing system was found to be inadequate due to ageing and an increase in demands, in terms of the pressure requirements.
]]>
The method utilised to improve the system was to lay parallel pipes between certain nodes. There are 15 available commercial diameters (Table 5), which could be used as well as a socalled 'do nothing' option for each of the 21 pipes in the system. The cost per unit length as shown in Table 5 is similar to other researchers' defined cost functions (see Eq. (5) with L = 1 m, K = 0.06537, n = 1.24 and D measured in mm).
]]>
A HazenWilliams friction factor of C_{H} = 100 is assumed for both the old tunnels and the new pipes. It has been indicated that the conversion from imperial units to metric units could result in small differences and subsequent changes in optimum solutions (Savic and Walters, 1997).
The only system constraint that this network has is that the minimum head at each node should be as indicated in Table 4. Although the system is fairly unsophisticated and small in comparison to the internal distribution system there are 1.93 x E25 or 16^{21 }possible solutions for this system. It is thus impossible to analyse every single network improvement alternative.
The optimum obtained with the proposed GA procedure as used in the GANEO program was $M38.65. The number of iterations (generations) it took to obtain this optimum was 684. The convergence to the optimum is shown in Fig. 5. As can be seen the initial reduction in total cost of the system occurs fast after which this improvement process slows down.
]]>
The nodes with its pressure closest to the minimum required are Nodes 16, 17 and 19 with pressure elevations of 79.26 m, 83.16 m and 77.73 m respectively. The five best solutions, based on randomly selected initial seed values, and their costs are shown in Table 6. The following GA parameters were used: number of generations = 1 000, population = 100, penalty factor = 5  6 and mutation rate was set at 3%.
These analyses in Table 6 compare favourably with the average cost of $M39.57 after 44 280 evaluations obtained by Montesinos et al. (1999) using different parameter sets and constraints. A direct comparison of the optimum solution obtained with the developed GA procedure and that obtained by other researchers is shown in Table 7.
]]>
As indicated by Savic and Walters (1997), GAs are stochasticsearch techniques and the solution found will not always be the same and a number of runs are required to ensure that the solutions identified are of good quality.
Example 2: Hanoi network
The Hanoi system, Vietnam, is a new network that should be optimised. It has a single fixed head source at elevation of 100 m. There are 34 pipes and 32 nodes as shown in Fig. 6 and listed in Tables 8 and 9.
]]>
The ground elevation of all the nodes is 0.0 m. The pipes that could be utilised in the design of the system are shown in Table 10.
]]>
A HazenWilliams friction factor of C_{H} = 130 is assumed for all new pipes
The head constraint for this system is 30 m, i.e. the pressure everywhere in the system should be greater than 30 m. The costs as indicated in Table 10 were determined with the cost function (Eq. 5) with K = 0.008593, n = 1.5 and D measured in millimetre.
The Hanoi network has been optimised by other researchers as shown in Table 11. Fujiwara and Khang (1990) did not use discrete values for the pipe diameters and for this reason direct comparison with their solution is not possible.
The following GA parameters were used: number of generations = 2 000, population = 100, penalty factor = 2 and mutation rate was set at 3%. The optimum obtained with the GANEO program was $M6.110. The number of iterations (generations) it took to obtain this optimum was 495 and it was achieved after 324 s.
According to Savic and Walters (1997) the solution of Fujiwara and Khang (1990) is based on a continuous cost function solution since their method could not handle discontinuous objective functions directly. Furthermore when the Fujiwara and Khang (1990) solution is reanalysed with the range values of ω = 10.5088 and ω =10.9031 it does not meet the minimum 30 m pressure requirement with pressures as low as 10.31 m and 7.69 m respectively.
]]> As can be seen in Fig. 7 the initial reduction in total cost of the system occurs rapidly after which the improvement process slows down.
Vairavamoorthy and Ali obtained the optimum solution after approximately 25 min. Liong and Atiquzzaman analysis took approximately 11 min whilst the authors obtained the optimum after generation 495 and 6.4 min. The minimum pressure in the system was 30.045 m at node 32.
As detailed in Table 2 a new weighted penalty function is proposed by the authors. Wu and Walski (2005) provide a comparison of various constrainthandling techniques and this is reproduced in Table 12 (including the author's results).
]]>
As can be seen in Table 12 (next page) the proposed GA produced satisfactory results compared to others. Furthermore when the optimum Wu and Walski solution, Table 11, is reanalysed with ω = 10.667 as used in EPANET it does not meet the minimum pressure requirement of 30 m. Direct comparison is thus difficult but the solutions obtained by the authors are competitive.
Example 3: Twoloop network
The twoloop network was first studied by Alperovits and Shamir (1977) and many others thereafter (Keedwell and Khu, 2005). The twoloop network consists of eight pipes, which are fed from a single fixed head reservoir to supply the demands as shown in Fig. 8 (next page).
The pipes in the network are all 1 000 m long and the only system constraint is the minimum pressure requirement for nodes 2 to 7 defined as 30 m. The available pipe diameters and costs that could be used in the design of the system are shown in Table 13.
]]>
The objective of the system was to determine the required pipe diameters that would yield the least total cost whilst still supplying the demand and adhering to the system constraint of minimum pressure at each node. Although the system seems extremely simple there are still 8^{14} possible combinations of pipes.
Savic and Walters (1997) found two solutions consistently at 419 000 and 420 000 cost units (depending on ω) which satisfied the demand and pressure requirements. According to Keedwell and Khu (2005) the algorithm was ran for 500 generations with a population size of 50, i.e. 25 000 network simulations. Using a cellular automata (CA), genetic algorithm (GA) combination approach Keedwell and Khu (2005) also analysed the twoloop network. The Cellular Automation for Network Design Algorithm combined with a GA (called by the authors CANDAGA) can be described as seeding the GA with CA solutions, i.e. providing a better initial population to start with. The optimisation results of Savic and Walters (1997), Keedwell and Khu (2005) and the authors are presented in Table 14.
]]>
As discussed by Savic and Walters (1997) other researchers also provided solutions in this range such as Kessler and Shamir (1989) with 402 352 and Eiger et al. (1994) with 417 500 but these solutions did not obtain the minimum pressure requirement of 30 m as every node. Some other researchers such as Goulter et al. (1986) with a cost of 435 015 and Alperovits and Shamir (1977) with a cost of 497 525 obtained feasible solutions but utilised split pipe solutions.
Keedwell and Khu (2005) reported that both algorithms (GA and CANDAGA) fitness converged after 3 000 generations. If this is compared with what was obtained with GANEO it can be clearly seen that an optimum or near optimum is very quickly obtained. The results furthermore show an improvement of the average total cost of the five runs (different initial seed values) that were performed. The worst result obtained with GANEO was 9 000 cost units better than that obtained by Keedwell and Khu (2005) with the standard GA and 2 000 cost units better than that obtained with the CANDAGA.
The optimum solution of 419 000 cost units is obtained if the pipes as listed in Table 16 are used resulting in a minimum pressure in the system at node 6 being 30.44 m. These results were obtained with the following GA parameter: number of generations = 1 000, population = 100, penalty factor = 1.5 and mutation rate was set at 10%.
Features of GANEO
]]> The developed GA was implemented in the software package GANEO to test its ability, and some of the features that makes the optimisation process more powerful are:
Initial seed number – setting of the initial seed number for the pseudo random generator

Selecting of crossover method – single point, double point or uniform

Selecting the mutation procedure and value – random or MinMax

Setting the termination criteria – number of generations, time limit or when no change in fitness occurs for a set number of generations
 ]]> Setting of population size – four to one hundred

Penalty factors – the penalty factor can be set for nodes not meeting the minimum pressure requirement as well as a penalty factor for velocities which are greater than a specified value

Pipe/link fixing – when an existing network is analysed for rehabilitation purposes GANEO allows the fixing, restricting the changing or adding, of certain pipes. These pipes will thus be kept as is and won't be upgraded or improved although these will be included in the hydraulic analysis of the system. The reason for fixing a pipe is for instance when it is too costly or difficult to change/improve or if it is part of the existing system that must simply be analysed as part of the new extension of the network.
Conclusions
The developed genetic algorithm optimisation model was tested on three benchmark networks and it has been shown to produce good results in a limited number of generations (in relation to other GAbased methods). The weighted penalty function produced satisfactory final results and showed faster initial convergence. The developed GANEO program can be used in the design and analysis of a new network as well as providing suggestions on how to improve an existing network (adding additional pipes or replacing existing pipes). The EPANET software used for the hydraulic analysis of the systems is wellaccepted, welltested analysis software used in various other hydraulic analysis packages. The EPANET hydraulic modelling software as well as the developed GANEO software, which performs the optimisation of the water distribution network, is freeware. The software provides a tool for consulting engineers to optimise the design or rehabilitation of a water distribution network.
Further research is envisaged to include local search procedures once near optimum solutions are obtained.
]]>Acknowledgements
The authors wish to thank the Water Research Commission for funding this research project.
References
ALPEROVITS E and SHAMIR U (1977) Design of optimal water distribution systems. Water Resour. Res. 13 (6) 885900. [ Links ]
BHAVE PR (1985) Optimal expansion of water distribution systems. J. Environ. Eng. ASCE 111 (2) 177197. [ Links ]
COLEY DA (2003) An Introduction to Genetic Algorithms for Scientists and Engineers. World Scientific Publishing Co, Singapore. 227 pp. [ Links ]
DANDY GC, SIMPSON AR and MURPHY LJ (1996) An improved genetic algorithm for pipe network optimization. Water Resour. Res. 32 (2) 449458. [ Links ]
DEB K and AGRAVAL S (1999) A nichedpenalty approach for constrain handling in genetic algorithms. Proc. of Int. Conf. on Artificial Neural Networks and Genetic Algorithms. Control Theory and Applications Centre, Country Univ., UK. [ Links ]
EIGER G, SHAMIR U and BENTAL A (1994) Optimal design of water distribution networks. Water Resour. Res. 30 (9) 26372646. [ Links ]
FUJIWARA O and KHANG DB (1990) A twophase decomposition method for optimal design of looped water distribution networks. Water Resour. Res. 26 (4) 539549 [ Links ]
GESSLER J (1985) Pipe network optimization by enumeration. Proc. Spec. Conf. on Comp. Applications/Water Resourc. ASCE, New York 572581. [ Links ]
]]>GOULTER IC, LUSSIER BM and MORGAN DR (1986) Implications of head loss path choice in the optimization of water distribution networks. Water Resour. Res. 22 (5) 819822. [ Links ]
HOLLAND JH (1975) Adaptation in Natural and Artificial Systems (2^{nd} edn.). University of Michigan Press, Michigan, USA. 1992. [ Links ]
JOINS J and HOUCK C (1994) On the use of nonstationary penalty functions to solve nonlinear constrained optimization problems with GAs. Z Michalewicz, JD Schaffer, HP Schwefel, DB Fogel and H Kitano (eds.) Proc. 1^{st} IEEE Int. Conf. on Evolutionary Computing. IEEE Press, Piscataway, NJ. 579584. [ Links ]
JOWITT P, GARRETT R, COOK S and GERMANOPOULOS G (1988) Realtime forecasting and control for water distribution. In: B Coulbeck and C Orr (eds.) Computer Applications in Water Supply. John Wiley & Sons, Letchworth, England. 329355. [ Links ]
KEEDWELL E and KHU ST (2005) A hybrid genetic algorithm for the design of water distribution networks. Eng. Applic. Artificial Intell. 18 461472. [ Links ]
]]>KESSLER A and SHAMIR U (1989) Analysis of the linear programming gradient method for optimal design of water supply networks. Water Resour. Res. 25 (7) 14691480. [ Links ]
LIN LY and WU WH (2004) Selforganizing adaptive penalty strategy in constrained genetic search. J. Struct. Multidisciplinary Optim. 26 417428. [ Links ]
LIONG SY and ATIQUZZAMAN M (2004) Optimal design of water distribution network using shuffled complex evolution. J. Inst. Eng. 44 (1) 93107. [ Links ]
MICHALEWICZ Z (1994) Genetic Algorithms + Data Structures = Evolution Programs (2^{nd} edn.). SpringerVerslag, Berlin. [ Links ]
MORGAN DR and GOULTER IC (1985) Optimal urban water distribution design. Water Resour. Res. 21 (5) 642652. [ Links ]
]]>MOTESINOS P, GARCIAGUZMAN A and AYUSO JL (1999) Water distribution network optimization using a modified genetic algorithm. Water Resour. Res. 35 (11) 34673473. [ Links ]
QUINDRY GE, BRILL ED and LIEBMAN JC (1981) Optimization of looped water distribution systems. J. Environ. Eng. Div. ASCE 107 (4) 665679. [ Links ]
ROSSMAN L (2000) EPANET Users Manual. Environmental Protection Agency, Risk Reduction Engineering Laboratory, Cincinnati. [ Links ]
SAVIC D and WALTERS G (1997) Genetic Algorithms for the Leastcost Design of Water Distribution Networks. J. Water Resour. Plann.Manage. 123 (2) 6777. [ Links ]
SCAAKE JC and LAI D (1969) Linear Programming and Dynamic Programming Applications to Water Distribution Network Design. Report 116 Hydrodyn, laboratory, Department of Civil Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts. [ Links ]
]]>VAIRAVAMOORTHY K and ALI M (2000) Optimal design of water distribution systems using genetic algorithms. ComputerAided Civil and Infrastructure Eng. 15 374382. [ Links ]
VAN VUUREN SJ, VAN ROOYEN PG, VAN ZYL JE and VAN DIJK M (2005) Application and Conceptual Development of Genetic Algorithms for Optimisation in the Water Industry. WRC Report No. 1388/1/05. Water Research Commission, Pretoria, South Africa. [ Links ]
WU ZY and SIMPSON AR (2002) A selfadaptive boundary search algorithm and its application to water distribution systems. J. Hydraul. Res. 40 191203. [ Links ]
WU ZY and WALSKI T (2005) Selfadaptive penalty approach compared with other constrainhandling techniques for pipeline optimization. J. Water Resour. Plann. Manage. 131 (3) 181192. [ Links ]
]]> * To whom all correspondence should be addressed.
+2712 420 3176; fax: +2712 362 5218;
email: marco.vandijk@up.ac.za ]]>