Integration of Overall Equipment Effectiveness (OEE) and reliability method for measuring machine effectiveness

Abdul Samat, H.; Kamaruddin, S.; Abdul Azid, I.

Servicios Personalizados

Articulo

Traducción automática

Indicadores

Accesos

Links relacionados

Citado por Google
Similares en Google

Otros
Otros

Permalink

South African Journal of Industrial Engineering

versión On-line ISSN 2224-7890
versión impresa ISSN 1012-277X

S. Afr. J. Ind. Eng. vol.23 no.1 Pretoria ene. 2012

GENERAL ARTICLES

Integration of Overall Equipment Effectiveness (OEE) and reliability method for measuring machine effectiveness

H. Abdul Samat^{I, *}; S. Kamaruddin^{II, **}; I. Abdul Azid^III

^ISchool of Mechanical Engineering Universiti Sains Malaysia, Engineering Campus, Malaysia. hasnida_absamat@yahoo.com
^IISchool of Mechanical Engineering Universiti Sains Malaysia, Engineering Campus, Malaysia. meshah@eng.usm.my
^IIISchool of Mechanical Engineering Universiti Sains Malaysia, Engineering Campus, Malaysia. ishak@eng.usm.my

ABSTRACT

Maintenance is an important process in a manufacturing system. Thus it should be onducted and measured effectively to ensure performance efficiency. A variety of studies ave been conducted on maintenance as affected by factors such as productivity, cost, mployee skills, resource utilisation, equipment, processes, and maintenance task planning jnd scheduling [1,2]. According to Coetzee [3], equipment is the most significant factor ffecting maintenance performance because it is directly influenced by maintenance ctivities. This paper proposes an equipment performance and reliability (EPR) model for measuring maintenance performance based on machine effectiveness. The model is eveloped in four phases, using Pareto analysis for machine selection, and failure mode and effect analysis (FMEA) for failure analysis processes. Machine effectiveness is measured sing the integration of overall equipment effectiveness and the reliability principle. The result is interpreted in terms of maintenance effectiveness, using five health index levels asases. The model is implemented in a semiconductor company, and the outcomes confirm he practicality of the EPR model as it helps companies to measure maintenance effectiveness.

OPSOMMING

Instandhouding is 'n belangrike proses in 'n vervaardigingsomgewing. Dit moet dus effektief jnderneem en bestuur word met die oog op doeltreffende werkverrrigting. Verskeie studies s reeds onderneem om die impak van faktore soos produktiwiteit, koste, werknemer- aardighede, hulpbronbenutting, toerusting, prosesse en instandhoudingsbeplanning en ikedulering op instandhouding te bepaal [1,2]. Volgens Coetzee [3] het toerusting die mees )etekeninsvolle impak op instandhoudingswerkverrrigting aangesien dit direk beinvloed word deur instandhoudingsaktiwiteite. Hierdie artikel hou 'n model voor vir toerusting-werkverrigting en betroubaarheid wat gebruik kan word om die instandhoudingsprestasie te meet aan die hand van masjiendoeltreffendheid. Die model is in vier fases ontwikkel wat 'n areto-analise gebruik het vir masjienseleksie en FMEA vir falingsanalise. Masjiendoel- reffendheid word gemeet deur gebruik te maak van OEE en die betroubaarheidsbeginsel. Die model word dan toegepas in 'n halfgeleier-onderneming en die praktiese toepas-)aarheid van die model word geillustreer.

1. INTRODUCTION

Maintenance is done to ensure that machines are in a good condition, serviceable, and operationally safe for producing quality products. Chan et al. [4] reported that 15% to 40% of the total production cost is attributed to maintenance activities. However, up to 33% of this cost is spent unnecessarily [5]. This wastage shows that effective maintenance and equipment reliability can help companies to reduce waste and improve productivity without investing in costly equipment and systems [6]. Waste is accrued in maintenance costs because of failures in maintenance activities, such as using the wrong maintenance techniques, assigning underskilled workers to such tasks, and using fake spare parts. It is also caused by negligence in determining machine specifications and operational safety features that may contribute to the overall utilisation of the machines. This practice reduces machine reliability and sometimes causes dangerous accidents.

Maintenance activities require continuous monitoring, control, and measurement to determine performance levels. Performance measurement is a means of quantifying the effectiveness and efficiency of action [7]. Measurement provides a means of capturing performance data, which can be used to aid decision-making and the formulation of plans for improvement. Tangen [8] stated that performance measurements are often used to increase the competitiveness and profitability of manufacturing companies through the support and encouragement of productivity improvements.

Maintenance performance measurement requires a simple yet effective model that reveals the actual situation, based on measured factors [9]. A conceptual model or a theoretical construct is needed as a strong basis for research, especially in case studies [10]. The model may help companies to develop a measurement process that addresses maintenance performance levels, analyses the causes of ineffectiveness, and improves the system.

2. EQUIPMENT PERFORMANCE AND RELIABILITY (EPR) MODEL

Maintenance performance is defined as the periodic measurement of the state or condition of the processes involved in conducting maintenance functions [4]. In this paper, maintenance performance is measured using equipment performance and reliability as bases. This gauging system is grounded on the hypothesis that conducting good maintenance activities results in effective and reliable machines, given that maintenance directly affects machine effectiveness. The model introduced in this paper is called the Equipment Performance and Reliability (EPR) model. It consists of four phases: machine identification, critical system assessment, maintenance performance measurement, and maintenance performance level assessment. The model is illustrated in Figure 1.

2.1 Phase I: Machine identification

The first phase involves identifying 'critical machines' in a manufacturing plant where production processes are conducted. 'Critical machine' refers to equipment that has the highest failure effect on the manufacturing process. A manufacturing plant usually has one or more machines for each process, making it a complex manufacturing plant. However, because of cost, time, and resources constraints, analysing every machine is impossible in maintenance management. So an identification process is needed to select the equipment that requires immediate attention. Pareto analysis is routinely used in identifying failures that contribute to the majority of machine maintenance costs and operation downtimes [11]. The Pareto analysis principle, also known as the 80-20 rule, states that for many events, 80% of the effects come from 20% of the causes [12 & 13]. By concentrating on the 20% (i.e., the critical machine), the measurement process and improvement plan produce a much more favourable effect, and can result in more effective maintenance [14]. Three basic steps characterise this phase.

The first step identifies the losses that occur in the manufacturing plant. Losses refer to equipment-related failures, problems, or breakdowns. The identification process can be carried out by analysing data from historical and failure records, which are generally kept by the manufacturing company. Historical records are the daily maintenance documents containing the information collected by the maintenance department. The historical record also includes planned maintenance activities, such as preventive maintenance (PM). The failure record is the documentation that contains a detailed analysis of failures that occur, including the examination of the occurrence, time, duration, and location of the failure, and the work done to repair it. This step focuses on losses related to mechanical factors. Thus all data collected should exclude failures and other problems caused by humans, materials, and facilities. Given that many types of losses arise from machines, the types of losses should be grouped to enable easy analysis and identification. Nakajima [15] identified 'six big losses' that affect machine effectiveness (Table 1).

Konopka & Trybula [17], Ljungberg [18], Jonsson & Lesshammar [19], Jabiri et al. [20], and De Ron & Rooda [21] applied loss segregation in their research and debated the definition of each loss type. On the basis of the definition of each loss group, we choose only three big losses for the first phase: breakdown, setup and adjustment, and idling and minor stoppages. This restriction is adopted because maintenance activities cannot improve a machine's acceleration or speed feature, which is more related to machine design. For process defect losses, most products are rejected because of inappropriate or defective materials, unsuitable process environments, and human error during the process setup. Startup loss is also disregarded because it involves manufacturing principles, and can only be improved using good quality materials.

2.1.2 Step 2: Loss occurrence analysis

Once the failures are grouped, a loss occurrence analysis is conducted. The aim is to record and calculate the occurrence, frequency, or rate of losses that arise in the manufacturing plant. The occurrences can be counted on a weekly, monthly, or yearly basis. Aside from the number of loss occurrences, the total number of loss occurrences is also calculated by adding up all the loss occurrences for each machine. The value obtained is then divided by the total number of loss occurrences to determine the percentage for each individual problem classification. The cumulative percentage (C) of each type of loss is calculated to draw the Pareto chart.

2.1.3 Step 3: Machine selection

The final step of the first phase involves machine selection. The basic features of the Pareto chart are the columns with two vertical axes (Figure 2).

In the chart, the left column represents the frequency or loss occurrences, while the right represents the cumulative percentage. The left vertical axis is marked in increments from zero to the total number of all the losses classified, while the right vertical axis is marked in increments from zero to 100%. The Pareto chart shows that the most critical machine would be that falling under the first column on the left axis. This machine should be selected for the measurement process of maintenance performance.

2.2 Phase II: Critical machine assessment

Once the critical machine is identified, the machine failures are analysed. The purpose of this approach is to implement activities that eliminate or reduce failures, beginning with the highest-priority problems. Using failure mode and effect analysis (FMEA), failures are prioritised according to how serious the consequences are, how frequently they occur, and how easily they can be detected. FMEA is a structured and bottom-up approach that begins with identifying the potential failure modes at one level, and then investigating the effect on the next subsystem level [22 & 23]. Five steps are involved in implementing Phase II.

2.2.1 Step 1: Identification of critical machine function

A machine consists of various components with different working functions and purposes. On the basis of the information from the machine operating system, manual, and components list, the first step in FMEA is to list each component function of the critical machine. Machine function is defined as the task assigned to a component of the critical machine to accomplish specific processes. This step aims to simplify and focus the analysis on the smaller component levels. This way, a direct and accurate maintenance solution can be planned for the critical machine, based on the functionality of its components.

The process can be carried out by first constructing a functional block diagram (FBD) of the critical machine. The FBD is constructed to show diagrammatically the breakdown of a machine into components that are required to achieve successful operation. The basic structure of an FBD is shown in Figure 3, and the terms used are defined in Table 2 [24].

2.2.2 Step 2: Identification of potential failure modes

The second step in FMEA is to identify the potential failure mode of the critical machine. The failure mode is the component that fails to perform its intended process. Generally, the failure mode describes how the failure occurs. During this step, FMEA includes some processes to assess the risk associated with failure mode by rating the severity of each mode. A severity (S) rating considers the worst potential consequences of a failure, determined by the degree of injury, property damage, or system damage that ultimately occurs. The severity is quantitatively rated by experienced or expert workers in the work area, such as the process engineer, maintenance engineer, or technicians who are responsible for the selected machine. They are a team of workers who operate the machine on a regular basis and are therefore the most familiar with its operations. The ratings, ranging from 1 to 10 as based on the quantitative judgment of these experts, are used.

2.2.3 Step 3: Identification of Potential Failure Effects

The potential effects for each failure mode are identified, pertaining to the changes or consequences that stem from the failure mode. The effects are observed and recorded to assess effective maintenance action for the failure by looking at the historical record of previous failures, as well as the machine handbook, operation manuals, and actual observations of the machine. During this stage, rating the likelihood of occurrence (O) for each failure cause is necessary. The failure occurrence or failure rate represents the number of failures that occur in the identified failure mode. Experts make decisions by referring to the historical record of failure occurrences, and assess whether the failure has a remote, low, moderate, high, or very high probability of occurrence during operation. The rate is numerically valued from 1 to 10 points.

2.2.4 Step 4: Identification of potential failure causes

The fourth step is structured to extend the analysis of the failure mode by identifying its potential cause. The identification process can be carried out by asking questions such as:

What could cause the component to fail in this manner?
What circumstances could cause the component to fail to perform its function?
What can cause the component to fail to deliver its intended function?

During this step, the likelihood of prior detection (D) for each cause of failure is identified and rated from 1 to 10, as in the previous steps.

2.2.5 Step 5: Evaluation of current maintenance action

The next step is the evaluation of current maintenance action. This step is significant, because any improvement that is conducted at a later stage can be planned using previous maintenance activities as bases. This eliminates redundant action plans and ensures more effective maintenance. Here, the risk priority number (RPN) is calculated for each failure mode. The RPN can be obtained by multiplying the ratings of the severity (S), likelihood of occurrence (O), and likelihood of detection (D) obtained in the previous steps. Given that all the ratings are taken in the integral interval of 1 to 10, the three factors are considered to have the same weight in the RPN score. The RPN calculation is expressed as

where i is the failure mode number with i=1.....n. The assumption of this step is that the higher the value of RPN, the greater the risk of failure, and the lower the value of RPN, the lesser the risk of failure. Thus the usage of the RPN score prioritises improvement activities by focusing first on the most risky failure mode.

2.3 Phase III: Machine performance measurement

After the critical machine is assessed and the problematic functions of the machine are identified, the focus is now directed toward measuring machine effectiveness and reliability. Overall equipment effectiveness (OEE) and reliability are the main concepts adopted for the model, because both methods can be used to measure maintenance performance based on machine effectiveness. The key point in this phase is the assumption that machine effectiveness can be achieved with effective maintenance activities. OEE is a diagnostic function for multi-attribute factors, which are availability, performance rate, and product quality rate. The measurement method provides the total effectiveness of machine performance during its operation. Meanwhile, the reliability principle can be used to gauge maintenance performance based on machine dependability and lifetime. The main objectives of reliability analysis are to reduce failure rate and to extend machine operating time. This phase can be divided into two steps.

2.3.1 Step 1: Calculation of machine effectiveness

Step 1 uses the OEE method to calculate machine effectiveness (ME). In accordance with OEE, a machine's availability measures the fraction of total operating time in an observation period, such as a week or a month, in which a machine is capable of performing processing work. Available time excludes times when the machine is non-operational due to repairs or queued repair schedules. It also excludes times when the machine is undergoing preventive maintenance, cleaning, calibration, and re-qualification after maintenance, or is being used in engineering efforts. The available time for the machine includes actual processing time and idle time. In a manufacturing plant, unavailable time (the complement of available time) is commonly called machine 'downtime'.

For the machine performance rate element, the OEE method measures the fraction of total operating time in an observation period in which the machine asset is actually engaged in a processing activity. For practical reasons, time credited to performance rate may include not only actual processing time, but also the short periods of time in which the machine is idle while operators perform handling, program downloading, and metrology tasks that are required between consecutive machine cycles.

The performance rate also considers the comparison between the actual production and the expected production of the process. It represents the associated speed losses caused by poor adjustment carried out during maintenance work. An ME time frame can be developed from the information about the elements. The time frame is drawn to show schematically how the elements are determined and calculated [25]. The durations of the ME time frame are determined in relation to the three big losses. Figure 4 shows the computation of the time frame for machine effectiveness [16, 26].

The time frame is structured as three levels of bars. The data can be collected from production data that contains production scheduling and operation. In the first level, the top bar represents planned production time (T_plan) and shows the total time a machine is supposed to be available to produce a product. Therefore the planned production time for a selected machine can be calculated by multiplying the days of work in a month by the total number of minutes the machine is expected to operate in a day, as in Equation 2.

where δ is the number of working days and is the daily production time (converted to minutes) planned for the machine to operate.

The bar in the second step represents the actual production time (T_act) calculated by eliminating downtime losses, such as machine failures and setup and adjustments. This is the time planned for machine availability. The duration of Tp/an is the maximum time of machine operation, but it is rarely achieved because of unplanned and planned downtimes. Therefore, T_act is used instead. T_act is affected by availability losses, which are grouped based on breakdown, as well as setup and adjustment downtime. T_act is expressed as:

Tupdt in Equation 3 denotes the duration of unplanned downtime that occurs during the entire T_p/an. Tupdt occurs when a machine experiences failures or breakdowns, while Tpdt is the duration of downtime planned on the machine for maintenance actions or breaks such as:

implementation of PM or routine checkups and calibrations of the machine;
machine trials and process improvement activities;
machine stoppages for change of components to produce different products;
machine stoppages for software installation.

The third stage in the ME timeline is the machine's net production time (T_net). This is the time the machine takes to produce the finished product based on its capacity and capability as initially designed. The determination is based on the product cycle time as specified and recorded in the process manual and product specification. Thus the calculation can be carried out by multiplying the theoretical cycle time Ttc of one product by the number of products processed (α) by the machine. The equation is:

The losses experienced in the net production time are performance losses such as idling and minor stoppages caused by poor machine conditions. The construction of the ME measure is undertaken using historical data for availability and performance elements. The data required for the measurement can be collected on a daily basis by the machine operators, and the actual machine performance can be calculated at the end of the day. The percentage for the 'world class performance' availability element is considered to be at 90% [27]. The mathematical model for availability calculation is:

For performance effectiveness, the target for machine performance rate is set at 95% [24]:

Then, ME is

This research gauges ME during the machine's useful life by calculating the failures resulting from ineffective maintenance [31]. A machine deteriorates relative to usage and age; thus, reliability during usage life represents the prevention of machine failure by performing effective maintenance [32]. The reliability computation is carried out by calculating the number of failure occurrences based on the failures identified and analysed in Phase II, in which the final results are in RPN. The failures modes with the highest RPN values are chosen as the critical failures of the machine. The list of failures can be recorded, and is shown in Table 3.

The percentage for world class performance OEE is set at 85% [27]. The value is similar for ME. The discussions on how to interpret the results are provided in Phase IV: Maintenance performance level assessment.

2.3.2 Step 2: Reliability calculation

In the previous step, maintenance performance is gauged based on machine performance and capability to work in the operation system. The measurement in this step is based on the reliability principle, defined as the ability of a machine to perform, without failure, a specified function under a given production time [28]. The gauging system is based on the assumption that conducting good maintenance activities results in a more reliable machine. Machine failures, such as that portrayed by the bathtub curve, occur. The curve is the graphical representation of the reliability principle shown in Figure 5. It shows three stages of failure rates that are usually experienced by a machine: infant mortality, normal or useful life, and end of life wear-out stage [29, 30].

During this step, the number of failures or failure rate (λ) can be calculated by

The f(t) is the number of failure occurrences collected from the historical record during T_p/an. However, the result from failure rate calculation is in number form. It cannot be interpreted based on maintenance performance level. Thus the solution suggested in the model is to convert the number into a percentage. The total value of λ is recorded in the last row in Table 3, and is used to calculate the failure ratio for each failure type using Equation 9.

Out of six data in Table 3, the highest failure percentage is taken to depict machine reliability that will be interpreted in the final phase. The processes on how to interpret the result from failure ratio are provided in Phase IV: Maintenance performance level assessment.

2.4 Phase IV: Assessment of maintenance performance level

The completion of previous phases yields the percentages of ME and reliability. A medium is needed to tally the score for machine performance with maintenance performance. A literature review is carried out to identify a rating system that can be used to convert machine performance to maintenance performance. From the reviews, the health index (HI) is determined as the most suitable method. The concept of the HI is commonly applied in the rating of power transformers [33, 34]. This index represents a practical method for quantifying the results of operation observations, field inspections, and site testing into an objective quantitative index that represents the overall condition of a machine. The HI is developed at five levels of maintenance performance, with Level 1 standing for very good performance and Level 5 for poor performance (Table 4).

In accordance with the OEE method, any ME percentage below 85% is considered ineffective and should be further improved. Thus, Level 1 for machine performance is set at 85% and above. The remaining percentages are divided into four groups. Any machine with an ME value below 24% is considered to be at Level 5 and requires immediate risk or failure assessment. As a result, the machine needs to be replaced or subjected to maintenance activities as identified by the FMEA.

The HI for the reliability principle is based on failure ratio, which is the percentage of failures that occur in the critical machine. Any failure ratio under 5% is considered very good, to which Level 1 in maintenance performance is allocated. This indicates that the failure with the highest RPN value occurs only once in a while, and is addressed by maintenance activities. However, any performance level between Levels 2 and 5 should be further analysed and improved.

3. IMPLEMENTATION AND RESULTS

The model was implemented in a semiconductor company located in North Malaysia. The illustrative case is a company that does business in the assembly and testing of leaded semiconductor packages. Operating as a subcontracting company, it offers different process packages for products manufactured with integrated circuit production based on customer specifications (packages). The two main sections in the company are front of line (FOL) and end of line (EOL), as shown in Figure 6.

3.1 Phase I: Machine selection using a Pareto chart

The EPR model was implemented in FOL. Six processes in FOL use machines, and this is where the loss occurrence analysis was conducted. By referring to the historical and failure records, the number of loss occurrences (l) for the three big losses were calculated. To obtain accurate analysis, we compiled month-long data. The Pareto chart was developed as in Figure 7. The Y-axis of the chart represents the loss occurrences for the six processes in FOL. The left X-axis is assigned to the number of loss occurrences, while the right X-axis is the value of cumulative percentages.

Using Pareto analysis, we determined that the machines at the wire bond process have the highest number of losses. This process also exhibits the highest cycle time in production time, with frequent machine breakdown leading to low productivity rates.

3.2 Phase II: Failure analysis using FMEA

There are 140 machines under the wire bond. Thus a large amount of time and a lot of human resources were needed to collect the data from the machines. This feature also complicates the data analysis. However, choosing only one machine is impractical because the data will be inadequate and inaccurate. The company is on its way to adopting a new technology, known as the copper wire bonding process, to broaden its market niche. However, the process engineers were concerned by the many unknown failures that occurred during the implementation of the new process. The FBD for the copper wire bonding machine is illustrated in Figure 8.

Subsequently, the failure modes and their effects and causes, and the current maintenance activities for the machine components, were analysed. The final results for the failure modes with the highest and lowest RPN ranges are tabulated in Table 5, along with the S, 0, D, and RPN ratings.

3.3 Phase III: Machine performance analysis using OEE and the reliability principle

3.3.1 Step 1: Calculation of machine effectiveness

Phase III in the EPR model was initiated by identifying the three big losses as listed in Table 1. The information was taken from the daily maintenance record. All these losses were recorded as downtime during operating time. From the data analysis, 40 types of failures were identified. The downtimes in the log sheet were recorded based on failure type, such as broken wires, machine downtimes, or insufficient preventive maintenance. Once identified, each failure was one of the groups of three big losses. Failure types such as quality assurance (QA) buy-off, under engineering, under vendor, awaiting QA buy-off, and awaiting material, were omitted in the segregation allocation process because these are not related to maintenance performance. The results of the other 35 types of failures are shown in Table 6

For the timeline, the company followed the proposed ME timeframe. The machine downtime losses identified earlier were grouped into two: planned (PDT) and unplanned (UPDT) downtimes. The failures under the three big losses were considered UPDT, whereas failures segregated under 'other' were considered PDT.

Using the values of T_p(an, T_act, and T_net, the process continues with the calculation of availability (Aeff), performance (Peff), and finally, machine effectiveness. All results obtained for the 14 packages at the copper wire bonding process are shown in Table 7. Figure 9 is the graphical representation of Aeff, Peff, and ME.

Table 7 and Figure 9 show that the availability value of all packages is 90.6%. This indicates that the machines are operated according to Tplan. The difference between Tplan and Tact is that low-end machines rarely have any major breakdowns. This also indicates that maintenance activities, such as machine setup and adjustment, are conducted effectively. However, machines in the copper wire bonding process have extremely low performance effectiveness, with an average of 45.6%. This result shows that the machines are frequently idle and experience minor stoppages during operation. The average of ME is 41.5%, which is low compared with the world class mark of 85%.

3.3.2 Step 2: Calculation of machine reliability

The second step in Phase III was calculating machine reliability. This reliability analysis investigates machine performance based on a machine's resistance to failure and breakdown. For this purpose, this step used the data collected from the previous phase for analysis. Information was gained using the FMEA approach, in which failure modes with high RPN values were selected. From Phase II implemented in the company, the range of the RPN value is large: between 4 and 300 points. The reliability analysis was conducted on the failure with an RPN value of more than 200 points because of the low risk for some failure modes. The decision was triggered by the understanding that maintenance conducted on risky failure modes is important because failures may cause major machine breakdowns. The selected failure modes are provided in Table 8.

The reliability analysis was initiated by the collection of failure occurrences during the entire T_plan in processing copper wire bonding packages. These historical data were used to count the failure frequency or failure rate (λ) of all 10 critical failure modes at the machines. However, because T_plan varies for all 14 packages, the λ during this phase was collected during the entire production time when all packages in the copper wire bonding process were produced.

The failure modes and their λ values are listed in Table 8. The total number of λ is 482, with the highest rate at component 5.0 with 81 occurrences for contamination build-up in the wire clamp. The contamination affects wire bonding quality and causes the component to produce inconsistent looping and starch wires. It is usually caused by the contamination of copper wire during the oxidation process. The failure ratio (ζ) of each failure mode was then calculated. The highest number of ζ was the selected reliability value of the machine that is analysed using the EPR model. Machine reliability was calculated at 16.8%.

3.4 Phase IV: Validation of maintenance performance level assessment

The process began with the analysis of the results of ME and reliability calculation. The main objectives of applying relevant performance measures are to detect deviations in the conditions of the production and maintenance processes for implementing the actions required at an early stage with fewer resources such as time, labor, and cost. Furthermore, the analysis and diagnosis of the deviations of the performance measures yield better results when they are associated with identifying the root cause of the changes. The recommended action will help avoid failure re-occurrence.

The average of machine effectiveness percentages was determined. The average machine effectiveness in the copper wire bonding process is 41.5%, which places maintenance performance at Level 4 or poor performance. The suggested action for this level is to begin planning the process for replacing or rebuilding the machine, considering the risks and consequences of the failures. The same process was also applied to the results from the reliability analysis. The failure rate exhibited by the machine was calculated and then matched with the maintenance performance level in HI. The failure ratio at copper wire bonding is 16.8%. The result, when matched with the HI, exhibits a maintenance performance of Level 3, which is a 'fair' maintenance performance level.

4. ACTIONS TAKEN BY THE COMPANY

The company studied in this research agreed that the machines in the copper wire bonding process lacked performance efficiency, with performance rates much lower than anticipated. The company also realised that many machines did not comply with the theoretical cycle time that had been set, because many unplanned breakdowns occurred during operating hours. These unplanned breakdowns usually resulted from idling and minor losses with minor maintenance activities conducted by the operators attending to the machines.

Machine reliability is satisfactory, but the maintenance activities should be periodically monitored to ensure effective performance. The purpose of this model is to gauge maintenance performance levels based on machine effectiveness and reliability. Maintenance plays a key role in ensuring that the company's wire bonding machine and other equipment performs its required functions during production. The criticality of this idea is that many practitioners - such as process engineers and maintenance engineers -have no references to guide them in ensuring that the process runs smoothly in the production line, where they have no ability to eliminate failures by modifying or improving the material properties of the wire. What they can do is to set and maintain their machines properly, while addressing the failures from a maintenance perspective.

Based on the ME and reliability analysis results, we determined that the company has two levels of maintenance performance. The company opted to focus on the lowest level, because conducting an improvement plan in the future is easier for the company when it starts with the lowest level achieved. Thus the final step taken by the company was to conduct maintenance as described in the HI. The company usually practices PM and corrective maintenance (CM) in its operating system. PM is a proactive approach in which machines are monitored and maintained periodically to avoid failures throughout the manufacturing process. For each machine, PM is planned according to machine requirements, maintenance specifications, and design. PM is conducted on a weekly, monthly, quarterly, or yearly basis. CM is conducted whenever failure occurs. This process is a reactive maintenance approach, and is regarded as unplanned downtime during operation.

Based on the HI, the company opted for continuous maintenance to ensure high overall machine performance as well as high machine reliability. The suggestions given in the EPR model are found to be applicable to this company. The model is confirmed to be general yet suitable for the kind of maintenance system practised by the maintenance department.

5. CONCLUSION

An effective maintenance system is important, and requires monitoring and assessment so that an improvement plan can be efficiently formulated. This research takes its roots from various discussions and observations that arise from practitioners' dilemma in measuring maintenance performance. The motivation of this research is the development of a simple, easy to use, and viable model for measuring maintenance performance. The performance measurement requires holistic and effective approaches to enable the achievement of reliable results. The practice of measuring maintenance performance in this research is discussed, based on mechanical factors.

The combination of OEE and the reliability method is proposed in developing an EPR model. Machine effectiveness can be achieved by conducting effective maintenance. Numerous companies have already conducted measurements using OEE [4, 14, 16-21, 25-27].

However, the measurement processes conducted were focused on short-term perspectives. Machines are measured on their availability, performance rate, and quality rate, as was originally suggested by Nakajima [15] for the OEE model. The long-term effect (such as that of machine reliability) on maintenance performance was not measured. Martorell et al. [35] conducted a sensitivity study to investigate the effects of maintenance performance on machine survival functions and age. They found that when maintenance performance increases, the survival function and age of the machine also increase.

In addition, the asymptotic behaviour that represents machine reliability is achieved at a faster rate. This is deemed to be a natural consequence of implementing maintenance activities that further improve machine condition. This highlights the relationship between machine reliability and maintenance performance, as suggested in [26, 28, & 31]. Thus combining OEE and the reliability principle serves as a viable approach to maintenance performance measurement.

The EPR model uses three big losses instead of six for machine performance measurement. The new group of losses is developed, based on discussions concerning the definitions and segregation of failures and downtimes exhibited by machines in manufacturing plants [32, 36, 37, 38]. Only three losses are directly related to maintenance activities. The machine downtime caused by breakdown, setup and adjustment, and idling and minor stoppages are the losses considered in maintenance performance measurement. These losses can be repaired and improved by effective maintenance activities. The losses caused by reduced speed, process defects, and startup are omitted because these usually involve human error, material problems, or process requirements. They only indirectly affect maintenance performance.

Quality rate is omitted in the EPR model because it is not directly related to maintenance effectiveness. Quality is measured based on the number of products produced. However, production rate is commonly related to human error during process setting or material defects that lead to product rejection. Chakravarty et al. [38] measured machine effectiveness based on the availability and performance rate element to obtain the actual maintenance performance level, without considering problems related to materials and human factors. Steege [39] also omitted the quality element in his research because of the tight interrelationship of machines; this relationship makes the identification of machine- related product defects difficult.

In this paper, the structured technique of an EPR model for identifying problematic machines and planning improvement actions is presented. In the model, the selection phase involves choosing the most problematic machine using Pareto analysis. Subsequently, the model uses FMEA as a failure analysis method to identify failures and improvement actions in machine operation. Next, the maintenance performance is gauged based on a machine's effectiveness and reliability in the operation plant. The results are interpreted as maintenance performance levels based on the HI. The case study shows that the model successfully measures maintenance performance based on machine effectiveness. The analysis from the model has been used to improve the maintenance system employed in the company.

REFERENCES

[1] Kumar, U. 2006. Development and implementation of maintenance performance measurement system: Issues and challenges, WCEAM Paper 127. [ Links ]

[2] Parida, A. & Kumar, U. 2006. Maintenance performance measurement (MPM): Issues and challenges, Journal of Quality in Maintenance Engineering, 12(3), pp. 239-251. [ Links ]

[3] Coetzee, J.L. 1999. A holistic approach to the maintenance problem, Journal of Quality in Maintenance Engineering, 5(3), pp. 276-280. [ Links ]

[4] Chan, F.T.S., Lau, H.C.W., Ip, R.W.L., Chan, H.K. & Konga, S. 2005. Implementation of total productive maintenance - A case study, International Journal Production Economics, 95, pp. 71 94. [ Links ]

[5] Wireman, T. 2003. Benchmarking best practices in maintenance management, New York: Industrial Press. [ Links ]

[6] Prendergast, J., Murphy, E. & Stephenson, M. 1996. Building-in reliability - implementation and benefits, International Journal of Quality & Reliability Management, 13(3), pp. 77-90. [ Links ]

[7] Neely, A., Richards, H., Mills, J., Platts, K. & Broune, M., 1997. Designing performance measure: A structured approach, International Journal of Operations & Production Management, 17(11), pp. 1131-1152. [ Links ]

[8] Tangen, S., 2003. An overview of frequently used performance measures, Work Study, 52(7), pp. 347-354. [ Links ]

[9] Pintelon, L. & Van Puyveld, F. 1997. Maintenance performance reporting systems: Someexperiences, Journal of Quality in Maintenance Engineering, 3(1), pp. 4-15. [ Links ]

[10] Ahmed, S., Hassan, M. & Taha, Z. 2004. State of implementation of TPM in SMIs - A survey study in Malaysia, Journal of Quality In Maintenance Engineering, 10(2), pp. 93-106. [ Links ]

[11] Knights, P.F. 2001. Rethinking Pareto analysis: Maintenance applications of logarithmic scatter plots, Journal of Quality in Maintenance Engineering, 7(4), pp. 252-263. [ Links ]

[12] Craft, R.C. & Leake, C. 2002. The Pareto principle in organizational decision making, Management Decision, 40(8), pp. 729-733. [ Links ]

[13] Envision. The Pareto principle - 80/20 rule [online]. [Accessed 28 June 2009]. Available from http://www.envisionsoftware.com/Management/Pareto_Chart.html [ Links ]

[14] Pomorski, T., 1997. Managing overall equipment effectiveness (OEE) to optimize factory performance, International Symposium on Semiconductor Manufacturing Conference, pp. A-33 to A-36. [ Links ]

[15] Nakajima, S., 1988. Introduction to Total Productive Maintenance (TPM), Cambridge: Productivity Press. [ Links ]

[16] Nachiappan, R.M., & Anatharaman, N. 2006. Evaluation of overall line effectiveness (OLE) in a continuous product line manufacturing system, Journal of manufacturing Technology Management, 17(7), pp. 987-1008. [ Links ]

[17] Konopka, J. & Trybula, W. 1996. Overall equipment effectiveness (OEE) and cost measurement, IEEE/CPMT International Electronics Manufacturing Technology Symposium, pp.137-140. [ Links ]

[18] Ljungberg, O. 1998. Measurement of overall equipment effectiveness as a basis for TPM activities, International Journal of Operations & Production Management, 18(5), pp. 495-507. [ Links ]

[19] Jonsson, P. & Lesshammar, M. 1999. Evaluation and improvement of manufacturing performance measurement systems - The role of OEE, International Journal of Operations & Production Management, 19(1), pp. 55-78. [ Links ]

[20] Jabiri, N.Z., Jaafari, A., Platfoot, R. & Gunaratnam, D. 2005. Promoting asset management policies by considering OEE in product's TLCC estimation, IEEE, pp. 480-484. [ Links ]

[21] De Ron, A.J. & Rooda, J.E., 2005, Fab performance, IEEE Transactions on Semiconductor Manufacturing, 18(3). [ Links ]

[22] Slack, N., Chambers, S. & Johnston, R., 2004. Operations management, 4^th ed., Prentice Hall, Essex. [ Links ]

[23] Sharma, R.K., Kumar, D. & Kumar, P., 2005. Systematic failure mode effects analysis (FMEA) using fuzzy linguistic modeling, International Journal of Quality & Reliability Management, 22(9), pp. 986-1004. [ Links ]

[24] Benjamin, S.B. & Fabrycky, W.J. 2006. System engineering and analysis. Essex: Prentice Hall. [ Links ]

[25] Tjarning, A. & Brant, E., 2006. Changing from a reactive to a proactive maintenance culture- Implementation of OEE. Masters Thesis, Lulea University of Technology. [ Links ]

[26] Jeong, K.Y. & Phillips, D.T. 2001. Operational efficiency and effectiveness measurement, International Journal of Operations & Production Management, 21(11), pp. 1404-1416. [ Links ]

[27] Dal, B., Tugwell, P. & Greatbanks, R. 2000. Overall equipment effectiveness as a measure of operational improvement - A practical analysis, International Journal of Operations & Production Management, 20(12), p.1488-1502. [ Links ]

[28] Oyebisi, T.O. 2000. On reliability and maintenance management of electronic equipment in the tropics, Technovation, 20, pp. 517-522. [ Links ]

[29] Dhillon, B.S. 1999. Reliability-centered maintenance, Engineering maintainability, pp. 160-179. [ Links ]

[30] Booker, J.D., Raines, M. & Swift, K.G. 2001. Introduction to quality and reliability engineering, Designing capable and reliable products, pp. 1-36. [ Links ]

[31] Endrenyi, J., Anders, G.J. & Leite da Silva, A.M. 1998. Probabilistic evaluation of the effect of maintenance on reliability - An application, IEEE Transactions on Power Systems, 13(2), pp. 576583. [ Links ]

[32] Endrenyi, J. & Anders, G.J., 2006. Aging, maintenance, and reliability, IEEE Power and Energy Magazine, 4(3), p. 59-67. [ Links ]

[33] Dominelli, N., Lau, M., Olan, D. & Newell, J. 2004. Equipment health rating of power transformers, Conference Record of the 2004 IEEE International Symposium on Electrical Insulation, Indianapolis USA, 19-22 September 2004. [ Links ]

[34] Naderian, A., Cress, S., Piercy, R., Wang, F. & Service, J. 2008. An approach to determine the health index of power transformers, Proceeding of IEEE International Symposium on Electrical Insulation (ISEI). [ Links ]

[35] Martorell, S., Sanchez, A. & Serradell, V. 1999. Age-dependent reliability model considering effects of maintenance and working conditions, Reliability Engineering and System Safety, 64, pp. 19-31. [ Links ]

[36] Da Costa, S.E.G. & De Lima, E.P. 2002. Uses and misuses of the 'overall equipment effectiveness' for production management, IEEE, pp. 816-820. [ Links ]

[37] Oechsner, R., Pfeffer, M., Pfitzner, l., Binder, H., Muller, E. & Vonderstrass, T. 2003. From overall equipment efficiency (OEE) to overall Fab effectiveness (OFE), Materials Science in Semiconductor Processing, 5, pp. 333-339. [ Links ]

[38] Chakravarthy, G.R., Keller, P.N., Wheeler, B.R. & Van Oss, S. 2007. A methodology for measuring, reporting, navigating, and analyzing overall equipment productivity (OEP), IEEE/SEMI Advanced Semiconductor Manufacturing Conference, pp. 306-312. [ Links ]

[39] Steege, P. 1996. Overall equipment effectiveness in resist processing equipment, IEEE/SEMI Advanced Semiconductor Manufacturing Conference, pp. 76-79. [ Links ]

* Corresponding author
** The author is currently enrolled for a PhD degree at the Universiti Sains Malaysia.