"Behind the Scenes with EXAKT"
Under the Covers with EXAKT How it works….. July 2009 --- Ben Stevens Ben @omdec.com www .omdec.com 1 Agenda • Thinking about data • Data Pre-processing – Data Validation – Data Smoothing – Data Analysis • EXAKT Analysis – Modeling for single dominant Failure Mode – Modeling for complex items (2+ Failure Modes or multiple components) – Cost Modeling – Sensitivity Analysis – Output Reports 2 Thinking about data: General Problem • Each equipment, and each failure mode for each equipment is potentially unique. • What we need to measure are those variables which are important as predictors of failure. • Thus we cannot provide a definitive list of the data set that is going to be required for an equipment problem (failure mode) until that equipment and failure modes have been determined. • The analogy is trying to predict what procedures will be needed to do service work on a vehicle without knowing what type of vehicle will be serviced (car, truck, bus, train….), and what the problem is. 3 Thinking about data: General Answer • Three key sources of data: – CBM data – oil sample analysis (what particles are in the oil and how much), vibration data, temperature data – whatever data can be used to help predict failure – Operating data – operating hours, load, power cycles – Event data – what has been done to the equipment which might affect the reliability of the data – parts swap-outs, suspensions from service, service jobs, oil changes + if available, RCM analysis (this will help to define the failure modes) 4 Thinking about data: Amount of data • Too much data is rarely a problem; missing data almost always is. • Minimum of about 4 failures of any given failure mode – (8+ is better) • Too many co-variates is not a problem – EXAKT can sort out which are zero-impact or low impact in terms of ability to predict failure • Too little data, (or inconsistent data) will show up as low confidence levels – which makes the prediction not much use (50% CL = flip a coin) 5 Thinking about data: Simple Examples “Failure Modes” CBM data Operating data Event Data Tire failure Tire Pressure, Load, distance, tire Install dates, repair speed, operating compound, road dates, rotation dates, temperature, wheel surface suspensions balance/vibration. Bearing failure Vibration, oil Operating hours, Load Install dates, sediment count Lubrication dates, equipment suspensions Pump failure Vibration, oil Operating hours, Install dates, service sediment count, product pumped dates, equipment operating suspensions temperature Pipeline failure Pressure, wall- Operating hours, Install date, clean out thickness measure- product pumped, dates ment, temperature construction material Note – these “Failure Modes” are actually groups of failure modes for the purposes of examples; the specific co-variate data required to predict failure will depend on the specific failure mode 6 Data Validation 1. Objective – sufficient, consistent, accurate data 2. Initial Data Analysis 1. # Items, # Histories 2. Total Event Beginnings versus Total Event Ends 3. Typical Problems identified by EXAKT 1. Missing End Event (manually fill) 2. Duplicate records (eliminate one) 3. Two events on same Working Age but different dates (verify machine idle period) 4. Multiple repeat readings (measurement error equipment or human reading) 5. Missing record (interpolate) 6. Unexplained peaks and valleys (verify or discard) 7 Data Smoothing • Objective – eliminate outliers, make sense of variances from a trend • Erratic data recordings (but with overall increasing or declining trend) – Smoothing data readings by linear regression – Smoothing data averages – Performed by EXAKT 8 Data Analysis • Objectives: – to read the patterns in the data before transferring to EXAKT; – establish the impact of underlying trends before modeling; – Establish dependence/independence – Decide on derivatives • Examples: – Correlation analysis/investigation of relationships among variables – Correlation between inspections and events – Data reduction/consolidation/ summarization – Variable transformations (derived variables) – Lagging and leading variables – History transformations – Signal processing. 9 Modeling for single dominant Failure Mode Objectives: 1. Target high cost equipment, equipment with high cost of failure 2. Minimise cost (Preventive Maintenance + Failure Cost) – or 3. Maximise Availability – or 4. A combination 5. Establish Remaining Useful Life 10 Modeling for single dominant Failure Mode –2 General Procedure: 1. Create Inspection Table 1. shows condition data from CBM measurements at each working age 2. Working age = cycles, tons produced, operating hours x stress 3. (download from CBM db, type from inspection reports, extract from CMMS…) 2. Create Events table 1. Any event which has an impact on reliability 2. B = Beginning, EF = End in Failure, ES = End in Suspension (ie take off line), OC = Oil change etc 3. (extract from CMMS) 3. Combine Events table with Inspections 11 Modeling for single dominant Failure Mode – 3 General Procedure: 4. Proportional Hazards Modeling 1. Objective is to determine which conditions have predictive capacity, and eliminate non-predictive conditions AND to determine the relative significance of their contribution to the prediction 2. Select variables (in EXAKT) 3. Wald test automatically ranks the impacts of each variable and shows Y or N impact and the probability of their being NO IMPACT 4. From the N’s, eliminate highest P-value (lowest impact) 5. Re-run and repeat until all N’s deleted 6. Note with P-scores similar for different variables, may conclude several different models are valid. 7. If so, run Comparative Report to distinguish among them 12 Modeling for single dominant Failure Mode – 4 General Procedure: 5. Goodness of Fit 1. Kolmogorov-Smirnoff tests the accuracy of the prediction – 95% significance (ie 19 times out of 20) 2. If not significant….. Either search for missing variable and/or more data on existing variables or more consistent data 6. Transition Probability model 1. EXAKT sets bands for condition readings (ex: 5 to 10ppm) 2. TP model shows the probability of jumping from 1 defined band to the next highest before the next inspection period; (helps to establish the probability of failure) 13 Modeling for single dominant Failure Mode – 5 General Procedure: 7. Cost Model 1. Sets CR (Cost Ratio) – ratio of Failure Cost to Preventive Replacement Cost 8. Outputs for group of assets 9. Re-run model for single asset with latest data 10. Review reports and recommend action 14 Modeling for complex items • Complex item= 2+ Failure Modes or multiple components – Requires Marginal Analysis – Performed within EXAKT – Distinguishes among FM’s – Shows which FM is most imminent 1. Prepare separate models for each component or Failure Mode 2. Examine and compare output reports 3. Perform maintenance action 15 Cost Modeling 1. EXAKT calculates CR (Cost Ratio) – Ratio of Failure Replacement to Cost of Preventive Replacement 2. Numbers typically used are ~ 3:1 to 10:1. One example of 1000:1 Cost of Preventive Replacement = Cost of PM Work + Cost of Lost Mission Readiness, lost Revenue or Profit during the PM + Penalty Costs, Reputation Costs, Fines and Reparations during the PM Cost of Failure = Cost of Emergency Repair + Cost of Lost Mission Readiness, lost Revenue or Profit + Penalty Costs, Reputation Costs, Fines and Reparations 16 Sensitivity Analysis • Objectives: – How sensitive are our conclusions to variance in cost and time? – How accurate do we have to be to get reliable results? • Built into EXAKT • Examples: 1. Sensitivity of Preventive:Replacement cost ratios 2. Sensitivity of Preventive:Replacement time ratio 3. Downtime costs 17 Output Reports 1. Traffic light graph 2. Failure Risk plot 3. Conditional Failure Distribution 4. Conditional Density of Failure (frequency against working age) 5. Cost Report 6. Availability Report 7. Cost and Availability Report 8. Cost and Hazard Sensitivity reports 9. Time to Replace Report 18 Traffic light graph 19 Failure Risk plot 20 Conditional Failure Distribution Function 21 Conditional Density of Failure (frequency against working age) 22 Cost Report Step 12 23 Comparative Cost Reports CR=3 CR=5 CR=6 24 Availability Report 25 Cost and Availability Report 26 Hazard Sensitivity Example 27 Cost Sensitivity Example 28 Remaining Useful Life RUL 29 Questions? 30