Document Sample
					                CAA HUMS RESEARCH PROGRAMME

Present:         Mr D Howson    - CAA, RASA
                 Mr J McColl    - CAA, Propulsion Dept
                 Mr A Field     - CAA, Propulsion Dept
                 Mrs S Salter   - Smiths Aerospace (Smiths)
                 Dr R Callan    - Smiths Aerospace (Smiths)
                 Mr B Larder    - Smiths Aerospace (Smiths)
                 Mr L Stoate    - Smiths Aerospace (Smiths)
                 Mr A Heather   - Smiths Aerospace (Smiths)

Apologies:       Mr M Bonnick - CAA, Avionic Systems Dept
                 Mr G Hughes - Bristow Helicopters Ltd (BHL)


1.1    Accuracy

           The notes of the 2nd March meeting were approved.

1.2    Actions from previous meeting

           Action 2.2:   Mr Healey to check the CAA‟s records for significant defects on
                         IHUMS equipped AS332Ls.

           Action completed. Mr Field had supplied a copy of the list of successes
           contained in the CAA‟s database of HUMS arisings. Although there had been
           no new entries since January 2002, the data provided a useful cross-check.
           Eight records matched entries in BHL‟s list of documented AS332L faults, and
           there were two records for which there was no matching BHL data. One of
           these was not an actual fault, but related to high 38th shaft order vibration on
           the forward end of the MGB intermediate shaft. The only fault not in the BHL
           data was listed as an “MGB HSS splined adaptor SO1” on 25 August 2001. Mr
           Hughes will be asked to investigate this (see item 1.3).

           Action 4.1:   Mr McColl to contact Paul Anuzis at Rolls Royce to try to
                         initiate an exchange of information.

           Action on-going. Although Mr McColl had spoken to someone in Mr Anuzis‟
           team, no-one from the team had contacted Dr Callan. The option of arranging
           HHMAG presentations was considered, however it was believed that a smaller
           meeting with Rolls Royce to share experience would be more useful. Mr
           McColl will contact Mr Anuzis again.

           Action 4.2:   Mr Larder to discuss a possible data exchange with Mike
                         Horsey at WHL.

           Action on-going. Mr Larder had spoken to Mr Horsey and agreed in principle
           the idea of an exchange of data from WHL with information from Smiths on
           the results of the analysis of that data. However Mr Howson had subsequently

        received a request for CAA funding and full visibility of the research work from
        Mr Robertson at WHL. It was agreed that it was difficult to justify providing any
        funding to WHL, but that a brief analysis of the currently available WHL data
        would provide a useful addition to CAA‟s review of the MRGB seeded defect
        test programme data. Mr Larder will send Mr Howson a brief note specifying
        the data requested from WHL, and Mr Howson will then follow this up with Mr

        Action 4.3:   Mr Larder to produce a process flow diagram for the “on-
                      line” trial system.

        Action on-going, pending the completion of the system implementation
        feasibility assessment described in item 5.

1.3   New actions arising from item 1

        Action 5.1:   Mr Hughes to investigate the absence of the “MGB HSS
                      splined adaptor SO1” fault on 25 August 2001 from BHL’s


        Due to a number of factors, the data analysis process was proving to be very
        time consuming and this was putting the programme outcome at risk. There
        was a requirement to perform a detailed analysis of a large volume of data to
        define the anomaly detection process. The size of the cluster models was
        growing to the order of several hundred clusters, and there was a need to
        repeat many modelling steps with different parameters to understand the
        nature of the data and its influence on model training. These models could
        take several hours to train and the search for an optimum model could take up
        to 2 days. The technical challenges posed by the HUMS gearbox vibration
        data also required some fundamental research work, and it was necessary to
        try novel modelling approaches that would not be directly supported by any
        cluster algorithm.

        The data mining tool being used for the CAA HUMS research project had
        been developed under Smiths‟ ProDAPS programme, and has a powerful and
        flexible framework that provides capabilities beyond anything available in a
        third party software tool. It was therefore possible to build a software
        automation component to sit on top of the framework. This component was
        designed to facilitate batch processing of models and rapid prototyping of new
        modelling approaches.

        The ProDAPS data mining tool has proven to be essential to the progress of
        the programme. Its flexibility permits new modelling approaches to be rapidly
        prototyped, and the cluster modelling functionality has been key to tackling
        some of the difficult data issues. Furthermore, the tool has some advanced
        model diagnostic capabilities (i.e. the facility to extract information about
        complex models).


3.1   Choice of gearbox shaft for analysis to define the modelling

        As described in the minutes of the previous meeting, left hand accessory
        gearbox (AGB) shaft data had initially been used for the analysis to define the
        modelling approach for anomaly detection. The primary reason for this was
        that a high percentage of the relatively small number of documented faults
        had occurred on accessory gearbox components. There were a significant
        number of variables to be investigated in the development of an optimum
        anomaly detection process, for example data pre-processing options,
        parameter groupings for clustering, numbers of clusters to be used in the
        models, and how to handle unidentified anomalous data included in the
        training set. Due to the lower component loadings, the AGB data is more
        variable than MGB data, and unfortunately this variability hindered progress
        on defining the optimum anomaly detection process. Although some positive
        results were obtained, it proved extremely difficult to achieve the key objective
        of developing a robust approach. It was therefore decided to switch to data
        from a MGB shaft to continue this development. The bevel pinion shaft was
        then selected as the target component, as the most important fault test case
        related to this component – the cracked bevel pinion on a CHC Scotia aircraft.
        A key criterion for judging the success of the anomaly detection process
        would be its ability to detect this fault case.

3.2   Data pre-processing

        The first stage of pre-processing was “median filtering” to clean up the data by
        removing rogue outliers. To further reduce noise the commonly used signal
        processing technique of data smoothing was then applied. However, it was
        found that this actually increased the difficulty of modelling the data, as
        gearboxes tend to occupy their own “space” in the cluster model and
        smoothing added to their isolation. Therefore this pre-processing option was

        A number of pre-processing options were investigated to extract trend
        information from the data. The use of piecewise gradients (i.e. analysing the
        gradient of the signal at different points in time) was investigated, however as
        a result of the variability of the data it was found that this approach provided
        no useful additional information. Trend information was extracted by
        differencing samples from the median of the first „n‟ points, however this was
        ultimately discounted as a result of concerns that the first data points acquired
        on a new gearbox could be un-representative of a life-time trend. A “lifetime
        median differencing” approach was found to be promising, however this could
        not be applied in the live trial when new data is being added after every flight.
        The pre-processing option finally selected for trend identification was a
        “moving median difference” technique, in which the latest data sample is
        differenced from the median of all the data samples acquired to date, and this
        proved to be a valuable approach. It brings all gearboxes to a common zero
        baseline, and detects developing trends in data whilst suppressing early
        trends due to any initial component “running in”.

          The inputs to the data modelling stage were therefore: (i) median filtered data
          for building “absolute” models; and (ii) median filtered and moving median
          differenced data for building “trend” models.

3.3     Development of modelling approach for anomaly detection

          Classical statistical analysis techniques were used to identify gearbox
          statistical characteristics to support the model investigation. Three different
          cluster modelling approaches were then developed and evaluated: a
          “classical” cluster model, and two novel approaches - clustering with
          “subspace suppression”, and a “gearbox ID conditional” cluster model.

          Two sets of main gearbox bevel data were prepared – a “training set”
          comprising data from approximately 60 gearboxes to train the cluster models,
          and a “validation set”, comprising data from approximately 40 gearboxes for
          evaluation of the models.

3.3.1    Anomaly detection concept

          The basic anomaly detection concept is relatively simple. One or more cluster
          models are built for each gearbox shaft type, which can then be applied to any
          gearbox. For a particular gearbox, after each flight (or each day‟s flying), the
          newly acquired indicator values are pre-processed and then used to perform a
          model prediction fit to produce a set of new “model fit“ (Log Likelihood) values
          for that flight. The higher the values, the better the fit with the model.
          Assuming the model represents healthy data, if the newly acquired data has a
          good fit, the component can be assumed to be healthy. A time history of the
          model fit is plotted for a gearbox, and an alerting policy is implemented based
          on the identification of a negative trend in the time history of the model fit
          values. The time histories of many gearboxes can be plotted together for a
          simple visual comparison.

3.3.2    Classical cluster model

          Although it was known that classical cluster models would not provide a
          suitable anomaly detection solution owing to the previously identified problems
          with the inability to eliminate anomalous data from any training data set, these
          models were useful for reference purposes. There are two fundamental types
          of model – a “diagonal” model in which cluster axes are aligned with the
          indicator space, and a “full covariance” model, in which clusters can rotate to
          model correlations. Both types of model were applied and it was found that
          the diagonal model appeared to be less stable with more spurious clusters
          being generated, therefore the full covariance model was used. The models
          were built using a set of 9 of the most important indicators: M6*, ESA_PP,
          ESA_SD, GE_22, MS_2, SO1, SON, SIG_PP, SIG_SD.

          Although the models could identify the Scotia cracked bevel pinion data as
          clearly anomalous, other analysis had identified that the training set contained
          anomalies and, as expected, the models failed to detect these. This approach
          presents a dilemma. If anomalies are included in the training set the model
          learns to treat these anomalies as normal data, however filtering anomalies
          from the training data assumes that we know how to detect these. This type of
          assumption is typical in most published work on anomaly detection, but for

         HUMS data it presumes too much knowledge, for example there is little or no
         feedback on component condition from gearbox overhauls. Because
         knowledge of atypical behaviour is far from complete it is not practical to make
         a safe assessment of a gearbox state of health by visual inspection of HUMS
         indicator trends. The challenge is made even more difficult because
         gearboxes tend to occupy their own region of HUMS indicator space

3.3.3   Clustering with subspace suppression

         Cluster models were built using the BIC (Bayesian Information Criterion) to
         indicate an optimum model. When executing model fit predictions, parts of the
         cluster space that had a higher expectation of being associated with
         anomalous behaviour were then ignored. This “subspace suppression” used a
         heuristic approach to identify areas of the cluster space that might be
         associated with anomalies. Two heuristics were used for this subspace
         elimination: (i) Entropy – regions of space associated with low support and
         very few gearboxes (typically one or two) were eliminated from the prediction;
         (ii) Data dispersion – regions with points that were distant from other regions
         were eliminated from the prediction.

         There were some concerns over this approach. The measure of success of
         any modelling is the ability to generalize, and this is dependent on the
         heuristics for subspace suppression. Because these had still not been
         optimised, the “absolute” model did not identify one gearbox as anomalous
         when it had been determined by independent analysis that this gearbox had
         some anomalous data. As expected, the overall model fit for the validation
         data was lower than for the training data, which is a possible concern when
         defining an alerting strategy. It proved difficult to define a suitable diagonal
         model, and therefore it was necessary to rely on a full covariance model,
         however this limits the type of model diagnostics that could be extracted in the
         future. It is also not clear how this type of model should be updated to allow
         for changes over time.

         Despite these concerns, good results were obtained from the models, with
         very good visibility of the Scotia gearbox fault. Indeed, these results were
         better than expected, and it should be possible to improve the modelling with
         further optimisation.

3.3.4   Gearbox ID conditional cluster model

         A cluster space was defined conditioned on gearbox IDs. In this novel
         approach, model fit predictions were performed leaving out the selected
         gearbox ID cluster sub-space. Other clusters that had very small support were
         also eliminated from the prediction. Models with spurious clusters were
         allowed, provided these clusters were eliminated in the prediction.

         The optimum models generated thus far do not include M6*. Analysis of a
         number of data points with a predicted low fit showed that this was caused by
         M6* - the indicator is highly variable and the data contain many outliers. M6*
         also responds to a very different signal pattern than any other indicator and
         can cause a masking effect. This indicator may therefore be handled
         separately. We shall look further at the effect of M6* when we tune the

         Again, the models produced good results, with very good visibility of the
         Scotia gearbox fault. This approach offers a number of advantages. The
         models are much quicker to train, and are also more robust. The model fits for
         the validation data were higher and less spread than for the training data –
         this was consistent with the observation that the validation set was better
         behaved. The approach utilises a form of cross validation (a statistical
         technique utilised when there is a limited supply of data for training and
         validation). Models can be built using all the available data, which will produce
         better generalization in a fielded system, and it is easy to update a model with
         in-field experience. It is also possible to utilise the modelling approach
         demonstrated to the CAA at the proposal presentation in late 2003 to track
         changes in behaviour mode and assess their significance.

3.3.5   Example of model diagnostics

         Under the ProDAPS programme, Smiths is implementing some advanced
         modelling diagnostics. A set of indicator values can be used to predict the
         expected value of another indicator, with the output including a predicted
         value, predicted variance, and prediction support. If time permits, it may be
         possible to assess the utility of this type of prediction on HUMS data. A limited
         analysis has been performed to assess the influence of training attributes on
         the model fit. Although it is still in its early stages, this has the potential to
         reveal more information about a gearbox‟s fit within the model. For example,
         results for the Scotia cracked bevel pinion showed that the SO2 and ESA_SD
         indicators have the most anomalous trends given the behaviour of the
         remaining indicators.

3.3.6   Example feedback from BHL on identified anomalies in the training data

         Mr Hughes was asked to investigate some of the gearboxes with bevel pinion
         data that had been identified as anomalous in the modelling process. One
         gearbox that had produced unbelievably low model fit values had been
         removed due to high HUMS vibration levels (this had only 101 hrs remaining
         before overhaul). It was later confirmed that there had been a sensor problem.
         From the combined gearbox model fit time histories it was clear that the
         gearbox data was unbelievable, and it was therefore possible to diagnose the
         probable cause as a sensor problem directly from the results. In another case,
         it was found that a gearbox had been removed due to high No. 2 engine input
         vibration one day earlier than had been specified in BHL‟s gearbox
         maintenance records that had been supplied earlier. Therefore the detected
         anomaly was actually due to a gearbox change.

3.3.7   Summary

         A robust data modelling and anomaly detection approach has now been
         defined, and testing has shown that this approach can clearly detect the
         cracked CHC Scotia MGB bevel pinion. The results obtained provide a global
         picture of gearbox behaviour that should clearly identify abnormal trends, and
         also enable independent checking of gearbox state following an IHUMS alert.

         Results for the bevel pinion shaft suggest that the modelling approach will not
         trigger high numbers of false positive alerts, and will therefore provide a

      reliable alerting mechanism. Equally importantly, the approach provides
      valuable information on the degree of abnormality to support the maintenance
      decision making process (e.g. should a component be rejected?).


      The historical database of BHL AS332L IHUMS data is providing the primary
      source of fault data for system testing. This contains some documented faults,
      however the data analysis performed to date shows that there are additional
      undocumented faults and anomalies in the data set. Therefore, with BHL‟s
      assistance in investigating IHUMS indicator trends related to anomaly
      detection results, it is possible to perform a good assessment of anomaly
      detection capabilities using the existing data set.

      The most important fault data is the CHC Scotia IHUMS data for the cracked
      AS332L bevel pinion. As described above, this data has been received and

      Mike Horsey had previously sent Mr Larder documentary information on the
      availability of results from WHL‟s analysis of AS332L MGB data for the CAA
      seeded defect test programme. On each gearbox test a full analysis had only
      been performed on a small number of shafts where samples taken from the
      beginning, middle and end of the test indicated a potential fault related trend.
      Whilst re-processing of raw tape recorded vibration data could not be justified,
      a limited analysis of the data that had already been acquired could provide a
      useful addition to the results from the CAA‟s gearbox seeded defect test
      programme, and it was therefore agreed that this should be pursued (see
      action 4.2 in item 1).


      As described at the previous meeting, it is proposed that the anomaly
      detection system for the live trial is located at Smiths in Southampton. IHUMS
      data files would be automatically copied by BHL to an Internet Portal area
      each night. Data would be transferred overnight from BHL‟s Portal onto the
      trial system and then analysed (a user name and password are required for
      portal access). BHL would be able to remotely login to the trial system to view
      results at any time. An initial feasibility assessment was carried on this
      process. Mr Howson stated that a key output from the trial will be BHL‟s
      assessment of the usability of the system. It is therefore necessary for the
      trials implementation to be representative of a „real‟ system.

      The data imported from BHL‟s IHUMS NT Server totalled 132 Megabytes and
      by the end of the trial a worst case estimate is that this could have increased
      to approximately 300 Megabytes. The speed of data transfer was tested using
      data of approximately 3 Megabytes in size, and was measured at 57
      kbytes/sec at night, which is approximately 3 Megabytes per minute.
      Therefore downloading all data overnight would be expected to take less than
      two hours. Using a secure (https) protocol for the portal with compression
      could double this speed. For the remote login to view results, “RealVnc” is

       being tested to control the trial machine from a web browser. Access would be
       controlled with a username and password.


       Mr Larder presented an updated project schedule, which is shown in Appendix
       1. The project is approximately 6 months behind the original schedule. The
       delay has primarily been due to the extra level of effort required to achieve the
       project goals. There were a number of reasons for this. For example, there
       was a need to “decode” the IHUMS data files without assistance, and then
       add in gearbox serial numbers (IDs) and identify gearbox changes etc. The
       inability to create anomaly detection models based on pure “healthy” data
       required more fundamental research than had been anticipated to provide a
       robust anomaly detection approach for operational HUMS VHM data from a
       fleet of aircraft (as compared to the original study based on a small data set
       from a well controlled “seeded defect” test). Due to the large volume of data
       and intensive processing needed, considerable time was also required to
       perform modelling tasks.

       In addition to the delay, Mr Larder reported that the extra effort required had
       resulted in a cost overrun, and it was therefore necessary to carefully control
       the project scope. It was proposed that the anomaly detection process is
       implemented for the MGB and left and right AGBs (including the oil cooler fan)
       as these are the primary components of interest. Any additional analysis of
       “fault” data from other sources will be limited to the currently available gearbox
       seeded fault test data from WHL. The development of the “live trial” system
       will focus on the core functionality needed to achieve a successful evaluation
       of the anomaly detection approach. Mr Howson accepted the updated project
       status report and schedule.

7   A.O.B.

       Mr Howson asked Mr Larder to give a presentation on the results being
       obtained from the HUMS research project at the next HSRMC meeting on 9th
       November. Mr Larder agreed.


       The next meeting has been provisionally arranged for Thursday 20 October at
       Smiths Aerospace, Southampton, starting at 10:00am.

       Brian Larder
       Smiths Aerospace Electronic Systems - Southampton             31 August 2005

Appendix 1: Updated Project Schedule