Validation Report for “Rapid Development Thunderstorms” (RDT

Document Sample
Validation Report for “Rapid Development Thunderstorms” (RDT Powered By Docstoc
					                              Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                             Issue:1.3        Date:19 November 2007
                              Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                                  (RDT-PGE11 v1.3)           11_v1.3.doc
                                                             Page:                             1/24




      Validation Report for “Rapid
   Development Thunderstorms” (RDT-
              PGE11 v1.3)
                    SAF/NWC/CDOP/MFT/SCI/VR/01, Issue 1, Rev. 3
                                    19 November 2007




          Applicable to SAFNWC/MSG version 2008




Prepared by Météo-France / Direction de la Prévision
                          Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                         Issue:1.3        Date:19 November 2007
                          Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                              (RDT-PGE11 v1.3)           11_v1.3.doc
                                                         Page:                             2/24



                         REPORT SIGNATURE TABLE

  Function              Name                     Signature                   Date
Prepared by     MF/DP/Dprévi/PI                Yann Guillou           19 November 2007
Reviewed by     MF/DP/Dprévi/PI               Stéphane Sénési         19 November 2007
                L. F. López Cotín
Authorised by   SAFNWC Project                                        19 November 2007
                Manager
                         Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                        Issue:1.3        Date:19 November 2007
                         Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                             (RDT-PGE11 v1.3)           11_v1.3.doc
                                                        Page:                             3/24



                       DOCUMENT CHANGE RECORD


Version         Date         Pages                       CHANGE(S)
  1.3      12 October 2005            Creation
  1.3     19 November 2007      6     CDOP format reached
                                                 Validation Report for “Rapid               Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                                            Issue:1.3        Date:19 November 2007
                                                 Development Thunderstorms”                 File:SAF-NWC-CDOP-MFT-SCI-VR-
                                                     (RDT-PGE11 v1.3)                       11_v1.3.doc
                                                                                            Page:                             4/24



Table of contents

1.     INTRODUCTION ...................................................................................................................6
1.1      PURPOSE .................................................................................................................................6
1.2      SCOPE OF THE DOCUMENT .......................................................................................................6
1.3      DEFINITIONS, ACRONYMS AND ABBREVIATIONS .....................................................................6
1.4      REFERENCES ...........................................................................................................................6
   1.4.1     Applicable Documents...................................................................................................6
   1.4.2     Reference Documents ....................................................................................................6
2.     TUNING PGE11 SATELLITE-BASED DISCRIMINATION USING SEVIRI DATA...6
2.1        PRINCIPLE OF THE SATELLITE-BASED DISCRIMINATION ...........................................................6
2.2        LEARNING DATA SET ...............................................................................................................6
2.3        INITIAL RESULTS .....................................................................................................................6
                                                                                         ΔTtower = 6°.....................................6
2.4        ADAPTATION OF DETECTION METHOD WITH MSG DATA:
2.5        ELECTRICAL FILTER ................................................................................................................6
2.6        FINAL RESULTS OF MSG TUNING ............................................................................................6
2.7        PERSPECTIVES ........................................................................................................................6
3.     PGE11 1.2 VALIDATION......................................................................................................6
3.1      OVERVIEW ..............................................................................................................................6
   3.1.1     Modus operandi.............................................................................................................6
   3.1.2     Study data base..............................................................................................................6
3.2      VALIDATION OF PGE11 DISCRIMINATION 1.2 RELEASE ...........................................................6
   3.2.1     Overall quality of discrimination ..................................................................................6
   3.2.2     Detailed quality of discrimination for summer period ..................................................6
3.3      CONCLUSION ..........................................................................................................................6
4.     PERSPECTIVES.....................................................................................................................6
4.1        DEFINE NEW POPULATIONS .....................................................................................................6
4.2        CHANGE THE DISCRIMINATION METHOD..................................................................................6
                                                   Validation Report for “Rapid                 Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                                                Issue:1.3        Date:19 November 2007
                                                   Development Thunderstorms”                   File:SAF-NWC-CDOP-MFT-SCI-VR-
                                                       (RDT-PGE11 v1.3)                         11_v1.3.doc
                                                                                                Page:                             5/24



List of Tables and Figures

Figure 1: Periods of electrical activity chosen (right) . Initial domain of tuning (left) . Final
    domain appears as a red dash line rectangle. Black contour defines the area of good lightning
    detection. ...................................................................................................................................6
                                                                             ΔT
Figure 2: Diagram illustrating for MSG, RSS and GOES ( tower =3°C) data the quality of
    discrimination tuning upon cost ratio. .......................................................................................6
Figure 3: Characteristics of convective trajectories start. MSG(left) and GOES (right) cases. ........6
Figure 4: 20040801 at 09h30 case. Zoom on the result of detection method of RDT software
    applied to MSG image (left) and RSS (right). Orphan impacts (circle negative, cross positive)
    are yellow, those paired with a cloud cell are blue. Green numbers for electric cloud cells.....6
Figure 5: 20040801 at 09h30 case. Zoom on the result of adapted ( ΔTtower 6°) detection method of
    RDT software applied to MSG image. ......................................................................................6
                                          ΔT                                                         ΔT
Figure 6: Diagram illustrating for MSG( tower =3°C and 6°C) , RSS and GOES ( tower =3°C)
    data the quality of discrimination tuning upon cost ratio. .........................................................6
Figure 7: Precocity of discrimination against first flash (right). Grey arrow points for mean cost
    (=1) ............................................................................................................................................6
Figure 8. Monthly discrimination skill at several costs. All trajectory durations are considered.
    One curve per month, one point per cost ratio (50, 20, 10, 5, 1, 0.7, 0.5, 0.3, 0.2, 0.1). Cost
    ratios increase from bottom left to upper right. .........................................................................6
Figure 9. Monthly discrimination skill at fixed cost. One curve per month, one point per tracking
    duration and low duration on the left side. Cost ratio =1 (left graph). Cost ratio =5 (right
    graph).........................................................................................................................................6
Figure 10. Distribution of duration trajectory with population type .................................................6
Figure 11. Monthly distribution of discrimination skills upon trajectory duration. A curve for
    each duration class, one point for each cost value. Learning data set skills on top (June and
    August 2004). ............................................................................................................................6
Figure 12. Electrical activity (number of flashes) of convective cells. Learning data set (above),
    August 2004 (lower left), September 2005 (lower right). .........................................................6
Figure 13. Precocity of discrimination vs ground truth of discrimination tuning (left, strong
    activity), and vs first flash (right). .............................................................................................6
Figure 14. Discrimination skill for heat period .................................................................................6




Table 1: List of Applicable Documents.............................................................................................6
Table 2: List of Referenced Documents ............................................................................................6
Table 3: Distribution of population among ΔTtower value.................................................................6
Table 4: Monthly distribution of trajectories before and after previous filtrations on study dataset.
     ...................................................................................................................................................6
                                     Validation Report for “Rapid       Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                        Issue:1.3        Date:19 November 2007
                                     Development Thunderstorms”         File:SAF-NWC-CDOP-MFT-SCI-VR-
                                         (RDT-PGE11 v1.3)               11_v1.3.doc
                                                                        Page:                             6/24



1. INTRODUCTION

1.1 PURPOSE
The discrimination method of PGE11 has been developed in order to propose a possible
discrimination of convective objects over regions where lightning detection data are not available
or reliable. PGE11 aims to identify convective systems before lightning occurs.

1.2 SCOPE OF THE DOCUMENT
PGE11 has been first developed, tuned and validated using GOES and METEOSAT-7 (RD.1,
RD.2, RD.3).
A First tuning with SEVIRI data was undertaken for the delivery of 1.0 release (RD.4).
Nevertheless, the availability of representative dataset of a summer period was necessary for a
complete tuning of discrimination, before a validation process.
This document presents the tuning of PGE11 using MSG-SEVIRI data of 2004 summer, and the
validation undertaken over long periods.
It applies to the algorithm implemented in the release 1.2 of the SAFNWC/MSG SW package.

1.3 DEFINITIONS, ACRONYMS AND ABBREVIATIONS
SEVIRI                Spinning Enhanced Visible and InfraRed Imagery
GOES                  Geostationary Operational Environmental Satellite




1.4 REFERENCES

1.4.1 Applicable Documents
Reference                           Title                                          Code                 Vers
 [AD.1.]           Algorithm Theorical Basis Document                  SAF/NWC/CDOP/MFT/SCI/ATBD/11      1.3
 [AD.2.]                   Product Use Manual                           SAF/NWC/CDOP/MFT/SCI/PUM/11      1.3
 [AD.3.]     Interface Control document for the External and             SAF/NWC/CDOP/INM/SW/ICD/1      2008
                            Internal Interfaces
 [AD.4.]       Interface Control Document for the input and              SAF/NWC/CDOP/INM/SW/ICD/3      2008
                            output data formats
 [AD.5.]      Software User Manual for the SAFNWC/MSG                    SAF/NWC/CDOP/INM/SW/SUM/2      2008
                         Application, Software Part
                                          Table 1: List of Applicable Documents


1.4.2 Reference Documents
Reference                    Title                                      Code                 Vers      Date



                                      Table 2: List of Referenced Documents
                                         Validation Report for “Rapid     Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                          Issue:1.3        Date:19 November 2007
                                         Development Thunderstorms”       File:SAF-NWC-CDOP-MFT-SCI-VR-
                                             (RDT-PGE11 v1.3)             11_v1.3.doc
                                                                          Page:                             7/24



2. TUNING PGE11 SATELLITE-BASED DISCRIMINATION USING
SEVIRI DATA
The first discrimination files issued from a tuning with GOES data were implemented in 1.0
release of PGE11, because of the lack of summer MSG data at that time (see R.D.4). This
approach seemed reasonable given the close spatial resolution and frequency of GOES and MSG
infrared images; moreover a tuning with METEOSAT Rapid Scan Service showed a poorer
quality than GOES despite a higher update rate of images.
The availability of MSG data for 2004 summer allowed a specific tuning of PGE11 discrimination
method.

2.1 PRINCIPLE OF THE SATELLITE-BASED DISCRIMINATION
The discrimination method is based on a statistical approach. PGE11 tuning analyses how and
when characteristics of cloud systems are linked to convective activity on a learning data set.
Electrical data are used as “ground truth” in this learning data set, to identify convective systems
from non convective ones.
At first step, a segmentation method is applied in three levels: tracking duration, minimum
temperature reached and area at threshold brightness. For each final classes defined, the tuning
computes the skill of discriminatory parameters:
              -      Mean peripheral gradient;
              -      95 percentile of peripheral gradient;
              -      Minimum temperature–based cooling rates;
              -      Mean temperature-based cooling rates;
              -      Initial temperature-based cooling rates (only for the first duration class).


In order to compare the discrimination quality for various periods of the year and for various
satellite data, we have to use a different false alarm score than the one used in former Reference
Document (RD), because the False Alarm Rate (FAR)1 is very sensitive to the convective
occurrence, which of course varies a lot with the season and area study. We hence rather use the
Probability of False Detection (POFD), also called “False Alarm Ratio”, which represent the ratio
of false detection cases to all non-convective cases, expressed as a percentage, and writes as
follows:
         Let Ncv be the number of convective systems.
         Let Ngood be the number of convective systems that are correctly discriminated as
         convective by the satellite-based discrimination method.
         Let Nfalse be the number of non-convective systems that are wrongly discriminated as
         convective.
                                         N false                          N good
         Then, POFD = 100 ×                             and POD = 100 ×            .
                                     N non convective                     N cv



1                     N false
    FAR = 100 ×
                  N good + N false
                                          Validation Report for “Rapid           Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                                 Issue:1.3        Date:19 November 2007
                                          Development Thunderstorms”             File:SAF-NWC-CDOP-MFT-SCI-VR-
                                              (RDT-PGE11 v1.3)                   11_v1.3.doc
                                                                                 Page:                             8/24




A relative cost of false-alarms (i.e. cloud systems ill-classified as “convective”) against misses
(i.e. convective systems not classified as “convective” by the discrimination method) is defined
( CFA/ND ). The best discrimination parameter is identified for each final class, optimal threshold
values are computed for several costs. The range of CFA/ND values lies between 0.1 and 50.



2.2 LEARNING DATA SET
34 days during summer 2004 were used to tune RDT software with MSG data, corresponding to
periods of electrical activity : 6-15 and 15-20 June, 1-11 and 16-21 August.




 Figure 1: Periods of electrical activity chosen (right) . Initial domain of tuning (left) . Final domain appears as a red dash
                         line rectangle. Black contour defines the area of good lightning detection.



Parameters of the detection algorithm (see ¡Error! No se encuentra el origen de la referencia.4)
used during this tuning were:
Tcold = –55°C / Twarm = 5°C / ΔT = 1°C / ΔTtower = 3°C / A min = 1 km2.


Running the detection and tracking algorithms on these images has first led to a convective
sample of 1383 convective systems (i.e. with electrical activity below their RDT objects) and
140548 non-convective systems (i.e. without electrical activity below their RDT objects).
                                        Validation Report for “Rapid         Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                             Issue:1.3        Date:19 November 2007
                                        Development Thunderstorms”           File:SAF-NWC-CDOP-MFT-SCI-VR-
                                            (RDT-PGE11 v1.3)                 11_v1.3.doc
                                                                             Page:                             9/24



The tuning has been done for several values of CFA/ND (0.1, 0.3, 0.5, 0.7, 1, 2, 3, 5, 10 and 50) on
the 770 final classes:
         First level: 22 classes of duration (in minutes)2
         [15;30[; [30;45[; [45;60[; [60;75[; [75;90[; [90;105[; [105;120[; [120;135[; [135;150[;
         [150;165[; [165;180[; [180;195[; [195;210[; [210;240[; [240;270[; [270;300[; [300;330[;
         [330;360[; [360;390[; [390;420[; [420;450[; ≥450.
         Second level: 7 classes of minimum temperature reached (during the duration )(in °C):
         ]0;5]; ]-10;0]; ]-20;-10]; ]-30;-20]; ]-40;-30]; ]-50;-40]; ≤-50
         Third level: 5 classes of maximum cloud cell area at threshold brightness (during the
         duration) (in km2):
         [10;100[; [100;200[; [200;500[; [500;1000[; ≥1000

2.3 INITIAL RESULTS
With the same parameters of implementation, the results over Europe were disappointing, similar
to those of Rapid Scan Service for the low cost ratio, but degrading compared with GOES score.




Figure 2: Diagram illustrating the quality of discrimination tuning upon cost ratio for MSG, RSS and GOES (
                                                                                                              ΔTtower =3°C)
                                                             data.




2
 Each trajectory is over-sampled in several classes of duration. For example, a cloud trajectory
which has 60 minutes of lifetime will fall in duration classes as below:
    -    [15;30[ : trajectory on the first time slot (one picture)
    -    [30;45[: trajectory on two time slots
    -    [45;60[: trajectory on three time slots
                                      Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                            Issue:1.3        Date:19 November 2007
                                      Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                          (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                            Page:                            10/24


A further analysis of the trajectories labelled as “convective”, i.e. electrically active, has shown
that there were constituted of more small cells with MSG data than with RSS, or even GOES data.
Another point is a shorter duration of these trajectories when compared to GOES ones.
Moreover, characteristics of convective trajectories start showed a large amount of system
detected at cold temperature thresholds, with cold minimum temperature. When compared with
GOES tuning case, it appeared that this detection at cold temperature was excessive.




             Figure 3: Characteristics of convective trajectories start. MSG(left) and GOES (right) cases.



The detection method of RDT software had consequently to be adapted to the spatial resolution of
MSG




2.4 ADAPTATION OF DETECTION METHOD WITH MSG DATA: ΔTtower = 6°
Cases studies have shown that a given situation may be differently analysed by RDT software
depending on the resolution of input data.
Indeed MSG resolution allows to select cloudy ”budding” satisfying the vertical extension criteria
( ΔTtower of 3° i.e. around 300m). With less detailed Rapid Scan Service data, the same cloud
systems are more frequently defined at warmer threshold and by one single larger cell. The high
resolution lead also to different pairing of lightning data with cloud cells, and to more orphan
flashes in MSG case.
Figure below illustrate August 1st 2004 09h30 UTC case. The cloud system is identified as a
whole in RSS case, thanks to a warmer chosen threshold. All lightning flashes are paired with this
cell. The system is represented by several smaller and colder cells in MSG case, only two of them
being paired with few lightning flashes. Most lightning flashes are here orphan.
                                         Validation Report for “Rapid         Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                              Issue:1.3        Date:19 November 2007
                                         Development Thunderstorms”           File:SAF-NWC-CDOP-MFT-SCI-VR-
                                             (RDT-PGE11 v1.3)                 11_v1.3.doc
                                                                              Page:                            11/24




Figure 4: 20040801 at 09h30 case. Zoom on the result of detection method of RDT software applied to MSG image (left) and
   RSS (right). Orphan impacts (circle negative, cross positive) are yellow, those paired with a cloud cell are blue. Green
                                              numbers for electric cloud cells.



In such cases, cells detection and tracking largely differ. Moreover, we can suspect in MSG case
that small developed cells issued from a same convective cloud system are distributed in
convective and non-convective populations depending on the paired lightning flashes.
It appeared consequently necessary to adapt the detection method to higher resolution of input
data. The vertical extension threshold has been increased in RDT software implementation, in
order to limit an excessive splitting of cloud systems in small cells. As can be seen in figure
below, this adapted detection method seems to match the objective of a correct identification of
the “active” cloud system.




Figure 5: 20040801 at 09h30 case. Zoom on the result of adapted ( ΔTtower 6°) detection method of RDT software applied
                                                      to MSG image.



This adaptation should impact mainly cloud systems whose vertical developments are or remain
of limited extension. A more complete sensitivity study on cloud cells definition has been
undertaken for different values of ΔTtower . The values of 6° is effective, no impact on convective
population. An abstract on population effect is depicted in Tab below.


                          ΔTtower 2° ΔTtower 3°           ΔTtower 4°        ΔTtower 5°         ΔTtower 6° ΔTtower 7°
Convective                      1371              1317             1263                1197         1147             1046
No Convective                125968             97150             79707               68075        60194           54380
                                Table 3: Distribution of population among   ΔTtower   value.

The majority of discarded convective cells show a cold temperature at time of first detection
(below –15°C).
                                 Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                Issue:1.3        Date:19 November 2007
                                 Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                                     (RDT-PGE11 v1.3)           11_v1.3.doc
                                                                Page:                            12/24


2.5 ELECTRICAL FILTER
The lightning activity is used as « ground truth » to identify the populations: convective or not.
More precisely, from above 5 lightning flashes, a trajectory is supposed to be convective, and
without any lightning activity it is supposed to be non convective.
In order to eliminate possible cases of convective systems labelled as “non convective” in the case
of a lack of accuracy location of lightning data, the domain of interest has been reduced to be fully
included inside the area of good detection of Meteorage lightning data (see black contour and
dashed rectangle on figure ). The advantage was an approach similar to the one adopted for the
tuning with GOES data. On the other hand, the limited extension of the area in this case did lower
the number of long duration trajectories.
The second approach consisted in eliminating from convective population those which could be
actually non convective, by enforcing the electrical activity criteria. To be labelled as
“convective”, a cloud system should then show at least 50 lightning flashes during its first 3
hours, or at least 5 lightning flashes over any 15 minutes time slot. The advantage was to focus on
more active systems. On the other hand, the number of convective trajectories became reduced by
half. By applying this criteria, we just could note an increase of the proportion convective
trajectories associated with strong values of cooling rate.
Both approaches, which can be considered as upstream filters, have been then combined. The
results showed much better discrimination tuning when compared to this with RSS or GOES skill,
i.e next subsection.

2.6 FINAL RESULTS OF MSG TUNING
A new tuning of discrimination has been undertaken for RDT software applied to the same MSG
data, taking into account the following modifications :
           Adapted detection method with vertical extension threshold of 6° to run the software
           over summer periods
           Reduced geographical domain to be fully included in the area of good lightning
           detection
           Upstream filter based on a strong electrical activity for labelling a cloud systems as
           “convective”
In these conditions, the tuning is based on a sample of 848 convective trajectories, and 60194 non
convective ones.
We can observe with ΔTtower of 6° fewer orphan lightning flashes, indicative of a better pairing,
and longer trajectory duration.. The distribution of convective cells areas shows a shift towards
larger values, cloud systems being now tracked (defined) at warmer thresholds.
The results point to a MSG tuning close to the GOES one, and much better than in the RSS case.
The figure below confirms that the upstream filter succeeds in focusing on more marked
convective systems, with a better discrimination tuning.
                                    Validation Report for “Rapid        Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                        Issue:1.3        Date:19 November 2007
                                    Development Thunderstorms”          File:SAF-NWC-CDOP-MFT-SCI-VR-
                                        (RDT-PGE11 v1.3)                11_v1.3.doc
                                                                        Page:                            13/24




Figure 6: Diagram illustrating for MSG(
                                       ΔT   tower =3°C and 6°C) , RSS and GOES (   ΔTtower =3°C) data the quality of
                                       discrimination tuning upon cost ratio.
                                       Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                             Issue:1.3        Date:19 November 2007
                                       Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                           (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                             Page:                            14/24



The precocity of the MSG detection and discrimination tuning are illustrated below.




         Figure 7: Precocity of discrimination against first flash (right). Grey arrow points for mean cost (=1)



We can notice that the precocity of discrimination remains weak, even if the precocity of detection
is satisfactory. Moreover, it appears that the upstream filter concerning electrical activity does
lower this precocity : a discrimination tuned towards more active convective systems seems more
efficient, but also seems to discriminate later.


Finally, the results obtained allowed the constitution of 1.2 release of RDT software for
SAFNWC.



2.7 PERSPECTIVES
The modifications detailed in the previous paragraphs have shown a better adaptation of RDT
software to MSG data. Nevertheless, the results remain below the objectives of a really early
convective discrimination.


A complete tuning with MSG data should take into account the following points :
                A discrimination tuning focusing on trajectories starting in low-levels (at warm
                temperature thresholds), because MSG channels cannot provide useful information
                when convective development occurs below a cloud shield;
                An examination of a different ground truth for the separation of convective / non
                convective populations inside learning data set, because issues regarding lightning
                data and MSG data pairing may introduce some noise in the learning data set.
                                  Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                 Issue:1.3        Date:19 November 2007
                                  Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                                      (RDT-PGE11 v1.3)           11_v1.3.doc
                                                                 Page:                            15/24



3. PGE11 1.2 VALIDATION

3.1 OVERVIEW

3.1.1 Modus operandi

The purpose of the long period validation is to assess PGE11 skill to detect and track severe
thunderstorm on wide area and long period.
The ground truth remaining cloud-to-ground lightning flashes, the area of the objective validation
is restricted to METEORAGE network coverage area, for which lighting data access is easy.
This validation is divided in three steps:
            Build the data base for the study:
                    o Analysis of MSG data in order to discard non-nominal data, wrong dataset
                      and identify missing images ;
                    o Analysis of electrical activity in order to choose interest period for
                      validation.
            Analyse the study data base with PGE11 software (without activating the
            discrimination mechanism) and match computed trajectories with lightning data ;
            Apply discrimination algorithm to trajectories according to tracking duration and for
            each false alarm to non detection cost ratio, according to multi-level segmentation and
            to cost as recalled in first chapter. Check discrimination with ground truth (lightning
            activity criteria as in electrical filter, see §2.5).
For this study, PGE11 use the same parameters than the tuning and than V1.2 release:
deltaTtour=6° and discrimination file, see first chapter.



3.1.2 Study data base

The validation period is July 2004 to August 2005. 86 pictures were identified as corrupted and
hence, blacklisted.
Base on electrical activity and quality of MSG data (period of missing data), the study periods
were chosen as:
        15-25 July 2004 / 11-21 September 2004 / 13-17 et 25-30 October 2004
        1-5 et 13-16 November 2004 / 10-20 December 2004 / 15-20 et 26-31 January 2005
        13-23 February 2005 / 21-31 March 2005 / 7-17 April 2005 / 10-20 May 2005
        1-30 June 2005 / 1-31 July 2005 / 1-31 August 2005
For consistency with the tuning method, we applied same filters on trajectories of study dataset:
namely filtering out the “algorithm warm-up” period, and keeping only the trajectories beginning
with type “normal”, and without spatial or time break.
                                       Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                             Issue:1.3        Date:19 November 2007
                                       Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                           (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                             Page:                            16/24




                                   Trajectory Number                                   Ground Truth
                    Initial   After filtering
                              Spin-up       Normal              Edge                0 flash         50/3hours
                                (6h)       beginning                                                5/cell/15mi
                                                                                                         n
        Month                                            Study dataset         No Convective        Convective
        200406      42979     41984        29841         26339                    25649                 256
                              (-2.3%)      (-29%)        (-11,7%)
        200407      29881     29575        21043         18695                      17940                325
                              (-1%)        (-29%)        (-11%)
        200408      58233     56939        40198         36058                      34545                592
                              (-2.2%)      (-29.4%)      (-10.2%)
        200409      34258     33117        23850         21089                      20721                113
                              (-3.3%)      (-28%)        (-11.6%)
        200410      31549     29773        21007         17880                      17519                97
                              (-5.6%)      (-29.5%)      (-15%)
        200411      18511     17128        12174         10227                      10167                14
                              (-7.5%)      (-29%)        (-16%)
        200412      15541     15239        11007         9397                        9353                 5
                              (-2%)        (-27.7%)      (-14.6%)
        200501      14182     13578        9800          7868                        7827                 7
                              (-4.2%)      (-27.8%)      (-19.7%)
        200502      18882     18496        12770         10483                      10341                22
                              (-2%)        (-31%)        (-18%)
        200503      27105     26623        19468         16822                      16685                32
                              (-1.8%)      (-26.8%)      (-13.5%)
        200504      26320     25922        18556         15869                      15643                25
                              (-1.5%)      (-28.4%)      (-14.5%)
        200505      29338     28868        20679         18163                      17812                123
                              (-1.6%)      (-28.3%)      (-12%)
        200506      80271     79551        56817         50902                      49472                598
                              (-0.9%)      (-28.6%)      (-10.4%)
        200507      78553     76925        54649         49205                      48010                448
                              (-2%)        (-29%)        (-10%)
        200508      80335     79389        56675         50662                      49745                290
                              (-1.2%)      (-28.6%)      (-10.6%)
          Table 4: Monthly distribution of trajectories before and after previous filtrations on study dataset.



These filters discard more than one out of three trajectories. The majority of discarded trajectories
do not show a “normal” start. These systems were originated by a ”split” for most part and as
“complex” (“split” combined with “merge”) for the other one. We must recall the physic
interpretation behind this terminology, “split” and “merge” could as well be generated from actual
splitting or merging of cloud systems than be an effect of the adaptive brightness thresholding
scheme: cooling, heating, or secondary cloud tower emergence.
The same results concerning filters effect are observed with GOES and METEOSAT data.
Moreover, on notice how low the number of winter convective population is, which doesn’t allow
to fully assess PGE11 discrimination on this period (November to April).
                                         Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                               Issue:1.3        Date:19 November 2007
                                         Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                             (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                               Page:                            17/24


3.2 VALIDATION OF PGE11 DISCRIMINATION 1.2 RELEASE

3.2.1 Overall quality of discrimination

The figure below points out the variability of skill with the period of year. The discrimination is
poor for cold period (winter in blue, springtime in green). The discrimination skills for June and
August 2004 are clearly best, which could show an over discrimination on the learning dataset
(these months have been used to tune PGE11 discrimination).




 Figure 8. Monthly discrimination skill at several costs. All trajectory durations are considered. One curve per month, one
     point per cost ratio (50, 20, 10, 5, 1, 0.7, 0.5, 0.3, 0.2, 0.1). Cost ratios increase from bottom left to upper right.



The discrimination skill for August 2005 (empty circle on magenta dash line) is bad relative to
other summer months. We show in next subsection that, due to the tuning choice, this bad result is
linked with a low level of electrical activity for this month.


At this stage, we can only conclude that RDT seems to show a poor efficiency for the cold period
(albeit on a small sample of convective cases).
                                        Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                              Issue:1.3        Date:19 November 2007
                                        Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                            (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                              Page:                            18/24



The precocity is not analyzed at this stage.




Figure 9. Monthly discrimination skill at fixed cost. One curve per month, one point per tracking duration and low duration
                         on the left side. Cost ratio =1 (left graph). Cost ratio =5 (right graph).



At fixed cost ratio (balanced cost ratio of 1 and cost ratio of 5 in order to decrease false alarms),
we depict monthly scores for different tracking duration (15 to more 450 minutes). A one hour
duration is represented by the fourth point from bottom. All curves could be compared to the
tuning period ones (red thick curves).
It shows clearly that the quality is rapidly degrading, in terms of false alarms, on the test data set
when trajectory duration increases. This is not a paradox; with the recommended V1.2 setting for
RDT detection parameter, there are numerous short-lived cold cells in the non-convective sample;
hence the ratio of non-convective trajectories to convective trajectories become much lower when
duration do increase, and this mechanically raises the POFD value (see formula), except for the
(over-determined) learning dataset.




                                                     Figure 10. Distribution of duration trajectory with population type
                              Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                             Issue:1.3        Date:19 November 2007
                              Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                                  (RDT-PGE11 v1.3)           11_v1.3.doc
                                                             Page:                            19/24




3.2.2 Detailed quality of discrimination for summer period
Discrimination
The convective discrimination skill during the summer period has to be assessed against the
calculated quality of discrimination on the learning data set (June and August 2004).
 Months of learning dataset
 Summer 2004
                                         Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                               Issue:1.3        Date:19 November 2007
                                         Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                             (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                               Page:                            20/24
 Summer 2005




Figure 11. Monthly distribution of discrimination skills upon trajectory duration. A curve for each duration class, one point
                       for each cost value. Learning data set skills on top (June and August 2004).



These graphs further show that the quality of discrimination on non-dependant data, either during
2004 or 2005 summer cannot reach that of the learning data set.
The discrimination skill seems to be associated rather with electrical activity than with overall
number of convective cloud cells: despite less numerous convective systems, September 2004 for
example show higher skills than August 2005. Figures below reveal a stronger per-cell electrical
activity for convective cells in September, with characteristics closer to the learning data set
characteristics.
                                         Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                               Issue:1.3        Date:19 November 2007
                                         Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                             (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                               Page:                            21/24




 Figure 12. Electrical activity (number of flashes) of convective cells. Learning data set (above), August 2004 (lower left),
                                                September 2005 (lower right).




Precocity of discrimination
The precocity of discrimination is gauged with respected to electrical activity start (first flash),
and also with the « ground truth » used for discrimination tuning (in that case the start at stronger
electrical activity with 5 flashes per cell, i.e. in 15 minutes).
The figure below shows the precocity results for some summer months. Less than half of
convective systems are diagnosed as convective at a medium cost ratio of 1, before their strong
electrical activity occur. The percentage is only 20% when taking into account the time interval
with the first flash. On the other hand, even with respect to the first flash only and for a balanced
cost ratio of 1, around 80% (and sometimes 90%) of clouds cells are diagnosed as convective no
later than 30 minutes after the flash.
                                            Validation Report for “Rapid           Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                                   Issue:1.3        Date:19 November 2007
                                            Development Thunderstorms”             File:SAF-NWC-CDOP-MFT-SCI-VR-
                                                (RDT-PGE11 v1.3)                   11_v1.3.doc
                                                                                   Page:                            22/24




Figure 13. Precocity of discrimination vs ground truth of discrimination tuning (left, strong activity), and vs first flash (right).
                                 Validation Report for “Rapid          Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                       Issue:1.3        Date:19 November 2007
                                 Development Thunderstorms”            File:SAF-NWC-CDOP-MFT-SCI-VR-
                                     (RDT-PGE11 v1.3)                  11_v1.3.doc
                                                                       Page:                            23/24




3.3 CONCLUSION
We can conclude that the quality of the discrimination step of RDT software shows a strong
variability depending on the seasons. As expected, the choices made for discrimination tuning (see
first chapter) lead to link good discrimination and strong electrical activity. Over the domain of
study, the quality of discrimination seems rather poor out of summer season, i.e. from November
to April.
Moreover, convective diagnostic (discrimination) by PGE11 often comes late when compared to
first lightning flash.




                               Figure 14. Discrimination skill for heat period

The quality of discrimination during summer period and the quality published in the Software
User Manual (R.D.5) are comparable.
Cold periods show low probability of detection (POD), and, for long trajectory duration, most
periods show high probability of false detection (POFD). The filters of convective systems,
implemented in PGE11 since 1.0 release, represent in that case a necessary additional step for
making the discrimination useful. This filter had already been validated in 2003 thanks to a real-
time experiment with local forecasters. Its efficiency has been assessed again, particularly for long
“convective” trajectories.
Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                               Issue:1.3        Date:19 November 2007
Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
    (RDT-PGE11 v1.3)           11_v1.3.doc
                               Page:                            24/24
                                   Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                                  Issue:1.3        Date:19 November 2007
                                   Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                                       (RDT-PGE11 v1.3)           11_v1.3.doc
                                                                  Page:                            25/24



4. PERSPECTIVES
The real time validation performed on RDT V1.2 shows some weakness of the discrimination
method. These are note related with MSG data, but the high space resolution emphasizes some
limitations of the current method. In order to improve PGE11 discrimination skill, we have
several proposals described in next subsections.



4.1 DEFINE NEW POPULATIONS
The RDT discrimination algorithm builds three populations: Convective, no convective and
others. Each population is sub-divided according to tracking duration (after re-sampling each
trajectory in order to extract all its sub-trajectories with the same start). The third one is not
studied but accounts thirty percent of the whole population. It mixes trajectories from warm-up
period, or with spatial or time break, or with “split” or “complex” start for the most share.
Moreover, convective population includes both convective towers from mature system (i.e. more
or less well-defined overshoots) and convective towers from thunderstorm developing rapidly in
clear air. The latter are tracked from their triggering, while the former are sensed rather late by the
IR and VIS channels. Therefore, the discrimination is not tuned on an homogenous population.
We have to focus more on the aim of this PGE. To improve precocity of detection and physical
rationale behind trajectory classification, we should increase the homogeneity of convective
populations; so we suggest two main enhancements:
            -     separate developing thunderstorm from mature system;
            -     use either lightning activity or a satellite-based signature of vertical development
                  as the time origin for convective trajectories;


These proposals change current strategy of PGE11. Indeed, we would only perform discrimination
on those cloud cells being possibly a developing thunderstorms (defined for instance as those first
detected at a brightness temperature threshold warmer than a fixed value, or close to ground
temperature). Once cloud cells have reached a significant vertical development (as identified by
the fact that a cell is defined at a rather cold temperature threshold), we would basically only track
and depict these cold cloud cells (without discrimination) . This would lead to describe and track
the convective systems through their embedding cloud shield. A further development could allow
to additionally track the overshoots, as sub-objects of convective systems.



4.2 CHANGE THE DISCRIMINATION METHOD
The discrimination method described in first chapter combines a segmentation approach with
discrimination parameters. The result of V1.2 validation shows an over discrimination due to the
high number of classes relatively to the number of convective occurrences in the learning data set.
We suggest to :
        1- First assess results of an unchanged method when both releasing the over
           determination (through a significant reduction of the number of classes) and using the
           classification by minimum cloud cell temperature reached as a proxy to the separation
           of developing and mature storms. Adapting the parameters of the scheme used for cell
           definition (adaptative thresholding) could also help in this respect.
                        Validation Report for “Rapid   Code:     SAF/NWC/CDOP/MFT/SCI/VR/01
                                                       Issue:1.3        Date:19 November 2007
                        Development Thunderstorms”     File:SAF-NWC-CDOP-MFT-SCI-VR-
                            (RDT-PGE11 v1.3)           11_v1.3.doc
                                                       Page:                            26/24


2- And then, depending on the results reached, discard the multi-level segmentation
   scheme and add discrimination parameters:
   -   Add morphological (IR based) discrimination parameters;
   -   Add discrimination parameters describing the stability and moisture environmental
       characteristics, like other PGEs output and/or NWP derived parameters;
   -   Use principal component and/or factorial analyses as a basis for the discrimination
       algorithm.