Document Sample
Seismic_formats Powered By Docstoc
					                                                          Data Archival, Exchange and Seismic Data Formats
                                                                              Bernard Dost1), Jan Zednik2) , J. Havskov 3), R. Willemann4) and P. Bormann5)

1)   ORFEUS Data Center, Seismology Division KNMI, P.O. Box 201,3730 AE De Bilt, The Netherlands, Jan Zednik, Geophysical Institute AS CR, Bocni
     II/1401, 141 31 Prague Czech Republic,3) Jens Havskov, Institute of Solid Earth Physics, University of Bergen, Allegaten 41, 5007 Bergen, Norway,
          Raymond J. Willemann, International Seismological Centre, Pipers Lane, Thatcham, Berkshire RG19 4NS, UK-England; 5) Peter Bormann,
                                  GeoForschungsZentrum Potsdam, Telegrafenberg E428, D-14473 Potsdam, Germany,.

                                                                                                                                                              NORDIC format                                                             Some commonly encountered digital data formats
                                                                                                                                                                                                                            The following section gives an alphabetical list of common formats in use. The list of
     Seismology entirely depends on international co-operation. Only the accumulation of                                                                                                                                    formats will of course not be complete, particularly for formats in little use, however
                                                                                                                  In the eighties, there was one of the first attempts to create a more complete format for                 the most important formats in use today (2000) are included. In a later section, a list of
     large sets of compatible high quality data in standardized formats from many stations
                                                                                                                  data exchange and processing. The initiative came from the need the exchange and                          popular analysis software systems is mentioned as well as a brief description of some
     and networks around the globe and over long periods of time will yield sufficiently
                                                                                                                  store data in Nordic countries and the so called Nordic format was agreed upon among                      conversion programs.
     reliable long-term results in event localization, seismicity rate and hazard assessment,
                                                                                                                  the 5 Nordic countries. The format later became the standard format used in the
     investigations into the structure and rheology of the Earth interior and other priority                                                                                                                                In the following only those formats are listed which can be converted by at least one
                                                                                                                  SEISAN data base and processing system and is now widely used. The format tried to
     tasks in seismological research and applications.                                                                                                                                                                      of these analysis software systems.
                                                                                                                  address some of the shortcomings in HYPO71 format by being able to store nearly all
              For almost a century, only parameter readings taken from seismograms were                           parameters used, having space for extensions and useful for both input and output. An                     It is of particular importance on which computer platform the binary file has been
     exchanged with other stations and regularly transferred to national or international                         example is given in below.                                                                                written since only a few analysis programs work on more than one platform.
     data centers for further processing. Because of the uniqueness of traditional paper                                                                                                                                    Therefore, the data file should usually be written on the same platform as the one on
     seismograms and lacking opportunities for producing high-quality copies at low cost,                                                                                                                                   which the analysis program is run.
     original analog waveform data, cumbersome to handle and prone to damage or even                              1996     6 6 0648 30.4 L          62.635    5.047 15.0     TES 13 1.4 3.0CTES 2.9LTES 3.0LNAO1

     loss, were rarely exchanged. The procedures for carefully processing, handling,                              GAP=267             5.92           18.8      43.0 31.8 -0.5630E+03       0.8720E+03 -0.3916E+03E
     annotating and storing such records have been extensively described in the 1979                              1996-06-06-0647-46S.TEST__011                                                                        6
                                                                                                                                                                                                                            AH: The Ad Hoc (AH) format is used in the AH analysis system.
     edition of the Manual of Seismological Observatory Practice. Also the formats for                            STAT SP IPHASW D HRMM SECON CODA AMPLIT PERI AZIMU VELO SNR AR TRES W                         DIS CAZ7
     reporting parameter readings from seismograms to international data centers such as                             FOO   SZ EP         C    648 48.47      136                                       -0.110    116 180    CSS :The Center for Seismic Studies (CSS) Database Management System (DBMS)
     the U.S. Geological Survey National Earthquake Information Service (NEIS), the                                  FOO   SZ ESG             649    2.67                                               0.710    116 180
                                                                                                                                                                                                                            was designed to facilitate storage and retrieval of seismic data for seismic monitoring
     International Seismological Centre (ISC) or the European Mediterranean                                          FOO   SZ E               649    2.89          426.4   0.3                                   116 180
                                                                                                                                                                                                                            of test ban treaties.
     Seismological Centre (EMSC) are outlined in this manual in detail in the section                                MOL   SZ EP         C    648 49.97      144                                       -0.310    129   92   GeoSig: Binary format used by GeoSig recorders.
     Reporting output. They have not been changed essentially since then. On the other                               MOL   SZ EPG        C    648 50.90                                                 0.410    129   92
     hand, respective working groups on parameter formats of the IASPEI and of its                                                                                                                                          Güralp format: Format used by Güralp recorders
                                                                                                                     MOL   AZ E               649    5.86                                                        129   92
     regional European Seismological Commission (ESC) have now already debated for                                   MOL   SZ ESG             649    5.87                                               0.410    129   92   ESSTF binary : The European Standard Seismic Tape Format (ESSTF).
     many years, without conclusive results or binding recommendations yet, how to make                              MOL   SZ E               649    6.98          328.6   0.6                                   129   92
     these formats more homogeneous, consistent and flexible so as to better accommodate                                                                                                                                    GSE: The (GSE) format has been extensively used with the GSETT projects on
                                                                                                                     HYA   SZ EP              648 56.78      135                                        0.810    174 159
     also other seismologically relevant parameter information.                                                                                                                                                             disarmament.
                                                                                                                     HYA   SZ IP         D    648 56.78                                                 0.810    174 159
               Meanwhile, the Database Management System (DBMS) of the Center for                                    HYA   SZ EPG        D    648 57.56                                                 0.110    174 159    IRIS dial-up expanded ASCII.The IRIS dial-up data retrieval system format.
     Seismic Studies (CSS) developed a standard IMS1.0 format for exchanging parametric                              HYA   SZ ESG             649 18.07                                                 0.610    174 159    ISAM-PITSA: Indexed Sequential Access Method (ISAM) is a commercial database
     seismological data used to monitor the Comprehensive Test Ban Treaty (CTBT). It                                 NRA0 SZ    Pn           0649 24.03                          309.6   8.5 139   5 -0.410      403 119    file system designed for easy access . PITSA bases its internal file structure for digital
     uses a commercial relational database management system to facilitate storage and                               NRA0 SZ    Pg           0649 32.60                          305.6   7.285.2   1    0.410    403 119    waveform data on ISAM.
     retrieval of seismological data. Since seismological research has a broader scope than
     the International Monitoring System (IMS) for the CTBT, a IASPEI Seismic                                                                                                                                               Ismes: Format used by Italian Ismes recorders
     Parameter Format (ISF) has now been proposed . It conforms with the IMS.1.0                                  Example of Nordic format. The data is the same as seen in Tabs. 10.1 and 10.2. The
                                                                                                                                                                                                                            Kinemetrics formats: Kinemetrics have several binary formats.
     standard but has essential extensions and is currently tested at the ISC and NEIC. It is                     format starts with a series of header lines with type of line indicated in the last column
     hoped that this format will be adopted as binding at the IASPEI meeting in 2001 and                          (80) and the phase lines are following the header lines with no line type indicator.                      Lennartz: Format for Lennartz recorders.
     that a standardized instruction on how to report seismological parameter data to                             There can be any number of header lines including comment lines. The first line gives
                                                                                                                                                                                                                            Nanometrics: Format used by Nanometrics recorders.
     seismological data centers in future will follow soon. This new reporting format will                        among other things, origin time, location and magnitudes, the second line is error
     fully exploits the much greater flexibility and potential of E-mail and Internet                             estimate, the third line is the name of the corresponding waveform file and the fourth                    NEIC ORFEUS: The NEIC ORFEUS early CD-ROMs
     information exchange as compared to the older telegraphic reports. It will be added to                       line is the explanation line for the phases (type 7). The abbreviations are: STAT:
                                                                                                                                                                                                                            PDAS: The format used by the Geotech PDAS recorders
     this manual as soon as it is adopted and recommended by the IASPEI Commission on                             Station code, SP: component, I: I or E, PHAS. Phase, W: Weight, D: polarity, HRMM
     Practice for general use.                                                                                    SECON: time, CODA: Duration, AMPLIT: Amplitude, PERI: Period, AZIMU:                                      PITSA BINARY: A PITSA format
                                                                                                                  Azimuth at station, VELO: Apparent velocity, SNR: Signal-to-noise ratio, AR:
               By far the largest volume of seismic data stored and exchanged nowadays are                                                                                                                                  Public Seismic Networks format
                                                                                                                  Azimuth residual of location, TRES: Travel time residual, W: Weight in location, DIS:
     digital waveform data. The number of formats in existence and their complexity far                           Epicentral distance in km and CAZ: Azimuth from event to station.                                         SAC: Seismic Analysis Code (SAC) is a general purpose interactive program
     exceeds the variability for parameter data. With the wide availability of continuous                                                                                                                                   designed for the study of time sequential.
     digital waveform data and unique communication technologies for world-wide
     transfer of such complete original data, their reliable exchange and archival has gained                                                                                                                               SEED: The Standard for the Exchange of Earthquake Data (SEED). SEED was
                                                                                                                                                                   IMS formats                                              adopted by the Federation of Digital Seismographic Networks (FDSN) in 1987 as its
     tremendous importance. Several standards for exchange and archival have been
     proposed, however a much larger number of formats are in daily use. The purpose of                                                                                                                                     standard. IRIS has also adopted SEED, and uses it as the principal format for its
                                                                                                                    At about the same time as the Nordic format was made, a new format was also
     the section on digital waveform data is to describe the international standards and to                                                                                                                                 datasets. It is worth pointing out that formats (such as SEED) designed to handle the
                                                                                                                  created for exchange of data within the International Monitoring System (IMS) of the
     summarize the most often used formats. In addition, there will be a description of                                                                                                                                     requirements of international data exchange are seldom suited to the needs of
                                                                                                                  Comprehensive Test Ban Treaty Organization (CTBTO) (formally called the GSE
     some of the more common conversion programs.                                                                                                                                                                           individual researchers. Thus the wide availability of software tools to convert between
                                                                                                                  parameter format). The format IMS1.0 is similar in structure to the Nordic format,
                                                                                                                                                                                                                            SEED and a full suite of Class 2 formats is crucial for its success.
                                                                                                                  however more complete in some respects and lacking features in other respects. A
                                                                                                                  major difference is that the line length can be more than 80 characters long, which is                    SEISAN: The SEISAN binary format is used in the seismic analysis program
                                           Parameter formats                                                      not the case for any of the previously described formats. The IMS1.0 format was the                       SEISAN
              Parameter formats deal with all earthquake parameters like hypocenters,                             first real international parameter format (although decided upon by a very limited and
                                                                                                                                                                                                                            SeisGram ASCII and binary: SeisGram software format
     magnitudes, phase arrivals etc. There are no real standards, except The Telegraphic                          specialized user group) and has been used extensively for data exchange within the
                                                                                                                  institutions participating in the IMS. It has also been used for data exchange outside                    Sismalp: Sismalp is a widespread French data seismic recording system
     Format (TF) used for many years to report phase arrival data to international agencies.
     The format is not used for processing. There has been attempts to modernize TF for                           IMS like in the popular AutoDRM system, however it has been used less as a
                                                                                                                                                                                                                            Sprengnether: Format used by Sprengnether recorders.
     many years through the IASPEI Commission of Practice and as mentioned in the                                 processing format than HYPO71 and Nordic formats. The format has recently been
                                                                                                                  extended to include all information needed under the IASPEI Commission on Practice                        SUDS: SUDS stands for “The Seismic Unified Data System”. The SUDS format was
     introduction, a new standard might emerge from year 2001. Thus there is currently no
                                                                                                                  to be approved in the year 2001. This GSE-IMS extended format is called the IASPEI                        launched to be a more well thought out format useful for both recording and analysis
     modern and internationally accepted exchange format like SEED. In practice many
                                                                                                                  Seismic Format (ISF). Below is an example of the ISF format.                                              and independent of any particular equipment manufacturer.
     different formats are used and the most dominant ones have come from popular
     processing systems.

                                                                                                 ISF format

                                     Sta       Dist   EvAz Phase   Time      TRes   Azim AzRes    Slow    SRes Def     SNR           Amp      Per Qual Magnitude           ArrID                                                                             Format conversions
                                     KSAR      13.04 16.5 P        01:15:20.300      1.2 200.2      1.2    12.5       -0.3     TAS   47.5              1.5 0.33       a__                25616243
                                     BJT       16.14 340.0 P       01:15:59.460      1.9 154.3     -1.9     9.0       -2.7     T__   26.3              1.3 0.33       a__                25616240                           Ideally we should all use the same format. Unfortunately, as the previous descriptions
                                     MJAR      17.24 44.5 P        01:16:09.650     -0.4 240.1      7.9    10.9       -0.1     T__    6.0              0.4 0.33       a__                25616246                           have shown, there are a large number for formats in use. With respect to parameter
                                     CMAR      23.49 258.8 P       01:17:16.050      0.7 60.9       0.3     8.4        0.6     T__   35.6             10.5 0.83       a__ mb         4.1 25616266                           formats, one can get a long way with HYPO71, Nordic and GSE/ISF formats for
                                     CMAR      23.49 258.8 LR      01:27:05.155     -9.3 80.0      10.3    37.7       -0.4     ___                    96.9 19.42      a__ Ms         3.4 25636151                           which converters are available, e.g., in the SEISAN system. For waveform formats,
                                     Net      Chan F Low_F HighF AuthPhas         Date      eTime wTime eAzim wAzim            eSlow wSlow                  eAmp    ePer eMag Author           ArrID                        the situation is much more difficult.
                                      (#OrigID 12345678)
                                     IMS       BZH C 1.00 10.0 Pg            1997/01/01 0.200 0.000          10.0    0.400         2.5   0.400               0.1    0.05 1.0 EIDC           25636151
                                                                                                                                                                                                                            Many processing systems require a higher level format than the often primitive
                                     IMS       BZH C 1.00 10.0 pPKKPPKP      1997/01/01 99.200 0.000         10.0    0.400         2.5   0.400               0.1    0.05     EIDC           25616240                        recording formats so that is probably the most common reason for conversion, and a
                                     IMS       BZH C 1.00 10.0 P             1997/01/01 0.200 0.000          10.0    0.400         2.5   0.400               0.1    0.05     EIDC           25616246                        similar reason is to move from one processing system to another.
                                     IMS       BZH C 1.00 10.0 P             1997/01/01 0.200 0.000          10.0    0.400         2.5   0.400               0.1    0.05     EIDC           25616266
                                      (#MEASURE RECTILINEARITY=0.8)                                                                                                                                                         The SEEED format has become a success for archival and data exchange.
                                     IMS       BZH C 1.00 10.0 LR            1997/01/01         0.000 10.0 0.400     2.5 0.400 1234567.9 1.00       EIDC                                    25636151                        Unfortunately, it is not very useful for processing purposes, and almost unreadable on
                                      (#ORIG   PZH NRA0                      1997/01/01 01:27:05.123 359.9        1234.5           123.4        1.3)                                                                        PC. So it is also important to be able to move down in the hierarchy.
                                      (#MIN                                             -99.999      -100.0      -1000.0      -1234567.9-10.23)
                                      (#MAX                                             +99.999      +100.0      +1000.0      +1234567.9+10.23)
                                      (#COREC                                            +0.500      -100.0      -1234.5                        0.12)
                                                                                                                                                                                                                            There are essentially two ways of converting. The first is to request a data from a data
                                                                                                                                                                                                                            center in a particular format or logging into a data center and using one of their
                                                                                                                                                                                                                            conversion programs. The other more common way is to use a conversion program on
                                                 HYPO71                                                                                                                                                                     the local computer. Such conversion programs are available both as free standing and
                                                                                                                                                                                                                            as part of processing systems.
     The very popular locations program HYPO71 has been around for many years and
     has been the most used program for local earthquakes. The format was therefore
     limited to work with only a few of the important parameters. An example is shown
                                                                                                                                                                                                                                                            Conversion programs
                                                                                                                                                                                                                            Since conversion programs are often related to analysis programs, we list some of the
                                                                                                                                                                                                                            better known analysis systems and the format they use directly.
     FOO EPC   96 6 6 64848.47              62.67ES                               136
                                                                                                                                                             Digital waveform formats
     MOL EPC   96 6 6 64849.97              65.87ES                               144
     HYA EP    96 6 6 64856.78              78.07ES                               135                                 Many different formats for digital data are used today in seismology. Most formats
     ASK EP    96 6 6 649 2.94              34.72ES                               183                                              can be grouped into one of the following five classes:
     BER EPC   96 6 6 649 7.56              36.61ES
     EGD EPD   96 6 6 649 5.76              40.53ES
                          10   5.0                                                                                1. Local formats in use at individual stations, networks or used by a particular seismic
                                                                                                                      recorder (e.g. ESSTF, PDR-2, BDSN, GDSN).                                                                               Program        Author(s)       Input format(s)   Output
     Example of an input file in HYPO71 format. Each line contains, from left to right:                           2. Formats used in standard analysis software (e. g. SEISAN, SAC, AH, BDSN).
                                                                                                                                                                                                                                              CDLOOK         R.Sleeman       SEED              SAC, GSE
     Station code (max 4 characters), E (emergent) or I (impulsive) for onset clarity,
                                                                                                                  3. Formats designed for data exchange and archiving (SEED, GSE).
     polarity (C – compression; D – dilatation), year, month, day, and time (hours,
                                                                                                                                                                                                                                              Geotool        J.Coyne         CSS, SAC, GSE     CSS, SAC, GSE
     minutes, seconds, hundredth of seconds) for P-Phase, second for S-phase (seconds                             4. Formats designed for database systems (CSS, SUDS)
     and hundredth of seconds only), S-phase onset and, in the last column, duration. The
                                                                                                                  5. Formats for real time data transmission.                                                                                 PITSA          F.Scherbaum,    ISAM, SEED,       ISAM, ASCII
     blank space between ES and duration has been used for different purposes like                                                                                                                                                                           J.Johnson       Pitsa binary,
     amplitude. The last line is a separator line between events and contains control                                                                                                                                                                                        GSE, SUDS
     information.                                                                                                                                                                                                                             SAC            LLNL            SAC               SAC
                                                                                                                  Use of the term "designed" in describing Class 3 and 4 formats is intentional. It is
     The format is rather limited since only P or S phase names can be used and the S-                            usually only at this level that very much thought has been given to the subtleties of
                                                                                                                                                                                                                                              SEISAN         J.Havskov, L.   SEISAN, GSE       SEISAN, GSE,
     phase is reference to the same hour-minute as the P-phase and the format cannot be                           format structure which result in efficiency, flexibility, and extensibility.                                                               Ottemöller                        SAC
     used with teleseismic data. However, the format is probably one of the most popular
                                                                                                                   The four classes (1-4) show a hierarchical structure. Class 4 forms a superset of the                                      SeismicHand    K.Stammler      q, miniSEED,      q, GSE,
     formats ever for local earthquakes. The HYPO71 program has seen many
                                                                                                                  others, meaning that classes 1-3 can be deduced from it. The same argument applies to                                       ler                            GSE, AH, ESSTF    miniSEED
     modifications and the format exists in many forms with small changes.
                                                                                                                  class 3 with respect to classes 1 and 2. Nearly all format conversions performed at
                                                                                                                  seismological data centers are done to move upwards in the hierarchy for the purpose                                        SNAP           M.Baer          SED, GSE          SED, GSE

                                           HYPOINVERSE                                                            of data archiving and exchange with other data centers. Software tools are widely
                                                                                                                  available to convert from one format to another and particularly upwards in the                                             SUDS           P.Ward          SUDS              SUDS
      Following the popularity of HYPO71, several other popular location programs                                 hierarchy.
     followed like Hypoinverse and Hypoellipse, however none has been used as much as                                                                                                                                                         Event          M.Musil         ESSTF, ASCII      ESSTF, ASCII
                                                                                                                  The GDSN (Global Digital Seismic Network) format began as a Class 1 format, but
     HYPO71. Below is an example of the input format for Hypoinverse.
                                                                                                                  because it was used by an important global seismograph network (DWWSSN, SRO) it                                             SeisBase       T.Fischer       ESSTF, Mars88,    GSE
                                                                                                                  became accepted as a de facto standard for data exchange (Class 3). The beginning of                                                                       GSE
     96 6 60648                                                                                                   widespread international data exchange within the FDSN (Federation of Digital
     FOO EPC   48.5 136                                                                                           Seismic networks) and GSE (Global Seismic Exchange) groups in the late 1980s
     FOO ES    62.7                                                                                               revealed the GDSN format's weaknesses in this role and put in motion the process of
     MOL EPC   50.0 144                                                                                           defining more capable exchange formats.
     MOL EPC   50.9
     MOL ES    65.9

     Example of the Hypoinverse input format. Note that year, month, day, hour, min is
     only given in the header and only one phase is given per line

Shared By: