Docstoc

ANSI_NIST-ITL-1-2011_Supplement_V5a

Document Sample
ANSI_NIST-ITL-1-2011_Supplement_V5a Powered By Docstoc
					 1
 2
 3
 4
 5
 6
 7
 8   ANSI/NIST-ITL 1-2011 SUPPLEMENT:
 9            VOICE RECORD
10
11
12
13           12 February, 2013
14
15            Draft Version 5a
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37   Contents
38   Introduction .......................................................................................................................4
39   Investigatory Voice Biometric Committee (IVBC) Membership ........................................5
40   Definitions of Specialized Terms Used in this Document .................................................6
41   Scope of the Type-11 Record...........................................................................................9
42   Source Documents .........................................................................................................10
43   General Organization of the Type-11 Record.................................................................10
44   Record Type-11: Voice Record ......................................................................................12
45        1.    Field 11.001: Record header .......................................................................28
46        2.    Field 11.002: Information Designation Character/IDC .............................28
47        3.    Field 11.003: Audio Object Descriptor/AOD ..............................................28
48        4.    Field 11.004: Voice Recording Source Organization/VRSO....................29
49        5.    Field 11.005: Voice Recording Content Descriptor/VRC;.............................30
50        6.    Field 11.006: Audio Recording Device/REC ..............................................30
51        7.    Field 11.007: Acquisition source / AQS ..........................................................31
52        8.    Field 11.008: Record Creation Date/RCD ...................................................31
53        9.    Field 11.009: Voice Recording Creation Date/VRD ...................................31
54        10. Field 11.010: Total Recording Duration/TRD .............................................31
55        11. Field 11.011: Physical Media Object/ PMO ................................................32
56        12. Field 11.012: Container Format/CFT...........................................................33
57        13. Field 11.013: Codec/CDC.............................................................................34
58        14. Field 11.014: Preliminary Signal Quality/PSQ ...........................................35
59        15. Fields 11.015-020: Reserved Fields............................................................36
60        16. Field 11.021: Redaction/ RED......................................................................36
61        17. Field 11.022: Redaction Diary/RDD ...........................................................37
62        18. Field 11.023: Snipping Segmentation/ SNP ..............................................38
63        19. Field 11.024: Snipping Diary/SPD ..............................................................38
64        20. Field 11.025: Diarization/DIA .......................................................................39
65        21. Field 11.026: Segment Diary/SGD..............................................................40
66        22. Field 11.027-030: Reserved Fields.............................................................41
67        23. Field 11.031: Time of Segment Recording /TME ......................................41
68        24. Field 11.032: Segment Geographical Information/GEO............................42
69        25. Field 11.033: Segment Quality Values/SQV ...............................................44
70        26. Field 11.034: Vocal Collision Identifier/VCI................................................45
71        27. Field 11.035: Processing Priority /PPY ......................................................45
72        28. Field 11.036: Segment Content/SCN ..........................................................46
73        29. Field 11.037: Segment Speaker Characteristics/SCC ...............................47
74        30. Field 11.038: Segment Channel/SCH..........................................................50
75        31. Field 11.039-050: Reserved Fields..............................................................52
76        32. Field 11.051: Comments/COM.....................................................................52
77        33. Fields 11.052-099: Reserved Fields............................................................52
78        34. Fields 11.100-900: User-defined fields/UDF...............................................52
79        35. Field 11.901: Reserved field ........................................................................52
80        36. Field 11.902: Annotation information/ANN ................................................52
81        37. Fields 11.903-992: Reserved Fields............................................................53
82        38. Field 11.993: Source agency name/SAN ....................................................53
                                                                      2
83   39.   Field 11.994: External file reference/EFR ...................................................53
84   40.   Field 11.995: Associated Context/ACN ......................................................53
85   41.   Field 11.996: Hash/HAS ...............................................................................53
86   42.   Field 11.997: Source representation/SOR..................................................53
87   43.   Field 11.998: Reserved field ........................................................................54
88   44.   Field 11.999: Voice record/DATA................................................................54
89
90




                                                       3
 91   Introduction
 92
 93   Speaker recognition presents some unique challenges not found in other forms of human
 94   recognition, such as fingerprint, iris or face. The human voice, generally carrying both speech
 95   and non-speech sounds, propagates varying distances through air (principally) or another
 96   medium to reach acoustic transducers (usually microphones) of varying amplitude and phase
 97   response. For purposes of the Type-11 record, a “speaker” is any person producing
 98   “vocalizations” from the throat or oral cavity, which may be voiced (activating the vocal cords)
 99   or unvoiced (such as aspirations, whispers, tongue clicks and other similar sounds). The current
100   state of technology for speaker recognition usually requires vocalizations containing some
101   speech (linguistic content). An automated interlocutor is considered to be a “speaker” for the
102   purposes of this record type, since the intent is to directly mimic human speech, although such a
103   speaker will not be the primary subject of a speaker recognition transaction.
104
105   When voice sounds carry speech, that speech usually occurs within a social context involving
106   more than one speaker. Consequently, a speech signal collected in situ may contain the voices of
107   multiple speakers, each voice signal with its own transfer function between the speaker and the
108   transducer. Segmenting and de-conflicting overlapped voice signals (“speaker separation”)
109   through automation is currently an unsolved problem in the general case, thus implying that
110   many operational applications of speaker recognition technology will involve audio recordings
111   containing multiple speakers and multiple acoustic transmission paths.
112
113   The ANSI/NIST-ITL standard was originally developed for the interchange of fingerprint data,
114   whether collected from latent prints lifted from crime scenes, scanned off of ink-based
115   fingerprint cards or taken directly from electronic “live” scanners. The standard, therefore, is
116   explicitly restricted to cases where, “All records in a transaction shall pertain to a single subject.”
117   This restriction presents special challenges for use of the standard for interchange of natural
118   voice signals, containing both speech and non-speech sounds, collected in a social, multi-speaker
119   context and stored either digitally or in analog form and either electronically or on physical
120   media. Therefore, a voice record type will have to accommodate:
121      1) bespoke recordings of single speaker voice signals for the specific purpose of speaker
122      recognition;
123      2) conversational and interview scenario voice signals, digitized and segmented into clips, or
124      “snips”, restricted to speech from the single speaker of interest (the voice data subject). In a
125      conversational setting, a speaker “turn” might be divided into several segments as the
126      conditions of the speech and its collection change;
127      3) unsegmented natural voice signals on digital or analog media, with or without an
128      accompanying timing diary of the segments attributable to speech from the single speaker of
129      interest;
130      4) unannotated speech segment(s) for input to annotation work-flow tools. In all cases, the
131      voice recordings referred to in the Type-11 record must accommodate signals collected non-
132      continuously and stored in multiple segments, a requirement that has been encountered before
133      in other ANSI/NIST record types. For example, the Type-14 (variable-resolution fingerprint
134      images) record has the capacity to carry multiple fingerprints in one image with segment

                                                         4
135     boundary information for each finger in the image, albeit from a single individual, and serves
136     as a model in this regard.
137
138   There are other challenges facing a speaker recognition standard. The most significant ones
139   include:
140
141       Voice signals generally contain both speech and non-speech elements, either of which
142          might be useful in speaker recognition applications.
143       Unlike other modalities, voice signals are collected in time, not spatial, dimensions and
144          will not have a single “time of collection”.
145       In mobile applications, even a single segment of a voice signal may not be linkable to a
146          single geographic location.
147       Voice signals containing speech have direct informational content. Unlike other forms of
148          biometric recognition, the speech itself means something and, even if stripped of all
149          personally identifiable information including the acoustic content itself, may require
150          protection for privacy or security reasons.
151       Unlike other modalities, voice signals may reflect the social and behavioral conditions of
152          the collection environment, including the relationship between the data subject and any
153          interlocutors.
154
155   Consequently, creating a Type-11 record for voice signal transmission with the ANSI/NIST-ITL
156   context is more complicated than simply copying an existing ANSI/NIST record type and
157   changing terminology ( for example, substituting “voice” for “fingerprint” and “signal” for
158   “image”). In the case of DNA Type-18 records, the standard has previously shown significant
159   flexibility in dealing with record types which carry non-spatial data with significant content
160   beyond that required for the recognition of individuals.
161

162   Investigatory Voice Biometric Committee (IVBC) Membership
164
163
165   Joseph Campbell, MIT                         173   Alvin Martin, Consultant
166   Carson Dayley, FBI                           174   Hirotaka Nakasone, FBI
167   Craig Greenberg, NIST                        175   Mark Przybocki, NIST (IVBC Chair)
168   Peter Higgins, Consultant                    176   Vince Stanford, NIST
169   Alysha Jeans, FBI                            177   Pedro Torres-Carrasquillo, MIT
170   Ryan Lewis, FBI                              178   James Wayman, Consultant
171   Jim Loudermilk, FBI                          179   Bradford Wing, NIST
172   Kenneth Marr, FBI
180
181   ANSI/NIST-ITL Voice Working Group (ANVWG)
182
183   In addition to the above members of the IVBC, the following persons participated in the
184   ANVWG:
185
186   Bonny Scheier, Saber
187   Walter Tewes, Forensic Odontology Partners
                                                     5
188 Martin Herman, NIST
189    TO BE ADDED TO AS NEEDED (CHECK SIGN-IN LIST)
190

191   Definitions of Specialized Terms Used in this Document
192
193   The following definitions are supplemental to Section 4 of ANSI/NIST-ITL 1-2011
194
195   Acoustic signal
196   Pressure waves in a media with information content.
197
198   Audio signal
199   Information in analog or digital form that contains acoustic content (voice or otherwise)
200
201   Audio recording
202   A stored audio signal capable of being transduced into an acoustic signal.
203
204   Contemporaneous
205   Existing at or occurring at the same period of time.
206       Note: In this record type, the phrase “contemporaneous capture of a voice signal” indicates
207     recording of the voice signal at the time of the speaker vocalization.
208
209   Diary
210   List giving the start and stop times of speech segments of interest pertaining to the primary voice
211   subject within the voice signal.
212         Note: Diarization of segments from multiple speakers requires multiple Type-11 records,
213         one for each speaker. These multiple Type-11 records may be contained in a single
214         transaction, as long as the transaction is focused upon a single subject.
215
216   Interlocutor
217   Definition solicited
218
219   Known Voice Signal
220   A voice signal from an individual who has been “identified”, or individuated in a way that allows
221   linking to additional, available information about that individual.
222
223   Digital sample (n)
224   Authoritative definition solicited.
225   A representative value of a signal at a chosen instant, derived from a portion of that signal.
226   Vocabulary of Digital Transmission and Multiplexing, and Pulse Code Modulation (PCM)
227   Terms, ITU-T Recommendation G.701 (March, 1993)
228   (v) obtain the values of a function for regularly or irregularly spaced distinct values from its
229   domain ISO 2382-2
230
231   Metadata


                                                        6
232   Documentation about the biometric data objects necessary or helpful in supporting the types of
233   transactions likely to be encountered in law enforcement and homeland security applications.
234   Note: Metadata may include both signal-related and content-related information.
235
236   Physical medium
237   Any external storage material of the voice signal and content information in either analog or
238   digital form. Examples include reel-to-reel recording tape, cassette tape, Compact Disc, and
239   phonograph record.
240
241   Quality
242   An ordinal estimate of the usefulness of a biometric data for the purpose of recognition.
243
244   Questioned Voice Signal
245   A voice signal from an individual who is unknown and has not yet been linked to any previously
246   encountered individual. Note: The task of speaker identification is to link a questioned voice
247   signal to a known voice signal through determination of a common speaker.
248
249   Record (n)
250   An ANSI/NIST-ITL biometric data format type, in its entirety, within an ANSI/NIST-ITL
251   transaction.
252         Note 1:In this document, this will be the Type-11 record unless otherwise stated.
253         Note 2: An ANSI/NIST-ITL transaction might contain multiple Type-11 records, as well
254         as other record types, including the mandatory Type-1 record.
255
256   Record (v)
257   The act of converting an acoustic voice signal directly from an individual into a storage media,
258   perhaps through contemporaneous, intermediate (transient) signal types.
259         Note: This definition is retained because of its entrenchment in natural language use.
260        Consequently, a record (n) is not recorded, it is created.
261         Note: Transcoding is the term used for further processing of the voice signal and any
262        digital or analog representation of that signal.
263
264   Record creation
265   The act of creating a record contained in an ANSI/NIST-ITL transaction.
266
267   Recording (n)
268   A stored acoustic signal in either analog or digital form.
269
270   Redaction
271   Over-writing of segments of a voice signal for the purpose of masking speech content in a way
272   that does not disrupt the time record of the original recording.
273
274   Snip (n)
275   A segment of a voice signal extracted from a larger voice signal recording.
276        Note: Also called a “clip” or a “cut” in some communities.
277

                                                       7
278   Snip (v)
279   Extraction of segments of a voice signal in a way that disrupts the continuity and time record of
280   the original recording.
281
282   Speaker
283   A vocalizing human, whether or not the vocalizations contain speech.
284    Note: An interlocutor might be a synthesized voice, which can be considered a “speaker”
285   within the context of this supplement.
286
287   Speech
288   Audible vocalizations made with the intent of communicating information through linguistic
289   content.
290       Note 1: Nonsensical vocalizations with linguistic content will be considered as speech.
291       Note 2: Speech can be made by humans, by machine synthesizers, or by other means.
292
293   Subject of the record
294   The person to whom the data in the record applies.
295     Note: The subject of the record need not be the subject of the transaction, because a transaction
296   can include Type-11 records for interlocutors and others not named as the subject of the
297   transaction.
298
299   Subject of the transaction
300   The person to whom the transaction applies.
301    Note: The subject of a record need not be the subject of the transaction.
302
303   Track
304   Authoritative definition solicited.
305    On a data medium, a path associated with a single read/write head as the data medium moves
306   past it. ISO 2382-12 International Organization for Standardization. Geneva : ISO, 1988. 1 v.
307
308   Transaction
309   A transmission between sites or agencies comprised of records, types of which are defined in
310   ANSI/NIST-ITL.
311       Note: An ANSI/NIST-ITL transaction is called a file in Traditional encoding and an
312      Exchange Package in XML encoding.
313
314   Transcoding
315   Any transfer, compression, manipulation, re-formatting or re-storage of the original recorded
316   material.
317        Note 1: Transcoding is not the first recording of the acoustic signal.
318        Note 2: Transcoding can be lossless or lossy.
319
320   Voice data file
321   The digital, encoded file primarily containing the sounds of vocalizations of both speech and
322   non-speech content, convertible to an acoustic signal replicating the original acoustic signal.


                                                       8
323       Note 1: A voice data file is extracted from an audio recording, but not all audio recordings
324      contain voice signals and not all voice data is speech.
325       Note 2: A physical medium, such as a cassette tape, contains a voice signal but is not a
326      voice data file.
327
328   Voice recording
329   A signal, stored on a digital or analog medium, of vocalizations containing both speech and non-
330   speech content.
331
332   Voice signal
333   Any audible vocalizations emanating from the human mouth, throat and nasal cavity with or
334   without speech content.
335
336

337   Scope of the Type-11 Record
338
339   The following updates Section 5.3.11 of ANSI/NIST-ITL 1-2011
340
341    Type-11 records shall support the transmission of audio recordings containing speech by one or
342   more speakers, including noise (data of no interest to the transaction, whether speech, non-
343   speech voice data, or non-voice data) in the context of an ANSI/NIST- ITL transaction
344   pertaining to a single, perhaps unknown, individual. These transmissions support transactions
345   related to detecting and recognizing speakers, extracting from an audio recording speech
346   segments attributable to a single speaker, and linking speech segments by speaker, whether these
347   functions are to be accomplished through automated means (computers), human experts, or
348   hybrid human-assisted systems. Related functions, such as redaction, authentication, phonetic
349   transcription and enhancement, while also supported, are not the primary concern of this record
350   type, although audio recordings supporting these related functions may be transmitted via Type-
351   11 records. This standard does not specify which techniques will be used in any human expert,
352   automated or hybrid voice processing application and does not specify the form of the
353   examination report. Although not designed for use in logical or physical access control, time-
354   and-attendance, point-of-sale, or other consumer or commercial applications, nothing in this
355   record type should be construed as preventing its application in these or other transaction types
356   not specifically addressed here. This record type does not support streaming transactions. This
357   record does not define the transmission of features or models extracted from voice data, but does
358   allow the user to define specific fields to contain such information, in accordance with an
359   implementation domain or application profile. Fields that may be used for user-specific purposes
360   are specified as such in this supplement. This record type does not restrict the media by which
361   the audio recording will be transmitted, but will support digital transmission of transaction
362   information regardless of the audio recording media.
363
364
365



                                                      9
366 Source Documents
367 The following is added to Annex I of ANSI/NIST-ITL 1-2011
368
369    1. Collaborative Digitization Program, Digital Audio Working Group, “Digital Audio Best
370        Practices”, version 2.1, October, 2006,
371        http://ucblibraries.colorado.edu/systems/digitalinitiatives/docs/digital-audio-bp.pdf
372    2. Audio Engineering Society, “AES standard for audio metadata - Audio object structures
373        for preservation and restoration”, AES57-2011, Sept. 21, 2011
374    3. Audio Engineering Society, “AES standard for audio metadata -Core audio metadata”,
375        AES60-2011, Sept. 22, 2011
376

377 General Organization of the Type-11 Record
378
379 The Type-11 record is organized into 6 parts:
380
381      I) Mandatory fields;
382      II) Initial global fields, applying to the entire voice data record;
383      III) Indication of presence and definition of segments within the voice data record;
384      IV) Fields applying to the individual segments;
385      V) Additional global fields modeled on other record types in the ANSI/NIST standard;
386      VI) Fields containing or pointing to the voice recording.
387
388 I. Mandatory fields:
389      01 Record header
390      02 Information designation character
391
392II. The initial global fields are:
393      03 Audio object descriptor (internal or external digital file, external physical media
394      containing digital/analog known/unknown recording)
395      04 Voice recording information (source of the voice recording, phone numbers and POCs)
396      05 Voice recording content descriptor (number of speakers, status of speakers)
397      06 Recording device (hardware/software)
398      07 Acquisition source
399      08 Type-11 record creation date
400      09 Voice recording creation date
401      10 Total recording duration
402      11 Physical media object (tape, CD, phonograph record,...)
403      12 Container Format (wav, ogg, mp3/4)
404      13 Codec (PCM types)
405      14 Preliminary signal quality (multiple quality metrics possible)
406      15-20 Fields reserved for future ANSI/NIST use
407
408II. The presence and definition of segments within the audio file follow.
  I
409      21 Redaction (yes/no, by whom?)

                                                   10
410      22 Redaction diary (where in recording and why redaction occurred)
411      23 Snipping (yes/no, by whom?)
412      24 Snipping diary (separate snips/clips/cuts are numbered and identified by relative start/end
413      times, comments)
414      25 Diarization (yes/no, by whom?)
415      26 Segment diary (segments are numbered with relative start/end times, labels of attributes
416     attributed to the speech and speaker of each segment, and comments.)
417      27-30 Fields reserved for future ANSI/NIST use
418
419 Repeating sets of sub-fields labeled by segment numbers as designated in the diarization. (If the
  IV.
420      segment number is "0", that becomes the default for all segments not otherwise listed.)
421     31 Date/time of recording of segment/snip and labeled date/time of recording
422     32 Geolocation of data subject of this Type-11 record at start of segment/snip
423     33 Segment/snip quality values (possible multiple values for each segment)
424     34 Vocal collision indicator (two or more persons speaking at once)
425     35 Processing priority of the segment/snip
426     36 Segment content (language, prompted/read/conversation, word transcript, phonetic
427     transcript, translations)
428     37 Segment/snip speaker characteristics (impairment, intelligibility, health, emotion, vocal
429     effort, vocal style, language proficiency)
430     38 Segment channel (transducer, capture environment, channel type)
431     39-50 Fields reserved for future ANSI/NIST use
432
433 More global fields modeled on other record types in ANSI/NIST-ITL 1-2011:
   V.
434     51 Global comments
435     52 – 99 Fields reserved for ANSI/NIST future use
436     100-900 Fields reserved for user-defined use
437     902 Annotation information
438     903-992 Fields reserved for future ANSI/NIST use
439     993 Source Agency Name
440 I. The voice recording or pointers to that recording:
  V
441     994 External file reference
442     995 Associated context reference (Type 21 record)
443     996 Voice data file hash
444     997 Source representation reference (Type 20 record with original audio)
445     998 Field reserved for future ANSI/NIST use
446     999 Voice data file
447
448
449 The following is a replacement for Section 8.11 of ANSI/NIST-ITL 1-2011

450
451
452



                                                     11
453   Record Type-11: Voice Record
454
455   The Type-11 record shall be used to exchange a single voice data file or a physical medium
456   containing a digital or analog voice recording, together with fixed and user-defined textual
457   information fields (referred to in this standard as “metadata”) pertinent for understanding and
458   processing the voice signal.
459
460   The Type-11 record references a recording of a voice signal stored as a digital voice data file
461   within the record, or a recording external to the transaction. Information regarding the recording
462   type, the voice data file size, and other parameters or comments required to process the voice
463   data file are given as fields within the Type-11 record. If the Type-11 record references a voice
464   recording contained in a physical medium (i.e., an analog tape, a digital tape, a CD, a
465   phonograph record), the label and location of that medium shall be indicated in this Type-11
466   record, along with the information necessary to render the stored recording as acoustic output.
467
468   A transmitted voice recording may be processed by the recipient agencies to isolate the voice
469   signal of interest and to extract the desired feature or model information required for voice
470   comparison, speaker detection, or speech attribution purposes.
471
472   If there are multiple speakers of interest in a voice recording supported by a Type-11 record, then
473   a separate ANSI/NIST-ITL transaction may be created for each individual of interest, each
474   transaction possibly containing the same Type-11 records. If the voice recording included in or
475   pointed to by a Type-11 record has been extracted from a longer source recording, that source
476   recording may be included in digital form within the transaction as a Type-20 record, or referred
477   to as an external source in either digital or analog format in the Type-20 record. Voice models or
478   features extracted from voice data are not explicitly accommodated in this record, but may be
479   transmitted in user-defined fields.
480
481   All text fields are to be in Unicode.
482




                                                      12
         483
         484                                               Table 1 Type-11 record layout
         485   Key for Character type: N=Numeric; A=Alphabetic; AN=Alphanumeric; B=Binary or Base64; U=Unicode
         486   Key for Cond. code: M=Mandatory; O=Optional; D = Dependent upon another value or condition described in the text;
         487   M↑=Mandatory if the field/subfield is used; O↑=Optional if the field/subfield is used; S=special character
         488
         489
 Field          Mnemonic            Content                C         Character                     Value               Occurrence
Number                             Description             on                                    Constraints
                                                            d    T       M         M
                                                                 y       i         a                                     M         M
                                                                 p       n         x                                     i         a
                                                           co                                                            n         x
                                                                 e       #         #
                                                           de                                                            #         #


                                                                encoding specific: see
                                                                                             encoding specific: see
                                                                     Annex B:
                                                                                             Annex B: Traditional
                                   RECORD                       Traditional encoding
11.001
                                   HEADER
                                                           M                                 encoding or Annex C:         1        1
                                                                or Annex C: NIEM-
                                                                                              NIEM-conformant
                                                                    conformant
                                                                                                encoding rules
                                                                   encoding rules

                                   INFORMATION
                                                                                                   0 ≤ IDC ≤ 99
11.002         IDC                 DESIGNATION             M    N        1    2                                           1        1
                                                                                                      integer
                                   CHARACTER


                                   AUDIO OBJECT                                               See Supplement Table2
11.003         AOD                                         M    N        1    1                                           1        1
                                   DESCRIPTOR                                                      0 ≤ AOD ≤ 5


                                   VOICE
                                   RECORDING
               VRSO                SOURCE                  O                                                              0        1
                                   ORGANIZATIO
                                   N


                                   source
                      STC          organization type       M↑   A    1       1           STC = U, P, I, G, or O           0        1
                                   code
11.004

                                   source
                     SON                                   O↑   U    1       400                      none                0        1
                                   organization name


                     POC           point of contact        O↑   U    1       200                      none                0        1


                                   code of sending              A                                  value from
                     CSC                                   O↑        1       3                                            0        1
                                   country                      N                                 ISO-3166-1


                                   Voice Recording
               VRC                 Content                 O                                                              0        1
                                   Descriptor


11.005                             assigned       voice                                         0=questioned voice
                      AVI                                  O↑   B        0    1                                           0        1
                                   indicator                                                     1=assigned voice


                                   speaker     plurality                                         S=single speaker
                      SPC                                  O↑   A        0    1                                                    1
                                   code                                                         M=multiple speakers       0


                                                                             13
 Field   Mnemonic      Content              C            Character                  Value                 Occurrence
Number                Description           on                                    Constraints
                                             d    T        M            M
                                                  y        i            a                                  M       M
                                                  p        n            x                                  i       a
                                            co                                                             n       x
                                                  e        #            #
                                            de                                                             #       #


                      AUDIO
         REC          RECORDING             O                                                              0       1
                      DEVICE


                      recording device
               RDD                          O    U         1       4000                 none               0       1
                      description text


                      recording   device
               MAK                          O    U         1       50                   none               0       1
                      make
11.006
                      recording   device
               MOD                          O    U         1       50                   none               0       1
                      model


                      recording device
               SER                          O    U         1       50                   none               0       1
                      serial number


               COM    comments              O    U         1       4000                 none               0       1


                      ACQUISTION
         AQS                                M                                                              1       1
                      SOURCE


                      acquisition source                                             value from
               AQT                          M        N         1          2                                1       1
                      type                                                            Table 88


                      analog to digital
11.007         A2D                          D        U         1        200             none               0       1
                      conversion


                      radio transmission
               FDN                          D        U         1        200             none               0       1
                      format description


                      acquisition special
               AQSC                         O        U         1        200             none               0       1
                      characteristics


                                                   See Section 7.7.2.4
                                                                              See Section 7.7.2.4 Local
                                                   Local date and time;
                                                                              date and time; encoding
                      RECORD                      encoding specific: see
                                                                              specific: see Annex B:
11.008   RCD          CREATION              M     Annex B: Traditional                                     1       1
                                                                              Traditional encoding or
                      DATE                        encoding or Annex C:
                                                                              Annex       C:      NIEM-
                                                    NIEM-conformant
                                                                              conformant encoding rules
                                                     encoding rules


                                                 See Section 7.7.2.4
                                                                              See Section 7.7.2.4 Local
                                                 Local date and time;
                      VOICE                                                   date and time; encoding
                                                 encoding specific: see
                      RECORDING                                               specific: see Annex B:
11.009   VRD                                O    Annex B: Traditional                                      0       1
                      CREATION                                                Traditional encoding or
                                                 encoding or Annex C:
                      DATE                                                    Annex       C:      NIEM-
                                                 NIEM-conformant
                                                                              conformant encoding rules
                                                 encoding rules


                                                                   14
 Field    Mnemonic     Content                 C         Character             Value                   Occurrence
Number                Description              on                            Constraints
                                                d   T      M         M
                                                    y      i         a                                  M       M
                                                    p      n         x                                  i       a
                                               co                                                       n       x
                                                    e      #         #
                                               de                                                       #       #


                      TOTAL
          TRD         RECORDING                O                                                        0       1
                      DURATION


                                                                         1 ≤ TIM ≤ 99999999999 (in
                TIM   total time               O↑   N      1   11              microseconds)            0       1
11.0010                                                                         (no commas)


                                                                           1 ≤ CBY ≤ 999999999
                CBY   compressed bytes         O↑   N      1    14                 99999                0       1
                                                                                (no commas)


                      total          digital                             1 ≤ TSM ≤ 9999999999999
                TSM                            O↑   N      1   14                                       0       1
                      samples                                                  (no commas)


                      PHYSICAL
          PMO         MEDIA                    D                                                        0       1
                      OBJECT


                      media            type
                MTD                            M↑   U      1    300                 none                1       1
                      description


                                                                             0.9999999 ≤ RSP ≤
                                                                                 999999999
                RSP   recording speed          O↑   NS     1    9        value may include a decimal    0       1
                                                                            point or be an integer
                                                                                (no commas)

11.011                recording     speed
                RSU   measurement units        D↑   U      1    300                 none                0       1
                      description text


                EQ    equalization             O↑   AN     1    100                 none                0       1


                TRC   track count              O↑   N      1    2              1 ≤ TRC ≤ 99             0       1


                                                                            list of integer values
                      speaker         track                              between 1 and 99 inclusive
                STK                            O↑   NS     1    200                                     0       99
                      number                                                that are separated by
                                                                                   commas


                COM   comments                 O↑   U      1    4000                none                0       1


                      CONTAINER
11.012    CFT                                  O    N      1    2         See Supplement Table3         0       1
                      FORMAT


11.013    CDC         CODEC                    D                                                        0       1




                                                               15
 Field     Mnemonic     Content            C        Character               Value                  Occurrence
Number                 Description         on                             Constraints
                                            d   T     M        M
                                                y     i        a                                    M       M
                                                p     n        x                                    i       a
                                           co                                                       n       x
                                                e     #        #
                                           de                                                       #       #


                 CDT   codec type code          N     1    3         See Supplement Table4          1       1


                                                                      0 ≤ SRT < 100000000
                       digital sampling
                 SRT                            N     1    9            (Hz) integer value          0       1
                       rate number
                                                                     0 = variable or unknown


                                                                           0 ≤ BIT ≤60
                 BIT   bit depth count          N     1    2              positive integer          0       1
                                                                     0 = variable or unknown


                 EDN   endian code              N     1    1         0=big; 1=little; 2=native      0       1


                       fixed point                                        0=floating point
                 PNT                            N     1    1                                        0       1
                       indicator                                           1=fixed point


                 CHC   channel count            N     1    2               1 ≤ CHC ≤ 99             0       1


                 COM   comments                 U     1    4000                none                 0       1


                       PRELIMINARY
           PSQ         SIGNAL              O                                                        0       1
                       QUALITY


                       Subfields:
                       Repeating sets of                                                            1       9
                       information items


                                                                   0 ≤ QVU ≤ 100 or 255=
11.014           QVU   quality value       M↑   N     1    3                                        1       1
                                                                   quality not assessed; Integer


                       algorithm vendor
                 QAV                       M↑   H     4    4           0x00 ≤ QAV ≤ FFFF            1       1
                       identification


                       algorithm product   M↑                      0 ≤ QAP ≤ 65534 positive
                 QAP                            N     1    5                                        1       1
                       identification                              integer


                 COM   comments            D    U     1    300     none                             0       1


                       RESERVED
11.015--               FOR FUTURE
11.020                 USE only by
                       ANSI/NIST-ITL



11.021     RED         REDACTION           O                                                        0       1




                                                          16
 Field   Mnemonic     Content               C        Character                Value                    Occurrence
Number               Description            on                              Constraints
                                             d   T        M        M
                                                 y        i        a                                    M       M
                                                 p        n        x                                    i       a
                                            co                                                          n       x
                                                 e        #        #
                                            de                                                          #       #


                                                                                  0=no
               RDI   redaction indicator    M↑   B    1        1                                        1       1
                                                                                  1=yes


                     redaction authority
               RDA                          O↑   U    1        300                 none                 0       1
                     organization name


               COM   comments               O↑   U    1        4000                none                 0       1


                     REDACTION
         RDD                                O                                                           0       1
                     DIARY


                     Subfields:
                     Repeating sets of      M↑                                                          1     600,000
                     information items


               RID   redaction identifier   M↑   N    1        6            1 ≤ RID ≤ 600000            1       1
11.022
                                                 N                     List of integers separated by
               TRK   tracks                 D↑        1        297                                      0       1
                                                 S                                commas


               RST   relative start time    M↑   N    1        11        1≤ RST ≤ 99999999998           1       1


               RET   relative end time      M↑   N    1        11      99999999999 ≥ RET > RST          1       1


               COM   comments               O↑   U    1        4000                none                 0       1


                     SNIPPING
         SNP         SEGMENTA-              O                                                           0       1
                     TION


                                                                                  0=no
               SGI   snipping indicator     M↑   B    1        1                                        1       1
11.023                                                                            1=yes


                     snipping authority
               SPA                          O↑   U    1        300                 none                 0       1
                     organization name


               COM   comments               O↑   U    1        4000                none                 0       1


                     SNIPPING
         SPD                                O                                                           0       1
                     DIARY


                     Subfields:
11.024
                     Repeating sets of                                                                  1     600000
                     information items


               SPI   snip identifier        M↑   N        1    6            1 ≤ SPI ≤ 600000            1       1



                                                              17
 Field     Mnemonic     Content              C         Character                  Value                    Occurrence
Number                 Description           on                                 Constraints
                                              d   T          M         M
                                                  y          i         a                                    M         M
                                                  p          n         x                                    i         a
                                             co                                                             n         x
                                                  e          #         #
                                             de                                                             #         #


                                                                           List of integers separated by
                 TRK   tracks                D↑   NS     1        297                                       0          1
                                                                                      commas


                                                                             99999999998≥RST ≥ 0
                 RST   relative start time   M↑   N      1        11                                        1          1


                                                                           99999999999> RET > RST
                 RET   relative end time     M↑   N      1        11                                        1          1


                 COM   comments              O↑   U      1        4000                 none                 1          1


           DIA         DIARIZATION           D                                                              0         1


                       diarization                                                    0=no
                 DII                         M↑   B      1        1                                         1          1
                       indicator                                                      1=yes
11.025
                       diarization
                 DAU                         O↑   U      1        300                  none                 0          1
                       authority


                 COM   comments              O↑   U      1        4000                 none                 0          1


                       SEGMENT
           SGD         DIARY                 D                                                              0         1



                       subfields:
                       repeating sets of     M↑                                                             1       600,000
                       information items


                 SID   segment identifier    M↑   N      1        6             1 ≤ SID ≤600000             1          1
11.026
                                                                           List of integers separated by
                 TRK   tracks                D↑   NS     1        297                                       0          1
                                                                                      commas


                                                                            99999999998 ≥ RST ≥ 0
                 RST   relative start time   M↑   N      1        11                                        1          1



                                                                           99999999999 ≥ RET > RST
                 RET   relative end time     M↑   N      1        11                                        1          1



                 COM   comments              O↑   U      1        10000                 none                    0          1


                       RESERVED
11.027 –               FOR FUTURE
11.030                 USE only by
                       ANSI/NIST-ITL


                                                                 18
 Field   Mnemonic     Content             C        Character                   Value                   Occurrence
Number               Description          on                                 Constraints
                                           d   T       M            M
                                               y       i            a                                   M       M
                                               p       n            x                                   i       a
                                          co                                                            n       x
                                               e       #            #
                                          de                                                            #       #




                     TIME OF                                                                            0       1
         TME         SEGMENT              D
                     RECORDING


                                                                                                        1       *
                     Subfield:
                     repeating sets of    M↑
                     information items


                                                                        0=snip diary                    1       1
               DIA   diary identifier     M↑   B      1        1
                                                                        1=segment diary


               SID   segment identifier   M↑   N      1         6       1 ≤ SID ≤ 600000                1       1


                     original recording        encoding specific: see   encoding specific: see Annex
               ORD                        O↑                                                            0       1
                     date                      Annex B or Annex C              B or Annex C


                                               encoding specific: see   encoding specific: see Annex
               TDT   tagged date          O↑                                                            0       1
                                               Annex B or Annex C              B or Annex C

11.031               segment recording         encoding specific: see   encoding specific: see Annex
               SRT                        O↑                                                            0       1
                     start time                Annex B or Annex C              B or Annex C


                                               encoding specific: see   encoding specific: see Annex
               TST   tagged start time    O↑                                                            0       1
                                               Annex B or Annex C              B or Annex C


                     segment recording         encoding specific: see   encoding specific: see Annex
               END                        O↑                                                            0       1
                     end time                  Annex B or Annex C              B or Annex C


                                               encoding specific: see   encoding specific: see Annex
               TET   tagged end time      O↑                                                            0       0
                                               Annex B or Annex C              B or Annex C


                     time source
               TMD                        O↑   U      1       300                  none                 0       1
                     description text




                     comments
               COM                        O↑   U      1       4000                 none                 0       1




                                                            19
 Field   Mnemonic     Content               C         Character            Value                    Occurrence
Number               Description            on                           Constraints
                                             d   T      M        M
                                                 y      i        a                                   M       M
                                                 p      n        x                                   i       a
                                            co                                                       n       x
                                                 e      #        #
                                            de                                                       #       #



                     SEGMENT
                     GEOGRAPHIC-
                     AL
         GEO         INFORMATION            D                                                        0       1
                     (about person of
                     interest at start of
                     segment)



                     Subfields:
                     repeating sets of      M↑                                                       1       *
                     information items


                                                                            0=snip diary
               DIA   diary identifier       M↑   B      1    1                                       1       1
                                                                          1=segment diary


                                                                        0 or a list of integers
               SID   segment identifiers    M↑   NS     1    *                                       1       1
                                                                        separated by commas


                     segment cell
               SCT                          O↑   U      1    100                none                 0       1
                     phone tower code


11.032               latitude degree
               LTD                          D    NS     1    9             -90 ≤ LTD ≤ 90            0       1
                     value


                     latitude minute
               LTM                          D    NS     1    8              0 ≤ LTM < 60             0       1
                     value


                     latitude second
               LTS                          D    NS     1    8              0 ≤ LTS < 60             0       1
                     value


                     longitude degree
               LGD                          D    NS     1    10          -180 ≤ LGD ≤ 180            0       1
                     value


                     longitude minute
               LGM                          D    NS     1    8             0 ≤ LGM < 60              0       1
                     value


                     longitude second                                       0 ≤ LGS < 60
               LGS                          D    N      1    2                                       0       1
                     value                                                 positive integer


                                                                     -442.000 < ELE < 8848.000
               ELE   elevation              O↑   NS     1    8       Decimal point is the allowed    0       1
                                                                          special character.


                     geodetic datum                                         value from
               GDC                          O↑   AN     3    6                                       0       1
                     code                                               Supplement Table??


               GCM   geographic             D    AN     2    3       one or two integers followed    0       1


                                                            20
 Field   Mnemonic     Content              C         Character             Value                  Occurrence
Number               Description           on                            Constraints
                                            d   T      M        M
                                                y      i        a                                     M       M
                                                p      n        x                                     i       a
                                           co                                                         n       x
                                                e      #        #
                                           de                                                         #       #


                     coordinate                                           by a single letter
                     universal
                     transverse
                     mercator zone


                     geographic
                     coordinate
               GCE   universal             D    N      1    6                  integer                0       1
                     transverse
                     mercator easting


                     geographic
                     coordinate
               GCN   universal             D    N      1    8                  integer                0       1
                     transverse
                     mercator northing



                     geographic
               GRT                         O↑   U      1    150                 none                  0       1
                     reference text



                     geographic
                     coordinate other
               OSI                         O↑   U      1    10                  none                  0       1
                     system identifier (
                     or landmark)


                     geographic
               OCV   coordinate other      D    U      1    126                 none                  0       1
                     system value


                     SEGMENT
         SQV         QUALITY               D                                                      0       1
                     VALUES


                     Subfields:
                     Repeating sets of     M↑                                                     1       *
                     information items


                                                                    0=snip diary                  1       1
11.033         DIA   diary identifier      M↑   B      1    1
                                                                    1=segment diary


                                                                        0 or a list of positive   1       1
               SID   segment identifiers   M↑   NS     1    *         integers, each ≤ 600000,
                                                                        separated by commas


                                                                    positive integer, 0 ≤ QVU ≤
               QVU   quality value         M↑   N      1    3         100 or 255 = quality not    1       1
                                                                              assessed




                                                           21
 Field   Mnemonic     Content              C         Character                   Value                   Occurrence
Number               Description           on                                  Constraints
                                            d   T      M               M
                                                y      i               a                                  M        M
                                                p      n               x                                  i        a
                                           co                                                             n        x
                                                e      #               #
                                           de                                                             #        #


                     algorithm vendor
               QAV                         M↑   H      4           4          0x00 ≤ QAV ≤ FFFF           1    1
                     identification


                     algorithm product                                     positive integer, 0 ≤ QAP ≤
               QAP                         M↑   N      1           5                                      1         1
                     identification                                                   65534


               COM   comments              D    U      0           300                none                0         1


                     VOCAL
         VCI         COLLISION             D                                                              0        1
                     IDENTIFIER


                     Subfields:
                     Repeating sets of                                                                    1        2
                     information items
11.034

                                                                                  0=snip diary
               DIA   diary identifier      M↑   B          1       1                                      1         1
                                                                                1=segment diary


                                                                              0 or a list of positive
               SID   segment identifiers   M↑   NS     1           *        integers, each ≤ 600000,      1         1
                                                                              separated by commas


                     PROCESSING
         PPY                               D                                                              0        1
                     PRIORITY


                     Subfields:
                     Repeating sets of                                                                    1        *
                     information items


                                                                                  0=snip diary
11.035         DIA   diary identifier      M↑   B      1           1                                      1         1
                                                                                1=segment diary


                                                                              0 or a list of positive
               SID   segment identifiers   M↑   NS     1           *        integers, each ≤ 600000,      1         1
                                                                              separated by commas


                                                                           positive integer, 1 ≤ PTY ≤
               PTY   priority              M↑   N      1           1                                      1         1
                                                                                         9


                     SEGMENT
         SCN                               D                                                              0        1
                     CONTENT


                     Subfields:
11.036
                     Repeating sets of     M↑                                                             0        TBD
                     information items


               DIA   diary identifier      M↑   B      1       1                   0=snip diary           1         1


                                                               22
 Field   Mnemonic     Content                  C         Character                  Value                    Occurrence
Number               Description               on                                 Constraints
                                                d   T          M        M
                                                    y          i        a                                     M       M
                                                    p          n        x                                     i       a
                                               co                                                             n       x
                                                    e          #        #
                                               de                                                             #       #


                                                                                   1= segment diary


                                                                                  0 or a list of positive     1       1
               SID   segment identifiers       M↑   NS     1       *            integers, each ≤ 600000,
                                                                                  separated by commas


               TRN   transcript text           O↑   U      1       100,000                none                0       1


                     phonetic transcript
               PTT                             O↑   U      1       100,000                none                0       1
                     text


               TLT   translation text          O↑   U      1       100,000                none                0       1


                     segment content
               COM                             O↑   U      1       100,000                none                0       1
                     comments


                     transcript authority
               TAC                             O↑   U      1       10,000                 none                0       1
                     comment text


                     SEGMENT
                     SPEAKER
         SCC                                   D                                                              0       1
                     CHARACTERIS
                     TICS


                     Subfields:
                     Repeating sets of         M↑                                                             1      TBD
                     information items


                                                                                     0=snip diary             1       1
               DIA   diary identifier          M↑   B      1        1
                                                                                   1=segment diary


                                                                                 0 or a list of positive      1       1
                                                                   *
11.037         SID   segment identifiers       M↑   NS     1                   integers, each ≤ 600000,
                                                                                 separated by commas


                     impairment        level
               IMP                             O↑   N      1       1         positive integer, 0 ≤ IMP ≤ 5    0       1
                     number


                     dominant spoken                                                 Value from
               DSL                             O↑   A      3       3                                          0       1
                     language code                                                   ISO 639-3


                     language
               LPS   proficiency       scale   O↑   N      1       1         positive integer, 0 ≤ LPS ≤ 9    0       1
                     number


               STY   speech style code         O↑   N      1       2          See Supplement Table 5          0       1




                                                                   23
 Field   Mnemonic     Content                C         Character                    Value                    Occurrence
Number               Description             on                                   Constraints
                                              d   T          M           M
                                                  y          i           a                                    M       M
                                                  p          n           x                                    i       a
                                             co                                                               n       x
                                                  e          #           #
                                             de                                                               #       #


                     intelligibility scale
               INT                           O↑   N      0       1           positive integer, 0 ≤ INT ≤ 9    0       1
                     code


                     familiarity degree
               FDC                           O↑   N      0       1           positive integer, 0 ≤ FDC ≤ 5    0       1
                     code


               HCM   health comment          O↑   U      0       4000                    none                 0       1


                     emotional       state
               EMC                           O↑   N      1       2            See Supplement Table 6          0       1
                     code


                     vocal effort scale
               VES                           O↑   N      1       1           positive integer, 0 ≤ VES ≤ 5    0       1
                     number


               VSC   vocal style code        O↑   N      1       2            See Supplement Table 7          0       1


                     recording                                                       0=unknown
               RAI   awareness               O↑   N      1       1                    1=aware                 0       1
                     indicator                                                       2=unaware


               SCR   script text             O↑   U      0       9999                    none                 0       1


               COM   comments                O↑   U      1       4000                    none                 0       1


                     SEGMENT
         SCH                                 D                                                                0       1
                     CHANNEL


                     Subfields:
                     Repeating sets of       M↑                                                               1    TBD
                     information items


                                                                                     0=snip diary             1       1
               DIA   diary identifier        M↑   B     1            1
                                                                                   1=segment diary

11.038                                                                           0 or a list of positive      1       1
               SID   segment identifiers     M↑   NS    1         *            integers, each ≤ 600000,
                                                                                 separated by commas


                     audio capture
               ACD                           O↑   N     1         2           See Supplement Table 8          0       1
                      device code


                                                                                     unknown=0
                                                                                      carbon=1
                     microphone type
               MTC                           O↑   N     1         1                   electret=2              0       1
                     code
                                                                                     dynamic=3
                                                                                       other=4


                                                                 24
 Field          Mnemonic           Content             C            Character                   Value                     Occurrence
Number                            Description          on                                     Constraints
                                                        d       T           M        M
                                                                y           i        a                                       M         M
                                                                p           n        x                                       i         a
                                                       co                                                                    n         x
                                                                e           #        #
                                                       de                                                                    #         #


                                  capture
                      ENV         environment          O↑      U        1        4000                Text                    0         1
                                  description text


                                  transducer                                              positive integer, 0 ≤ DST ≤
                      DST                              O↑      N        1        5                                           0         1
                                  distance                                                           99999


                      ACS         acquisition source   O↑      N        1        2              See Table 88                 0         1


                                 voice modification
                      VMT                              O↑      U        1        400                 none                    0         1
                                 description text


                      COM        comments              O↑      U        1        4000                none                    0         1



                            RESRRESERVED FOR
                                    FUTURE USE
11.039-11.050                       only by
                                    ANSI/NIST-
                                    ITL



11.051                COM        COMMENTS              O↑      U            1    4000               None                     0         1


                                  RESERVED
                                  FOR FUTURE
11.052-11.099                                          Not to be used
                                  USE only by
                                  ANSI/NIST-ITL


                                  USER-DEFINED
11.100-11.900   UDF                                    O      user-defined                       user-defined           user-defined
                                  FIELDS


                                  RESERVED
                                  FOR FUTURE
11.901                                                 Not to be used
                                  USE only by
                                  ANSI/NIST-ITL


                                  ANNOTATION
                ANN                                    O                                                                      0        1
                                  INFORMATION


                                  Subfields:
                                  Repeating sets of    M↑                                                                     1        *
                                  information items
11.902
                                  Greenwich     mean           encoding specific: see    encoding specific: see Annex
                      GMT                              M↑                                                                     1        1
                                  time                         Annex B or Annex C        B or Annex C


                                  processing
                      NAV                              M↑      U        1        64                 None                      1        1
                                  algorithm     name


                                                                                25
 Field          Mnemonic     Content             C            Character                Value         Occurrence
Number                      Description          on                                  Constraints
                                                  d   T            M        M
                                                      y            i        a                         M        M
                                                      p            n        x                         i        a
                                                 co                                                   n        x
                                                      e            #        #
                                                 de                                                   #        #


                            version


                      OWN   algorithm owner      M↑   U         1       64                  None      1         1


                            process
                      PRO                        M↑   U         1       255                 None      1         1
                             description


                            RESERVED
11.903-11.992               FOR FUTURE
                                                                                  Not to be used
                            USE only by
                            ANSI/NIST-ITL


                            SOURCE
11.993          SAN         AGENCY               O    U        1        125                None        0        1
                            NAME


                            EXTERNAL
11.994          EFR         FILE                 D    U        1        200                None        0        1
                            REFERENCE


                            ASSOCIATED
                ACN                              O                                                     0        1
                            CONTEXT

11.995
                            Subfields:
                            Repeating sets of    M↑                                                    1       255
                            information items


                            associated context                                  1 ≤ ACN ≤ 255
                      CAN                        M↑       N     1       3                                  1        1
                            number                                              positive integer


                            associated segment                                  1 ≤ ASP ≤ 99
                      ASP                        O↑       N     1       2                                  0        1
                            position                                            positive integer


11.996          HAS         HASH                 O        H     64      64                   none          0        1


                            SOURCE
                SOR         REPRESENTA-          O                                                         0        1
                            TION


                            Subfields:
                            Repeating sets of    M↑                                                        1      255
11.997                      information items


                            source
                                                                                  1 ≤ SRN ≤ 255
                      SRN   representation       M↑       N     1       3                                  1        1
                                                                                  positive integer
                            number




                                                                       26
 Field         Mnemonic         Content             C           Character           Value               Occurrence
Number                         Description          on                            Constraints
                                                     d     T      M        M
                                                           y      i        a                             M       M
                                                           p      n        x                             i       a
                                                    co                                                   n       x
                                                           e      #        #
                                                    de                                                   #       #


                                reference segment                              1 ≤ RSP ≤ 99
                    RSP                             O↑      N     1    2                                     0       1
                                position                                       positive integer


                                RESERVED FOR
                                FUTURE      USE
11.998                                                                           Not to be used
                                only          by
                                ANSI/NIST-ITL


11.999         DATA             VOICE DATA          D      B      1    22                  none           0          1

         490
         491
         492                            1. Field 11.001: Record header
         493
         494   The content of this mandatory field is dependent upon the encoding used. See the relevant annex
         495   of this standard for details. See Section 7.1.
         496
         497                            2. Field 11.002: Information Designation Character/IDC
         498
         499   This mandatory field shall contain the IDC assigned to this Type-11 record as listed in the
         500   information item IDC for this record in Field 1.003 Transaction content/CNT. See Section
         501   7.3.1. This field can be used to identify, within the Type-1 record, the various Type-11 records
         502   in a single transaction.
         503
         504                            3. Field 11.003: Audio Object Descriptor/AOD
         505
         506   This mandatory field shall be a numeric entry selected from the attribute code column of
         507   Supplement Table2. Only one value is allowed and indicates the type of audio object containing
         508   the voice recording which is the focus of this Type-11 record. Attribute code 0 indicates that the
         509   audio object of this record is a digital voice data file in the Field 11.999. Attribute code 1
         510   indicates that the audio object is a digital voice data file at the location specified in Field 11.994.
         511   Attribute codes 2-4 indicate that the audio object is a physical media object at a location
         512   described in Field 11.994.
         513
         514   If the Type-11 record contains only metadata (such as in a response to a voice recording
         515   submission), attribute code 5 shall be selected.
         516
         517                                                    Table 2
         518                                             Audio Object Descriptor
                                                        Audio Object                        Attribute

                                                                      27
                                                                          Code
                       Internal digital voice data file                   0
                       External digital voice data file                   1
                       Physical Media Object containing digital data      2
                       Physical Media Object containing analog signals    3
                       Physical Media Object containing unknown data or   4
                       signals
                        No audio object associated with this record       5
519
520
521                         4. Field 11.004: Voice Recording Source Organization/VRSO
522
523   This is an optional field and shall contain information about the site or agency that created the
524   voice recording pointed to or included in this record. In the case of files created from previous
525   recordings, this is not necessarily the source of the original transduction of the acoustic
526   vocalizations from the person to whom the Type-11 record pertains. This need not be the same
527   as the Source agency/SRC or the Originating agency of Field 1.008 or the Destination agency
528   of Field 1.007.
529
530         o The first information item, the source organization type code/STC, is mandatory if
531            this field is used. There may be no more than one occurrence of this item. This
532            information item contains a single character describing the site or agency that created
533            the voice recording:
534
535                 U = Unknown
536                 P = Private individual
537                 I = Industry/Commercial
538                 G = Government
539                 O = Other
540
541      o The second information item (source organization name/ SON) is optional and shall be
542          the name of the group, organization or agency that created the voice recording. There
543          may be no more than one occurrence for this item. This is an optional information item in
544          Unicode characters and is limited to 400 characters in length.
545
546      o The third information item is the point of contact/POC who composed the voice
547          recording. This is an optional information item that could include the name, telephone
548          number and e-mail address of the person or persons responsible for the creation of the
549          voice recording. This information item may be up to 200 Unicode characters.
550
551      o The fourth information item is optional. It is the ISO-3166-1 code of the sending
552         country/CSC. This is the code of where the voice recording was created – not
553         necessarily the nation of the agency entered in Field 11.993: Source agency/SRC . All
554         three formats specified in ISO-3166-1 are allowed (Alpha2, Alpha3 and Numeric). A
555         country code is either 2 or 3 characters long.
556

                                                     28
557
558                          5. Field 11.005: Voice Recording Content Descriptor/VRC;
559
560   This field is optional and shall describe the content of the voice recording. It consists of 2
561   information items, one of which must be included if this field is used:
562
563      o The first information item (assigned voice indicator /AVI) is an optional binary indicator
564         and is mandatory if this field is used. It indicates if the voice recording sample was
565         obtained from a known subject. 0 indicates that the recording contains a questioned
566         voice; 1 indicates that the recording contains an assigned voice.
567
568      o The second information item (speaker plurality code/SPC) is optional and indicates
569         plurality of speakers represented on voice recording: M = multiple speakers; S = single
570         speaker.
571
572                          6. Field 11.006: Audio Recording Device/REC
573
574   This field is optional and shall indicate information about the recording equipment that created
575   the voice recording
576   contained in or pointed to by this record. There may be no more than one occurrence of this
577   field.
578
579    NOTE: As recordings or data files may be transcoded from previously recorded or broadcast
580   content, this equipment may or may not be the equipment used to record the original acoustic
581   vocalization of the person to whom the Type-11 record pertains.
582
583      o The first information item (recording device descriptive text/RDD) is an optional text
584        field of up to 4000 characters describing the recording device that created the voice
585        recording. An example would be “Home telephone answering device”.
586
587      o The second, third and fourth information items (recording device make/MAK,
588         recording device model/MOD, recording device serial number/SER) are optional
589         items of up to 50 characters each and shall contain the make, model and serial number,
590         respectively, for the recording device. There may be no more than one entry for this item.
591         See Section 7.7.1.2 for details.
592
593
594                          7. Field 11.007: Acquisition source / AQS
595
596   This mandatory field shall specify and describe the acquisition source.
597
598         oThe first information item, Acquisition source type / AQT, is mandatory and it
599          shall be a numeric entry selected from the “attribute code’ column of Table 88.
600
601         o The second information item is mandatory if the acquisition source is analog, and the
602            data is stored in digital format. It is a text field, analog to digital conversion / A2D,
                                                       29
603            that describes the analog to digital equipment used to transform the source. This field
604            should address parameters used, such as sample rate, if known.
605
606        oThe third information item is mandatory if the AQT is 23 or 24. It is a text field, radio
607          transmission format description / FDN. It is optional for other radio transmission
608          codes.
609
610        oThe fourth information item is optional. It is a free text field, acquisition special
611          characteristics / AQSC that is used to describe any specific conditions not mentioned
612          in the table.
613
614                          8. Field 11.008: Record Creation Date/RCD
615
616   This mandatory field shall contain the date and time of creation of this Type-11 record. This
617   date will generally be different from the voice recording creation date and may be different from
618   the date at which the acoustic vocalization originally occurred. See Section 7.7.2.4 Local date
619   and time for details.
620
621
622                          9. Field 11.009: Voice Recording Creation Date/VRD
623
624   This optional field shall contain the date and time of creation of the voice recording contained in
625   the record. If pre-recorded or transcoded materials were used, this date may be different from
626   the date at which the acoustic vocalization originally occurred. See Section 7.7.2.4 Local date
627   and time for details.
628
629
630                          10. Field 11.010: Total Recording Duration/TRD
631
632   This field is optional and gives the total length of the voice recording in time, compressed bytes
633   and total digital samples. At least one of the three information items must be entered if this field
634   is used.
635
636      o The first information item (time/TIM) is optional and gives the total time of the voice
637         recording in microseconds. The size of this item is limited to 11 digits, limiting the total
638         time duration of the signal to 99,999 seconds, which is approximately 28 hours.
639
640      o The second information item (compressed bytes/CBY) is optional and gives the total
641         number of compressed bytes in the voice data file. Consequently, this information item
642         applies only to digital voice recordings stored as voice data files. The size of this item is
643         limited to 14 digits, limiting the total size of the voice data file to 99 terabytes.
644
645      o The third information item (total digital samples/TSM) is optional and gives the total
646          number of digital samples in the voice data file after any decompression of the
647          compressed signal. This information item applies only to digital voice recordings stored
648          as voice data files. The size of this item is limited to 14 digits.
                                                      30
649
650                          11. Field 11.011: Physical Media Object/ PMO
651
652   This field is optional and identifies the characteristics of the physical media containing the voice
653   recording. There can be only one physical media object per Type-11 record, but multiple Type-
654   11 records can point to the same physical media object. This field only applies if Field 11.003
655   has an attribute code of 2, 3 or 4. The location of the physical media object is given in Field
656   11.994.
657
658      o The first information item (media type description/MTD) is mandatory if this field is
659         used and contains text of up to 300 characters describing the general type of media (e.g.,
660         analog cassette tape, reel-to-reel tape, CD, DVD, phonograph record) upon which the
661         voice recording is stored. If an analog media is used for storage, and AQS of Field
662         11.006 is 14, then a description of the digital to analog procedure should be noted in
663         Field 11.902 and the reasons for such a conversion noted in COM of Field 11.010.
664
665      o The second information item (recording speed/RSP) is optional and gives a numerical
666         value to the speed at which the physical media object must be played to reproduce the
667         voice signal content. This value may be integer or floating point and shall not exceed 9
668         characters.
669
670      o The third information item (recording speed measurement units description text /RSU)
671         is mandatory if the second information item, RSP, is entered and contains text of up to
672         300 characters to indicate the units of measure to which RSP refers.
673
674      o The fourth information item (equalization description/EQ) is an optional text field
675         containing up to 1000 characters and indicating the the equalization that should be
676         applied for faithful rendering of the voice recording on the physical media object.
677
678      o The fifth information item (track count/TRC) is an optional integer between 1 and 99,
679         inclusive, that gives the number of tracks on the physical media object. For example, a
680         stereo phonograph record will have 2 tracks.
681
682      o The sixth information item (speaker track number/STK) is an optional list of integers
683         which indicate which tracks carry the voices of the speaker(s).
684
685      o The seventh information item (comments/COM) is optional and allows for additional
686          comments of up to 4000 Unicode characters in length describing the physical media
687          object.
688
689                            12. Field 11.012: Container Format/CFT
690
691   This is an optional field (container format/CFT) that gives information about the container
692   format, if any, which encapsulates the audio data of the electronic file used to carry the voice
693   data in the digital recording. This field is not used if the voice recording is stored on a physical
694   media object as an analog signal. If present, this field overrides the CDC Field 11.012. This field

                                                       31
695   does not accommodate multiple Container Formats in a single Type‐11 record. The Container
696   Format shall be entered as the appropriate integer code from Table 3 below.
697
698   Container files incorporate audio samples and specifications to properly decode the audio, such
699   as the codec, and its parameters, e.g., number of channels, sample rate, bit/byte depth, and
700   big/little endian. More generally, the container formats can specify a codec, or simply
701   encapsulate one or more audio channels as Linear PCM.
702
703   The well‐known Wave container specification has fields such as chunk ID, chunk size, audio
704   format (codec), sampling rate, number of channels, space for extra parameters (for the codec or
705   other uses).
706
707                                              Table 3
708                                  Table of Audio Visual Container Types
                              Container Type               Windo Attrib
                                                           ws       ute
                                                           Extensi Code
                                                           on(s)
                              RAW format (no Container)             0
                              WAV (RIFF audio)             .wav     1
                              3GP and 3G2 mobile video     .3gp     1
                                                           .3g2
                              AIFF                         .aiff    1
                                                           .aif
                              MP3 (MPEG-1, Layer 3         .mp3     1
                              audio)
                              QuickTime (Apple VBR-        .mov     1
                              audio/video/image)           .qt
                              Video for Windows            .avi     1
                              Vorbis (OGG audio)           .ogg     1
                              Windows Media                .wmv     1
                                                           .wma
                                                           .asf
                                                           .asx
                              Other                                 2
709
710   All the audio characteristics required to properly interpret RAW format data must be provided
711   elsewhere, so if RAW is specified, then Field 11.012 is mandatory since the codec type and its
712   parameters (SRT, BIT, EDN, PNT, and CHC) must be specified for retrieval of the audio.
713
714   A Container Type of Other (CFT=2) indicates that the Container is not given in Table 3 and is
715   specified externally to the Type-11 Standard. Containers not specified in Table 3 are optional,
716   are not guaranteed to exist in a given implementation of the standard, and should be used with
717   caution. Optional Containers are specified in Table 3-External, as published in the document
718   External Container Formats, available: http://xyx.gov.
719
                                                     32
720
721                            13. Field 11.013: Codec/CDC
722
723   This is an optional field that gives information about the codec used to encode the voice and
724   audio data in the digital recording. This field is not used if the voice recording is stored on a
725   physical media object as an analog signal. This field is only used if the digital audio file lacks a
726   Container. Information in Field 11.011 (Container Type/CFT) overrides this Field if both are
727   present. The following information types can be specified.
728
729      o The first information item (Codec type code/CDT) is mandatory if this information item is
730         used and indicates the single codec type used for all audio segments in the record. This
731         format does not accommodate multiple codec types within a single record. It shall be a
732         numeric entry selected from the Attribute Code column of Table 4. If the codec type is
733         identified as Other (CDT=7), the final information item (comments/COM) shall be used
734         to describe the codec.
735
736                                                     Table 4
737                                             Table of Codec Types
738                               Codec Type                    Attribute
739                                                             Code
740                               Linear PCM                    1
741                               Floating-point linear PCM     2
742                               ITU-T G.711 (PCM): μ-law 3
743                               with forward order digital
744                               samples
745                               ITU-T G.711 (PCM): μ-law 4
746                               with reverse order digital
747                               samples
748                               ITU-T G.711 (PCM): A-law 5
749
                                  with forward order digital
750                               samples
751
                                  ITU-T G.711 (PCM): A-law 6
752
                                  with reverse order digital
753
                                  samples
754
                                  Other                         7
755
756
757      o The second information item (Sampling rate number/SRT) is ?????? and indicates the
758         number of digital samples per second that represent a second of analog voice data upon
759         conversion to an acoustic signal. The sampling rate is expressed in Hz and must be an
760         integer value. Acceptable values are between 1 and 100,000,000 Hz, but unknown or
761         variable sampling rates shall be given the value of 0. Common values of SRT are 8000,
762         11025, 16000, 22050, 32000, 44100, and 48000 Hz. The value of 0 shall only be used to
763         indicate unknown or variable sampling rate.
764


                                                      33
765     o The third information item (Bit depth count/BIT) is ?????? and indicates the number of
766         bits that are used to represent a single digital sample of voice data. Acceptable values are
767         between 1 and 64, inclusive. Encoders of unknown or variable bit depth shall be given
768         the value of 0. (This field is not intended to be an indication of the actual dynamic range
769         of the voice data.) Changes to the bit depth should be logged in Type-98 or Field 11.902
770         audit logs. Common values for BIT are 8, 16, 24, and 32 bits.
771
772     o The fourth information item (Endian code/EDN) is ?????? and indicates which byte goes
773         first for digital samples containing two or more bytes. The values for EDN are 0=big,
774         1=little, or 2=native endian. (EDN is optional and ignored for digital samples that do not
775         contain two or more integer multiples of bytes.)
776
777     o The fifth information item (Fixed point indicator/PNT) is ?????? and indicates the digital
778         sample representation. The value is 0 if the digital samples are represented as fixed-point
779         or 1 if the samples are floating-point.
780
781     o The sixth information item (Channel count/HC) is ????? and gives the integer number of
782         channels of data represented in the digital voice data file. The number of channels must
783         be between 1 and 99, inclusive. Common values for CHC are 1 and 2 channels.
784
785     o The seventh information item (Comments/COM) is an optional, unrestricted text string of
786         up to 4000 characters in length. It is required if the Codec Type is Other (CDT=7). For
787         Codec Types other than Other, COM is optional and it can contain additional information
788         about the codec or additional instructions for reconstruction of audio output from the
789         stored digital data. Codec parameters shall be specified in this field when required for
790         unambiguous decoding. This item should include a description of any noise reduction
791         processing or equalization that must be applied to faithfully render the voice recording.
792
793
794                           14. Field 11.014: Preliminary Signal Quality/PSQ
795
796 This field is optional and gives an assessment of the general “quality” of the voice recording.
797 There may be as many as 9 PSQ subfields for the audio file to indicate different types of quality
798 assessments.
799
800      o The first information item (quality value/QVU) is mandatory if this field is used and
801           shall indicate the general quality as an integer value between 0 (low quality) and 100
802           (high quality). A value of 255 indicates that quality was not assessed.
803
804      o A second information item is mandatory if this field is used and shall specify the ID of
805           the vendor of the quality assessment algorithm used to calculate the quality score, which
806           is an algorithm vendor identification/QAV. This 4-digit hex value (See Section 5.5
807           Character types) is assigned by IBIA and expressed as four characters. The IBIA
808           maintains the Vendor Registry of CBEFF Biometric Organizations that map the value in
809           this field to a registered organization. For algorithms not registered with the IBIA, the
810           value of 0x00 shall be used.

                                                     34
811
812        o A third information item is mandatory if this field is used and shall specify a numeric
813           product code assigned by the vendor of the quality assessment algorithm, which may be
814           registered with the IBIA, but registration is not required. This is the algorithm product
815           identification/QAP that indicates which of the vendor’s algorithms was used in the
816           calculation of the quality score. This information item contains the integer product code
817           and should be within the range 1 to 65,534. For products not registered with the IBIA,
818           the code 0 shall be used.
819
820        o The fourth information item (comments/COM) is optional and should be used to give
821           additional information about the quality assessment process. It shall be used to describe
822           unregistered algorithms.
823
824                          15. Fields 11.015-020: Reserved Fields
825
826   These fields are reserved for future use by ANSI/NIST-ITL.
827
828                          16. Field 11.021: Redaction/ RED
829
830   This field is optional and indicates whether the voice recording has been redacted, meaning that
831   some of the audio record has been overwritten (“Beeped”) or erased to delete speech content
832   without altering the relative timings within, or the length of, the segments. This field is not to be
833   used to indicate that audio content has been snipped with the alteration of the relative timings in,
834   or length of, the segment.
835
836      o   The first information item (redaction indicator/RDI) is a binary indicator and is
837          mandatory if this field is used. It indicates whether the voice recording contains
838          overwritten or erased sections intended to remove, without altering the length of the
839          segment, semantic content deemed not suitable for transmission or storage. 0 indicates
840          no redaction and 1 indicates that redaction has occurred.
841
842      o   The second information item (redaction authority organization name/RDA) is an
843          optional text field of up to 300 characters in length containing information about the
844          agency that directed, authorized or performed the redaction. Agencies undertaking
845          redaction activities on the original speech should log their actions by appending to this
846          item and noting the change of field contents in the Type-98 record and/or Field 11.902 of
847          this record.
848
849      o   The third information item (comments/COM) is an optional unrestricted text string of up
850          to 4000 characters in length that may contain text information about the redactions
851          affecting the stored voice data.
852
853
854                          17. Field 11.022: Redaction Diary/RDD
855


                                                       35
856   This optional field (redaction diary/RDD) indicates the timings with the voice recording of
857   redacted (overwritten) audio segments. The redactions need not be dominated by speech from
858   the subject of this transaction or record. Four items (uniquely numbering the redactions
859   identified by recording track and giving relative start and end times of each) are mandatory if
860   this field is used and shall repeat for each redaction. A fifth item is optional and
861   accommodates comments on the individual redactions. The record type accommodates up to
862   600,000 redactions by repeating the subfield.
863
864      o The first information item (redaction identifier/RID) is mandatory if this field is used and
865         uniquely numbers the redactions to which the following items in the field apply. There is
866         no requirement that the redactions be numbered sequentially. The RID may contain up to
867         6 digits. The number of redactions is limited to 600,000.
868
869      o The second information item (tracks/TRK) is mandatory if item PMO_TRC in Field
870         11.010 or CDC_CHC of Field 11.013 is greater than one and lists all tracks or channels
871         on the recording to which the redaction identifier applies. The track numbers are
872         separated by commas. No value in this list should be greater than the value of
873         PMO_TRC or CDC_CHC, whichever applies. For example, in the case of a two-track
874         stereo recording where both tracks contain a redaction at the same start and end times,
875         this item will be “1,2”
876
877      o The third information item (relative start time/RST) is a mandatory integer for every
878         redaction identified by an RID and indicates in microseconds the time of the start of the
879         redaction relative to the beginning of the voice recording. The item can contain up to 11
880         digits, meaning that the start of a redaction might occur anywhere within a voice
881         recording limited to about 28 hours. It is not expected that redactions on the same track
882         of the audio object will overlap, meaning that the RST of a redaction is not expected to
883         occur between the RST and RET of any other redaction on the same track, although this
884         is not prohibited. If the Type-11 record refers to an analog recording, the method of
885         determining the start time shall be given in the comment item of this field.
886
887     oThe fourth information item (relative end time/RET) is a mandatory integer for every
888       redaction identified by an RID and indicates in microseconds the time of the end of the
889       redaction relative to the beginning of the voice recording. The item can contain up to 11
890       digits, meaning that the end of a redaction might occur anywhere within a voice recording
891       limited to about 28 hours. As with the RST, it is not expected that redactions on the same
892       track of the audio object will overlap, although this is not prohibited.
893
894     oThe fifth information item (comments/COM) is an optional unrestricted text string of up to
895       4000 characters in length that allows for comments of any type to be made on a redaction.
896
897
898                         18. Field 11.023: Snipping Segmentation/ SNP
899
900   This field is optional and indicates whether the voice recording referenced in this Type-11 record
901   has had segments removed meaning that the voice signal is not a continuous recording in time.
                                                     36
902   This field is used to indicate removal, for any reason, of audio signal from the original recording
903   of the acoustic vocalizations in a way that disrupts time references.
904
905      o   The first information item (snip indicator/SGI) is a binary variable and is mandatory if
906          this field is used. It indicates whether the voice recording contains temporal
907          discontinuities caused by snipping of segments from a longer original recording. 0
908          indicates no snipping and 1 indicates that snipping has occurred.
909
910      o   The second information item (snipping authority organization name/SPA) is an
911          optional text field of up to 300 characters containing information about the agency that
912          performed the snipping segmentation. Agencies undertaking snipping activities on the
913          original speech should log their actions by appending to this item and noting the change
914          of field contents in the Type-98 record and/or Field 11.902 of this record.
915
916      o   The third information item (comments/COM) is an optional unrestricted text string of up
917          to 4000 characters that may contain text information about the snip activities affecting the
918          voice recording.
919
920
921                         19. Field 11.024: Snipping Diary/SPD
922
923   This optional field (snipping diary/SPD) allows the documentation of snips obtained from
924   larger voice recordings, which might themselves be included in the transaction as Type-20
925   records. There may be up to 600,000 snips diarized in repeating subfields. Each snip shall be
926   dominated by speech from the subject of this Type-11 record. Four items (uniquely numbering
927   the snips by track and giving relative start and end times of each) are mandatory in each
928   subfield. A fifth item is optional within each subfield and allows for comments on the identified
929   snip. If there is no snipping (Field 11.023) indicated, then all of the data in the voice recording
930   will be considered as in toto and the subfields will not repeat. There can be at most one
931   snipping diary for each Type-11 record.
932
933     oThe first information item (snip identifier/SPI) is mandatory in each subfield and uniquely
934       numbers the snip to which the following items in the subfield apply. There is no
935       requirement that the snips be numbered sequentially. The SPI may contain up to 6 digits
936       and up to 600,000 snips may be identified. If Field 11.023 indicates snipping, the voice
937       recording must consist of at least one snip.
938
939      o The second information item (tracks/TRK) is mandatory if item PMO_TRC in Field
940         11.010 or CDC_CHC of Field 11.013 is greater than one and lists all tracks or channels
941         on the recording to which the snip identifier applies. The track numbers are separated by
942         commas. No value in this list should be greater than the value of PMO_TRC or
943         CDC_CHC, whichever applies. For example, in the case of a two-track stereo recording
944         where both tracks contain a snip at the same start and end times, this item will be “1,2”
945
946     oThe third information item (relative start time/RST) is a mandatory integer for every snip
947       identified by an SPI and indicates in microseconds the time of the start of the snip relative

                                                      37
948        to the beginning of the voice recording. The item can contain up to 11 digits, meaning
949        that the RST might occur anywhere within a voice recording limited to about 28 hours.
950        Because each snip is obtained independently from a larger voice recording, snips from a
951        single track on the audio object described in Field 11.003 shall not overlap, meaning that
952        the RST of a snip shall not occur between the RST and RET of any other snip on the
953        same track. If the Type-11 record refers to an analog recording, the method of determining
954        the start time shall be given in the comment item of this field.
955
956    oThe fourth information item (relative end time/RET) is a mandatory integer for every snip
957        identified by an SPI and indicates in microseconds the time of the end of the snip relative
958        to the beginning of the voice recording. The item can contain up to 11 digits, meaning that
959        the snip may end anywhere within the 28 hour voice recording. Because each snip is
960        obtained independently from a larger voice recording, snips from the same track of the
961        audio object of Field 11.003 shall not overlap, meaning that the RET of a snip shall not
962        occur between the RST and RET of any other snip from the same track.
963
964    oThe fifth information item (comments/COM) is an optional unrestricted text string of up to
965        4000 characters in length that allows for comments of any type to be made on a snip. This
966        allows for comments on a snip-by-snip basis. This comment field could contain word or
967        phonic level transcriptions, language translations or security classification markings, as
968        specified in exchange agreements.
969
970
971                         20. Field 11.025: Diarization/DIA
972
973 This field (Diarization/DIA) is optional and indicates whether the voice recording has been
974 diarized, meaning that time markings are included in Field 11.026 to indicate the speech
975 segments of interest pertaining to the subject of this Type-11 record.
976
977     o The first information item (diarization indicator/DII) is mandatory if this field is used.
978         It is a binary indicator that indicates whether the voice recording is accompanied by a
979         segment diary in Field 11.026 indicating speech segments from the voice signal subject
980         of the Type-11 record. 0 indicates no accompanying diary and 1 indicates one or more
981         accompanying diaries.
982
983     o The second information item (diarization authority/DAU) is an optional text field of up
984         to 300 characters containing information about the agency that performed the diarization.
985         Agencies undertaking diarization activities on the original speech should log their actions
986         by appending to this item and noting the change of field contents in the Type-98 record
987         and/or Field 11.902 of this record
988
989
990     o The third information item (comments/COM) is an optional unrestricted text string of up
991         to 4000 characters that may contain text information about the diarization activities
992         undertaken on the voice data.
993
994
                                                     38
 995                         21. Field 11.026: Segment Diary/SGD
 996
 997   This field only appears if Field 11.025 is present and DII = 1. This field (segment diary/SDI)
 998   contains repeating subfields that name and locate the segments within the voice recording of this
 999   Type-11 record associated with a single speaker. In a conversational setting, a speaker “turn”
1000   might be divided into several segments as the content, speaking style and collection conditions
1001   change. Within a Type-11 record, there may be only one segment diary describing a single
1002   speaker within the single voice recording. If additional diarizations of this voice recording are
1003   necessary -- for example, to locate segments of speech from a second speaker in the voice
1004   recording, additional Type-11 records must be created. Each segment diarized shall contain
1005   speech from the subject of this record, although a segment may contain speech collisions. The
1006   first four items (uniquely identifying the segments, identifying the tracks from the audio media
1007   object of Field 11.003 to which the segment number applies, and giving start and end times of
1008   each relative to the absolute beginning of the voice recording) are mandatory if this field is used
1009   and shall repeat for each speech segment identified. A fifth item is optional and accommodates
1010   comments on the individual segments. This record type accommodates up to 600,000 speech
1011   segments as repeating subfields. For voice recordings consisting of snips, the snipping diary
1012   SPD of Field 11.024 may be included in the SGD as a subset and may be identical.
1013
1014     oThe first information item (segment identifier/SID) is mandatory in each subfield and
1015       uniquely numbers the segment to which the following items in the subfield apply. There
1016       is no requirement that the segments be numbered sequentially in sequential subfields. The
1017       SID may contain up to 6 digits, but the number of segments identified in the field (the
1018       total number of recurring subfields) is limited to 600,000.
1019
1020      o The second information item (tracks/TRK) is mandatory if item PMO_TRC in Field
1021         11.010 or CDC_CHC of Field 11.013 is greater than one and lists all tracks or channels
1022         on the recording to which the segment identifier applies. The track numbers are
1023         separated by commas. No value in this list should be greater than the value of
1024         PMO_TRC or CDC_CHC, whichever applies. For example, in the case of a two-track
1025         stereo recording where both tracks contain a segment at the same start and end times, this
1026         item will be “1,2”
1027
1028      o The third information item (relative start time/RST) is a mandatory integer for every
1029         segment identified and indicates in microseconds the time of the start of the segment
1030         relative to the absolute beginning of the voice recording. The item can contain up to 11
1031         digits, meaning that the segment can start at any time within the 28 hour voice recording.
1032         Because each segment is expected to be dominated by the primary subject of this Type-
1033         11 record, it is expected that segments from the same track of the audio object identified
1034         in Field 11.003 not will overlap, meaning that the RST of a segment is not expected to
1035         occur earlier than the end of a previous segment from the same track, although this is not
1036         prohibited. In multiple ANSI/NIST-ITL transactions involving multiple speakers using
1037         the same voice data record, segments on the same track across the transactions may
1038         overlap during periods of voice collision. If the Type-11 record refers to an analog
1039         recording, the method of determining the start time shall be given in the comment item of
1040         this subfield.

                                                       39
1041
1042     oThe fourth information item (relative end time/RET) is mandatory for every segment and
1043       indicates in microseconds the time of the end of the segment relative to the absolute
1044       beginning of the voice recording. The item can contain up to 11 digits, meaning that the
1045       segment can end at any time within the 28 hour voice recording. As with the RST, it is
1046       expected that segments from the subject of this Type-11 record will not overlap, although
1047       this is not prohibited.
1048
1049     oThe fifth information item (comments/COM) is an optional unrestricted text string of a
1050       maximum of 10,000 characters in length that allows for comments of any type to be made
1051       on a segment. This comment item could contain word- or phonic level transcriptions,
1052       language translations or security classification markings, as specified in exchange
1053       agreements.
1054
1055                         22. Field 11.027-030: Reserved Fields
1056
1057   These fields are reserved for future use by ANSI/NIST-ITL.
1058
1059                         23. Field 11.031: Time of Segment Recording /TME
1060
1061   This optional field (Time of Segment Recording/TME) contains subfields, each referring to a
1062   segment identified in either the snip diary SPD of Field 11.024 or the segment diary SGD of
1063   Field 11.026 and gives the date, start, and end times of the original transduction of the
1064   contemporaneous vocalizations in the identified segment. This field is only present if Field
1065   11.024 or Field 11.026 is present in this record. This field also accommodates circumstances in
1066   which the original voice recording was tagged with a time and date field. There is no
1067   requirement that the date and times for the original recording match the dates and times of the
1068   tags, if the tags have been determined to be inaccurate.
1069
1070      o The first information item (diary identifier/DIA) is mandatory in each subfield and is a
1071         binary value that indicates the diary to which this subfield refers. If this item refers to a
1072         segment in the SPD of Field 11.024, the value is 0. If this item refers to a segment in the
1073         SGD of Field 11.026, the value is 1.
1074
1075      o The second information item (segment identifier/SID) is mandatory and gives the
1076         segment identifier from the diary given in DIA to which the values in this subfield
1077         pertain. Together, the first and second information items of each subfield uniquely
1078         identify the segment to which the following items apply.
1079
1080      o The third information item (original recording date/ORD) is optional and gives the date
1081         of the original, contemporaneous capture of the voice data in the segment identified. See
1082         Section 7.7.2.3.
1083
1084      o The fourth information item (tagged date/TDT) is optional and gives the date indicated on
1085         the original, contemporaneous capture of the voice data in the segment identified. This


                                                      40
1086          item may be different from the value of the ORD above, if the tag is determined to be
1087          inaccurate. See Section 7.7.2.3.
1088
1089     oThe fifth information item (segment recording start time/SRT) is optional and gives the
1090       local start time of the original, contemporaneous capture of the voice data in the segment
1091       identified. See Section 7.7.2.4 Local date and time for details.
1092
1093     oThe sixth information item (tagged start time/TST) is optional and gives the time tagged
1094       on original, contemporaneous capture of the voice data at the start of the segment
1095       identified. This item may be different from the value of the SRT above, if the tag is
1096       determined to be inaccurate. See Section 7.7.2.4 Local date and time for details.
1097
1098     oThe seventh information item (segment recording end time/END) is optional and gives the
1099       local end time of the original, contemporaneous capture of the voice data in the segment
1100       identified. See Section 7.7.2.4 Local date and time for details.
1101
1102     oThe eight information item (tagged end time/TET) is optional and gives the time tagged on
1103       original, contemporaneous capture of the voice data at the end of the segment identified.
1104       This item may be different from the value of the END above, if the tag is determined to be
1105       inaccurate. See Section 7.7.2.4 Local date and time for details.
1106
1107     oThe ninth information item (time source description text/TMD) is an optional string of up
1108       to 300 characters that gives the reference for the values used for DOR, SRT and END.
1109
1110     oThe tenth information item (comments/COM) is an unrestricted text string of up to 4000
1111       characters in length that allows for comments of any type to be made on the timings of the
1112       segment recording, including the perceived accuracy of the values of DOR, SRT and
1113       END.
1114
1115
1116                         24. Field 11.032: Segment Geographical Information/GEO
1117
1118   This field (Segment Geographical Information/GEO) contains repeating subfields, each
1119   referring to a segment identified in either the snip diary SPD of Field 11.024 or the segment
1120   diary SGD of Field 11.026 and giving geographical location of the primary subject of the Type-
1121   11 record at the beginning of that segment. This field is only present if Field 11.024 or Field
1122   11.026 is present in this record.
1123
1124      o The first information item (diary identifier/DIA) is mandatory in each subfield and is a
1125         binary indicator of the diary to which this subfield refers. If this item refers to a segment
1126         in the SPD of Field 11.024, the value is 0. If this item refers to a segment in the SGD of
1127         Field 11.026, the value is 1.
1128
1129      o The second information item (segment identifiers/SID) is mandatory in each subfield and
1130         gives the segment identifiers from diary to which the values in this subfield pertain. The
1131         number of segment identifiers listed is limited to 600,000. A value of 0 in this subfield

                                                      41
1132          indicates the segment geographical information in this subfield shall be considered the
1133          default value for all segments not specifically identified in other occurrences of this
1134          subfield. If multiple segments are identified, they are designated as integers separated by
1135          commas.
1136
1137      o The third information item (segment cell phone tower code/SCT) is optional and
1138         identifies the cell phone tower, if any, that relayed the audio data at the start of the
1139         segment or segments referred to in this subfield. It is a text field of up to 100 unrestricted
1140         characters.
1141
1142      o The next six information items are latitude and longitude values. See Section 7.7.3
1143
1144     oThe tenth information item (elevation/ELE) is optional. It is expressed in meters. See
1145       Section 7.7.3. Permitted values are in the range of -442 to 8848 meters. For elevations
1146       outside of this range, the lowest or highest values shall be used, as appropriate.
1147
1148     oThe eleventh information item (geodetic datum code/GDC) is optional. See Section 7.7.3.
1149
1150     oThe twelfth, thirteenth and fourteenth information items (GCM/GCE/GCN) are treated as a
1151       group and are optional. These three information items together are a coordinate which
1152       represents a location with a Universal Transverse Mercator (UTM) coordinate. If any of
1153       these three information items is present, all shall be present. See Section 7.7.3
1154
1155     o The fifteenth information item (geographic reference text /GRT) is optional. See Section
1156        7.7.3.
1157
1158     o A sixteenth information item (geographic coordinate other system identifier/OSI) is
1159        optional and allows for other coordinate systems and the inclusion of geographic
1160        landmarks. See Section 7.7.3.
1161
1162     o A seventeenth information item (geographic coordinate other system value/OCV) is
1163        optional and shall only be present if OSI is present in the record. See Section 7.7.3
1164         The Geographic entry may be modified slightly based upon some issues related to
1165         National Information Exchange Model (NIEM) in the XML encoding.
1166
1167
1168                         25. Field 11.033: Segment Quality Values/SQV
1169
1170   This field (Segment Quality Values/SQV) contains repeating subfields, each referring to a list
1171   of segments identified in either the snip diary SPD of Field 11.024 or the segment diary SGD of
1172   Field 11.026. The items in each subfield give an assessment of the quality of the voice data
1173   within the segments identified in the subfield. This field is present only if Field 11.024 or Field
1174   11.026 exists in the record. This contrasts with Field 11.014 that gives the general quality across
1175   the entire audio recording. Values in this field dominate any values given in Field 11.014. It is
1176   possible for each segment given in the associated diary to have different quality. The subfields


                                                       42
1177 accommodate only a single quality value. If segments have multiple quality values based on
1178 different types of quality assessments, then multiple subfields are entered for those segments.
1179
1180     o The first information item (diary identifier/DIA) is mandatory and is a binary indicator of
1181         the diary to which this subfield refers. If this item refers to a segment in the SPD of Field
1182         11.024, the value is 0. If this item refers to a segment in the SGD of Field 11.026, the
1183         value is 1.
1184
1185     o The second information item (segment identifiers/SID) is a mandatory list of integers and
1186         gives the segment identifiers from the diary to which the values in this subfield pertain.
1187         The number of segment identifiers listed is limited to 600,000. A value of 0 in this
1188         subfield indicates the segment quality information in this subfield shall be considered the
1189         default value for all segments not specifically identified in other subfields of this field. If
1190         multiple segments are entered, they are listed as integers separated by commas.
1191
1192     o The third information item (quality value/QVU) is mandatory and shall indicate the
1193         segment quality value between 0 (low quality) and 100 (high quality). A value of 255
1194         indicates that quality was not assessed. An example would be the Speech Intelligibility
1195         Index, ANSI 3.5 1997.
1196
1197     o A fourth information item is mandatory and shall specify the ID of the vendor of the
1198         quality assessment algorithm used to calculate the quality score, which is an algorithm
1199         vendor identification/QAV. This 4-digit hex value (See Section 5.5 Character types )
1200         is assigned by IBIA and expressed as four characters. The IBIA maintains the Vendor
1201         Registry of CBEFF Biometric Organizations that map the value in this subfield to a
1202         registered organization. A value of 0x00 indicates a vendor without a designation by
1203         IBIA. In such case, an entry shall be made in COM of this subfield describing the
1204         algorithm and its owner/vendor.
1205
1206     o A fifth information item is mandatory and shall specify a numeric product code assigned
1207         by the vendor of the quality assessment algorithm, which may be registered with the
1208         IBIA, but registration is not required. This is the algorithm product identification/QAP
1209         that indicates which of the vendor’s algorithms was used in the calculation of the quality
1210         score. This information item contains the integer product code and should be within the
1211         range 0 to 65,534. A value of 0 indicates a vendor without a designation by IBIA. In
1212         such case, an entry shall be made in COM of this subfield describing the algorithm and
1213         its owner/vendor.
1214
1215      o The sixth information item (comments/COM) is optional but shall be used to provide
1216           information about the quality assessment process, including a description of any
1217           unregistered quality assessment algorithms used. (if QAV= 0x00 or QAP = 0)
1218
1219
1220                         26. Field 11.034: Vocal Collision Identifier/VCI
1221


                                                        43
1222   This optional field (Vocal Collision Identifier/VCI) contains 2 mandatory information items,
1223   each referring to a list of segments identified in either the snip diary SPD of Field 11.024 or the
1224   segment diary SGD of Field 11.026 and indicating that a vocal collision (two or more persons
1225   talking at once) occurs within the segment. This field shall only appear if Field 11.024 or Field
1226   11.026 exists in this record.
1227
1228      o The first information item (diary identifier/DIA) is mandatory and is a binary indicator of
1229         the diary to which this subfield refers. If this item refers to a segment in the SPD of Field
1230         11.024, the value is 0. If this item refers to a segment in the SGD of Field 11.026, the
1231         value is 1.
1232
1233      o The second information item (segment identifiers/SID) is a mandatory list of integers
1234          separated by commas and gives the segment identifiers from the diary named in the item
1235          above in which vocal collisions occur. There may be up to 600,000 segments identified
1236          in this subfield.
1237
1238
1239                          27. Field 11.035: Processing Priority /PPY
1240
1241   This optional field (Processing Priority/PPY) contains repeating subfields, each referring to a
1242   list of segments identified in either the snip diary SPD of Field 11.024 or the segment diary SGD
1243   of Field 11.026 and indicating the priority with which the segments named in those diaries
1244   should be processed. If this field exists, segments not identified should be given the lowest
1245   priority.
1246
1247      o The first information item (diary identifier/DIA) is mandatory and is a binary indicator of
1248         the diary to which this subfield refers. If this item refers to a segment in the SPD of Field
1249         11.024, the value is 0. If this item refers to a segment in the SGD of Field 11.026, the
1250         value is 1.
1251
1252      o The second information item (segment identifiers/SID) is a mandatory list of integers,
1253         separated by commas, and gives the segment identifiers from diary named in the first
1254         information item above to which the values in this subfield pertain. There may be up to
1255         600,000 values of this field, one for each segment identified in the diaries of Field 11.024
1256         or Field 11.026. A value of 0 in this item indicates the segment content information in
1257         this field shall be considered the default value for all segments not specifically identified
1258         in other subfields of this field.
1259
1260      o The third information item (processing priority/ PTY) is mandatory if this field is used
1261         and indicates the priority with which the segments identified in this subfield should be
1262         processed. Priority values shall be between 1 and 9 inclusive. A value of 1 will indicate
1263         the highest priority and 9 the lowest.
1264
1265
1266                          28. Field 11.036: Segment Content/SCN
1267
                                                       44
1268   This optional field (Segment Content/SCN) contains subfields, each referring to a segment
1269   identified in either the snip diary SPD of Field 11.024 or the segment diary SGD of Field
1270   11.026. Each subfield gives an assessment of the content of the voice data within the identified
1271   segment and includes provision for semantic transcripts, phonetic transcriptions and translations
1272   of the segment. It may only appear if Field 11.024 or Field 11.026 is present in this record. At
1273   least one of the third, fourth, fifth, sixth or seventh information items must be used if this field is
1274   used.
1275
1276       o The first information item (diary identifier/DIA) is mandatory and is a binary indicator of
1277          the diary to which this subfield refers. If this item refers to a segment in the SPD of Field
1278          11.024, the value is 0. If this item refers to a segment in the SGD of Field 11.026, the
1279          value is 1.
1280
1281       o The second information item (segment identifiers/SID) is a mandatory list of integers
1282          separated by commas and gives the segment identifiers from diary to which the values in
1283          this subfield pertain. There may be 600,000 values of this item, one for each segment
1284          identified in related diary. A value of 0 of this item indicates the segment content
1285          information in this subfield shall be considered the default value for all segments not
1286          specifically identified in other subfields of this field.
1287
1288       o The third information item (transcript text/TRN) is mandatory if this field is used and
1289          shall be a text field of up to 100,000 characters. It may contain a semantic transcription
1290          of the segment.
1291
1292       o The fourth information item (phonetic transcript text/PTT) is an optional text field
1293          containing a phonetic transcription of the segment.
1294
1295       o The fifth information item (translation text/TLT) is an optional text field containing a
1296          translation of the segment into a language other than the one in which the original
1297          segment was spoken.
1298
1299       o The sixth information item (segment content comments/COM) is an optional text field
1300          containing comments on the content of the segment.
1301
1302       o The seventh information item (transcript authority comment text/TAC) is an optional
1303          text field and shall state the authority providing the transcription, translation or comments
1304          if SMC, PTS or COM is used. If an automated process was used to develop the transcript,
1305          information about the process (i.e., the automated algorithm used) should be included in
1306          this text.
1307
1308                           29. Field 11.037: Segment Speaker Characteristics/SCC
1309
1310   This optional field (Segment Speech Characteristics/SCC) contains subfields, each referring to
1311   a segment identified in either the snip diary SPD of Field 11.024 or the segment diary SGD of
1312   Field 11.026. Each subfield gives an assessment of the characteristics of the voice within the


                                                         45
1313 segment, including intelligibility, emotional state and impairment. This field shall only appear if
1314 Field 11.024 or Field 11.026 exists in the record.
1315
1316     o The first information item (diary identifier/DIA) is mandatory and is a binary indicator of
1317         the diary to which this subfield refers. If this item refers to a segment in the SPD of Field
1318         11.024, the value is 0. If this item refers to a segment in the SGD of Field 11.026, the
1319         value is 1.
1320
1321     o The second information item (segment identifiers/SID) is a mandatory list of integers
1322         separated by commas and gives the segment identifiers from Field 11.024 to which the
1323         values in this subfield pertain. There may be up to 600,000 values in this item, one for
1324         each segment identified in Field 11.026. A value of 0 indicates the segment content
1325         information in this item shall be considered the default value for all segments not
1326         specifically identified in other occurrences of this item.
1327
1328    oThe third information item (impairment level number/IMP) is optional and shall indicate
1329        an observed level of neurological diminishment, whether from fatigue, disease, trauma, or
1330        the influence of medication/substances, across the speech segments identified. No attempt
1331        is made to differentiate the sources of impairment. The value shall be an integer between 0
1332        (no noticed impairment) and 5 (significant), inclusive.
1333
1334    oThe fourth information item (dominant spoken language code/DSL) is optional and gives
1335        the 3 character ISO 639-3 code for the dominant language in the segments identified in
1336        this subfield.
1337
1338    oA fifth information item (language proficiency scale number/LPS) is an optional integer
1339        and rates the fluency of the language being spoken on a scale of 0 (no proficiency) to 9
1340        (high proficiency).
1341
1342    oThe sixth information item (speech style code/STY) is optional and shall be an integer as
1343        given in Supplement Table 5. There may be no more than one value for each of the
1344        segments identified in this subfield and will indicate the dominant style of speech within
1345        the segments. If attribute code “12” is chosen to indicate “other”, additional explanation
1346        should be included in the tenth item (comments/COM) below.
1347
1348                                                     Table 5
1349                                                 Speech Style
                                          Speech Style                       Attribute
                                                                             Code
                         Unknown                                           0
                         Public speech (oratory)                           1
                         Conversational telephone                          2
                         Conversation face-to-face                         3
                         Read                                              4
                         Prompted/repeated                                 5
                         Storytelling/Picture description                  6
                         Task induced speech                               7
                                                            46
                     Interview                                          8
                     Recited/memorized                                  9
                     Spontaneous/free                                   10
                     Variable                                           11
                     Other                                              12
                     RESERVED FOR FUTURE USE only by ANSI/NIST-ITL      13-20
1350
1351
1352   oThe seventh information item (intelligibility scale code/INT) is optional and shall be an
1353     integer from 0 (unintelligible) to 9 (clear and fully intelligible).
1354
1355   o The eighth information item (familiarity degree code/FDC) is an optional integer
1356      between 0 and 5, inclusive, and indicates the degree of familiarity between the data
1357      subject and the interlocutor, which ranges from 0 indicating no familiarity to 5 indicating
1358      high familiarity/intimacy.
1359
1360   oThe ninth information item (health comment/HCM) is optional text noting any observable
1361     health issues impacting the data subject during the speech segment, such as symptoms of
1362     the common cold (hoarse voice, pitch lowering, increased nasality) and an indicator if the
1363     data subject regularly smokes tobacco products.
1364
1365   o The tenth information item (emotional state code/EMC) is an optional integer giving an
1366      estimation of the emotional state of the data subject across the segments identified in this
1367      subfield. Admissible attribute values are given in Supplement Table 6. Only one value
1368      for this item is allowed across all of the segments identified in this subfield. If attribute
1369      code “9” or “10” is chosen to indicate “variable” or “other”, additional explanation may be
1370      included in the tenth information item (comments/COM) below.
1371
1372
1373
1374
1375
1376
1377
1378
1379                                           Table 6
1380                                       Emotional State
                                     Emotional State                    Attribute
                                                                        Code
                     Unknown                                            0
                     Calm                                               1
                     Hurried                                            2
                     Happy/joyful                                       3
                     Angry                                              4
                     Fearful                                            5
                     Agitated /Combative                                6
                     Defensive                                          7

                                                     47
                     Crying                                            8
                     Variable                                          9
                     Other                                             10
                     RESERVED FOR FUTURE USE only by ANSI/NIST-ITL     11-20
1381
1382
1383   oThe eleventh information item (vocal effort scale number/VES) is an optional integer
1384     between 0 (very low vocal effort) and 5 (screaming/crying) which reports perceived vocal
1385     effort of the data subject across the identified segments. Only one value is allowed for this
1386     item in each subfield.
1387
1388   oThe twelfth information item (vocal style code/VSC) is an optional integer assessing the
1389     predominant vocal style of the data subject across the identified segments. The attribute
1390     value shall be chosen from Supplement Table 7. Only one value is allowed for this item
1391     in each subfield.
1392
1393                                                Table 7
1394                                              Vocal Style
                                        Vocal Style                    Attribute
                                                                       Code
                     Unknown                                           0
                     Spoken                                            1
                     Whispered                                         2
                     Sung                                              3
                     Chanted                                           4
                     Rapped                                            5
                     Mantra                                            6
                     Falsetto/Head voice                               7
                     Spoken with laughter                              8
                     Megaphone/Public Address System                   9
                     Shouting/yelling                                  10
                     Other                                             11
                     RESERVED FOR FUTURE USE only by ANSI/NIST-ITL     12-20
1395
1396
1397     o The thirteenth information item (recording awareness indicator/RAI) is optional and
1398        indicates whether the data subject is aware that a recording is being made. 0 indicates
1399        unknown, 1 indicates aware and 2 indicates unaware.
1400
1401     o The fourteenth information item (script text/SCR) is optional and may be used to give
1402        the script used for read, prompted or repeated speech. This item may have up to 9,999
1403        characters.
1404
1405     o The fifteenth information item (comments/COM) is optional and may be used to give
1406        additional information about the characteristic assessment process, including a
1407        description of any characteristic assessment algorithms used, notes on any known
1408        external stresses applicable to the data subject, such as extreme environmental
                                                     48
1409           conditions or heavy physical or cognitive load, and a description of how the values in
1410           the items of this subfield were assigned. If the sixth information item indicates read or
1411           prompted speech, this item may contain the read or prompted text. This item may have
1412           up to 4,000 characters.
1413
1414
1415                         30. Field 11.038: Segment Channel/SCH
1416
1417   This field (Segment Channel/SCH) contains subfields, each referring to a segment identified in
1418   either the snip diary SPD of Field 11.024 or the segment diary SGD of Field 11.026. Each
1419   subfield describes the transducer and transmission channel within the identified segments. This
1420   field shall only be present if Field 11.024 or Field 11.026 appears in this record.
1421
1422      o The first information item (diary identifier/DIA) is mandatory and is a binary indicator of
1423         the diary to which this subfield refers. If this item refers to a segment in the SPD of Field
1424         11.024, the value is 0. If this item refers to a segment in the SGD of Field 11.026, the
1425         value is 1.
1426
1427      o The second information item (segment identifiers/SID) is a mandatory list of integers
1428         separated by commas, and gives the segment identifiers from the diary to which the
1429         values in this subfield pertain. There may be up to 600,000 values in this item. A value
1430         of 0 in this item indicates the segment content information in this subfield shall be
1431         considered the default value for all segments not specifically identified in other subfields
1432         of this field.
1433
1434      o The third information item (audio capture device type code/ACD) is an optional integer
1435         with attribute values given in Supplement Table 8. A value of “2” indicates that more
1436         than one type of microphone is being used simultaneously to collect the audio signal. It is
1437         recognized that for most of the acquisition sources in Field 11.006 REC_AQS, as
1438         specified by Table 88, the transducer type will not be known.
1439
1440
1441
1442
1443                                              Table 8
1444                                     Audio Capture Device Type Code
                                         Device Type                Attribute
                                                                    Code
                        Unknown                                     0
                        Array                                       1
                        Multiple style microphones                  2
                        Earbud                                      3
                        Body Wire                                   4
                        Microphone                                  5
                        Handset                                     6
                        Headset                                     7
                                                      49
                       Speaker phone                                   8
                       Lapel Microphone                                9
                       Other                                           10
                       RESERVED FOR FUTURE USE only by ANSI/NIST-ITL   11-99
1445
1446
1447   oThe fourth information item (microphone type code/MTC) is an optional integer that
1448       specifies the transducer type as unknown=0, carbon=1, electret=2, dynamic=3, or other=4.
1449       Transducer arrays using mixed transducer types shall be designated “other”.
1450
1451   oThe fifth information item (capture environment description text/ENV) is an optional text
1452       field of up to 4000 characters to describe the acoustic environment of the recording.
1453       Examples of text placed in this item would be “reverberant busy restaurant”, “urban
1454       street”, “public park during day”.
1455
1456   oThe sixth information item (transducer distance /DST) is an optional integer and specifies
1457       the approximate distance in centimeters, rounded to the nearest integer number of
1458       centimeters, between the speaker in the identified segments and the transducer. A value of
1459       0 will be used if the distance is less than one-centimeter. Some example distances:
1460       handheld = 5cm; throat mic = 0cm, mobile telephone = 15cm; Voice-over-internet-
1461       protocol (VOIP) with a computer = 80cm, unless other information is available.
1462
1463   oThe seventh information item (acquisition source/ACS) is an optional integer that specifies
1464       the source from which the voice in the identified segments was received. Only one value
1465       is allowed. Permissible values are given in Table 88 of the Type-20 record. Any conflict
1466       between this value and Field 11.006 REC_AQS shall be resolved by taking this item to
1467       be correct for all segments identified in the subfield, SCH_DIA and SCH_SID, of this
1468       occurrence of Field 11.038.
1469
1470   oThe eighth information item (voice modification description text/VMT) is an optional,
1471       unrestricted string for a description of any digital masking between transducer and
1472       recording, disguisers or other attempts to change the voice quality. Any processing
1473       techniques used on the recording should be indicated, such as Automated Gain Control
1474       (AGC), noise reduction, etc.
1475
1476   oThe ninth information item (comments/COM) is an optional, unrestricted string for
1477       additional information to identify or describe the transduction and transmission channels
1478       of the identified segments.
1479
1480
1481                        31. Field 11.039-050: Reserved Fields
1482
1483 These fields are reserved for future use by ANSI/NIST-ITL.
1484
1485
1486                        32. Field 11.051: Comments/COM
                                                       50
1487
1488   This field (Comments/COM) is an optional unrestricted text string of up to 4000 characters in
1489   length that may contain comments of any type on the Type 11 record as a whole. Comments on
1490   individual segments shall be given in Field 11.024, SNP_COM, or in Field 11.026,
1491   SGD_COM. This field should record any intellectual property rights associated with any of the
1492   segments in the voice recording, any court orders related to the voice recording and any
1493   administrative data not included in other fields.
1494
1495
1496                         33. Fields 11.052-099: Reserved Fields
1497
1498   These fields are reserved for future use by ANSI/NIST-ITL.
1499
1500
1501                         34. Fields 11.100-900: User-defined fields/UDF
1502
1503   These fields are user-defined fields. Their size and content shall be defined by the user and be in
1504   accordance with the receiving agency
1505
1506
1507                         35. Field 11.901: Reserved field
1508   This field is reserved for future use by ANSI/NIST-ITL.
1509
1510
1511                         36. Field 11.902: Annotation information/ANN
1512
1513   This is an optional field, listing the operations performed on the original source in order to
1514   prepare it for inclusion in a biometric record type. This field logs information pertaining to this
1515   Type-11 record and the voice recording pointed to or included herein. See Section 7.4.1. This
1516   section is not intended to contain any transcriptions or translations themselves, but may contain
1517   information about the source of such fields in the record.
1518
1519
1520
1521                         37. Fields 11.903-992: Reserved Fields
1522
1523   These fields are reserved for future use by ANSI/NIST-ITL.
1524
1525
1526                         38. Field 11.993: Source agency name/SAN
1527
1528   This is an optional field. It may contain up to 125 Unicode characters. This is the name of the
1529   agency referred to in Field 11.004 using the identifier given by domain administrator.
1530
1531

                                                       51
1532                          39. Field 11.994: External file reference/EFR
1533
1534   This conditional field shall be used to enter the URL/URI or other unique reference to a storage
1535   location for all source representations, if the data is not contained in Field 11.999. If this field is
1536   used, Field 11.999 shall not be set. However, one of the two fields shall be present in all
1537   instances of this record type. A non-URL reference might be similar to: “Case 2009:1468 AV
1538   Tape 5”. It is highly recommended that the user state the format of the external file in Field
1539   11.051: Comment/COM .
1540
1541
1542                          40. Field 11.995: Associated Context/ACN
1543
1544   This optional field applies to all audio object type records, not just ones with included binary
1545   files as Type-21 Records. See Section 7.3.3. Record Type-21 contains audio, video and images
1546   that are NOT used to derive the biometric data in Field 11.999: Voice Record/DATA but that
1547   may be relevant to the collection of that data.
1548
1549
1550                          41. Field 11.996: Hash/HAS
1551
1552   This optional field applies to all digital audio records, whether stored in Field 11.999 or
1553   reference to an external storage location in Field 11.994 and shall contain the hash value of the
1554   data in Field 11.999: Voice Data of this record, calculated using SHA-256. See Section 7.5.2.
1555   Use of the hash enables the receiver of the data to check that the data has been transmitted
1556   correctly, and may also be used for quick searches of large databases to determine if the data
1557   already exist in the database. It is not intended as an information assurance check, which is
1558   handled by Record Type-98.
1559
1560
1561                          42. Field 11.997: Source representation/SOR
1562
1563   This optional field refers to a representation in Record Type-20 with the same SRN.
1564
1565
1566
1567                            43. Field 11.998: Reserved field
1568
1569   This field is reserved for future use by ANSI/NIST-ITL.
1570
1571
1572                          44. Field 11.999: Voice record/DATA
1573
1574   This field contains the voice data. See Section 7.2 for details.
1575
1576
1577   Annex B of ANSI/NIST-ITL 1-2011 Table 97 is updated as follows:
                                                         52
1578
1579   Record Identifier    Logical record contents          Type of Data
1580      11                Voice                           ASCII/Binary
1581
1582   Annex B Section B.2.7 is updated:
1583
1584   There are no special requirements for this record type.
1585
1586   Annex G, Insert table Type-11
1587
1588   DEVELOP TABLE FOR XML REPRESENTATION HERE
1589




                                                       53

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:10/6/2013
language:Latin
pages:53
huangyuarong huangyuarong
About