; 43
Documents
Resources
Learning Center
Upload
Plans & pricing Sign in
Sign Out
Your Federal Quarterly Tax Payments are due April 15th Get Help Now >>

43

VIEWS: 5 PAGES: 32

  • pg 1
									Chapter 4


Fields




Tasks Map for Basic Parameters of Fields

When you want to ...      Use ...                        And/Or          Described in ...
                                                         Related
                                                         Definition...

Create a field definition DMDBA screen 11 or 12                          ―Fields,‖ ―Field
                          or, in statement mode, FIELD                   Validation,‖ ―Field
                                                                         Display,‖ ―Field Indexing,
                                                                         Searching, and Groupings
                                                                         of Characteristics,‖ ―Virtual
                                                                         Fields and Expressions,‖
                                                                         ―ADM Screens,‖ ―ADM
                                                                         Definitions,‖ and ―REVL
                                                                         Screen and Statements‖

Define a required or an   DMDBA screen 11 - page 4 or    PARAMETER       ―Required or Optional‖ and
optional field,           screen 12                      _SET            ―Simple or Compound‖
                                                                                  1
Define a simple or a      or, in statement mode,                         sections
compound field            FIELD OCCURS




                                                                                            Fields  61
When you want to ...       Use ...                        And/Or          Described in ...
                                                          Related
                                                          Definition...

Establish a field’s data   DMDBA screen 11 - page 2 or    DOMAIN          ―Data Types‖ section
type                       screen 12                      _CONTROL
                           or, in statement mode,         _SET
                           FIELD TYPE

Define characteristics     DMDBA screen 11 - pages 1 or   DOMAIN          ―Defining Numeric Fields‖
of a numeric field         2 or screen 12                 _CONTROL        section
                           or, in statement mode,         _SET
                           FIELD
                             TYPE,
                             PRECISION,
                             SCALE,
                             USAGE

Define characteristics     DMDBA screen 11 - page 1 or 2 DOMAIN           ―Defining Non-numeric
of a character field       or screen 12                  _CONTROL         Fields‖ section
                           or, in statement mode,        _SET
                           FIELD
                             TYPE,
                             SIZE,
                             USAGE

Define automatic           DMDBA screen 11 - page 2       DOMAIN          ―Automatic Editing of Data
editing of data for        or, in statement mode,         _CONTROL        for Storage‖ section and
storage                    FIELD                          _SET,           ―ADM Definitions‖
                             RAISE_DATA,                  PARAMETER
                             BLANK_CONTROL_DATA,          _SET
                             EDIT_BLANKS,
                             HIDDEN_STRING
                             _DELIMITERS,
                             COMPRESS

Define a text stream or    DMDBA screen 11 - pages 1, 2, PARAMETER        ―Defining Text Image and
text image field           and 5 or, in statement mode,  _SET             Text Stream Fields‖ section
                           FIELD                                          and ―ADM Definitions‖
                             FORMAT=TEXT_IMAGE
                             CONTEXT_PARSER
                             CONTEXT_UNIT
                             USAGE=TEXT_STREAM




62  Fields
When you want to ...          Use ...                          And/Or             Described in ...
                                                               Related
                                                               Definition...

Define fields in              DMDBA screen 11 - pages 1        DOMAIN             ―Defining Fields in
                 1
sectioned record              and 2 or, in statement mode,     _CONTROL           Continuous and Sectioned
                              FIELD                            _SET               Records‖ section
                                SECTION_FIELD=YES                                 and
                                USAGE=SECTION_NAME                                ―ADM Definitions‖
                              USAGE=SECTION_TITLE

                              USAGE=SECTION_NUMBER
                              USAGE=SECTION_LEVEL

Define complex fields         DMDBA screen 27 - page 1 or,     FIELD              ―Defining Complex Fields
and arrays                    in statement mode,                SEARCH            and Arrays‖ section and
                              SEARCH_CONTROL_SET                _CONTROL          ―ADM Definitions‖


                         1
                             Sectioned records are not supported on Windows.




Basic Requirements
                     Each record in a BASIS database consists of fields which contain data. Each of the fields
                     in a record must be defined in the ADM. Though the syntax for the FIELD definition in
                     ―ADM Definitions‖ includes more than 35 parameters, only four are required:
                        FIELD=field_name
                        OCCURS=integer | integer:integer
                        SIZE=integer | integer:integer        (for a character field)
                        TYPE=data_type

                     If the field is defined with a USAGE, DOMAIN, or PARAMETER_SET, one or more of
                     the required parameters may be implicitly included in such a parameter. For example, if
                     you define a field with USAGE=TEXT_STREAM, that usage implies
                     TYPE=CHARACTER and OCCURS=0:1. For information about usages, see the
                     ―Specifying Standard Attributes (Usages)‖ topic in the ―Defining Numeric Fields‖ and
                     ―Defining Character Fields‖ sections later in this chapter. For information about domains
                     and parameter sets, see ―Field Indexing, Searching, and Groupings of Characteristics.‖




                                                                                                     Fields  63
              Required or Optional
              The integer value assigned to the OCCURS parameter designates whether the field is
              required or optional. OCCURS can be assigned an integer value or a range of integer
              values—integer or integer:integer. A simple field that is required has OCCURS=1. A
              simple field that is optional has OCCURS=0:1, indicating that the field may have 0 or 1
              occurrence of its value.

              OCCURS=1 means that the field can have one and only one value; it is a required simple
              field.

              OCCURS=0:1 means that the field can have zero or one value; it is an optional simple
              field.

              Simple or Compound
              A simple field contains one value. A compound field contains a list of values. Each value
              in a compound field is called a subfield.

              The integer value assigned to the OCCURS parameter not only designates whether the
              field is required or optional but also whether the field is simple or compound. A simple
              field has OCCURS=1 or OCCURS=0:1, depending on whether it is required or optional.
              A compound field that is required has an integer for OCCURS that is greater than 1. A
              compound field that is optional has an integer range from 0 to an integer greater than 1.

              OCCURS=7 means that the field must have no more and no less than seven values; it is a
              required compound field.

              OCCURS=1:12 means that the field can have from 1 to 12 values; it is a required
              compound field.

              OCCURS=0:20 means that the field can have 0 or as many as 20 values; it is an optional
              compound field.

              A compound field can have as many as 4000 subfields or occurrences.

              Data Types
              The data type, which must be declared for each field, specifies the format of the data for
              the field in computer storage and implies how the field is to be used in comparisons and
              calculations.




64  Fields
BASIS supports a rich set of data types. Types that can be specified with the TYPE
parameter are listed below in all uppercase. (Numeric, Non-numeric, and Unformatted
below are not data types but words used to group data types.)


 Numeric                        Non-numeric

 EXACT_BINARY                       LOGICAL

    INTEGER                         CHARACTER

   EXACT_DECIMAL

   APPROXIMATE                      Unformatted

    REAL                             BYTE_STRING

    DOUBLE                           CELL

   COMPLEX

Note: If you are entering data types for fields by way of DMDBA screen mode screen
12 for Records and Fields, data types are assigned the following codes:


 EXACT_BINARY                   E

 INTEGER                        I

 EXACT_DECIMAL                  P

 APPROXIMATE                    A

 REAL                           R




                                                                           Fields  65
               DOUBLE                           D

               COMPLEX                          X

               LOGICAL                          L

               CHARACTER                        C

               BYTE_STRING                      B

               CELL                             K

              The computer stores data in units known as bytes and words. On many computers, a byte
              is 8 bits, and a word is 32 bits or 4 bytes. A character is normally stored in 1 byte, and
              numeric data requires from 1 to 16 bytes. BASIS efficiently stores fixed-length numeric
              data and fixed- or variable-length character data.

              Numeric data type classifications are described in the ―Defining Numeric Fields‖ section,
              and non-numeric data type classifications are described in the ―Defining Character
              Fields‖ section. Text stream and text image fields, special forms of character data, are
              described in the ―Defining Text Stream and Text Image Fields‖ section.




Defining Numeric Fields
              Numeric fields can have any of the following data types, each of which is described in
              this section:

              EXACT_BINARY
                  INTEGER
                 EXACT_DECIMAL
                 APPROXIMATE
                  REAL
                  DOUBLE
                 COMPLEX

              The most frequently used numeric data types are EXACT_BINARY for whole numbers
              and EXACT_DECIMAL for numbers with a decimal point.




66  Fields
Numeric Data Types
EXACT_BINARY

The EXACT_BINARY data type stores an exact whole number in binary form. The
default precision is 9 on a 32-bit machine and 18 on a 64-bit machine. The maximum
precision is 18. All EXACT_BINARY data is stored with a sign (+ or -) because most
languages do not support unsigned exact binary data.

Generally, instructions that perform arithmetic operations on EXACT_BINARY data
have faster execution times than instructions that operate on other data types. However,
the system must convert EXACT_BINARY data to CHARACTER data before it can
display the data.

Dates are stored as EXACT_BINARY with PRECISION = 8 in the format
YYYYMMDD, where YYYY is the year, MM is the month, and DD is the day.

INTEGER

A field defined as an INTEGER data type will be handled as an EXACT_BINARY with a
precision of either 9 or 18, depending on the word size of the host machine. If the word
size is 32 bits, 1 word of precision is 9 digits. If the word size is 64 bits, 1 word of
precision is 18 digits. Since the INTEGER data type has machine dependent precision, it
should not be used to define fields within databases that may be moved from one machine
to another. INTEGER is a language-type variation of EXACT_BINARY. It implies that
a full word is used to store the number.

EXACT_DECIMAL

The EXACT_DECIMAL data type stores an exact fixed point number in packed decimal
form. The default precision is 15. The maximum precision is 31 digits, but most
languages do not support more than 18 digits.

A packed decimal number is stored as a sequence of bytes. Every byte is divided into two
4-bit fields called nibbles. Each nibble must contain decimal digits, except the last
nibble, which must contain a sign code (+ is 12 and - is 13). Packed decimal values are
always stored with a trailing separate sign. If the number of digits is odd, the digits and
the sign fit into (n/2)+1 bytes. If the number of digits is even, an extra 0 digit must be
stored in the first nibble. The number 123 uses 3 nibbles to store the value plus 1 trailing
(sign code) nibble totaling 4 nibbles or 2 bytes. The number 1234 uses 4 nibbles to store
the value plus 1 leading (0) nibble plus 1 trailing (sign code) nibble, totaling 6 nibbles or
3 bytes.




                                                                                 Fields  67
              You can declare a scale factor for EXACT_DECIMAL fields. The SCALE parameter is
              used to specify the number of digits that are assumed to the right of the decimal point.
              The default scale factor is 0. The maximum scale factor is equal to the precision
              specified for the field; that is, all digits of the field are assumed to be right of the decimal
              point.

              Exact numbers of more than 9 digits and exact numbers that need to use a scale factor
              should be defined as EXACT_DECIMAL data. Monetary values are usually stored in
              EXACT_DECIMAL. The system can convert EXACT_DECIMAL data to characters
              faster than it can convert EXACT_BINARY data to characters. It is, however, faster to
              perform calculations with EXACT_BINARY than with EXACT_DECIMAL. Division
              on EXACT_DECIMAL numbers will result in a rounded, not truncated, value.

              APPROXIMATE

              The APPROXIMATE data type stores approximate quantities using scientific notation
              (consisting of a sign, an exponent, and a fraction) in floating point format. The default
              precision for all operating systems is 6 digits. For the VMS operating system the
              maximum precision is 28 digits. For Windows and UNIX the maximum precision is 14
              digits.

              Because of the range of the exponent, this data type is useful for working with very large
              or very small numbers with an accuracy of up to 28 digits. This data type is not suitable
              for calculations where exact accuracy is required, such as in monetary calculations.

              Table 4-1 shows characteristics of APPROXIMATE on various operating systems.

              Table 4-1: Characteristics of APPROXIMATE data type


               Precision      Bytes      NT             UNIX            VMS

               1:6            4          REAL           REAL            F-FLOAT

               7:14           8          DOUBLE         DOUBLE          D-FLOAT

               15:28          16                                      H-FLOAT

              REAL

              A field defined as REAL has an APPROXIMATE data type with a precision of either 6 or
              14, depending on the word size of the host machine. If the word size is 32 bits, 1 word of
              floating point precision is 6 digits. If the word size is 64 bits, 1 word of floating point
              precision is 14 digits. Since the REAL data type has machine-dependent precision, you
              should not use it in databases that will be moved from one machine to another.



68  Fields
DOUBLE

A field defined as DOUBLE is an approximate number with a precision of either 14 or
28, depending on the word size of the host machine. If the word size is 32 bits, 2 words
of floating point precision are 14 digits. If the word size is 64 bits, 2 words of floating
point precision are 28 digits. Since the DOUBLE data type has machine-dependent
precision, you should not use it in databases that will be moved from one machine to
another.

COMPLEX

A complex number has two parts and is mathematically written as A+Bi. The first part
(A) is the ―real part‖ or value of the number. The second part (Bi) is the ―imaginary part‖
where ―B‖ is a real coefficient and ―i‖ equals the square root of -1. The complex number
is represented by using a pair of approximate numbers. This pair of values makes
complex numbers important because one value states magnitude and the other states
direction. The pair of values can easily be mapped to an X-Y coordinate system to
display both values. The default precision for all operating systems is 6 digits. The
maximum precision for these operating systems is 14 digits.

Two complex numbers are equal if both parts are equal. Greater than and less than
comparisons (> and < operators) are not defined for complex numbers. Since complex
numbers cannot be compared or sorted, BASIS cannot index or perform search operations
on complex numbers. Only a few functions have meaning for complex numbers. They
are addition, subtraction, multiplication, division, equal (EQ), not equal (NE), absolute
value (ABS), square root (SQRT), exponent (EXP), logarithm (LOG), sine (SIN) and
cosine (COS).

In BASIS, a complex constant is written as C#(real,imaginary) where both parts are
integer (exact) or real (approximate) constants. Any number (exact or approximate) can
also be thought of as a complex number with the ―imaginary‖ part equal to 0. That is, as
A+0i.

As the Table 4-2 indicates, data type affects the values you can assign to PRECISION and
SCALE.




                                                                                  Fields  69
              Table 4-2: Numeric data types and related parameters


Data Type           Related Parameters   Storage          Comments
                                         Format

EXACT_BINARY        PRECISION=1:18       signed binary    Default decimal precision =
                                         integer          9 (32-bit machine),
                                                          18 (64-bit machine).

INTEGER             PRECISION= 9 or      one word of      Depends on machine one-word
                    18                   storage          size.

EXACT_DECIMAL       PRECISION=1:31       packed           Default decimal precision = 15.
                                         decimal          Number of digits to right of
                    SCALE=0:31                            decimal point.

APPROXIMATE         PRECISION=1:28       floating point   Default decimal precision =
                                                          6 (32-bit machine),
                                                          14 (64-bit machine).

REAL                PRECISION= 6 or      floating point   Depends on machine one-word
                    14                                    size.

DOUBLE              PRECISION=14 or      floating point   Depends on machine two-word
                    28                                    size.

REAL_BIG            PRECISION=14 or      floating point   Depends on machine two-word
                    28                                    size.

COMPLEX             PRECISION=1:14       two floating     Default decimal precision =
                                         point            6 (32-bit machine),
                                                          14 (64-bit machine).

COMPLEX_BIG         PRECISION=6 or 14                     Depends on host machine.




70  Fields
Precision
Numeric fields can represent only a certain range of values using limited precision. Use
the PRECISION parameter to specify the number of decimal digits to allocate for the
number. If a number has a type of EXACT_BINARY or EXACT_DECIMAL, assign it
the value of the total number of digits needed to store the number (omitting the decimal
point for EXACT_DECIMAL. For example, if the largest EXACT_BINARY number
expected for a field is 9999999999, its PRECISION can be 10. If the largest
EXACT_DECIMAL number expected for a field is 99999.99, its PRECISION can be 7.

In fields of EXACT_BINARY, EXACT_DECIMAL, APPROXIMATE, and COMPLEX
data types, the system sometimes uses a precision that is greater than the one you supply
with the PRECISION parameter. This is because of the way computers represent data
internally (in bits and bytes).

For fields of EXACT_BINARY, EXACT_DECIMAL, APPROXIMATE, or COMPLEX
data type, the actual precision used is the maximum allowed for the number of bytes used.
For example, if you define a field with TYPE=EXACT_BINARY and PRECISION=6,
the system will accept numbers with precisions up to 9 as valid field values because the
maximum precision that can be stored in 4 bytes is 9.

As Table 4-3 shows, minimum default maximum precisions depend on the data type of a
field. They also depend on word size.




                                                                               Fields  71
              Table 4-3: Data types, precision range, and bytes used


               Data Type                   Minimum           Maximum            Number of
                                                                                Bytes Used

               EXACT_BINARY                1                 2                  1
                                           3                 4                  2
                                           5                 9                  4
                                           10                18                 8

               INTEGER                     –                 –                  word

               EXACT_DECIMAL               1                 7                  4
                                           8                 15                 8
                                           16                31                 16

               APPROXIMATE                 1                 6                  4
                                           7                 14                 8
                                           15                28                 16

               REAL                        –                 –                  word

               DOUBLE                      –                 –                  2 words

               COMPLEX                     1                 6                  8
                                           7                 14                 16

               LOGICAL                     –                 –                  bit

               CHARACTER                   –                 –                  0:15000

               BYTE_STRING                 –                 –                  1:15000

               CELL                        –                 –                  word

              You can use the LEGAL parameter to define a more limited range that stays consistent
              with the specified precision. Be sure that the precision is sufficient for the legal values
              defined for the field.




72  Fields
Scale
Use the SCALE parameter to specify the number of digits that belong to the right of the
decimal point in a field having an EXACT_DECIMAL data type. For example, if the
largest EXACT_DECIMAL number expected for a field is 999999.9999, its PRECISION
can be 10 and its SCALE can be 4.

Legal
Use the LEGAL parameter to limit the range of a number. For example, if a field can
have values ranging from 0 to 99999, its range can be set to LEGAL=0:99999. When
defining a field, be sure the precision and range are sufficient to accommodate the legal
values of the field.

Specifying Standard Numeric Attributes (Usages)
The following usages or sets of standard attributes can be used for numeric fields.
Specifying USAGE=DATE, for example, automatically adds the implied parameters for
the field; you do not also need to specify TYPE, PRECISION, or LEGAL parameters for
the field.




                                                                                Fields  73
              Table 4-4: Implied parameters for numeric usages


              Usage                           Implied Parameters

              DATE                            TYPE=EXACT_BINARY, PRECISION=8,
                                              LEGAL=DATE

              DATE_KEY (date_field)           TYPE=EXACT_BINARY, PRECISION= 9,
                                              UNIQUE=YES, OCCURS=1,
                                                               2
                                              SECTION_FIELD=NO
                                  1
              DUPCHECK_key                    UNIQUE=NO,
                                              SEARCHED_FREQUENTLY=YES,
                                              OCCURS=0:1

              TIMESTAMP                       TYPE=EXACT_BINARY, PRECISION=18
                                              OCCURS=0:1

              MONEY                           TYPE=EXACT_DECIMAL, SCALE=2

              NUMBER                          TYPE=EXACT_BINARY or
                                              EXACT_DECIMAL, depending on whether
                                              scale is given

              SCIENTIFIC                      TYPE=APPROXIMATE

              SYSTEM_KEY (seed)               TYPE=EXACT_BINARY, PRECISION= 9,
                                              UNIQUE=YES, OCCURS=1,
                                                               2
                                              SECTION_FIELD=NO

              TIME                            TYPE=EXACT_BINARY, PRECISION=6

              USER_KEY                        TYPE=EXACT_BINARY, PRECISION= 9,
                                              UNIQUE=YES, OCCURS=1,
                                                               2
                                              SECTION_FIELD=NO

                 1
                     If you define a field with the usage DUPCHECK_KEY, the system issues a warning
                     message when you enter a duplicate record.
                 2
                     Sectioned records are not supported on Windows.




74  Fields
Key-Related Usages

SYSTEM_KEY, DATE_KEY, and USER_KEY are usages that can be applied to a field
to help ensure that it contains unique values.

If usage is SYSTEM_KEY, you can supply the seed value. This value is assigned to the
first record added for the record type. The values assigned are unique and ascending.
The starting value can be a number from 0 to 33554431. The default seed value is 1.

If usage is DATE_KEY, you need to supply the name of the date field that contains the
value the system will use to make a unique key. The date field must be defined with
USAGE=DATE.

If usage is USER_KEY, the user must supply a different value for this field for each
added record. The system prevents users from entering a value already used in the field
by another record.

A field with a DATE_KEY or USER_KEY usage cannot be a virtual field or a field
calculated by REVL from the value of another field.

Note: If you plan to use an indirect index structure for the record type, the primary key
field must contain a value defined with a USAGE of SYSTEM_KEY, DATE_KEY, or
USER_KEY. For more information about indirect index structure, see ―Indexes.‖

Using DATE_KEY for Sorting in Descending Order

To sort result sets in descending order without having to use the ORDER BY phrase as
part of your FQM FIND command, you can define a field with DATE_KEY usage and
use indirect index structure for the record type. The sort order would base values for
DATE_KEY on a supplied date. A field with DATE_KEY usage cannot be a virtual field
or a field calculated by REVL from the value of another field.

To display a result set for news stories in reverse chronological order, for example, you
could use a date key based on the publication date.
    FIELD=STORY_NUMBER, +
       USAGE=DATE_KEY(PUBLICATION_DATE);
    FIELD=PUBLICATION_DATE, USAGE=DATE, OCCURS=1:1;




                                                                                Fields  75
              To generate STORY_NUMBER, the value for PUBLICATION_DATE is converted to
              the number of days until November 11, 2158. This number is the starting base number of
              the current day’s story numbers. This will handle stories from January 1, 1800, to
              November 11, 2158, and can be stored in 17 bits. (Standard Julian dates produce 10-digit
              document numbers and cannot be used.) An INDIRECT index must be used on the
              record type containing the STORY_NUMBER and PUBLICATION_DATE fields. Story
              numbers for up to 16,383 stories per day require 14 bits. Each USAGE=DATE_KEY
              field comprises 3 parts:
                 an unused portion
                 a count of days until November 11, 2158. This count ranges from 0 to 131071 for
                  dates as far back as January 1, 1800.
                 a portion for the article sequence number, which ranges from 1 to 16383




Defining Non-numeric Fields
              Non-numeric Data Types
              Non-numeric fields consist of the following BASIS data types: LOGICAL,
              CHARACTER, NUMERIC, BYTE_STRING, and CELL. Each of these is described in
              this section.

              LOGICAL

              A logical value is either true or false and is stored as one bit. True is represented as 1.
              False is represented as 0.

              CHARACTER

              The character data type is used to store any character string. Textual items are always
              stored using this data type. Very long numbers (more than 31 digits) are also stored using
              this data type. They are called numeric strings. Each character is stored as one byte
              using the host machine’s character set . Trailing blanks are always removed. The SIZE
              parameter is used to specify the minimum and maximum length of a character field.

              Unformatted Data

              Two ways to define a field as an unformatted data type are BYTE_STRING and CELL.
              Since the unformatted data may represent any value, BASIS cannot index or perform
              search operations on these data types. The following paragraphs explain the difference
              between these two unformatted data types.




76  Fields
BYTE_STRING

A byte string is a sequence of bytes that can be used to hold any type of data. Because
users can define any substructures they want for byte string values, they can be used to
hold any kind of information, such as graphical representations of objects. The SIZE
parameter is used to specify the size, in bytes, of the byte string.

Because the system cannot compare two byte strings, it cannot index a byte string or
perform search operations on a byte string. The only parameters you can declare for a
byte string are TYPE, SIZE, OCCURS, ALIAS, LABEL, COMMENT,
PRIVACY_CODE, and PRIVACY_TEST.

CELL

A cell is one word of computer storage that can be used to hold data in any format. It
may occur (see OCCURS parameter) more than one time in order to form a cell array that
can be used to store various data structures.

Because the system cannot compare two cells, it cannot index a cell or perform search
operations on a cell. The only parameters you can declare for a cell are TYPE,
OCCURS, ALIAS, LABEL, COMMENT, PRIVACY_CODE, and PRIVACY_TEST.

Table 4-5 summarizes the related parameters and storage formats for non-numeric fields.

Table 4-5: Non-numeric data types and related parameters


 Data Type           Related            Storage     Comments             Searchable?
                     Parameters         Format

 LOGICAL                                bit         One bit.             Yes
                                        (true or    True = 1.
                                        false)      False = 0

 CHARACTER           SIZE=1:16000       character   String of            Yes
                                        string      characters.

 BYTE_STRING         SIZE=1:16000       byte        String of bytes of   No
                                        string      unformatted data.

 CELL                OCCURS=1:n         one word One word of             No
                                        of storage unformatted data
                                                   that occurs (n)
                                                   times.



                                                                                Fields  77
              Defining CHARACTER Fields
              SIZE

              The only required parameter for fields having a CHARACTER or BYTE_STRING data
              type is SIZE, which specifies the length of the field. It can be assigned an integer value or
              a range of integer values—integer | integer:integer. If a single integer value is assigned, it
              sets the maximum size or number of characters (bytes) used to store the field’s value. If a
              range of integer values is assigned, they set the minimum and maximum sizes for the
              field’s value. If the field is compound, SIZE declares the length of each subfield. The
              default is SIZE=1:500.

              To define a variable-length string, use the form SIZE=min:max, where min is from 0 to
              8000 and max is from 1 to 16000. If you specify SIZE as SIZE=n, the system treats it as
              SIZE=1:n.

              To store an all-blank string in a field, that field’s minimum size must be 0. Note that an
              all blank or zero-length string is not null. Only a field that occurs zero times is null (see
              the OCCURS parameter).

              BASIS automatically trims trailing blanks before adding a value to the database. When a
              user program attempts to store an all-blank string in a character field, the system will
              reject the all-blank string as invalid unless one of the following two conditions is true.
              1.   The field is optional (OCCURS=0:n) and the minimum size is zero (SIZE=0:m).
              2.   The field is required (OCCURS=1:n), the minimum size is zero (SIZE=0:m), and the
                   initial value is blank (INIT=' ').

              The following points describe how the SIZE parameter interacts with other FIELD
              parameters:
                  In fields that use a CODE_LIST, SIZE declares the length of the codes.
                  A field with EDIT_BLANKS=SQUEEZE or REMOVE must be large enough to hold
                   the embedded blanks before they are removed.
                  The system checks the minimum size after it performs the EDIT_BLANKS
                   procedure.




78  Fields
Specifying Standard Character Attributes (Usages).

The following usages or sets of standard attributes can be used for character fields.
Specifying USAGE=TITLE, for example, automatically adds the implied parameters for
the field; you do not also need to specify SORT_STOPWORDS, FORMAT, or TYPE
parameters for the field.

Table 4-6: Implied parameters for character usages


Usage                             Implied Parameters

CHARACTER                         TYPE=CHARACTER
                      1
DUPCHECK_key                      UNIQUE=NO,
                                  SEARCHED_FREQUENTLY=YES,
                                  OCCURS=0:1

MIMETYPE                          TYPE=CHARACTER, OCCURS=0:1

PERSON_NAME                       TYPE=CHARACTER
                  2
SHORT_TEXT                        TYPE=CHARACTER
         2
TITLE                             SORT_STOPWORDS=
                                   DM_LEAD_WORDS,
                                  FORMAT=STRING, TYPE=CHARACTER

TRANSLATE(code_field)             TYPE=CHARACTER,
                                  TRANSLATE=code_field

    1
      If you define a field with the usage DUPCHECK_KEY, the system issues a
warning message when you enter a duplicate record.
    2
        This usage also implies TEXT=YES for indexing purposes.

Automatic Editing of Data for Storage

Table 4-7 summarizes parameters to use to control how data is edited for storage.




                                                                               Fields  79
              Table 4-7: Parameters related to automatic editing of data for storage


              To do this ...                               Use this parameter

              Raise case; change all lowercase data to     RAISE_DATA=NO | YES
              uppercase

              Replace control characters—non-              BLANK_CONTROL_DATA
              displayable, non-printable characters like   =NO | YES
              carriage returns and tabs—with blanks

              Remove trailing blanks                       EDIT_BLANKS=
                                                           TRIM_TRAILING

              Remove leading blanks                        EDIT_BLANKS=
                                                           TRIM_LEADING

              Remove leading and trailing blanks           EDIT_BLANKS= TRIM

              Remove leading and trailing blanks and       EDIT_BLANKS= SQUEEZE
              change multiple embedded blanks to one
              blank

              Remove all blanks                            EDIT_BLANKS= REMOVE

              Compress data according to the BLANKS        COMPRESS=BLANKS |
              or UPPER_CASE_ASCII compression              UPPER_CASE_ASCII
              algorithm

              Hide character sequences delimited by        HIDDEN_STRING
              special characters                           _DELIMITERS=G1 | C0 |
                                                           char_cons | (ascii,ascii)

              Table 4-8 summarizes the effects of the RAISE_DATA, BLANK_CONTROL, and
              EDIT_BLANKS parameters on the following line of data where <FF> represents a form
              feed, a non-printable control character:
                  <FF>    This is      a line       of     text.




80  Fields
In this table default values for the other parameters related to automatic editing of data are
in effect. For example, trailing blanks are removed for both settings of RAISE_DATA
and BLANK_CONTROL_DATA because the default for EDIT_BLANKS, namely
TRIM_TRAILING, is in effect.

Table 4-8: Effects of various parameters on storage of data


 Parameter                                    Stored data

 RAISE_DATA=NO                                <FF> This is a line of text.

 RAISE_DATA=YES                               <FF> THIS IS A LINE OF TEXT.

 BLANK_CONTROL_DATA=NO                        <FF> This is a line of text.

 BLANK_CONTROL_DATA=YES                           This is a line of text.

 EDIT_BLANKS=TRIM_TRAILING                    <FF> This is a line of text.

 EDIT_BLANKS=TRIM_LEADING                     <FF> This is a line of text.

 EDIT_BLANKS=TRIM                             <FF> This is a line of text.

 EDIT_BLANKS=SQUEEZE                          <FF> This is a line of text.

 EDIT_BLANKS=REMOVE                           <FF>Thisisalineoftext.

Note that because the leading character is a form feed, not a blank,
EDIT_BLANKS=TRIM_LEADING and EDIT_BLANKS=TRIM do not remove the
blanks between the <FF> and This.

By using the COMPRESS parameter on fields containing character data, you can save
storage space. If you set the value of COMPRESS to BLANKS, the system converts non-
printable characters to blanks and stores special codes for sequences of embedded blanks.
If you set the value of COMPRESS to UPPER_CASE_ASCII, the system converts non-
printable characters to blanks, translates lowercase letters to uppercase letters, and
substitutes blanks for the characters { | } ~ ` Uppercase ASCII characters require only 6
bits per character, as opposed to the usual 8 bits. This provides significant storage
savings but requires processing overhead because the values must be translated each time
they are manipulated. You cannot use the COMPRESS parameter for fields with
FORMAT=TEXT_IMAGE or USAGE=TEXT_STREAM because in such cases the
system will automatically set this parameter to its default value.




                                                                                  Fields  81
              If you want to include hidden strings in your data, use the
              HIDDEN_STRING_DELIMITERS parameter to specify the means for delimiting the
              hidden strings. If you are using an ASCII character set, you can specify
              HIDDEN_STRING_DELIMITERS=C0 to include ASCII codes from 0 through 31 as
              hidden string delimiters or specify HIDDEN_STRING_DELIMITERS=G1 to include any
              ASCII codes from 128 through 255 as hidden string delimiters or specify a pair of ASCII
              codes. If you are not using an ASCII character set, you can specify a two-character
              literal.

              For example, if HIDDEN_STRING_DELIMITERS='[ ]', the string ABC[DEF]GHI
              would contain the hidden string DEF, and the application would never display DEF.

              The following guidelines describe the way the BASIS system processes hidden strings:
                 An initial value can contain hidden strings.
                 You can use the ENCRYPTION parameter with a field that contains hidden strings.
                 The system does not raise hidden strings to uppercase or modify control characters.
                  See the BLANK_CONTROL_DATA and RAISE_DATA parameters.
                 Fields that contain hidden strings cannot be compressed.
                 The system ignores hidden strings when it performs a PATTERN check and when it
                  edits blanks. (See the PATTERN parameter and the EDIT_BLANKS parameter.)
                 The system does not treat a hidden string as a break character when it produces
                  tokens for an inclusive index.
                 The system removes hidden strings when it produces index terms, tests a character
                  field via the FIND command, compares a field value to a legal list or a code list, or
                  sorts a field value.

              Note: The HIDDEN_STRING_DELIMITERS parameter cannot be used with
              FORMAT=TEXT_IMAGE or USAGE=TEXT_STREAM because in such cases the
              system will automatically set this parameter.

              Hidden string capabilities are often implemented in a more elegant fashion by using
              SGML tags. For more information about SGML and markup tags, see Markup and Style
              Guide.




82  Fields
 Defining Text Image and Text Stream Fields
 As illustrated in Figure 4-1, BASIS architecture accommodates several object classes that
 contain textual information.

                                           Document


                Field Group                                    Text Stream Field



Standard Field         Text Image Field               Phy sical Text         Indexed Text



   Subf ield            Text Image Line                                    Text Image Line




                                           Section


               Field Group                                     Text Stream Field



Standard Field          Text Image Field              Phy sical Text        Indexed Text



   Subf ield             Text Image Line                                   Text Image Line

               Figure 4-1: Object classes comprising documents in BASIS


 Conventional records contain a field group and its object classes: one or more standard
 fields, and optionally one or more text image fields.

 Continuous records or documents contain a field group and a text stream field.

 Sectioned records or documents contain a field group and several sections, each of which
 contains a field group and text stream field. (Sectioned records are not supported on
 Windows.) Text image fields and text stream fields are special kinds of non-numeric
 fields.




                                                                                             Fields  83
              Text Image Field

              A text image field is a character field that contains markup characters such as Standard
              Generalized Markup Language (SGML). It is composed of text image lines. The
              maximum size of a text image line is 512 characters. Normally, a text image line is the
              same as it appears on a printed page. Text image is an internal storage format that
              preserves the formatting and enhancements found in the source data.

              To identify a field as a text image field, define it with the FORMAT=TEXT_IMAGE
              parameter. A field defined with FORMAT=TEXT_IMAGE automatically takes on
              default values for the following parameters:

                  COMPRESS=none
                  HIDDEN_STRING_DELIMITERS=none
                  NORMALIZE=NO
                  RAISE_DATA=NO
                  BLANK_CONTROL=NO

              You cannot use the above parameters when you define FORMAT as TEXT_IMAGE.
              Furthermore, the system ignores the setting or default for these two parameters:
              1. EDIT_BLANKS: The system retains leading, trailing, and multiple imbedded blanks.
              2. SIZE: The system imposes no specific limit on the size of the field. However, you may
              use the REVL LENGTH function to enforce a limit yourself.

              In addition to being defined by FORMAT=TEXT_IMAGE, text image fields are also
              defined by CONTEXT_UNIT and CONTEXT_PARSER parameters.

                  CONTEXT_UNIT defines the portion of text that BASIS uses to define the
                  boundaries in a proximity search—a search for specified terms that occur within a
                  given number of context units. A CONTEXT_UNIT can be SENTENCE,
                  PARAGRAPH, or USER_DEFINED. The default value for CONTEXT_UNIT is
                  SENTENCE.

                  If sentences and paragraphs are not suitable for the purpose of proximity searching ,
                  you can delineate context units yourself by specifying CONTEXT_PARSER=NONE
                  and CONTEXT_UNIT=USER_DEFINED. You will then need to delineate the
                  context units by inserting context delimiters, if any, into the text before loading the
                  documents into the database.

                  Context delimiters are useful even when you select SENTENCE (or PARAGRAPH)
                  as the context unit and parser. Say you have a sentence (or paragraph) you want to
                  divide into separate context units, in particular, a list of bulleted items. Although
                  each item does not end with a period (or other conventional sentence punctuation), it
                  represents a complete thought for proximity-searching purposes. In this case, you
                  can insert a context delimiter after each bulleted item, and the system will interpret
                  each bulleted item as a separate context unit.


84  Fields
    The maximum number of words in a context unit is 4,000. The maximum number of
    characters is about 16,000 (the maximum length of a subfield). You can use the
    CAPACITY parameter of the RECORD_STORAGE definition to define the
    appropriate maximum number of context units per document and the words per
    context unit. Select the value of CAPACITY based, in part, on the values you have
    chosen for CONTEXT_UNIT and CONTEXT_PARSER. For example, setting
    CAPACITY=C4 (500,000 context units per continuous document, 4000 words per
    context unit) would be reasonable given CONTEXT_UNIT=USER_DEFINED and
    CONTEXT_PARSER=NONE. For information about the RECORD_STORAGE
    definition, see ―SDM Definitions.‖

    CONTEXT_PARSER defines how the end of a context unit is located. The default
    value for CONTEXT_PARSER is PARAGRAPH if PARAGRAPH has been
    specified for CONTEXT_UNIT. Otherwise, the default context parser is
    SENTENCE, meaning the rules of English punctuation for determining the end of a
    sentence are used to locate the end of a context unit. For more information about
    context parsers, see Markup and Style Guide, ―Context Parsers.‖

Text Stream Field

A text stream field is a character field that contains the text of a continuous document or
the text of a section in a sectioned document. (Sectioned records are not supported on
Windows.) The text stream field can contain as many as 128,000,000 characters. Each
text stream field can consist of physical and/or indexed text.

    Physical Text is the form of the text steam field that contains text and formatting
    information normally output from a word processor. It can also be a binary large
    object (BLOB). Sometimes this is called the original form, revisable form, or the
    actual document. BASIS only stores physical text; it does not display or enable
    updates to it. Changes to it are made by exporting it from the database, changing text
    and or formatting by means of word processing software, and importing it back into
    the database. For more information about these import and export processes, see
    Database Loading and Maintenance, ―Import/Export.‖

    Indexed Text is the form of the text stream field that contains the ASCII text used
    for indexing data and displaying information in BASIS modules. Sometimes this is
    called the readable form.

Both physical text and indexed text forms can be stored in a BASIS database. The field
containing the physical and/or indexed text must have USAGE=TEXT_STREAM as one
of its parameters. This usage automatically attributes to the field the following implied
parameters: TYPE=CHARACTER, FORMAT=TEXT_IMAGE, and OCCURS=0:1. For
more information about how the physical text and indexed text are stored in the database,
see Database Loading and Maintenance, ―Import/Export.‖




                                                                                  Fields  85
Defining Fields in Continuous and Sectioned Records
              IMPORTANT: Sectioned records are not supported on Windows.

              For the system to create the necessary data structures for continuous and sectioned
              records and the table of contents for sectioned records, you need to define the RECORD
              STYLE parameter as CONTINUOUS or SECTIONED, and you need to define various
              FIELD USAGE parameters.

              Defining Fields in Continuous Records
              For continuous records, you need to identify and define document-level fields (Document
              Field Group of Figure 4-1) and a text stream field (Document Text Stream field of Figure
              4-1). Some examples of document-level field names are Title, Author(s), Date, Abstract,
              Publisher, and Source.

              Defining Fields in Sectioned Records
              For sectioned records, you need to identify and define document-level fields (Document
              Field Group of Figure 4-1) and section-level fields (Section Field Group and Section Text
              Stream of Figure 4-1). A section field with USAGE=SECTION_NAME must be defined.
              Section fields with USAGE=SECTION_LEVEL, USAGE=SECTION_NUMBER, and
              USAGE=SECTION_TITLE are optional. For each field that you want to include in
              sections, you will need to specify SECTION_FIELD=YES in the FIELD definition.

              The section name is a unique name that you make up to indicate the contents of a section.
              The system needs the section name to identify each section in a document. Section level
              numbers (0, 1, 2, etc.) determine indentation when the table of contents is displayed, and
              optionally are used to calculate section numbers.

              If you have a document that falls under the continuous style category but requires frequent
              updates and could be broken into sections, then give it sectioned style. Because sections
              are updated individually, updates are much more easily performed on sectioned records.
              Several people can update different sections of the same document at the same time.




86  Fields
You need to define one set of section-level fields for the whole document. The data for
each section that goes in these fields will be different, but the field names will all be the
same for each section.

Suppose you have a document with the following organization (table of contents):
1.   Retrieving Records

     1.1 Using FIND Commands

     1.2 Using TYPE Commands
2.   Components of Searches
3.   Manipulating Result Sets

For the system to keep the correct organization of your document and automatically
generate a table of contents for a sectioned record, it needs at least three and up to five
pieces of information:

Section Name

The section name, which is required, is the name that is displayed in the table of contents.
In the above example, Retrieving Records, Using FIND Commands, Using TYPE
Commands, and so forth are section names. Each section name can contain a maximum
of 31 characters. If your section name is larger than that, it will be truncated. A section
name must be unique within a document, but it does not need to be unique within the
database. The section name, therefore, should not have a unique index, because that
would prevent the same section name from being used by other documents in the
database.

Section Level

The section level, which is optional, tells the system how your sections are indented,
distinguishing sections from subsections and lower-level divisions of your document. In
the above example, chapters 1, 2, and 3 are level-1 sections, and sections 1.1 and 1.2 are
level-2 sections. You can have up to 60 levels.




                                                                                    Fields  87
              Section Number

              The optional section number field identifies a chapter number, section number, etc. In the
              example above, the section numbers are 1 for Retrieving Records, 1.1 for Using FIND
              Commands, and so on.

              Section Title

              The optional section title is used for display in output other than the table of contents. If
              you have a long (over 31 characters) section name, you can put an abbreviated version of
              your section name in the section name field and then put the extended version in the
              section title field. The section title field can be up to 500 characters long.

              Text Stream

              The text stream usage identifies characteristics of the text stream, the field designed to
              hold most of the text of a document. This usage enables the database to retain the format
              of the text stream, including spacing, blank lines, and tabs. The text stream field stores
              the text body; document-level information such as chapter titles, revision dates and author
              names is stored in other fields.

              To store all the information about a sectioned style record, in your ADM you need to
              define five fields with their associated USAGES. The names you give the fields can vary,
              of course, from those listed below:
                  FIELD=NAME, USAGE=SECTION_NAME...
                  FIELD=LEVEL, USAGE=SECTION_LEVEL...
                  FIELD=NUMBER, USAGE=SECTION_NUMBER...
                  FIELD=TITLE, USAGE=SECTION_TITLE...
                  FIELD=TEXT, USAGE=TEXT_STREAM..

              Note: For the field that you define with a USAGE of TEXT_STREAM, you should
              also define the CONTEXT_UNIT and CONTEXT_PARSER parameters.

              The following usages or sets of standard attributes can be used for fields specific to
              sectioned records.




88  Fields
Table 4-9: Implied parameters for sectioned record usages


Usage                       Implied Parameters

SECTION_NAME                TYPE=CHARACTER, SIZE=1:31, OCCURS=1,
                            FORMAT=STRING, SECTION_FIELD=YES

SECTION_LEVEL               TYPE=EXACT_BINARY, PRECISION=2,
                            OCCURS=0:1, SECTION_FIELD=YES

SECTION_NUMBER              TYPE=CHARACTER, SIZE=1:240,
                            OCCURS=0:1, SECTION_FIELD=YES

SECTION_TITLE               TYPE=CHARACTER, SIZE=1:500,
                            OCCURS=0:1, SECTION_FIELD=YES


Sample Sectioned Record Definition
Following is the DDL for the TOPIC record type in the DOC Database. Only syntax
directly related to defining this document in the ADM is shown. For more examples of
DDL for long text record types, and definitions of the PLACE and SCHED records, see
―TOUR Demo Database Description.‖




                                                                            Fields  89
              ACTUAL_DATA_MODEL;
              RECORD=TOPIC,+                             Usage implies several field
                STYLE=SECTIONED;                         parameters. It is like a
              FIELD=ID,LABEL='DOCUMENT ID',+             system-defined DOMAIN.
                USAGE=SYSTEM_KEY;

              FIELD=RELEASE,+
                LABEL='SOFTWARE RELEASE', OCCURS=1,USAGE=CHARACTER;

              FIELD=MANUAL,+
                USAGE=TITLE,OCCURS=1;

              FIELD=PTITLE,LABEL='PART TITLE',+
                USAGE=TITLE,OCCURS=0:1;

              FIELD=PNUM,LABEL='PART NUMBER',+
                USAGE=NUMBER,OCCURS=0:1;

              FIELD=AUTHOR,LABEL='AUTHOR',+
                USAGE=PERSON_NAME,OCCURS=0:5;

              FIELD=RDATE,LABEL='REIVEW DATES',+
                USAGE=DATE,OCCURS=0:10;

              FIELD=REVISION,LABEL='LAST REVISION',+
                USAGE=DATE,OCCURS=0:1;                      SHORT_TEXT USAGE
                                                            provides for phrase and
              FIELD=ABS,LABEL='ABSTRACT',+                  proximity searching for
                USAGE=SHORT_TEXT;
                                                            text fields other than the
              FIELD=STITLE,LABEL='SECTION TITLE',+          text stream
                USAGE=SECTION_TITLE;

              FIELD=SNUM,LABEL='SECTION NUMBER',+
                USAGE=SECTION_NUMBER;

              FIELD=SECTION_LEVEL,LABEL='SECTION LEVEL',+
                USAGE=SECTION_LEVEL;
                                                            Section name is required
              FIELD=SECTION_NAME,+                          by the system to identify
              LABEL='SECTION NAME',+                        sections in a document
              USAGE=SECTION_NAME;

              FIELD=TEXT,+
                USAGE=TEXT_STREAM;                          The text stream field is
                                                            identified by
                                                            USAGE=TEXT_STREAM


                         Figure 4-2: Sample DDL for a sectioned record




90  Fields
Defining Complex Fields and Arrays
          A complex field contains data values that represent a table or multi-dimensional array. A
          calendar page for a MONTH can be considered a two-dimensional array with DAYS
          represented as columns and WEEKS represented by rows. MONTH(3,4) could identify
          the third week’s Wednesday. An annual calendar can be considered a three-dimensional
          array: YEAR(1,3,4) could identify the first month’s third week’s Wednesday. A
          collection of diaries spanning a range of years could represent a four-dimensional array.
          DIARY(2,1,3,4) could represent the second year’s first month’s third week’s Wednesday.

          Traditional relational database design guidelines discourage use of complex fields or
          arrays, preferring instead a separation of data into individual fields and, if necessary,
          reorganizing the data into several record types normalized in third normal form.
          However, there are some applications where complex fields or arrays are the best kind of
          data structure.

          Complex fields in BASIS are considered CHARACTER data. They consist of character
          strings embedded with one kind of delimiter to separate subfields and another kind of
          delimiter to separate subitems. BASIS supports arrays of up to twelve dimensions.




                                                                                         Fields  91
92  Fields

								
To top