ISO
Shared by: HC12092821840
-
Stats
- views:
- 18
- posted:
- 9/28/2012
- language:
- simple
- pages:
- 23
Document Sample


October 7, 2006 J4/06-0159
Page 1 of 23
Subject: Project editor recommendations on responses to XML TR HOD
poll
Author: Don Schricker – Project editor for the XML TR
References:
1. ISO/IEC 1989:2002, Programming language COBOL
2. J4/06-0129 (WG4N0249, DTR 24716) - Native COBOL Syntax for XML Support
3. J4/06-0145 - US position on XML DTR for WG4 HoD straw poll (Schricker for J4)
4. J4/06-0147 - German response to XML TR HOD poll (Augustin/Bennett)
5. J4/06-0149 – UK response to XML TR HOD poll (Grealish/Bennett)
6. J4/06-0151 – Japan's response to XML TR HOD poll (Takagi/Bennett)
7. ISO/IEC JTC1 Directives, Edition 5, Version 2.0
This document reflects the recommendations of the project editor for resolution of the
responses received from the WG4 Heads of Delegation during the recent straw poll on the
XML DTR and should not be construed as the opinion of J4 or WG4. Some comments that
represent drastic changes in direction were considered by the project editor to be contrary to
the direction given by WG4 at its most recent meeting and not given further consideration. It is
the author's hope that these recommendations will expedite the processing of the items on
which there is no controversy.
The comments in this document are listed in the same order as the J4 documents are ordered.
US Response:
The U.S. position on the XML DTR HoD straw poll is YES WITH COMMENTS. The U.S. offers
the following comments for resolution before forwarding the document:
1) In accordance with JTC1 directives add the following text in a box on the cover page:
“Recipients of this draft are invited to submit, with their comments, notification of any
relevant patent rights of which they are aware and to provide supporting
documentation.”
Response: Accept comment and make recommended change.
2) The JTC 1 Directives, in 16.2.4, Contents of Type 1 and Type 2 TRs, states
“TRs of types 1 and 2 shall contain the following parts: …
Explanation of the reasons why JTC 1 has considered it necessary to publish a
TR instead of an IS;"
The current DTR does not include (or does not clearly include) this required part.
Response: Accept comment and add the following text to the introduction:
"The decision was made to publish the specification in a Type 2 Technical Report so
that the specification can be available for implementation as soon as possible and so
that implementations can be undertaken on an experimental basis. The experience
October 7, 2006 J4/06-0159
Page 2 of 23
gained is expected to result in an improved specification that can progress to
standardization in the next revision.
In order to provide as much stability as possible to implementors and users,
ISO/IEC JTC 1 Subcommittee 22 intends that the syntax and semantics be changed for
purposes of standardization only as necessary to address issues arising in
implementation or use of the feature and to integrate this facility with other new facilities
in the next revision."
3) The DTR is inconsistent on its indication of published document titles. In some cases, it
places “Extensible Markup Language (XML) 1.0 Third Edition” in italics, but in the “terms
and definitions” section, it does not. The entire DTR should be reviewed for consistent title
indications for all published documents.
Response: Accept comment. The ISO/IEC Directives indicate that the title should be
italicized in 2, Normative references. In the references throughout the document, the title
should not be italicized and should be as brief as possible. The 2002 ISO COBOL
standard should be added as a reference.
4) In some sections of the TR, the changes are numbered “1)” “2)” etc. In other sections,
they are numbered “1.” “2.” etc. Select a style and make all sections identical. The
indentation is also inconsistent. Even more serious is the fact that the numbering
conventions for subrules used in the standard are not always followed.
Response: Accept comment and make recommended changes for consistency with the
standard.
5) The term “white space” appears in a number of places within the DTR, but is never defined
or explained. When it is talking about what is not processed, this may not be a problem,
but when it is talking about what is trimmed, this needs to be well defined. Does it include
tabs, em-space, and other filling-type characters supported in a national character set?
Response: XML 1.1 states that "white space consists of one or more space (#x20)
characters, carriage returns, line feeds, or tabs". It is not clear to me that we need a
definition in the TR. However, we may what to indicate that all white space, not just
spaces (blanks) are trimmed.
6) Throughout the document, set the formatting to "justify" so that text will align on the right
margin as in the base document.
Response: Accept comment and make recommended change.
7) Page iv
a. In the third paragraph, last sentence, change “75 %” to “75%”
b. In the penultimate paragraph, last sentence, delete "Programming language".
Response: Accept comment and make recommended change.
8) 4.10 XML Schema (in terms and definitions), can “by constraining the structures and data
types that instance documents that conform to the schema comprise” be worded any
clearer? (Avoid “that instance ... that conforms ...comprise”.) Changing "that conforms" to
"conforming" would be an improvement, but perhaps even more can be done.
Response: Accept comment and change to "by constraining the structures and data types
of instance documents that conform to the schema".
October 7, 2006 J4/06-0159
Page 3 of 23
9) 6.1, Changes to 8, Language Fundamentals,
a. There is a problem with making DOCUMENT a context-sensitive word. Consider the
two formats of the CLOSE statement: if the compiler sees "CLOSE DOCUMENT FILE-1",
it won't know whether this is an XML-document format CLOSE statement for file-1 or a file
format CLOSE statement for files named document and file-1. One solution is the make
DOCUMENT a reserved word, rather than a context-sensitive word.
Response: Accept and make DOCUMENT a reserved word.
b. Add END-OPEN to the list of reserved words, both here and in 6.6, Substantive
changes.
Response: Accept and make recommended change.
10) 6.2 Changes to 9, I-O, objects, and user-defined functions, [c] 2)
a. Second paragraph, first sentence, add a serial comma before "or".
Response: Accept and make recommended change.
b. Second paragraph, last sentence add a comma after "Therefore".
Response: Accept and make recommended change.
c. Third paragraph, fifth sentence, change in part to read:
“The START statement positions the element position vector to attributes or
elements…”
Response: Accept and make recommended change.
d. Sixth paragraph, first sentence, change
“COBOL by default writes XML documents ...”
To
“COBOL, by default, writes XML documents ...”
Response: Accept and make recommended change.
e. last paragraph, first sentence, change
“... CDATA sections, processing instructions or white space inserted for readability.”
To
“... CDATA sections, processing instructions, or white space.”
Response: Accept and make recommended change.
11) 6.2 Changes to 9, I-O, objects, and user-defined functions, [h2] through [i]
All of the new descriptions use past tense, e.g. “attempted” while the existing (pre-DTR)
rules use present tense, e.g. “is attempted”. Change new rules to match existing ones.
Response: Accept and make recommended change.
12) 6.2 Changes to 9, I-O, objects, and user-defined functions, [l], 9.1.12.6, Logic error
condition …, item 14, I-O status = 4E, change,
“...statement attempted to create XML with an ...”
October 7, 2006 J4/06-0159
Page 4 of 23
to
“...statement attempted to create XML text with an ...”
Response: Accept and make recommended change.
13) 6.3 Changes to 12, Environment division [b]
There are restrictions on data-name-9, -10, -11, and -12 not being subordinate to “the file
description entry for file-name-1”. Should these also be restricted from being subordinate
to the file description of any file specified in the same “file-area format SAME clause” as
file-name-1? (They can’t both be OPEN at the same time, but this still might be a problem.)
Response: There is no need for a change in the DTR. These new rules parallel the
existing rules in the section being changed. The reason for these restrictions is because
there would not be a valid value in the data items when the file was closed.
14) 6.3 Changes to 12, Environment division [c], 2
a. Change
“the following additional subrule”
to
“the following additional subrules”.
Response: Accept and make recommended change.
b. In rule n, change “where” to “when”.
Response: Accept and make recommended change.
15) 6.3 Changes to 12, Environment division [e]
Change
“new file control entry general rules”
to
“new file control entry general rule”
Response: Accept and make recommended change.
16) 6.3 Changes to 12, Environment division [j], the new rule should be rule 4a, not 6.
Response: Accept and make recommended change.
17) 6.3 Changes to 12, Environment division [l]
a. Data-name-1 in the TYPE clause may not be qualified. Is this intentional?
Response: Accept and allow data-name-1 to be qualified.
b. First sentence of 12.3.4.17, VERSION-XML clause, change
“...of the XML specification to which created files conform.”
to
“...of the XML specification to which created XML files conform.”
Response: Accept and make recommended change.
October 7, 2006 J4/06-0159
Page 5 of 23
c. There are no restrictions on the literal in the VERSION-XML clause. Shouldn’t syntax
rule 1 be changed from
“Literal-1 shall be either 1.0 or 1.1.”
to
“Literal-1 shall be an alphanumeric or national literal with the value '1.0' or '1.1'.”
Response: Accept and make recommended change.
d. VERSION-XML clause general rule 1, change
“...when a XML-document format CLOSE statement”
to
“...when an XML-document format CLOSE statement”
Response: Accept and make recommended change.
18) 6.4 Changes to 13, Data division, [a], add a colon after “(XML)” in the format header.
Response: Accept and make recommended change.
19) 6.4 Changes to 13, Data division, [d]
a. Should it be possible to qualify data-name-1 in the CODE-SET clause?
Response: Accept and allow data-name-1 to be qualified.
b. General rule 8, change,
“If the file connector is open in the extend or output mode ...”
to (something like)
“If the file connector associated with the file description in which this clause is
specified is open in the extend or output mode ...”
Make a similar change to the beginning of general rule 9.
Response: Accept and make recommended changes.
c. General rule 9, last paragraph, make the last sentence a separate paragraph and move
it before this paragraph because the file status should always be set. Check for RESUME
and other file statuses to see if they have this same problem.
Response: Move the first sentence of this paragraph to be the last sentence of general
rule 9a in the same paragraph as the existing sentence. Move the last sentence of this
paragraph to the OPEN statement as the second paragraph of new general rule 31.
With regard to the rest of the document: (1) In new general rule 25 of the REWRITE
statement, switch the order of the last two paragraphs. (2) Do the same thing for new
general rule 38 of the WRITE statement.
20) 6.4 Changes to 13, Data division, [e] , in the introductory paragraph, last sentence, change
“... to determine the number of occurrences to write.”
to
“... to determine the number of occurrences to write or rewrite.”
October 7, 2006 J4/06-0159
Page 6 of 23
Response: Accept and make recommended change.
21) 6.4 Changes to 13, Data division, e2
Is there a problem with both subrules b and c now ending with
“... subject to the following rules”?
Could it incorrectly be interpreted as subrule b only referring to subrule c by this phrase?
Response: Accept and change both subrules to read: "…subject to general rules 3 and 4"
22) 6.4 Changes to 13, Data division, [f], IDENTIFIED clause,
a. Second introductory paragraph, change,
“...is to be transferred into the data item without decomposition and out of the data
item intact”
to (something like)
“...is to be transferred into the data item without decomposition and transferred out of
the data item intact”
Response: Accept and make recommended change.
b. Syntax rule 6, change
“XML Namespaces 1.1”
to
“Namespaces in XML 1.1”
Response: Accept and make recommended change.
23) 6.4 Changes to 13, Data division, [i] replace the entire change with the following:
“In 13.16.42.2, syntax rules for the REDEFINES clause, replace syntax rule 3 with
“This clause shall not be specified in level 1 entries in the file section, in any entry
subordinate to a file description entry that contains a FORMAT clause, or in any entry
subordinate to a file description entry associated with a file that has organization
XML.”
Response: Accept and make recommended change.
24) 6.4 Changes to 13, Data division, [j] change
“... subordinate to a file description entry with organization XML”
to
“... subordinate to a file description entry associated with a file with organization XML”
NOTE: Check for any other occurrences of this problem in the DTR.
Response: Accept and make recommended change. I found no other occurrence of this
problem.
25) 6.5 Changes to 14, Procedure Division, the header should have “division” (lowercase “d”).
Response: Accept and make recommended change.
October 7, 2006 J4/06-0159
Page 7 of 23
26) 6.5 Changes to 14, Procedure Division [c]
a. Description of EC-DATA-INFINITY, change "The value infinity" to "INF (infinity)".
Response: Accept and make recommended change.
b. Description of EC-DATA-NEGATIVE-INFINITY, change "The value negative infinity" to
"-INF (minus infinity)".
Response: Accept and make recommended change.
c. Description of EC-DATA-NOT-A-NUMBER, change “The value not a number” to
“NaN (not a number)”.
Response: Accept and make recommended change.
d. Remove the periods at the end of the descriptions of EC-XML-COUNT, EX-XML-
IMPLICIT-CLOSE, and EC-XML-STACKED-OPEN.
Response: Accept and make recommended change.
e. Description of EX-XML-STACKED-OPEN, change
“The file containing the XML document for a stacked open is closed.”
to something like
"An XML-document format OPEN statement with the STACK phrase failed because
the specified identifier was not associated with a valid element position by the
element position vector.”
Response: Accept and make recommended change.
27) 6.5 Changes to 14, Procedure Division [ca]
a. 14.6a.1.1, Alphabetic receiving items, second paragraph, first sentence, change
“...a temporary data item of class and category alphabetic of a length to hold the
entire converted value ...”
to
“...a temporary data item of class and category alphabetic of the exact length
necessary to hold the entire converted value ...”
Response: Accept and make recommended change.
b. 14.6a.1.1, Alphabetic receiving items, second paragraph, last sentence, delete “class
and” from the statement
“...MOVE statement for an elementary sending data item of class and category
alphabetic ...”.
Note - Make the same change in all the other rules that use this term.
Response: Accept and make the recommended change here and in 14.6a.1.2, 14,6a.1.3,
and 14.6a.1.4.
c. 14.6a.1.2, Alphanumeric receiving items, second paragraph, first sentence, change
“...a temporary data item of class and category alphanumeric of a length to hold
the entire converted value.”
October 7, 2006 J4/06-0159
Page 8 of 23
to
“...a temporary data item of class and category alphanumeric of the exact length
necessary to hold the entire converted value.”
Response: Accept and make recommended change.
d. 14.6a.2, Transfer of output data, first paragraph, last sentence, change
“...of a length necessary to hold the ...’
to
“...of the exact length necessary to hold the ...’
Response: Accept and make recommended change.
e. 14.6a.2.1, Alphabetic or alphanumeric send items, second paragraph, change
“If after trimming the resulting XML-value ...”
to
“If the trimmed XML-value ...”
Note – Fix this also in 14.6a.2.3, National sending items.
Response: Accept and make recommended change in both places.
f. 14.6a.2.4, Fixed-point numeric sending items, does text such as
PICTURE -(n)9.9(m)
assume that a “Decimal-Point is COMMA” clause has not been specified? If so, does this
need to be made explicit somewhere (or is it already there)?
Response: J4 needs to discuss the solution. It is my understanding that XML always uses
the period as the decimal separator.
g. The text is inconsistent on whether the subclause, 14.6a.4, is called
“Codeset correspondence for output data transfer”
Or
“Codeset conversion for output data transfer”
The first one is correct. Check for all occurrences of the second and change them to the
correct name.
Response: Accept and make recommended changes.
28) 6.5 Changes to 14, Procedure Division [d], 14.8.6, CLOSE statement:
a. Change 1, change "subdocument" to "portion of a document".
Response: Accept and make recommended change.
b. Change 4, change
“The other general rules apply only to format 1.”
to something like:
“The other existing general rules apply only to format 1.”
October 7, 2006 J4/06-0159
Page 9 of 23
Response: Accept and make recommended change.
c. Change 5, is there a problem with the interaction of new general rule 11 of the CLOSE
statement and the existing rule 1 in “14.5.10 Normal run unit termination”? 14.5.10 starts:
“An implicit CLOSE statement without any phrases is executed for each file that is in the
open mode.”
Response: J4 needs to discuss this.
d. Change 6, general rule 16a, last sentence, change
“they had when that OPEN statement was initiated.”
to
"they had at the start of the execution of that OPEN statement."
Response: Accept and make recommended change.
29) 6.5 Changes to 14, Procedure Division [f], 14.8.9.1, DELETE statement general format,
delete "sequential-" in the name of format 1.
Response: Accept and make recommended change.
30) 6.5 Changes to 14, Procedure Division [h], 14.8.9.3, DELETE statement general rules,
general rule 15, last paragraph, change
“...satisfies this general rule, the delete statement is ...
To
“...satisfies this general rule, the DELETE statement is ...
Response: Accept and make recommended change.
31) 6.5 Changes to 14, Procedure Division [h3], 14.8.26.1, OPEN statement general format,
when the TR adds “END-OPEN” for the new format, it should also add it for the old one,
even though it doesn’t have an AT END phrase. See COMPUTE, “Format 2 (boolean-
compute):” or format 1 of the ACCEPT statement for the precedence. There are
nesting/matching rules that explain why this needs to be done.
Response: Accept and make recommended change.
32) 6.5 Changes to 14, Procedure Division [j], 14.8.26.3, OPEN statement general rules,
general rule 29 is missing from the sequence of added rules. Beware when this is
corrected because specific rule numbers are referenced in this change.
Response: Accept and carefully make the recommended change.
33) 6.5 Changes to 14, Procedure Division [n], 14.8.28.3, READ statement general rules,
a. In change 1, change
“General rules 1, 2, 12, 14, 15, and 21 apply to this new format.”
to
“General rules 1, 2, 12, 14, 15, and 21 apply to both this new format and the
existing formats.”
Response: Accept and make recommended change.
October 7, 2006 J4/06-0159
Page 10 of 23
b. In change 2, general rule 31, second paragraph,
i. Start a new paragraph with the last sentence.
ii. Use bullets instead of letters for the subparagraphs to avoid having two rules 31a
and 31b.
Response: Accept and make recommended change.
34) 6.5 Changes to 14, Procedure Division [r], 14.8.33.3, REWRITE statement general rules,
change 1, change
“Old general rules 1, 2, 12, 13, and 14 apply to this new format.”
to
“Old general rules 1, 2, 12, 13, and 14 apply to both formats.”
Response: Accept and make recommended change.
35) 6.5 Changes to 14, Procedure Division [s], 14.8.37.1, START statement general format,
make certain that the entire new general format appears on a single page.
Response: Accept and make recommended change.
36) 6.5 Changes to 14, Procedure Division [u], 14.8.37.3, START statement general rules,
general rule 25b references “the collating sequence of the XML document”. This term is
never used anywhere else within the DTR and it is not clear how or when this is
established.
Response: Help – what should this say?
37) 6.5 Changes to 14, Procedure Division [y], 14.8.47.3, WRITE statement general rules, rule
38, first paragraph, last sentence, change
“If ONLY is specified, ...”
to
“If the ONLY phrase is specified, ...”
NOTE – this problem occurs in the REWRITE statement, as well.
Response: Accept and make recommended changes.
38) A.1 Implementor-defined element list
The introductory information is missing the following paragraph from the base document,
“A short header and informative optional parenthetical text provide a paraphrase of the
normative detailed specification located in the body of this International Standard
<Technical Report> and direct the reader to that detail. A cross-reference is provided for
all items.”
Response: Accept and make recommended change. This new paragraph will go between
the bulleted list and the numbered list.
39) A.2 Undefined language element list, add “content of the receiving data item” when the
- EC-DATA-INFINITY
- EC-DATA-NEGATIVE-INFINITY
October 7, 2006 J4/06-0159
Page 11 of 23
- EC-DATA-NOT-A-NUMBER
- EC-XML-RANGE
exception conditions are set to exist.
Response: Accept. Add the following entry to A.2:
2) Transfer of XML input data to floating-point numeric receiving items. The content of
the receiving data item is undefined when XML-value is plus infinity (INF), minus
infinity (-INF), or Not a Number (NaN) or when XML-value contains an exponent
that has a value greater than the implementation supports. (14.6a.1.6,
Floating-point numeric receiving items)
40) Address any editorial errors that are discovered in the resolution of these and other
comments.
Response: Accept. :
German Response
The German position on reference 2 is "YES with comment".
This vote is explicitly restricted to publishing this specification as a TR only and does not
endorse including it in a future revision of reference 1. We think, the inclusion of this feature in
the standard must depend on the acceptance of this feature by implementers and users and
the feedback expected from those.
We are concerned about the size and complexity of this proposal, which will result in further
decreasing the comprehensiveness and attractiveness of the COBOL language even beyond
its present state.
We also question the decision to define this feature on the basis of the COBOL file system,
which carries along a large overhead, substantially contributing to the complexity of the
language, and may cause a large number of questions and defect reports.
We nevertheless agree with publishing this TR, because XML support in COBOL seems to be
a high level user requirement, and this documentation provides a good basis for discussion
and evaluation by implementers and users.
Response: We acknowledge your concerns and will proceed with the publication of the TR.
Much of the XML in existence today is in files, so some file handling overhead is necessarily
involved with XML processing. An effort has been made to keep the complexity to a minimum.
UK Response
The UK position on the XML DTR HoD straw poll is NO.
A) Introduction
There are substantial worries about this proposal that go beyond corrections to the
document. Since XML is of the utmost importance strategically, it is imperative to bring it
wholeheartedly into COBOL with some fundamental new concepts. The proposal requires
chopping an XML document into manageable pieces, using artificial 01 levels, that can be
handled by old-fashioned COBOL, instead of elevating COBOL to encompass the new
Digital Revolution. The proposal is much too complex and, because the low-level
October 7, 2006 J4/06-0159
Page 12 of 23
procedural operations interact in complex ways, there must be a host of pitfalls and
misinterpretations waiting to be discovered which will occupy the committee for a long time
hence and may never be adequately resolved. The complexity will make the feature
difficult to understand by present-day programmers and will put its usability into very
serious doubt.
This paper expresses the strong opinion that the use of file definitions for XML and the
hijacking of COBOL’s i/o verbs to parse a record which is already in memory are very ill-
advised and should be changed to a simple in-memory verb such as a move.
It is recognised that the proposal would benefit from the inclusion of dynamic OCCURS and
ANY LENGTH items but that the TR can only be applied against the current standard.
As a result, some of the methods, which are only sketched in the examples, are finicky in
the extreme and will tie programmers in knots. It only needs one half sentence (“except in
files of organization XML”) to allow this flexible OCCURS, since it’s already in the CD, and
greatly simplify most of the procedures. The same applies to ANY LENGTH items, which
are mentioned in the proposal.
The XML feature went to a formal proposal very early and this makes it hard to see how the
features work amidst all the insertions into the Standard. There have been few detailed
working papers that pursue all the implications and illustrate how every practical case is
resolved. The examples in the Concepts part of the proposal are much too brief and are
flawed (see later section), giving the impression that the proposed features have not been
worked through. Without this kind of detailed research, it is unknown whether the feature
will ever work.
Despite these fundamental problems, there now follows a list of comments, as requested
by the recent response poll. Answers to the more minor concerns may in fact be hidden in
the proposal, which is now so complex that it is difficult to find one’s way around it. Several
of the major comments were made several years ago when the proposal was fairly new,
and must now be reconsidered in the light of recent developments.
Response: The questioning of the use of I/O verbs is addressed by the response to the
next UK comment. An effort has been made to keep the complexity to a minimum.
The third and fourth paragraphs request the use of dynamic tables and the ANY LENGTH
clause which are in the draft revision, but not in the standard. From this request, I would
draw the opposite conclusion than the author of 06-0153, XML design - too early (too
tentative) to include in full revision. It would seem to me that the UK would like to skip
publication of a technical report, which is based on the current standard, and instead
include XML support in the full revision to make use of these features.
Before XML support was formatted as a technical report, the design was explain at a
workshop and fully endorsed by WG4. I disagree that the examples in the technical report
are flawed, and encourage anyone who has additional examples to put them forward for
inclusion. It is known that this feature will work because Micro Focus has customers who
are currently using this facility.
October 7, 2006 J4/06-0159
Page 13 of 23
B) Major concerns.
1) A fateful decision of the proposal was to view XML as a type of file organisation.
Instead, it could have been viewed as just another way of representing information,
similar in effect to COBOL’s traditional way but using tags. XML data is often passed
via linkage, such as by an object method, or as a pointer, or placed in working storage
by some means, such as a CALL to a server routine or a database. The proposal calls
this an “in-memory” document. The dilemma for the programmer is that he/she is
forced to turn the 01-level record, conceptually, into a “file” and navigate the
complexity of all the file operations such as read, write, rewrite, start with a bewildering
range of optional phrases like invalid key. Instead of defining the XML using data
clauses, the programmer is in effect forced to write a kind of “file handler” where the
document structure is concealed and, except in simple cases, is spewed out in
fragments.
No doubt the reason behind the read-write approach was that it seems easier for the
implementor. This is a false economy because it means the difficulties are shifted to
the programmer. Ten implementors do a better job than ten thousand programmers
(especially the current breed), and the bugs only have to be cleared once.
The final example in C.2 shows the problems. We have an 01-level containing the
document, assumed to be in memory. The proposal should have provided something
like this, using a complete view of the record and a MOVE to/from any of its lower-
level entries:
01 book-document identified by "book-document"
(some clause to say that this is an XML record).
... (some data)
05 root-tag identified by "library".
10 root-data pic x(80).
10 books.
15 book identified by "book" occurs dynamic
count in kount.
20 book-data pic x(80).
... (some more data)
MOVE books to conventional-COBOL-data
The record layout above is “raw” XML, not traditional COBOL contiguous data. It can
have many levels of hierarchy and it may be in any COBOL section. If the record is in
a file, a conventional read is done first. The move statement “knows” that the sending
item is XML and performs all the necessary parsing to find the “book” data. (The
implementor will of course optimise this process when more than one move is used.)
The move is doing what the read and write do in the current proposal, except that the
October 7, 2006 J4/06-0159
Page 14 of 23
implied navigation is done by the logic behind the move statement, rather than by the
programmer.
As the current proposal stands, instead of that simple move, the programmer has to
provide the following:
a SELECT ... ASSIGN for an in-memory file
an FD statement in the File Section
a record describing the root-tag data in a separate 01 record
a file-level OPEN
a document-level OPEN
an element-level READ for root-tag
a series of READs for book
a series of MOVEs
a document-level CLOSE
a file-level CLOSE
The description of the document is lost because the data is viewed broken up into
many 01-levels and the logical organisation of the document vanishes in complex
procedure.
To pick a particular occurrence of the book, the proposal provides a START verb,
instead of simply moving a value to a subscript.
The proposal should therefore be changed to separate the physical file operations and
the logical parsing and generating of XML data. In fact very little needs to be added at
the physical level. XML data is usually held in what COBOL would call a variable-
length serial file, but it could be held within any kind of existing file structure.
Response: There's a lot of reasons why in-memory processing of XML files would be
a less desirable approach than the one on the technical report. It's easy enough to
see if you consider reading a single XML file. For the in-memory approach, you would
need to:
1) start the file to find its size
2) allocate a buffer of the appropriate size
3) use file handling to read the file into memory
4) use xml processing to then handle file
This would double the amount of memory required to process an XML file and fail to
acknowledge that XML is simply a data representation, just like indexed files, relative
key files and line sequential files.
October 7, 2006 J4/06-0159
Page 15 of 23
On output, in-memory only processing would be disastrous, because you would need
to trap buffer overrun, reallocate buffers explicitly and then continue XML processing
where you were interrupted (presuming you could clear or reset the buffer overrun
condition).
XML is commonly used in configuration and tuning files, state storage files, general
data files and inter-process communication pipes. It very rarely appears 'naturally' in-
memory. Why make the rarest case the easiest processing? Why make the most
common cases the most difficult and error prone processing? The approach in the
technical report caters well to all cases, with very little distinction between them.
2) Items of indeterminate length, such as remarks, descriptions and comments are
essential to XML. It is mentioned in B.1 under Unresolved technical issues that the
ANY LENGTH clause will be available. However, ANY LENGTH is not allowed in the
File Section at the moment, and this restriction needs to be removed now.
Response: Consider lifting the restriction on ANY LENGTH in the File Section when
XML is added to the full revision. Ramifications of this for the TR have been
discussed.
3) Items that occur (repeat) an indeterminate number of times are also essential to
XML. So, since any-length items are to be allowed in XML-type records, it’s even
more important to allow the OCCURS DYNAMIC clause in order to handle items that
repeat any number of times. OCCURS DYNAMIC is made for XML and exactly
corresponds with the DTD * (asterisk) symbol which says “this item is repeated any
number of times”. (And the minoccurs and maxoccurs of the DTD correspond with
the FROM and TO of the OCCURS clause.) The COUNT (or CAPACITY) clause is
already there to hold the number of occurrences actually found. This is not a future
enhancement: it must go in now with ANY LENGTH. We don’t want programmers to
risk hitting the limit and then having to read tag after tag to find the remainder (see
example below). The change can be done in a half sentence!
Response: Ramifications of this for the TR have been discussed.
4) COUNT clause.
The COUNT clause is essentially the same as the CAPACITY phrase of dynamic-
capacity tables, and it seems a pity that one of them is not changed to make them the
same keyword. When these big new features appear together in the next Standard, it
will look as though different parts of the committee disagreed.
For a non-occurring item it seems wrong to use the same COUNT keyword and a
“count” that can only be 0 or 1. Better would be to use a boolean item or add a new
COBOL condition such as identifier IS [NOT] PRESENT … or simply to state that a
blank (i.e. space-filled) item that is defined as optional will correspond to an item
omitted in the XML.
Response: The COUNT clause and the CAPACITY phrase do not serve the same
purpose. The CAPACITY phrase is concerned with the allocated size of a table,
whereas the COUNT clause is concerned with the number of elements within the
allocated table, which may be less than the number that that allocation could hold.
October 7, 2006 J4/06-0159
Page 16 of 23
5) OCCURS DEPENDING. This should be prohibited in an XML record. XML
documents do not usually store counts as data items and we have the COUNT (or
CAPACITY) clause anyway. An ODO would conflict with it and there are too many
questions arising.
Response: Accept and make change.
6) Absent items. It’s not clear what happens to an item that is absent in the XML
document (i.e. optional and omitted from the document or an excess occurrence in the
case of an OCCURS clause).
What goes into a COBOL data item on READ if the XML item is absent? Is the data
item initialized or space-filled? This should be made clear. (All the items in a File
Section record are changed after a READ so something must be placed in all of them.)
There does not seem to be a requirement for a COUNT clause for items that might be
absent in the XML. On REWRITE, if a data item was absent but has no COUNT and
the program has not moved a value into it, is the item still absent in the XML? This
should be specified.
If a WRITE takes place (to insert new data into a document) and an alphanumeric item
defined as optional in the DTD contains spaces in the COBOL program but there is no
COUNT, is the item omitted from the XML document? (Hopefully yes. It should not
appear as a blank tag when it is not required.) This should be explained.
Response: Accept.
It should be made clear in the rules of the READ statement that when there is no
associated XML attribute or element, the effect on the receiving item is as though the
following statement were processed: INITIALIZE receiving-data-item TO DEFAULT.
REWRITE will write the absent element unless there is a COUNT phrase that prevents
this.
WRITE would work like REWRITE. Note that the tag would not be blank. The tag
would be whatever was defined for the tag. The data associated with the tag would be
one space. When this is folded into the full revision, the data associated with the tag
could be of zero-length. This issue could be added to Annex B, Unresolved technical
issues, for exploration during the development of the full revision.
7) Ordering of items. Since XML depends on tags rather than contiguous ordering (see
the IBM document Principles of XML design: “when the order of XML elements matters“) it
should be stated explicitly that the “XML order” need not be the same as the “COBOL
order”. For example, “boy”, “boy”, “girl”, “girl” could appear in the document as “boy”,
“girl”, “boy”, “girl”, etc. etc. unless intermediate XML elements (“boys” and “girls”) are
defined, or the DTD has a sequence keyword to enforce the order.
As it stands, it’s not clear that the proposal can handle these different orderings. For
example, would a (group level) READ of the following data description work in all
possible orderings?
October 7, 2006 J4/06-0159
Page 17 of 23
01 children IDENTIFIED BY “children”.
03 boy occurs 2 IDENTIFIED BY “boy”.
03 girl occurs 2 IDENTIFIED BY “girl”.
The proposal does not seem to rule this out but it should explicitly say that the XML
elements, in whatever order they arrive, are rearranged if necessary to the order given
in the COBOL record.
Again we have the question as to what happens to the order if the record is now
rewritten. (Is it retained? or re-ordered after the COBOL sequence? is this
undefined? does it matter?) May be it should be explicitly stated that COBOL writes
data in the order specified in its own data description.
Response: Accept. The proposal will make clear that all possible orderings can be
handled on input and that the data is written in the order specified in the data
description.
8) It’s not clear what the status of the element position vector is after an unsuccessful
READ. It should be stated that the element position vector is “established for data-
name-2” even though no data-name-2 was found. This is to prevent the situation:
(1) Read myxml element only root
(2) Read myxml element rare-element
(3) Read myxml element regular-element
For this code to be correct, another Read (1) should be inserted before Read (3). But
if “rare-element” is absent, the element position vector might be left unchanged by
some implementations so that Read (3) works as expected. Then, on the rare
occasion that “rare-element” is present, Read (3) will give an “at end” condition,
according to GR31, as though there were no occurrences of regular-element present.
In other words, this is a serious pitfall that programmers may fall into, yielding incorrect
results under certain true-life conditions.
Response: No change is necessary. According to the last paragraph of new general
rule 31, an unsuccessful READ statement sets the element position vector to indicate
that no next element exists. In order for Read (3) to always work in this scenario, a
START statement can be inserted between Read (2) and Read (3).
9) Characters > (greater than), < (less than) and “ (quote). These are presumably
permitted within the contents and get converted automatically to <, >, and "
in the XML document. This should be stated. However, if RAW is specified, for the
purpose of building up from scratch or repairing an XML format, a GT, LT or QUOTE
might be a “semantic” one and need to stay as it is. (The introduction to 13.16.28a
IDENTIFIED clause may in fact say this.) But the programmer using RAW may need
to store a “non-semantic” LT etc. as < etc. This needs a general rule.
Response: Any suggestions? Are these assumptions correct?
October 7, 2006 J4/06-0159
Page 18 of 23
10) C.2, “slightly more complicated example”. This example will be confusing for readers.
The namespace yourname only occurs once and is rightly defined on its own, but the
attribute name and value would surely occur several times. So why is there no
OCCURS clause? Programmers will need to be told that they have to obtain each one
by successive READs. But this example should use OCCURS … DYNAMIC to define
the attributes:
15 yourname PIC X ANY LENGTH.
15 root-tag-attr-name OCCURS DYNAMIC.
20 root-tag-attr-name PIC X ANY LENGTH.
20 root-tag-attr-val PIC X ANY LENGTH.
Response: See previous response regarding dynamic tables. The example will be
improved to say the multiple READ statements will be necessary, as illustrated by the
example that follows.
11) C.2, final example (Consider an example to illustrate the COUNT clause). This seems
a very bad example and illustrates a problem with the syntax as it stands. How does
the program know that there are more than 5 occurrences of the tag “book”? How
does it know that it has to do that “subsequent read”? Of course, it carries on reading
if it gets a maximum count “just in case” and the next count will be zero if there were
exactly 5, but the example must explain this point and must be generalised for any
number. Note that in the following we cannot insert a group item above “book”,
because the item would not have an equivalent in the XML document (and must have
an IDENTIFIED clause by 13.16.28a.2 SR(2)), and so we have to do single MOVEs
which is ugly. Nevertheless, the example as it stands must be changed to this (if it is
correct):
October 7, 2006 J4/06-0159
Page 19 of 23
01 root-tag identified by "library".
10 root-data pic x(80).
10 book identified by "book" occurs 5 count in kount.
...
01 ws-kount pic s9(4) comp.
01 book-sub pic s9(4) comp.
01 ws-book-table.
10 ws-book occurs dynamic.
...
Read myxml element root-tag *> initial read
Move 0 to ws-kount
Perform until kount = 0
Perform varying book-sub from 1 by 1
until book-sub > kount
add 1 to ws-kount
move book (book-sub) to ws-book (ws-kount)
end-Perform
Read myxml element book *> subsequent read
end-Perform
The best solution is now to allow the dynamic syntax that is already in the Draft:
01 root-tag identified by "library".
10 root-data pic x(80).
10 book identified by "book" occurs dynamic
count in kount.
15 book-data pic x(80).
Response: See previous response regarding dynamic tables.
12) 13.16.28a.2 SR2): this says that COBOL cannot introduce its own group levels that are
not reflected in the XML. There is no reason for this and it is important to remove this
restriction. Consider the layout:
October 7, 2006 J4/06-0159
Page 20 of 23
01 account IDENTIFIED BY “account”.
05 receipt occurs 10 IDENTIFIED BY “receipt”.
05 payment occurs 10 IDENTIFIED BY “payment”.
Because of this unnecessary restriction, the programmer who wants to move all the
receipts or all the payments has to move them one-by-one. There is absolutely no
reason why he should not write:
01 account IDENTIFIED BY “account”.
03 receipts.
05 receipt occurs 10 IDENTIFIED BY “receipt”.
03 payments.
05 payment occurs 10 IDENTIFIED BY “payment”.
Here, receipts and payments are purely COBOL items which do not exist in the XML
(indicated by the lack of an IDENTIFIED clause). The programmer can now do a
MOVE receipts and MOVE payments etc.
Response: This rule is necessary to prevent unworkable situations.
13) 6.4 item [i] (REDEFINES clause). There is absolutely no reason why REDEFINES
should not be used, provided that a redefinition does not have an IDENTIFIED clause
at the same level or below. For example, if we want to break up a 10-character code,
the following should definitely be possible:
01 cust-rec IDENTIFIED BY “cust-rec”.
03 cust-code IDENTIFIED BY “cust-code” PIC X(10).
03 cust-code-2 REDEFINES cust-code.
05 cust-code-head PIC 99.
05 cust-code-middle PIC X(7).
05 cust-code-tail PIC 9.
Only items with an IDENTIFIED clause correspond to items in the XML document, so
the REDEFINES has a completely neutral effect.
Response: Does the committee want to consider this suggestion?
14) Item [h3]. The OPEN and CLOSE DOCUMENT verb form is simply horrible. It turns
the OPEN (this form, at least) into a conditional statement and requires the
programmer to do two OPENs for the same name! The example in C4.4 brings this
home:
October 7, 2006 J4/06-0159
Page 21 of 23
open i-o quote-info
open document quote-info
...
close document quote-info
close quote-info
A double OPEN goes right against the principles. Either the document should have a
different name from the file or the OPEN and CLOSE DOCUMENT should be some
other statement. Also, OPEN … AT END is out of the question. OPEN does not
produce an AT END! The question should be resolved in the traditional way by
means of a file status code. (This OPEN is really a READ – see the introduction. But
using files is the wrong approach anyway.)
Response: No change will be made. The syntax and direction in the TR has been
endorsed by WG4.
15) 12.3.4.16 The CHECK VALIDITY phrase should be on the OPEN or CLOSE verb
respectively (WITH VALIDITY CHECK). It is a procedural concept, it may not need to
be done on every open or close, and it may not be appropriate, such as when
OUTPUT is specified but the file is only opened for INPUT.
Response: Does the committee want to consider this suggestion?
16) Invalid Key phrase. This goes against the principle that INVALID KEY refers to a file
that has KEY (as a keyword and as a concept). XML elements are not keys. The
phrase should be changed when we understand its purpose. The proposal itself
seems to neglect this phrase and rules need to be added (assuming we have the latest
version):
Item [p] REWRITE statement: There is no mention of imperative-statement-1 or
imperative-statement-2 in the GRs.
Item [w] WRITE statement: There is nothing in the GRs to describe the purpose of the
Invalid Key phrases and no mention of imperative-statement-1 or imperative-
statement-2.
Response: The rules for imperative-statement-1 and imperative-statement-2 are in
the COBOL standard and apply to these new formats. The committee wishes to apply
existing syntax, rather than invent new reserved words.
C) Minor concerns.
1) Item 4: add definitions for these terms which are referred to continually: namespace,
attribute in an XML context, subdocument (mentioned only once under CLOSE
statement) and possibly CDATA (or add a passing reference or explanation “Character
Data (CDATA)”).
Response: The term subdocument will be removed. The other terms are defined by
XML so there is no need to define them here.
October 7, 2006 J4/06-0159
Page 22 of 23
2) Item 6.1: Shouldn’t DISCARD and VERSION-XML be context-sensitive? (After all,
DOCUMENT is context-sensitive in the CLOSE statement.)
Response: DOCUMENT needs to be a reserved, rather than a context-sensitive,
word, as explained in the US response to this straw poll. DISCARD could be context-
sensitive, rather than reserved. The committee should consider this. VERSION-XML
needs to remain reserved.
3) Item 6.1: END-OPEN is also a new context-sensitive word.
Response: END-OPEN will be added to the list of reserved words.
4) Item 6.5: EC-DATA-NOT-A-NUMBER is misspelt.
Response: This spelling will be corrected.
5) EC-XML-CODESET-CONVERSION is misspelt in two places as EX-XML-CODESET-
CONVERSION.
Response: This will be corrected.
6) Item 6.5: under EC-XML-COUNT, “COUNT phrase” should be “COUNT clause”.
Response: This will be corrected.
7) 12.3.4.17: 1.1 or 1.0 will not be written on CLOSE if the document was OPEN for
INPUT.
Response: This will be corrected.
8) 13.16.15a COUNT clause, Note after GR4: this is a major processing rule and should
not be a Note.
Response: No change will be made. This note is just a summary of what is stated in
the rules for the READ statement.
9) Item 6.5 [c]: infinity is not available as a value in COBOL, so it surely cannot be
assigned to an item!
Response: No change will be made. Infinity is a value available to the new floating-
point binary and decimal types.
10) C.4 in the example: address and name are reserved words!
Response: Address is a reserved word and will be changed. Name is not a reserved
word.
11) C.4 same example: it’s a good idea to explain why you would do this piecemeal rather
than simply using an OCCURS 20 and an OCCURS 50 with COUNT clauses. Also
explain why the element “root” has to be written separately.
October 7, 2006 J4/06-0159
Page 23 of 23
Response: Accept. More explanation will be added to this example.
D) Other Issues (for consideration)
1) Consider using DTD to specifty the XML structure. Is new syntax needed for
something that already has established syntax - ie the DTD? If new syntax is specified
then:
a) it has to be able to specify the same things, so will, presumably, be isomorphic
b) any vendor is going to create a conversion program to map DTD onto the new
syntax.
Response: The XML structure may be specified either with a DTD or with a Schema.
In either case, a description is needed of the portion that COBOL will work with. At
least one vendor has a utility to create the COBOL description from a DTD.
2) Consider using Xpath to navigate through XML.
Response: J4 considered Xpath and decided that using it in this proposal would make
the proposal too complex.
3) Having rules to map an XML element (with sub-elements and attributes) to some
COBOL data structure would be very convenient (and save a lot of MOVE statements),
but it looks like you're jumping through a host of hoops you don't really need to.
Response: I don't understand this comment.
E) Conclusion
Markup Language processing is too important to accept this proposal as it stands. It must
be changed urgently to analyse the XML in memory only and in a schematic (whole record
view) way, with drastic simplification of the syntax, a few additions to the data division to
model XML structures, and a resultant vastly greater appeal and acceptability to the public.
Response: The issue of in memory only XML processing was addressed in the
response to UK comment B1. The syntax and direction in the TR was endorsed by
WG4 at their most recent meeting.
Japan's Response
Japan's position is "YES with comment".
1.. Page 1, 2, Normative references.
The editions of XML specs are updated. It is better to reference them unless any
significant changes to the TR are required.
XML 1.0 (Fourth Edition), W3C Recommendation 16 August 2006
XML 1.1 (Second Edition), W3C Recommendation, 16 August 2006
Response: Accept comment and make recommended change.
Get documents about "