COBOL-Tool by huanghengdong

VIEWS: 4 PAGES: 5

									12/16/2011Confidential                        Page 1                                   12/16/2011


            KCS COBOL Tool: An Overview and Current Status

Introduction

The KCS COBOL Tool (sometimes called the COBOL Gopher) is based on the Knowledge-
Centric Software (KCS) technology – a framework for developing tools for software evolution
and maintenance. An overview of the KCS technology and its application for developing COBOL
tools is given in [1]. This document is written to give an overview of the internals of the KCS
COBOL Tool and it current status. This document will help the readers in understanding the evo-
lution path for the tool. The COBOL tool has three major categories of capabilities: analysis (e.g.
data modeling), knowledge extraction (e.g. extracting business rules), and restructuring (e.g. eli-
minating GO TO, dead code etc.). This report focuses only on the analysis capabilities.

Internal Structure

The Figure 1 shows the organization of the key components of the tool. The eXtensible Common
Intermediate Language (XCIL) format in KCS technology enables creation of an integrated set of
tools for different programming languages. Currently, C, C++, COBOL, FORTRAN, and Java
are supported. Different types of graph objects are produced depending on the type of the analy-
sis. The Paragraph Flow Diagram (PFD) for viewing intra-program structure, the Component
View Diagram (CVD) for viewing inter-program interactions, and the Variable Trace Diagram
(VTD) for a pictorial view of the trace are the graph objects currently available. The Database
Generator can support any SQL-based database. The parser and some aspects of the analyzer (e.g.
the analysis of different types of MOVE, RENAME, and REDEFINE etc.) are language specific
the others are standard components used across KCS tools for different languages.


   SOURCE          COBOL               XCIL          LOADER            ANALYZER
    CODE           PARSER            FORMAT



                                                                          GRAPH
                                                                          OBJECT


                                        DATABASE            VISUALIZER          REPORT GE-
                                       GENERATOR                                 NERATOR

                   Figure 1: Internal Structure of the COBOL Tool



Current Status
This is a brief report of the current status of components and their integration. This gives an idea
of how the COBOL tool will evolve to the next stage. The evolution path will alter if new press-
ing needs are identified.

A. Parser
Currently we are using our own parser for COBOL. We plan to change to a commercial grade
parser and build an XCIL converter for it. We use Edison Design Group (EDG) front-ends for


                                                 1
12/16/2011Confidential                       Page 2                                    12/16/2011

other languages. COBOL is not supported by EDG. Currently we are examining the different al-
ternatives for parsers. An open COBOL parser appears to be one viable alternative at this point.
The parsing capabilities of the current parser are summarized in Table 1.Based on the hand trac-
ing and testing we have done, we believe that the current parsing capabilities are adequate for
performing the necessary analysis for data modeling.


Supported Syntax         IFCOND, ELSECOND, ENDIF, GOTO, PERFORM, PERFORM-
                         THRU, STARTPROC, CALL
                         EXEC CICS, EXEC CICS (READ | WRITE | REWRITE |SEND | RE-
                         CEIVE | XCTL | LINK), ENDEXEC
                         READ, WRITE, REWRITE, FILE, DATASET, MAP, MAPSET, IN-
                         TO, FROM
                         COMMAREA, PROGRAM, PROGRAM-ID,
                         MOVE, SMOVE, NMOVE, COPY
                         EVALUATE, WHEN
                         RENAME, REDEFINE, COPY REPLACE
Syntax not supported     STRING, UNSTRING, DELIMITED, SET, COMPUTE, MOVE
                         LENGTH, ERR, EXEC CICS UNLOCK
       Table1: A Summary of Current Capabilities of the Parser

B. Analyzer
All the COBOL specific algorithms are developed but not all of them are implemented. A sum-
mary of the pending implementations is provided in Table 2.


Loop Analysis      In transaction-processing programs many loops iterate over transactions and
                   for data modeling an analysis of loop iteration is not required. We believe this
                   to be the case for the given programs. We would like confirmation from do-
                   main experts.
MOVE analysis      The MOVE to unstructured buffers is currently analyzed in a conservative
                   way. It is likely to give false positive. We believe that a more accurate analy-
                   sis is possible if domain-knowledge is used. We will implement more accu-
                   rate analysis after discussions with the domain experts. We did a test to check
                   if a large number of false positives are produced when the given program is
                   analyzed. We found that no more false positives are produced beyond what is
                   generated by the conservative IF analysis, discussed next.
Domain-specific    In a conservative static analysis, one has to consider the possibility that all
IF analysis        possible execution paths generated by IF conditions may be taken and one has
                   to take the union of the results to be accurate. This type of conservative analy-
                   sis can produce false positives. In the current program this is a dominant fac-
                   tor and completely overshadows the conservative MOVE analysis. We be-
                   lieve that instead of conservative analysis, a precise analysis can be performed
                   if the meanings of the IF flag are clarified by domain experts.
       Table2: A Summary of Pending Enhancements for the Analyzer




                                                2
12/16/2011Confidential                       Page 3                                   12/16/2011

We support batch processing so that hundreds of variables can be analyzed in a matter of minutes.
This can save significant time compared to manual processing. The systematic automated tracing
can be verified intrinsically by checking the algorithm used for the analysis. This is an important
advantage because it is a formal check as opposed to heuristics or random sampling, both of
which are not completely reliable. It is possible that there are errors in coding the algorithm. We
sampled a few variables and did tracing by hand to check the results. Manual tracing is tedious
and prone to errors. Another way to verify the results will be to use domain knowledge about the
application.
An example is given at the end to illustrate the difference between conservative IF analysis and a
more precise form of analysis. The example is taken from the actual code and it shows how some
false positives may be generated in absence of domain-specific knowledge. One important aspect
of the KCS technology is the ability to support for customizing analysis using domain knowledge.
The example illustrates an opportunity for customization.

C. Visualization
The visualization components are tested separately and used in many of our other tools. We have
encountered some integration problems and the current visualization in COBOL tool needs fur-
ther improvement. If visualization becomes a priority, this can be done fairly quickly.

D. Report Generation
This capability is flexible and easily extensible. The available reporting is based on the current
needs identified by the domain experts.

E. Database Generation
Since our output is XML-based it is very easy for us to store the analysis results in any standard
SQL-based database. We have tested the Microsoft Access and a Linux-based public domain da-
tabase as two possibilities. We have designed a database schema that we believe will be useful for
the data modeling exercise. This database facility can be easily customized to suit the specific
needs in a given analysis or knowledge extraction project.

REFERENCE

    1. An Overview of Knowledge-Centric Software Technology with Applications to Legacy
       COBOL Code.




                                                3
12/16/2011Confidential                      Page 4                                  12/16/2011



Example: Possibility of Substituting More Precise IF Analysis Instead of Conservative
Analysis

047370 A000-MainLine.
.....
.....
047530     MOVE 'LNFILE' TO DATASET-NM.
047540     String '01ML' KEY-LPO KEY-LOAN ' 01'
047550        delimited size into NMI-RCD-KEY.
047560     MOVE KEY-LPO TO PA-LPO-NBR.
047570     MOVE KEY-LOAN TO PA-NEW-LOAN-NBR.
047580     Exec CICS Read
047590         DATASET (DATASET-NM)
047600         RIDFLD   (NMI-RCD-KEY)
047610         LENGTH   (RECORD-LEN)
047620         INTO     (NMI-RCD-AREA)
047630         KEYLENGTH (22)
047640         RESP     (CICS-RESP)

047650     END-Exec.
.....
.....
047770*>>PROCESS-FILES
047780     IF CB-DATA-FLAG (001) = SPACE
047790        GO TO R001-END.
047800     MOVE 'BORSEL ' TO DATASET-NM.
047810     MOVE 804 TO RECORD-LEN.
.....
.....
047850     MOVE NMI-RCD-AREA TO BORS1001-DEF.                                  1
047860     PERFORM P005-READ-RECORD THRU P005-EXIT.
047870     MOVE NMI-RCD-AREA TO BORS1001-INPT.
.....
.....
051930 R001-END.
051940
051950     IF CB-DATA-FLAG (002) = SPACE
051960        GO TO R002-END.
051970     MOVE 'BORSEL ' TO DATASET-NM.
.....
.....
052020     MOVE NMI-RCD-AREA TO BORS1101-DEF.
052030     PERFORM P005-READ-RECORD THRU P005-EXIT.                        2
052040     MOVE NMI-RCD-AREA TO BORS1101-INPT.




NOTE: The conservative analysis will assume that either of the two marked possibilities can oc-
cur and it will save LNFILE-NMI-RCD-AREA and BORSEL-NMI-RCD-AREA to BORS1101-
DEF. Most likely, during runtime only one of the CB-DATA-FLAG is not empty and thus only
one of the two possibilities occur and hence only LNFILE-NMI-RCD-AREA should be saved to
BORS1101-DEF.




                                               4
12/16/2011Confidential   Page 5   12/16/2011




                           5

								
To top