way Research has long recognized that high-perrormance naraware musr oe
complemented with high-performancesoftware to achieve the ultimate in high-
speed scientific computing. Having pioneered the development of automatic
optimmg and vectorizing compilers with the CFT Fortran compiler, Cray
Research now proudly offers the CFT77 compiler, which represents the
leading edge of compiler technology.
CFT77 is a multipass, optimizing, vectorizing, and multitasking compiler that
adheres to the American National Standards Institute (ANSI) standard 3.9-1978
(often called Fortran 77). CFT77 processes existing standard Fortran pro-
grams without modification.
The CFT77 compiler is available for the CRAY X-MP series of computer
systems, the CRAY-2 computer system, and for CRAY-1 computer systems -
and it operates under both COS and UNICOS, the Cray operating systems.
CFT77 takes full advantage of the unique hardware architecture of Cray com-
puter systems and by doing so greatly enhances their performance.Thus,
users benefit from hardware and software that work together to achieve max-
The high degree of software portability and superior performance offered by
the CFT77 compiler result in increased productivity of the programming staff
and efficient use of computing resources.
As users of the first Cray Fortran compiler (CFT) know, application perfor-
mance and portability are the top priorities for Cray compiler developers.
These goals are paramount to CFT77, which applies the latest techniques in
software design to continue in the tradition of excellence established by Cray
Research with the CFT Fortran compiler.
CFT77 uses three techniques to processing. The programmer Multitasking
improve the execution time of a does not need to know the CFT77 permits the partitioning of
FORTRAN program: details of vectorization; CFT77 a program among multiple
vectorization, scalar optimization, automatically vectorizes Fortran processors, enabling different
and multitasking. These three programs. parts to execute at the same time.
techniques are key to the Future plans include the ability
performance of Fortran Scalar optimization for CFT77 to partition
programs. Even when CFT77 cannot automatically. Multitasking
vectorize code, it still optimizes teamed with vectorization is a
Vectorization scalar code using a variety of powerful combination.
The compiler automatically optimization techniques to
generates code that uses the improve execution time.
vector registers and functional
units of the Cray hardware.
Speedups in the area of 10 to 1
are common when comparing
vector processing to scalar
Vectorization is a method for CFT77 combines the practical
reducing the execution time of knowledge gained in Cray
repetitive code. Following is an Research's decade of
overview of the difference vectorization experience with
between scalar and vector successful research programs
processing. from several universities.
Vectorization means that CFT77 also provides an extensive
specialized hardware is used for set of vectorization diagnostics to
greatly increasing program indicate vectorized and
performance. CFT77 takes care unvectorized areas of code.
of vectorizing for the users; Simple code changes or
without a vectorizing compiler, a compiler directives often can help
programmer would have to use the compiler fully vectorize the
assembly language to unvectorized sections.
manipulate the hardware.
Vectorized loops include those
containing nested IF statements,
loops that use indirect
(gatherlscatter) addressing, and
search loops, among others.
CFr77 efficiently optimizes scalar Common subexpression
code. As with vectorization, elimination
CFr77 approaches scalar Forward propagation of
optimization by analyzing a constants and expressions
complete program unit. u Extracting invariant
expressions from loops
Scalar optimization transforms the Strength reductions
internal representation of the Hoisting and sinking
Fortran program into a more Moving stores out of loops
efficient but functionally Store elimination
equivalent program. This is Dead code elimination
achieved by simplifying Arithmetic simplification
expressions and by detecting Short circuiting of logical
and eliminating redundant expressions
operations. The following Constant expression
optimization techniques evaluation
recognized as being state-of-the Bottom loading of loops
art by today's compiler
developers are incorporated into These scalar optimizations are
CFr77: always transparent to the user.
I r ~ t :lululastmg capawrles
Ir In the first approacn, tne user COMMON blocks known only to
CFT77 permit the programn to multitasks a Fortran program by a single task; this is often useful
divide a single program among inserting calls to library routines in the first multitasking approach.
the multiple central processing that implement a basic set of
units offered on CRAY X-MP and multitasking functions. In the Currently under development
WAY-2 computer systems. second approach, called and expected to be available in
microtasking, the user invokes the near future as a feature of
The speedup possible with the PREMULT preprocessor by CFr77 is the ability to multitask
multitasking is a function of the inserting directives in the Fortran some Fortran code automatically.
number of central processors source code. PREMULT then Emphasis IS placed on multitask-
available, the degree of parallel generates the appropriate library ing at the DO-loop level. Cray
processing in the program, and calls. Research is exploring how best
the overhead inherent in to implement multitasking
multitasking. Speedup factors in CFr77 supports both so that the use of multiple
the range of 3.6 to 3.8 have been approaches to multitasking, as processors is as easy as the use
achieved on four-processor does CFT. When stack storage of vector processors.
systems and of up to 1.8 for two allocation is specified, CFT77
processors. generates reentrant code. The
TASK COMMON statement
Cray Research currently supports allows the declaration of
two approaches to multitasking.
CFr77 is a language rich in The same language features are Comments embedded within
features. It contains all features supported in CR77 on all Cray a line
described in the Fortran 77 stan- computer systems. The exten- Compiler directives for listing
dard as well as a number of ex- sions supported include the output control, vectorization
tensions to the language. Some
of these extensions, such as
followina: control, dvnamic common
blocks, and array bounds
stack storage allocation and Array processing, which checking
TASK COMMON, are necessary permits operations on whole A choice of static or stack
to support multitasking. Other arrays or array sections (a storage allocation methods
features such as NAMELIST 110 subset of the proposed TASK COMMON storage for
and Hollerith constants are fre- Fortran 8X standard array multitasking
quently used in existing Fortran processing) On the CRAY-2, COMMON
programs; C R 7 7 supports these Automatic arrays, with flexible blocks allocated to local
features so that existing codes bounds memory, permitting faster
can be moved to CFT77 without Recursive functions and access to frequently used
extensive conversions. A few subroutines variables
features that are expected to be Pointer data type Asynchronous 110, which
in the next Fortran standard have Hollerith constants allows 110 operations to
been implemented in CFr77; Boolean constants (octal and execute simultaneously with
these include a subset of the hexadecimal) other program statements
array syntax (see example Variable names of up to 31 Mixed formatted and
above). characters and external and unformatted records in a file
COMMON block names under the COS operating
containing up to 8 characters system
Extra edit descriptors,
including those for right
justification and octal or
The C r a y Fon'ran envirmmenf
The environment surrounding the All of the library routines are package consists of the following
CFT77 compiler contains a optimized. They have been parts:
wealth of library routines and coded to keep execution time to
tools that make the user's job a minimum; many are coded in DEBUG, which analyzes a
both easier and faster. assembly language to maximize memory dump of a job and
efficiency. provides listings of variable
Library routines names and values
Supporting the CFT77 compiler Linking to non-Fortran DRD, which is a powerful
and the high-performance routines interactive symbolic debugger
hardware inherent in a Cray CFT77 is compatible with other that analyzes the memory of
computer system is a library of Cray Research language an executing job based on
highly optimized subroutines to processors. Routines compiled user directives
aid scientific and engineering with CFT77 may call or be called DDA, which allows interactive
computation. by routines compiled by the analysis of a memory dump of
Pascal, C, or CFT compilers, or a job using a subset of DRD
Regardless of the machine and routines assembled by the CAL directives
operating system, a wide variety assembler.
of library routines are callable Multitasking tools
from CFT77. They include: Segment loader A multitasking history trace buffer
SEGLDR, the segment loader, provides for the accumulation of
Mathematical routines that are allows control over memory use a history of multitasking events.
intrinsic to Fortran at run time. This is particularly An associated tool, MTDUMR
Scientific application routines useful for large codes with interpretsthis data and reports
110 and utility routines several distinct sections, such as the sequence of execution, task
initialization, computation, and history, and processor history.
Routines in these libraries output. These tools aid the user in
perform random number understanding multitasking
generation, Fourier analysis, Symbolic debug package behavior, identifying bottlenecks,
sorting, and many other Included with CFT77 is a debug and debugging programs.
operations. Fortran programs that package to help users locate
need a frequently used operation errors in their applications. The
can be served by Cray's
Non-ANSI flags FTREF second). The monitor
At the user's request, CFT77 will The FTREF program, a global accumulates statistics on the
flag features that are not part of cross-reference utility, provides a following hardware activities:
Fortran 77. static analysis of program flow
and common block use. The Instructions executed
List options latter is provided in both Floating-point operations
Many options are available for summary and detailed formats. Hold issue conditions
generating output listings, FTREF also has options Reference conflicts
including a source statement specifically oriented to Vector operations
listing with any of five levels of multitasked applications.
error messages and a listing of SPY
assembly code generated by FLOWTRACE This is a code-level profiler
CFT77. Diagnostic messages are The FLOWTRACE option is a available for the CRAY X-MP
issued on the source listing for useful tool for fine-tuning program computer systems. Like
loops that are not vectorized. performance. It shows where the FLOWTRACE, it is useful for fine-
code spends its time and helps tuning program performance.
Cross referencing locate the sections where special SPY samples the hardware
CFr77 has an extensive cross- optimization could be applied for program address register to build
reference facility. The listing increased performance. a map of where the program
includes addresses, references spends its time and can provide
and definitions of variables, Hardware performance information at a lower level of
statement labels, subroutine monitor detail than that provided by
names, and so on. All are keyed On CRAY X-MP computer FLOWTRACE.
to the Fortran line number. systems, the hardware
performance monitor allows users Data conversion
to identify bottlenecks and to 1
1 0 library routines convert data
compute MFLOPS (millions of and tapeldisk formats during
floating point operations per 1
Fortran 1 0 operations. Data is
converted,to and from Cray
formats and IBM, CDC, or DEC
VAX formats. Users may also
disable data conversion during
1 0 operations and perform the
conversion by calls to special
I CFT77 design philosophy
CRAY-1 CRAY X-MP
Transportability A Fortran program that compiles Structure of the compiler
Through its many features and and runs on one Cray system will CFT77 is designed for the future.
because of its compliance with compile and run on all Cray The compiler is structured for
the 1978 ANSl standard, CFT77 systems. Dtfferent codes do not easy adaptation to new Cray
assures that programs written for need to be maintained for each hardware as it becomes available
other computer systems have machine. Upgrading to a new and to new optimization
maximum portability with a Cray system, therefore, is easy. techniques as they evolve.
minimum of effort. Because it is written in Cray's
Changing from CFT to CFT77 is extended Pascal, CFT77 is both
Additionally, CFT77 contains a also easy. In general, programs portable and maintainable.
number of extensions to the ANSl that compile and execute
standard, including those already correctly with the old CFT The structure of CFT77 is
supported in CFT. Some of the compiler also compile and organized around three major
extensions add helpful features execute correctly with CFT77. functions: source input and
that make Fortran richer and semantic analysis; optimization
more flexible. Others enhance Cray Research will implement and vectorization; and code
portability by reflecting features CFT77 on future generations of generation.
added to Fortran by other its computers. The compiler has
computer manufacturers, such as been structured so this can be In the first phase of the
IBM and CDC. done quickly, without sacrificing compilation, CFT77 reads the
the performance of generated Fortran statements and translates
Cray Research took portability code. Therefore, the program them into an intermediate form
one step beyond ANSl optimized today for a CRAY X-MP used in later processing. This
compliance by designing CFT77 or CRAY-2 computer system will section of CFT77 is virtually the
to run on all of its machines and move easily to the new Cray same on all Cray machines,
under all Cray supported systems of tomorrow. meaning source code that
operating systems. It runs on compiles on one machine will
CRAY-1, CRAY X-MR and CRAY-2 compile on the others.
computer systems and executes
under COS and UNlCOS (the
Cray operating systems).
The intermediate code consists of In its final phase, CFT77 Documentation and training
text and a dictionary. The text is a generates machine instructions Cray Research supports all of its
representation of the executable from the intermediate text and software products with technical
Fortran statements. The dictionary. The instructions are manuals and training.
dictionary is a collection of the scheduled to take advantage of Programmers may be interested
attributes associated with the text the asynchronous execution of in the following:
items. the independent functional units
common to all Cray computers. The CR77 Reference
During the second phase, CFT77 Each code generator also takes Manual, which describes the
performs optimization advantage of specific hardware entire CFT77 language and its
transformations on the features, such as chaining, local interface to the Cray operating
intermediate text and determines memory, or gatherlscatter systems
the vectorizable sections of the operations. Upon completion of The Progammer's Library
code. This phase is optional. this third phase, the machine Reference Manual, which
Bypassing it slows down the language code is ready for describes the routines
execution speed of the generated loading and execution. available and how they can be
code, but the corresponding called from CFT77
speedup in compilation time can A course on CFT77 offered by
be valuable during development Cray Research at the Mendota
and debugging. Heights, Minnesota, training
facility, which provides
information and practical
experience in code
and programming to take
advantage of vector
processing and other basic
Additional ihormation on CFT77
is available from any Cray
Research sales office.
1 0 Second Avenue South
Minneapolis, MN 55402
Domestic sales offIces
Albuquerque, New Mexico
Colorado Springs, Colorado
Los Angeles, California
Rochester, New York
St. Louis, Missouri
Cmy We~laarch R L
MP-1009 %86, Cmy Rasearch, Inc