







| |
An ANDF Based Ada 95 Compiler System
-Compilers and Related Tools
-FAA Certification
-Custom Software Development Safety Critical Real-World Software Development
White Papers
DDC-I A/S
Gl. Lundtoftevej 1B
DK-2800 Lyngby, Denmark
Phone: +45 45 87 11 44
Fax: +45 45 87 22 17
An ANDF Based Ada 95 Compiler System
By Jørgen Bundgaard
The ANDF Concept
The ANDF Language
ANDF Compared to Other Compiler
Intermediate Languages
Usefulness of the ANDF Technology
Will ANDF be Useful for ADA 95 as Well?
About the OMI/GLUE Project
Efficient Support for ADA 95 in ANDF
Components of the ANDF
Based Ada 95
Compiler System
The Compiler
The Linker
The Standard Libraries
Current Status
How to get ANDF Information
and Documentation
References
Abstract
An Ada 95 compiler system which uses the Architecture Neutral Distribution Format (ANDF) as an intermediate program representation form is
presented.
ANDF was originally designed and developed for use in C compilers, but has now
been extended to support other high-level languages
efficiently, including Ada 95.
The main difference between ANDF and conventional intermediate program forms is
that ANDF represents an abstraction of the high-level
language while the others typically represent an abstraction of the target
architecture.
One of the advantages of using ANDF is that high quality final code generators
already exist for a number of popular platforms. That gives a
potential for bringing Ada 95 rapidly to these platforms.
The ANDF Concept
Traditionally, when N high-level languages shall cover M target machines, this
requires N times M compilers. However, with the existence of a
universal intermediate language supporting the N high-level languages and
covering the M target machines, the number of compilers could be
reduced to N plus M. ANDF is such an intermediate language.
High-level language
|
Producer
|
ANDF
|
Installer
|
Target architecture
In the ANDF terminology a compiler translating from a high-level language to
ANDF is called a producer, and a compiler translating from
ANDF to target machine code is called an installer.
Currently, installers are available (or under development) for the most popular
platforms, as well as for some less popular:
Architecture Operating Systems
386/486 SVR4.2, SCO, Solaris, NT, Linux
Pentium SVR4.2, SCO, Solaris, NT, Linux, Sinix
680x0 HP-UX
HP-PA HP-UX
SPARC SunOS, SVR4.2, Solaris, NT
MIPS Ultrix, Iris, NT
Alpha OSF/1
RS6000 AIX
PowerPC AIX PowerOPEN
ARM -
Currently, a producer is available for C. Producers are under development for
C++, Ada 95, Dylan/Lisp, and FORTRAN 77.
The ANDF Language
ANDF is an abstraction of high-level languages. Its definition contains
abstractions for common programming language concepts such as data
structures, procedures, numbers, conditionals, loops, labels, jumps, etc. The
intention of ANDF is to retain all the information in the
programming language being compiled to ANDF which can then be used by code
optimizers in the installers. In this way it is equally applicable
to any new architectures as to any existing ones.
ANDF constructs have been designed so as to be able to accommodate the
particular variants found in different programming languages.
However, ANDF cannot guarantee coverage of new programming languages as it can
for architectures. New languages might contain novel
features that are not efficiently implementable using existing features of ANDF.
Extensive research has therefore been undertaken to ensure that
ANDF is an efficient target for the compilation of Ada 95.
ANDF contains for example EXPs (abstractions of expressions an statements),
SHAPEs (abstraction of types) and TAGs (abstraction of
variable identifiers). In general form it is an abstract syntax tree which is
flattened and encoded as a series of bits, called a CAPSULE. A number
of CAPSULEs may be combined to form a single CAPSULE. This fairly high level of
definition (for a compiler intermediate language) means that
ANDF is architecture neutral in the sense that it makes no assumptions about the
underlying processor architecture.
ANDF has a parameterization mechanism via TOKENs.
Virtually any node of the ANDF tree may be a TOKEN: a place holder which stands
for a subtree. Before the ANDF can be decoded fully, the
definition of this TOKEN must be provided. The TOKEN definition is then macro
substituted for the token in the decoding process to form the
complete tree. Tokens may also take arguments.
The specification of ANDF takes the form of an abstract syntax with the
semantics described in English. Here is an extract of the specification
of the constructs assign, top, pointer, and alignment:
assign
arg1: EXP POINTER(x)
arg2: EXP y
-> EXP TOP
The value produced by arg2 will be put in the space indicated by arg1.
top
-> SHAPE
TOP is the SHAPE that describes program pieces which return no useful value.
pointer
arg: ALIGNMENT
-> SHAPE
A POINTER is a value which points to space allocated in a computer's memory. The
pointer constructor takes an ALIGNMENT argument.
alignment
An ALIGNMENT gives information about the layout of data in memory and hence is a
parameter for the POINTER and OFFSET SHAPES.
The construct assign takes two arguments each of sort EXP; the sort of assign is
itself an EXP; the SHAPE of assign is TOP, it performs the
assignment but does not deliver any useful value.
There are more than 100 different EXPs, and there are ten different basic
SHAPEs.
The specification of ANDF also concerns the memory model (that is what the
alignment constructors are used for), error handling, order of
evaluation, representation of integers and floating point values.
The complete specification of the ANDF language is found in [1]. This informal
specification of ANDF is supplemented by a formal
specification [2], using the style of Action Semantics [3].
The ANDF concept was originally invented by the British MoD Defence Research
Agency (DRA) for solving portability problems with
software written in C, including the problems of name clashes in different APIs
and much, much more. The paper "TDF and Portability" [4]
provides a deep insight.
These true powers of ANDF are really not unleashed until the Ada 95 vendor, in
this case DDC-I, attempts to port an Ada binding from one
platform to another, using the available system libraries. In all other
respects, ANDF can be regarded as Yet Another Compiler Intermediate
Language, but one with the thrilling feature that a host of code generators
happen to exist.
ANDF Compared to Other Compiler Intermediate Languages
Intermediate Languages (IL) are typically compared on the basis of the
achievable efficiency of the final code, the ease of targeting from a
high-level language, the ease of mapping to a new architecture, the cost and
availability of the technology, and finally on the level of
standardization. For compiler system vendors who support multiple high-level
languages and/or multiple target platforms, the choice of IL is of
strategic importance.
There are several alternative IL designs: Three-address code, abstract stack
machine code, Register Transfer Lists (RTL) (the interface between
the GNU NYU Ada Translator (GNAT) and the GNU gcc back end) and even shrouded C!
The main difference in design between the usual IL choices and ANDF is that the
usual choices represent an abstraction of the architecture
while ANDF represents an abstraction of the high-level language.
No IL design has yet proven to be superior with respect to efficiency of the
final code because some information present in the source code is
bound to be lost in either end of the abstraction scale. However, ANDF based C
compilers are as good as the best traditional C compilers. This
is documented in [5].
The ease of targeting from a high-level language favors ANDF once ANDF has been
properly extended for the purpose. The cost of extending
ANDF for a new high-level language includes effectively the cost of enhancing
all dependent installers.
The development time for a new high quality installer (for a new architecture)
is one to two person-years. A lot of code can be reused for similar
architectures. The retargeting time for the GNU gcc back end is known to depend
very much on the implementors' experience with that very
same component. Anything from a couple of days to a year has been quoted. The
retargeting time for a proprietary IL is typically a well guarded
secret!
There is an important difference between proprietary ILs, ANDF, and RTL in terms
of availability: Code generators for proprietary ILs are
typically either not commercially available, or are only available from a single
source. The GNU back end is free "Copyleft" software, but using it
implies that the entire compiler source code must also be given away for free to
anyone who asks for it. ANDF installers are available on
commercial terms for the highly optimizing ones from at least two vendors while
the "standard" ones are in the public domain.
ANDF is the only IL for which a formal standardization has begun (ISO and
X/Open). RTL is a sort of de facto standard set by the GNU
community.
All in all, the choice of IL lies with the compiler vendor, and the combinations
of high-level languages and architectures to support now and in
the future must be used to guide decisions.
It is clearly interesting for DDC-I to build a compiler system which includes a
translator from Ada 95 to ANDF, because the availability of
installers will reduce the cost and time for bringing DDC-I Ada 95 technology to
many platforms.
Usefulness of the ANDF Technology
ANDF has proven to be a very versatile technology for C. OSF Research Institute,
France, and Defence Research Agency (DRA), UK, as well
as other collaborators around the world, have made numerous advances through the
use of ANDF. Until now the technology has proven to be
very useful in the following areas:
a. For developing and testing portable software (the ANDF technology requires
conformance to API specifications, so it also serves as a good
portability checker).
b. As an open compiler technology (a baseline for developing a compiler family).
In particular for new non-interpretive languages, portable
compilers may find ANDF a more preferable intermediate language than say C
source code.
c. For software distribution (the original ANDF objective).
d. As a more formal and machine manipulable way of expressing the static aspects
of APIs.
OSF also foresees interesting usages in the following areas:
e. ANDF as a representation of parallel programs.
f. As a format to represent and manipulate code in CASE tools.
Will ANDF be Useful for ADA 95 as Well?
It has yet to be demonstrated that ANDF is also a very versatile technology for
Ada 95. This will partly depend on the quality of the Ada 95
front end for ANDF (being developed as part of the OMI/GLUE project), partly on
the quality and availability of installers.
The most important message is that there is definitely a great potential for
bringing Ada 95 rapidly to a lot of popular platforms via the ANDF
technology.
This is important because there are clear indicators that ANDF candidates for an
important role in multi-language development environments
and in projects which deal with porting legacy software to new hardware
platforms:
- New architectures emerge (Alpha, PowerPC)
- COTS software is (re)used through interfacing
- Applications are more long-lived than hardware
- APIs being offered on more platforms
About the OMI/GLUE Project
The Open Microprocessor Initiative (OMI) is a 250 million dollar technology
programme which involves many large IT companies within the
European Community. The hardware part of OMI operates with a "macrocell" concept
which enables VLSI designers to combine different CPU
architectures, storage units, communication devices, and other function units on
single chips.
The OMI/Global Language and Uniform Environment (GLUE) project is a 17 million
dollar project in the software part of OMI. It aims at
providing a choice of advanced languages for the software developers within the
OMI and allows the most suitable language to be used in a
given application. The development of an Ada 95 producer is such an example. The
project also aims at consolidating the ANDF technology
within OMI. This means development of installers for the architectures supported
by the OMI.
The partners in the OMI/GLUE project are:
Etnoteam SPA, Italy,
DDC-I A/S, Denmark
Harlequin Ltd, UK
Defence Research Agency (DRA), UK
OSF Research Institute, France
The starting point for the OMI/GLUE project was a production quality C producer
and several ditto installers developed by DRA.
In addition to development of new producers and installers, the project includes
formal and informal specification of ANDF, development of
test suites and validation capabilities, portability checking tools, technology
consolidation for conventional languages (C & Fortran 77),
extension of ANDF for advanced languages (Ada 95 and Dylan/Lisp), and proposing
ANDF extensions for parallism. From the diagram below
you can probably see why the project's logo is an hourglass!
Programming languages supported
Ada 95 C/C++ F77 Dylan/Lisp
\ | /
Producers
\ | /
ANDF
/ | \
Installers
/ | \
i386/486 680x0 SPARC MIPS PowerPC
Pentium RS6000 HP-PA Alpha ARM
Target architectures supported
The OMI/GLUE project will be finished by September 1995. Several follow-up
projects, under the Esprit programme as well as funded by private
companies, are under preparation.
International standardization of ANDF will be handled by the newly formed ISO
working group: ISO/IEC JTC1/SC22 WG15 ANDF/Virtual
Binary Interface.
Standardization under X/Open (which is likely to be achieved much faster than
ISO standardization) is currently under consideration.
Efficient Support for ADA 95 in ANDF
As part of the OMI/GLUE project, ANDF has been extended in order to support
advanced languages like Ada 95 better. It was found that
improvements were required in order to provide efficient support for the
following areas:
- Exceptions (stack overflow)
- Dynamic Types (functions returning dynamically sized objects)
- Out parameters (postlude)
Other areas did not require changes. Dispatching is for example already
supported indirectly in ANDF through subprogram pointers (which are
needed for C).
ANDF has no heap concept or tasking concept. Heap storage management and tasking
will therefore be handled through external calls to a
Run-time System.
Error handling, for numeric overflow and stack overflow is (now) built into
ANDF, while subtype checks are not. Explicit ANDF code must
therefore be generated for subtype checks. This has the advantage that a
producer can omit a check if it can be determined at compile time that
the check is superfluous [6].
ANDF has (now) the data types in the form of SHAPEs: top, bottom, bitfields
(bit-strings), compounds (records), floatings, integers, nofs
(one-dimensional arrays), offsets, pointers, and procedures required to model
the Ada 95 data types.
Not all issues are equally important. DDC-I had, for example, suggested
introduction of a fixed SHAPE (for fixed point types). That would have
made the job simpler for the Ada producer, but there would be no differences at
run-time. The additional cost imposed on all installers to
support SHAPE fixed could therefore not be justified.
There are, of course, trade-offs: DDC-I has chosen to let the installer handle
alignment of objects, including record components. This implies
that it is the responsibility of the installer to choose the kind of alignment
that gives the most efficient access to objects on the given target
machine. The benefit is that the producer can be simpler and that the result in
theory should be better. The downside is that there may be (yet
to be shown) cases where the installer makes a bad choice of alignment with a
resulting negative effect on the run-time performance.
Components of the ANDF Based Ada 95 Compiler System
The ANDF based Ada 95 Compiler System consists of the following major
components:
- Compiler
- Linker
- Standard libraries
The design of the compiler system is presented in [7].
As we shall see below, the components are a little more complicated than
normally due to the use of ANDF as intermediate language and due to
the use of third party installers. Also, the CPU time and other resources used
to activate the ANDF linker will come in addition to what is
normally required.
However, from a functional point of view, the user should not be able to tell a
difference between an ANDF based and a "normal", native Ada
compiler system.
The Compiler
The compiler has a front end which maps Ada 95 source to a High-Level
Intermediate Language (HLIL) which is essentially an abstract syntax
tree where all name resolution and type matching has been done and where a
number of attributes have been set to guide the code generation.
The compiler has also a back end which first maps HLIL to ANDF and then calls an
installer to map ANDF to object code for the desired target
architecture.
When a compilation unit is compiled, an ANDF capsule is generated. If the
compilation unit contains entities that may become visible to other
units, then a second capsule is generated for these entities (this corresponds
to a .h (=header) file in C).
Conversely, when the installer generates object code for a compilation unit, it
is necessary to have the ANDF context capsules available for the
Ada units mentioned (transitively) in the context clause of the compilation
unit, including any parents.
In addition, it is necessary to have a header capsule available for describing
target entities (e.g. size of predefined type Integer). This is the
ANDF counterpart to the definitions in the predefined package Standard.
Finally, it is necessary to have header capsules available describing the APIs
used on the platform, including the Ada 95 RTS.
The ANDF context capsules are identified (in the Ada 95 program library) by the
compiler front end as part of locating the corresponding HLIL
representations of the Ada context units. The remaining ANDF capsules reside in
the compiler's system directory.
Several ANDF capsules can be combined by an ANDF linker to form a single
capsule. This means that only a single capsule is fed into the
installer.
All current installers generate assembly code which is fed directly into the
system assembler.
To summarize the steps in the compile process:
Ada compile:
translator:
Ada 95 compilation unit ->
ANDF source capsule +
ANDF header capsule (if required)
ANDF linker:
API headers +
source capsule +
context headers +
target entity header -> installable capsule
installer:
installable capsule -> assembly source file
system assembler:
assembly source file -> object code file
The Linker
The Ada 95 linker has a front end which determines the units necessary to build
a partition, extracts the required object files from the program
library, and builds a main ANDF capsule that deals with elaboration code and
elaboration order. Object code is generated for the main capsule
and all the object files are given to the system linker which generates the
executable.
To summarize the steps in the link process:
Ada link:
extractor:
partition specification ->
elaboration order +
unit object files of partition
ANDF main generator:
elaboration order -> main capsule
installer:
main capsule -> main assembly file
system assembler:
main assembly file -> main.o
system linker:
rts.o +
main.o +
unit object files of partition -> executable
The Standard Libraries
The standard libraries are supported via an Ada 95 "root" program library
containing HLIL and ANDF header capsules for all compilation units
(and for package Standard) as defined in Annex A of the Ada Reference Manual
[8].
In addition, but hidden to the user, there are ANDF capsules representing the
target entities (numeric types, addresses, etc.), the system header
files and the header files of the APIs supported on the platform. The target
entity capsule is essentially derived from the predefined package
Standard. System header files and API header files are supplied (for the
platform) by the system vendor, while the corresponding capsules are
generated with a tool set accompanying the ANDF technology.
Finally, and also hidden to the user, there is the Ada 95 Run-time System which
includes the tasking kernel, heap storage manager, exception
manager, and other DDC-I specific code generator support.
The use of ANDF capsules to represent System header files and API header files
makes it simpler for DDC-I to move the Standard Libraries to
other platforms because the capsules can be made to be architecture neutral.
Current Status
At this time the development of the Ada 95 to ANDF mapping is well under way.
The environment issues, such as library management,
automatic recompilation, linking and integration of installers are working.
The compiler does not yet produce code for sufficiently many Ada constructs to
accomplish meaningful run-time performance measurements.
However, installers used as back ends in C compilers developed by DRA and others
have shown performance results that are fully comparable
with native C compilers. So why not be optimistic about Ada 95? On several
occasions the ANDF based system outperformed the "default"
native C compilers [5].
How to get ANDF Information and Documentation
There is now an ANDF home page located on the OSF RI World Wide Web server at
URL:
http://riwww.osf.org:8001/andf/index.html
Here you will find a brief overview of ANDF, additional information about OSF
experience with ANDF, as well as pointers to a collection of
ANDF related papers which may be either viewed on-line or printed. All of the
papers from the "ANDF Technology Collected Papers" series,
currently four volumes, are contained on the server.
For those who are not Mosaic users, the ANDF papers may also be obtained from
the OSF RI anonymous FTP server:
ftp://riftp.osf.org/pub/andf_coll_papers
Requests for ANDF information may be sent to:
andf-tech-request@osf.org.
There is also an ANDF Frequently Asked Questions document [9].
References
[1] Edwards, Peter. Foster, Michael. Currie, Ian. "TDF Specification 4.0",
Defence Research Agency, Malvern, UK. 1995.
[2] Toft, Jens-Ulrik. "Formal specification of ANDF semantics", ESPRIT Project
6062 OMI/GLUE, DDC-I, 1995.
[3] Mosses, Peter. "Action Semantics", Number 26 in Cambridge Tracts in
Theoretical Computer Science. Cambridge University Press, 1992.
[4] Andrews. Robert. "TDF and Portability", Defence Research Agency, Malvern, UK. 1994.
[5] DRA. "TDF Facts and Figures", Defence Research Agency, Malvern, UK. 1995.
[6] Møller, Peter: "Run-Time Check Elimination for Ada 95", Proceedings of the
TRI-Ada '94 Conference, Baltimore, ACM, 1994
[7] Bundgaard, Jørgen. "The Design of an Ada 95 Compilation Environment",
Proceedings of the Fourth "Ada in Aerospace" Symposium in Brussels, 1993.
[8] ISO/IEC 8652:1995(E) "Ada Reference Manual", 1995
[9] Peeling, N.E. et al. "Frequently Asked Questions about ANDF", Defence
Research Agency, Malvern, UK. 1993.
|