Gela - An Ada Compiler

The goal of Gela project is creation of portable Ada compiler. This document contains project overview, user guide and some implementation details which could be interesting for deep sight.

Content

  1. Gela - An Ada Compiler
    1. Project Overview
    2. Standard ASIS
    3. Structure of the compiler
    4. Architecture neutral compilation
    5. Gela ASIS
      1. Installation
      2. Usage
    6. Internal details
      1. Scanner
      2. Parser
      3. Semantic analysis
      4. Run Time Library (RTL)
    7. Conclusion

Project Overview

Goal of Gela project is to create an Ada compiler. Ada was originally designed with three concerns: program reliability and maintenance, programming as a human activity, and efficiency. Our intention is to follow these concerns in compiler design.

In our project we evolve follow ideas:

  • Compile to machine-independent format and transform it to target architecture by compiler independent tool possible in different time and node.
  • Using ASIS as intermediate representation of parsing tree.
  • Using XML-based methods to build some compiler parts.

Ada has strict precision standard - ISO 8652. The standard specifies the form and meaning of Ada program, environment, etc. The strict standard makes compiler creation more easier.

The other standard we based our compiler on is ASIS (Ada Semantic Interface Specification). ASIS is an open and published callable interface which gives access to semantic and syntactic information about Ada compilation units. We use ASIS as foundation of our compiler. For example, code generator will communicate with frontend over ASIS interface.

Generation of target machine code isn't in our current goals. This is a tedious job and we would like to avoid it now. Happily, there are some open source projects that provide such functionality. One of them, namely TenDRA contains "installers" - converters that take program in ANDF (Architecture Neutral Distribution Format) and produce native code for various platforms.

Standard ASIS

Ada Semantic Interface Specification, ASIS is an interface between an Ada environment and any tool requiring lexical, syntax and semantic information from it. For instance translators, source navigators, style checkers, source formatters, documentation generators, reverse engineering tools, metrics calculator, test case generators, correction proofer, etc. ASIS has been designed to be independent of underlying Ada environment implementations, thus supporting portability of software engineering tools while relieving tool developers from needing to understand the complexities of an Ada environment's proprietary internal representation.

Structure of the compiler

There are three parts in our compiler: Gela ASIS, Ada checker and code generator. Gela ASIS includes scanner, parser AST (Abstract Syntax Tree) builder, overload resolution and calculation of semantic attributes. This part catches a few possible errors that make AST building impossible. Second part, Ada checker, validate program checking violation of all other Ada rules. After validation code generator starts its job. Code generator and Ada checker get access to program over ASIS interface.

Now we are working on Gela ASIS. Our next step is code generator integration. After then we will be able to compile correct Ada program. Ada checker will be implemented next

In Ada the same identifier can denote different entities in different place. After AST building compiler uses Visibility Rules to resolve names. This step is name resolution.

Some Ada constructions have the same syntax. For instance, function call, taking array element, type convertion and slice are undistinguished without context. Resolve such ambiguity them compiler use Overload Resolution Rules. Overload resolution and name resolution phases are part of Gela ASIS.

Architecture neutral compilation

Ada program could be very portable. Let's see how this feature could help in architecture neutral compilation.

Typical portable program defines its own data types and uses them to operate on data. This types are mapped to different low level types on different architectures. In TenDRA case result of compilation is ANDF capsule where data types expressed in requirements of range and precision of its values. External tool, installer, reads ANDF, substitutes data types with suitable low level types, do optimisation and produce target code. Predefined data types are processed in the same way.

But not any Ada program could be translated to architecture neutral format. Ada standard requires exact calculation of static expressions at compile time. Installer could do it because it has enough information of target types. But in some cases frontend need value of static expressions to overload resolution. For example:

Example 1. Target dependent program

declare
   type Int_1 is range 1 .. 10;
   type Int_2 is range 1 .. 20;
   procedure F (X : Int_1);
   procedure F (X : Int_2);

   Arr : array (Int_1, Int_2) of Boolean;
begin
   for K in Arr'Range (Integer'Size / 16) loop
      F (K);
   end loop;
end;

If size of Integer is unknown to frontend (because it platform depended) there is ambiguity in type of K. So it is impossible to resolve name of function F and this program can't be translated in architecture neutral format.

There are only two kind of context where frontend need value of static expression: as argument of 'Range() attribute and as discriminant value of record aggregate. Here is example of aggregate:

Example 2 - Target dependent aggregate

declare
   type Rec (Value : Integer) is record
      case Value is
         when 1 =>
            X : Int_1;
         when others =>
            Y : Int_2;
      end case;
   end record;

   Var : Rec := (Integer'Size / 16, 1);

We presume such usage of target depended static expression are very rare and we won't translate them.

Gela ASIS

For now Gela ASIS is available with follow restrictions:

  • It accepts only correct Ada program. Result of processing incorrect code is unpredictable.
  • Static expression evaluation is not implemented. So overload resolution may fail if depend on static expression value. Function Representation_Value_Image doesn't work.
  • Package Asis.Text is partial implemented. Only function works is Element_Span.
  • Function Discriminant_Associations (Normalized => True) is not implemented.
  • Function Record_Component_Associations (Normalized => True) is not implemented.
  • Function Is_Dispatching_Call is not implemented.

Installation

TODO. Gela ASIS library contains a lot of auto-generated files, so it compilation takes over 20 minutes on Athlon 2.5GHz.

Usage

Set of accessible through ASIS Ada units make up Context. Gela ASIS Context contains one or more user provided units and all units them semantically depend on. Opening Context (through Asis.Ada_Environments.Associate) user provide file name as parameter. This file contains one or more units. Gela ASIS search for others units to satisfy semantic dependencies. It uses file naming follow rules.

File name is equal to unit name in lower case where symbol '.' replaced by '-'. Unit specification has extension '.ads' and unit body - '.adb'. Gela ASIS looks for this files in current directory and predefined library directory. Last locates in directory of executable and named "lib".

We provide predefined unit specifications from core standard in directory "rtl".

One can provide follow options to Asis.Ada_Environments.Associate besides file name to open:

  • -I/search/path - zero or more option to populate search path. Initial value of search path contains sinlge directory pointed by GELA_INCLUDE_PATH environment variable.
  • -A:assertion_expression - zero or more expression to debug internal execution.

Internal details

Scanner

There is full syntax of Ada in Annex P "Syntax Summary" of RM. We converted it to XML format by a mechanical transformation. XML format preserves all information from Annex P, but more easy for process. We can extract needed information with XSLT style. Here is syntax in HTML as example. We generate alfex input in the same way. Resulted scanner accepts Ada program in Latin_1 encoding. Parser

We use a type hierarchy to represent nodes of parsing tree. Types from this hierarchy correspond to ASIS element kinds. Parent types collect common attributes of children. Hierarchy is stored in its-own XML file.

Parser

We use a type hierarchy to represent nodes of parsing tree. Types from this hierarchy correspond to ASIS element kinds. Parent types collect common attributes of children. Hierarchy is stored in its-own XML file.

Figure 1. Base element of hierarchy Base_Element_Node and its children.

Figure 1 - Base element of hierarchy Base_Element_Node and its children.

We could build parser using the same XML as for building scanner. But syntax from Annex P is not in LR(1) form and couldn't processed by parser generator ayacc. Indeed, X (Y) may be

  • a call of function X with argument Y
  • element Y of array X
  • type conversion Y to X
  • slice of array X if Y is discrete type

We fix syntax from Annex P to bring it to form suitable for yacc. We record fixes in file fix.xml. Here is its HTML version of fix.xml . Here is complete fixed Ada syntax. We compensate this corrections in a pass over parsing tree.

There should be action descriptions in yacc file to build parsing tree. Because names of ASIS elements are identical to names of syntax element generation of such actions are usually trivial. For instance, rule "selected_component ::= prefix.selector_name" need action like this:

Example 3. Element creation action

selected_component :
     prefix dot selector_name
{
   declare
      New_Node : constant Selected_Component_Ptr :=
        New_Selected_Component (The_Context);
   begin
      $$ := YYSTYPE (New_Node);
      Set_Start_Position (New_Node.all, Start_Position ($1.all));
      Set_Prefix (New_Node.all, $1);
      Set_Selector (New_Node.all, $3);
      Set_End_Position (New_Node.all, End_Position ($3.all));
   end;
}

In some case we need more complex action. These cases are described in file code.xml. An Ada program translate this file together with fixed syntax file in yacc input file.

Semantic analysis

After parser build parsing tree it we can use procedure Iterate to walk over tree.

First pass over parsing tree compensates simplifications needed to bring syntax to LR(1) form. Procedure Normalize does such compensations.

Next pass (Asis.Gela.Resolver) builds declarative regions, names resolutions, introduce implicit declarations and search for complete overload resolutions contexts. Overload resolution procedure takes two passes over complete context. First pass down-to-up collects possible interpretations. Second one modifies parsing tree according to chosen interpretation.

Next pass named Polish runs after any parser tree modifications. It collects semantic attributes which wasn't calculated before.

Run Time Library (RTL)

Conclusion

Our compiler has plain architecture based on well known ASIS standard. It's highly portable. These its features makes it very attractive to develop in research and production. We hope it will evolve into full functional Ada compiler.

Attachments