$TenDRA: book.xml 2447 2006-03-23
21:15:51Z verm $
Copyright © 2002, 2003, 2004, 2005, 2006 TenDRA Documentation Team
Copyright © 1997, 1998 Defence Evaluation and Research Agency (DERA)
Extensions to this document from the original TenDRA-4.1.2-doc.tar.gz source distribution are covered by the BSDL, while all prior modifications remain under the Crown Copyright.
Berkeley Systems Design License
Redistribution and use in source (SGML DocBook) and 'compiled' forms (SGML, HTML, PDF, PostScript, RTF and so forth) with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code (SGML DocBook) must retain the above copyright notice, this list of conditions and the following disclaimer as the first lines of this file unmodified.
Redistributions in compiled form (transformed to other DTDs, converted to PDF, PostScript, RTF and other formats) must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS DOCUMENTATION IS PROVIDED BY THE TENDRA DOCUMENTATION TEAM "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE TENDRA DOCUMENTATION TEAM BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Crown Copyright (c) 1997, 1998
This TenDRA(r) Computer Program is subject to Copyright owned by the United Kingdom Secretary of State for Defence acting through the Defence Evaluation and Research Agency (DERA). It is made available to Recipients with a royalty-free licence for its use, reproduction, transfer to other parties and amendment for any purpose not excluding product development provided that any such use et cetera shall be deemed to be acceptance of the following conditions:
Its Recipients shall ensure that this Notice is reproduced upon any copies or amended versions of it;
Any amended version of it shall be clearly marked to show both the nature of and the organisation responsible for the relevant amendment or amendments;
Its onward transfer from a recipient to another party shall be deemed to be that party's acceptance of these conditions;
DERA gives no warranty or assurance as to its quality or suitability for any purpose and DERA accepts no liability whatsoever in relation to any use to which it may be put.
This document was generated on 2006-09-07 15:12:56
Abstract
Please email us at <docs@ten15.org> if you see any errors
Table of Contents
As explained in reference 1, TDF may be regarded as an abstract target machine which can be used to facilitate the separation of target independent and target dependent code which characterises portable programs. An important aspect of this separation is the Application Programming Interface, or API, of the program. Just as, for a conventional machine, the API needs to be implemented on that machine before the program can be ported to it, so for that program to be ported to the abstract TDF machine, an "abstract implementation" of the API needs to be provided.
But of course, an "abstract implementation" is precisely what is provided by the API specification - it is an abstraction of all the possible API implementations. Therefore the TDF representation of an API must reflect the API specification. As a consequence, compiling a program to the abstract TDF machine is to check it against the API specification rather than, as with compiling to a conventional machine, against at best a particular implementation of that API.
In this document we address the problem of how to translate a standard API
specification into its TDF representation, by describing a tool, tspec, which has been developed for this purpose.
The low level form which is used to represent APIs to the C to TDF producer
is the #pragma token syntax described in reference
3. However this is not a suitable form in which to describe API specifications.
The #pragma token syntax is necessarily complex,
and can only be checked through extensive testing using the producer. Instead
an alternative form, close to C, has been developed for this purpose. API
specifications in this form are transformed by tspec into the corresponding #pragma
token statements, while it applies various internal checks to the API
description.
Another reason for introducing tspec is that
the #pragma token syntax is currently limited in
some areas. For example, at present it has very limited support for expressing
constancy of expressions. By allowing the tspec
syntax to express this information, the API description will contain all the
information which may be needed in future upgrades to the #pragma token syntax. Thus describing an API using tspec is hopefully a one off process, whereas describing
it directly to the #pragma token syntax could
require periodic reworkings. Improvements in the #pragma
token syntax will be reflected in the translations produced by future
versions of tspec.
The tspec syntax is not designed to be a formal
specification language. Instead it is a pragmatic attempt to capture the common
specification idioms of standard API specifications. A glance at these
specifications shows that they are predominantly C based, but with an added
layer of abstraction - instead of saying that t is
a specific C type, they say, there exists a type t, and so on. The tspec syntax is
designed to reflect this.
Table of Contents
Let us begin by examining the various levels of specification with which
tspec is concerned. At the lowest level it is
concerned with objects - the types, expressions, constants etc. which comprise
the API - and indeed most of this document is concerned with how tspec describes these objects. At the highest level, tspec is concerned with APIs. We could just describe an
API as being a set of objects, however this is to ignore the internal structure
of APIs.
At the most obvious level the objects in an API are spread over a number of
different system headers. For example, in ANSI, the objects concerned with file
input and output are grouped in stdio.h, whereas
those concerned with string manipulation are in string.h. But a further level of refinement is also required.
For example, ANSI specifies that the type size_t
is defined in both stdio.h and string.h. Therefore tspec needs
to be able to represent subsets of headers in order to express this
intersection relation.
To conclude, tspec distinguishes four levels of
specification - APIs (which are sets of headers), headers (which are sets of
objects), subsets of headers, and objects. It identifies APIs by an identifying
name chosen by the person performing the API description. The (purely
arbitrary) convention is for short, lower case names, for example:
ansi refers to ANSI C (X3.159),
posix refers to POSIX 1003.1,
xpg3 refers to X/Open Portability Guide 3.
In this document, headers are identified by the API they belong to and the
header name. Thus ansi:stdio.h refers to the stdio.h header of the ANSI API. Finally subsets of
headers are identified by the header and the subset name. If, for example, the
stdio.h header of ANSI has a subset named file, then this is referred to as ansi:stdio.h:file.
The tspec representation of an API is arranged
as a directory with the same name as the API, containing a number of files, one
for each API header. For example, the ANSI API is represented by a directory
ansi containing files ansi/stdio.h, ansi/string.h etc.
In addition each API directory contains a master file (for ANSI it would be
called ansi/MASTER) which lists all the headers
comprising that API.
When tspec needs to find an API directory it
does so by searching along its input directory path. This is a colon separated
list of directories to be searched. This may be specified in a number of ways.
A default search list is built into tspec, however
this may be overridden by the system variable TSPEC_INPUT. Directories may be added to the start of the path
using the -I dir command-line option (see Section 1.5, “Command-line
Options” for a complete list of options). The current working
directory is always added to the start of the path.
tspec actually outputs two sets of output
files, the include output files, containing the #pragma
token directives corresponding to the input API, and the source output
files, which provide a rig for TDF library building (see Section 5.4, “TDF Library
Building”). These output files and directories are built up under two
standard output directories - the include output directory, incl_dir say, and the source output directory, src_dir say. tspec has default values for these directories built in, but
these may be overridden in a number of ways. Firstly, if the system variable
TSPEC_OUTPUT is defined to be dir, say, then incl_dir is dir/include and src_dir is dir /src . Secondly,
incl_dir and src_dir can be set independently using the system
variables TSPEC_INCL_OUTPUT and TSPEC_SRC_OUTPUT respectively. Finally, they may also be set
using the -Odir and -Sdir command-line options respectively.
As an example of the mapping from input files to output files, the header
ansi:stdio.h is mapped to the include output file
incl_dir/ansi.api/stdio.h and the source output file src_dir /ansi.api/stdio.c. The header subset ansi:stdio.h:file is mapped to its own pair of output files,
incl_dir /shared/ansi.api/file.h and src_dir /ansi.api/file.c.
The default output file names can be overridden by means of the INCLNAME and SOURCENAME file
properties described in Section 4.4, “File
Properties”
By default, tspec only creates an output file
if the date stamps on all the input files it depends on indicate that it needs
updating. In effect, tspec creates an internal
makefile from the dependencies it deduces. This behaviour can be overridden by
means of the -f command-line option, which forces
all output files to be created.
In addition, tspec only creates the source
output file if it is needed for TDF library building. If the corresponding
include output file does not contain any token specifications then the source
output file is suppressed (see Section 5.4, “TDF Library
Building”).
tspec will optionally add a copyright message
to the start of each include output file. This message is copied from a file
which may be specified either using the TSPEC_COPYRIGHT system variable, or by the -Cfile command-line
option.
Table of Contents
The basic form of the tspec description of an
API has already been explained in Section 1.2, “Input Layout” - it
is a directory containing a set of files corresponding to the headers in that
API. Each file basically consists of a list of the objects declared in that
header. Each object specification is part of a tspec construct. These constructs are identified by keywords.
These keywords always begin with + to avoid
conflict with C identifiers. Comments may be inserted at any point. These are
prefixed by # and run to the end of the line.
In addition to the basic object specification constructs, tspec also has constructs for imposing structure on the API
description. It is these constructs that we consider first.
A list of tspec constructs within a header can
be grouped into a named subset by enclosing them within:
+SUBSET "name" := {
....
} ;
where name is the subset name. These named
subsets can be nested, but are still regarded as subsets of the parent
header.
Subsets are intended to give a layer of resolution beyond that of the entire header (see Section 1.1, “Specification Levels”). Each subset is mapped onto a separate pair of output files, so unwary use of subsets is discouraged.
tspec has two import constructs which allow one
API, or header, or subset of a header to be included in another. The first
construct is used to indicate that the given set of objects is also declared in
the including header, and takes one of the forms:
+IMPLEMENT "api" ;
+IMPLEMENT "api", "header" ;
+IMPLEMENT "api", "header", "subset" ;
The second construct is used to indicate that the objects are only used in the including header, and take one of the forms:
+USE "api" ;
+USE "api", "header" ;
+USE "api", "header", "subset" ;
For example, posix:stdio.h is an extension of
ansi:stdio.h , so, rather than duplicate all the
object specifications from the latter in the former, it is easier and clearer
to use the construct:
+IMPLEMENT "ansi", "stdio.h" ;
and just add the extra objects specified by POSIX. Note that this makes the
relationship between the APIs ansi and posix absolutely explicit. tspec is as much concerned with the relationships between APIs
as their actual contents.
Objects which are specified as being declared in more than one header of an
API should also be treated using +IMPLEMENT. For
example, the type size_t is declared in a number
of ansi headers, namely stddef.h, stdio.h, string.h and time.h. This can be
handled by declaring size_t as part of a named
subset of, say, ansi:stddef.h:
+SUBSET "size_t" := {
+TYPE (unsigned) size_t ;
} ;
and including this in each of the other headers:
+IMPLEMENT "ansi", "stddef.h", "size_t" ;
Another use of +IMPLEMENT is in the MASTER file used to list the headers in an API (see Section 1.2,
“Input Layout”). This basically consists of a list of +IMPLEMENT commands, one per header. For example, with
ansi it consists of:
+IMPLEMENT "ansi", "assert.h" ;
+IMPLEMENT "ansi", "ctype.h" ;
....
+IMPLEMENT "ansi", "time.h" ;
To illustrate +USE, posix:sys/stat.h uses some types from posix:sys/types.h but does not define them. To avoid the user
having to include both headers it makes sense for the description to include
the latter in the former (provided there are no namespace restrictions imposed
by the API). This would be done using the construct:
+USE "posix", "sys/types.h" ;
On the command-line tspec is given one set of
objects, be it an API, a header, or a subset of a header. This causes it to
read that set, which may contain +IMPLEMENT or
+USE commands. It then reads the sets indicated by
these commands, which again may contain +IMPLEMENT
or +USE commands, and so on. It is possible for
this process to lead to infinite cycles, but in this case tspec raises an error and aborts. In the legal case, the
collection of sets read by tspec is the closure of
the set given on the command-line under +IMPLEMENT
and +USE. Some of these sets will be implemented -
that it to say, connected to the top level by a chain of +IMPLEMENT commands - others will merely be used. By default
tspec produces output for all these sets, but
specifying the -r command-line option restricts it
to the implemented sets.
For further information on the +IMPLEMENT and
+USE commands see Section 5.1, “Fine
Control of Included Files”.
Table of Contents
The main body of any tspec description of an
API consists of a list of object specifications. Most of this section is
concerned with the various tspec constructs for
specifying objects of various kinds, however we start with a few remarks on
object names.
All objects specified using tspec actually have
two names. The first is the internal name by which it is identified within the
program, the second is the external name by which the TDF construct (actually a
token) representing this object is referred to for the purposes of TDF linking.
The internal names are normal C identifiers and obey the normal C namespace
rules (indeed one of the roles of tspec is to keep
track of these namespaces). The external token name is constructed by tspec from the internal name.
tspec has two strategies for making up these
token names. The first, which is default, is to use the internal name as the
external name (there is an exception to this simple rule, namely field
selectors - see Section 3.9,
“+FIELD”). The second, which is preferred for standard APIs, is
to construct a "unique name" from the API name, the header and the internal
name. For example, under the first strategy, the external name of the type
FILE specified in ansi:stdio.h would be FILE,
whereas under the second it would be ansi.stdio.FILE. The unique name strategy may be specified by
passing the -u command-line option to tspec (see Section 1.5, “Command-line
Options”) or by setting the UNIQUE
property to 1 (see Section 4.4, “File
Properties”).
Both strategies involve flattening the several C namespaces into the single
TDF token namespace, which can lead to clashes. For example, in posix:sys/stat.h both a structure, struct stat, and a procedure, stat, are specified. In C the two uses of stat are in different namespaces and so present no difficulty,
however they are mapped onto the same name in the TDF token namespace. To work
round such difficulties, tspec allows an
alternative external form to be specified. When the object is specified the
form:
iname | ename
may be used to specify the internal name iname
and the external name ename.
For example, in the stat case above we could
distinguish between the two uses as follows:
+TYPE struct stat | struct_stat ;
+FUNC int stat ( const char *, struct stat * ) ;
With simple token names the token corresponding to the structure would be
called struct_stat, whereas that corresponding to
the procedure would still be stat. With unique
token names the names would be posix.stat.struct_stat and posix.stat.stat respectively.
Very occasionally it may be necessary to precisely specify an external token name. This can be done using the form:
iname | "ename"
which makes the object iname have external name
ename regardless of the naming strategy used.
Basically the legal identifiers in tspec (for
both internal and external names) are the same as those in C - strings of upper
and lower case letters, decimal digits or underscores, which do not begin with
a decimal digit. However there is a second class of local identifiers - those
consisting of a tilde followed by any number of letters, digits or underscores
- which are intended to indicate objects which are local to the API description
and should not be visible to any application using the API. For example, to
express the specification that t is a pointer
type, we could say that there is a locally named type to which t is a pointer:
+TYPE ~t ;
+TYPEDEF ~t *t ;
Finally it is possible to cheat the tspec
namespaces. It may actually be legal to have two objects of the same name in an
API - they may lie in different branches of a conditional compilation, or not
be allowed to coexist. To allow for this, tspec
allows version numbers, consisting of a decimal pointer plus a number of
digits, to be appended to an identifier name when it is first introduced. These
version numbers are purely to tell tspec that this
version of the object is different from a previous version with a different
version number (or indeed without any version number). If more than one version
of an object is specified then which version is retrieved by tspec in any look-up operation is undefined.
The simplest form of object to specify is a procedure. This is done by means of:
+FUNC prototype ;
where prototype is the full C prototype of the
procedure being declared. For example, ansi:string.h contains:
+FUNC char *strcpy ( char *, const char * ) ;
+FUNC int strcmp ( const char *, const char * ) ;
+FUNC size_t strlen ( const char * ) ;
Strictly speaking, +FUNC means that the
procedure may be implemented by a macro, but that there is an underlying
library function with the same effect. The exception is for procedures which
take a variable number of arguments, such as:
+FUNC int fprintf ( FILE *, const char *, ... ) ;
which cannot be implemented by macros. Occasionally it may be necessary to specify that a procedure is only a library function, and cannot be implemented by a macro. In this case the form:
+FUNC (extern) prototype ;
should be used. Thus:
+FUNC (extern) char *strcpy ( char *, const char * ) ;
would mean that strcpy was only a library
function and not a macro.
Increasingly standard APIs are using prototypes to express their procedures.
However it still may be necessary on occasion to specify procedures declared
using old style declarations. In most cases these can be easily transcribed
into prototype declarations, however things are not always that simple. For
example, xpg3:stdlib.h declares malloc by the old style declaration:
void *malloc ( sz )
size_t sz ;
which is in general different from the prototype:
void *malloc ( size_t ) ;
In the first case the argument is passed as the integral promotion of size_t, whereas in the second it is passed as a size_t . In general we only know that size_t is an unsigned integral type, so we cannot assert that
it is its own integral promotion. One possible solution would be to use the C
to TDF producer's weak prototypes (see reference 3). The form:
+FUNC (weak) void *malloc ( size_t ) ;
means that malloc is a library function
returning void * which is declared using an old
style declaration with a single argument of type size_t. (For an alternative approach see Section 3.8, “+TYPEDEF”.)
Expressions correspond to constants, identities and variables. They are specified by:
+EXP type exp1, ..., expn ;
where type is the base type of the expressions
expi as in a normal C declaration list. For
example, in ansi:stdio.h:
+EXP FILE *stdin, *stdout, *stderr ;
specifies three expressions of type FILE *.
By default all expressions are rvalues, that is, values which cannot be
assigned to. If an lvalue (assignable) expression is required its type should
be qualified using the keyword lvalue. This is an
extension to the C type syntax which is used in a similar fashion to const. For example, ansi:errno.h says that errno is
an assignable lvalue of type int. This is
expressed as follows:
+EXP lvalue int errno ;
On the other hand, posix:errno.h states that
errno is an external value of type int. As with procedures the (extern) qualifier may be used to express this as:
+EXP (extern) int errno ;
This automatically means that errno is an
lvalue, so the lvalue qualifier is optional in
this case.
If all the expressions are guaranteed to be literal constants then one of the equivalent forms:
+EXP (const) type exp1, ..., expn ;
+CONST type exp1, ..., expn ;
should be used. For example, in ansi:errno.h we
have:
+CONST int EDOM, ERANGE ;
The +MACRO construct is similar in form to the
+FUNC construct, except that it means that only a
macro exists, and no underlying library function. For example, in xpg3:ctype.h we have:
+MACRO int _toupper ( int ) ;
+MACRO int _tolower ( int ) ;
since these are explicitly stated to be macros and not functions. Of course
the (extern) qualifier cannot be used with +MACRO.
One thing which macros can do which functions cannot is to return assignable
values or to assign to their arguments. Thus it is legitimate for +MACRO constructs to have their return type or argument types
qualified by lvalue, whereas this is not allowed
for +FUNC constructs. For example, in svid3:curses.h, a macro getyx is
specified which takes a pointer to a window and two integer variables and
assigns the cursor position of the window to those variables. This may be
expressed by:
+MACRO void getyx ( WINDOW *win, lvalue int y, lvalue int x ) ;
The +STATEMENT construct is very similar to the
+MACRO construct except that, instead of being a C
expression, it is a C statement (i.e. something ending in a semicolon). As such
it does not have a return type and so takes one of the forms:
+STATEMENT stmt ;
+STATEMENT stmt ( arg1, ..., argn ) ;
depending on whether or not it takes any arguments. (A +MACRO without any arguments is an +EXP, so the no argument form does not exist for +MACRO.) As with +MACRO, the
argument types argi can be qualified using lvalue.
It is possible to insert macro definitions directly into tspec using the +DEFINE
construct. This has two forms depending on whether the macro has arguments:
+DEFINE name %% text %% ;
+DEFINE name ( arg1, ..., argn ) %% text %% ;
These translate directly into:
#define name text
#define name( arg1, ..., argn ) text
The macro definition, text, consists of any
string of characters delimited by double percents. If text is a simple number or a single identifier then the double
percents may be omitted. Thus in ansi:stddef.h we
have:
+DEFINE NULL 0 ;
New types may be specified using the +TYPE
construct. This has the form:
+TYPE type1, ..., typen ;
where each typei has one of the forms:
name for a general type (about which we know
nothing more),
(struct) name for a structure type,
(union) name for a union type,
struct name for a structure tag,
union name for a union tag,
(int) name for an integral type,
(signed) name for a signed integral type,
(unsigned) name for an unsigned integral
type,
(float) name for a floating type,
(arith) name for an arithmetic (integral or
floating) type,
(scalar) name for a scalar (arithmetic or
pointer) type.
To make clear the distinction between structure types and structure tags, if we have in C:
typedef struct tag { int x, y ; } type ;
then type is a structure type and tag is a structure tag.
For example, in ansi we have:
+TYPE FILE ;
+TYPE struct lconv ;
+TYPE (struct) div_t ;
+TYPE (signed) ptrdiff_t ;
+TYPE (unsigned) size_t ;
+TYPE (arith) time_t ;
+TYPE (int) wchar_t ;
It is also possible to define new types in terms of existing types. This is
done using the +TYPEDEF construct, which is
identical in form to the C typedef construct. This
construct can be used to define pointer, procedure and array types, but not
compound structure and union types. For these see Section 3.9, “+FIELD” below.
For example, in xpg3:search.h we have:
+TYPE struct entry ;
+TYPEDEF struct entry ENTRY ;
There are a couple of special forms. To understand the first, note that C
uses void function returns for two purposes.
Firstly to indicate that the function does not return a value, and secondly to
indicate that the function does not return at all (exit is an example of this second usage). In TDF terms, in the
first case the function returns TOP, in the second
it returns BOTTOM . tspec allows types to be introduced which have the second
meaning. For example, we could have:
+TYPEDEF ~special ( "bottom" ) ~bottom ;
+FUNC ~bottom exit ( int ) ;
meaning that the local type ~bottom is the
BOTTOM form of void.
The procedure exit, which never returns, can then
be declared to return ~bottom rather than void. Other such special types may be added in
future.
The second special form:
+TYPEDEF ~promote ( x ) y ;
means that y is an integral type which is the
integral promotion of x. x must have previously been declared as an integral type. This
gives an alternative approach to the old style procedure declaration problem
described in Section 3.2,
“+FUNC”. Recall that:
void *malloc ( sz )
size_t sz ;
means that malloc has one argument which is
passed as the integral promotion of size_t. This
could be expressed as follows:
+TYPEDEF ~promote ( size_t ) ~size_t ;
+FUNC void *malloc ( ~size_t ) ;
introducing a local type to stand for the integral promotion of size_t .
Having specified a structure or union type, or a structure or union tag, we
may wish to specify certain fields of this structure or union. This is done
using the +FIELD construct. This takes the
form:
+FIELD type {
ftype field1, ..., fieldn ;
....
} ;
where type is the structure or union type and
field1, ..., fieldn
are field selectors derived from the base type ftype as in a normal C structure definition. type may have one of the forms:
(struct) name for a structure type,
(union) name for a union type,
struct name for a structure tag,
union name for a union tag,
name for a previously declared structure or
union type.
Except in the final case (where it is not clear if type is a structure or a union), it is not necessary to have
previously introduced type using a +TYPE construct - this declaration is implicit in the +FIELD construct.
For example, in ansi:time.h we have:
+FIELD struct tm {
int tm_sec ;
int tm_min ;
int tm_hour ;
int tm_mday ;
int tm_mon ;
int tm_year ;
int tm_wday ;
int tm_yday ;
int tm_isdst ;
} ;
meaning that there exists a structure with tag tm with various fields of type int. Any implementation must have these corresponding fields,
but they need not be in the given order, nor do they have to comprise the whole
structure.
As was mentioned above (in Section 3.1.1, “Internal
and External Names”), field selectors form a special case when tspec is making up external token names. For example, in
the case above, the token name for the tm_sec
field is either tm.tm_sec or ansi.time.tm.tm_sec , depending on whether or not unique token
names are used.
It is possible to have several +FIELD
constructs referring to the same structure or union. For example, posix:dirent.h declares a structure with tag dirent and one field, d_name , of
this structure. xpg3:dirent.h extends this by
adding another field, d_ino.
There is a second form of the +FIELD construct
which has more in common with the +TYPEDEF
construct. The form:
+FIELD type := {
ftype field1, ..., fieldn ;
....
} ;
means that the type type is defined to be
exactly the given structure or union type, with precisely the given fields in
the given order.
In the example given in Section 3.9, “+FIELD”, posix:dirent.h specifies that the d_name field of struct dirent is
a fixed sized array of characters, but that the size of this array is
implementation dependent. We therefore have to introduce a value to stand for
the size of this array using the +NAT construct.
This has the form:
+NAT nat1, ..., natn ;
where nat1, ..., natn are the array sizes to be declared. The example thus
becomes:
+NAT ~dirent_d_name_size ;
+FIELD struct dirent {
char d_name [ ~dirent_d_name_size ] ;
} ;
Note the use of a local variable to stand for a value, namely the array size, which is invisible to the user (see Section 3.1.2, “More on Object Names”).
As another example, in ansi:setjmp.h we know
that jmp_buf is an array type. We therefore
introduce objects to stand for the type which it is an array of and for the
size of the array, and define jmp_buf by a +TYPEDEF command:
+NAT ~jmp_buf_size ;
+TYPE ~jmp_buf_elt ;
+TYPEDEF ~jmp_buf_elt jmp_buf [ ~jmp_buf_size ] ;
Again, local variables have been used for the introduced objects.
Currently tspec only has limited support for
enumeration types. A +ENUM construct is translated
directly into a C definition of an enumeration type. The +ENUM construct has the form:
+ENUM etype := {
entry,
....
} ;
where etype is the enumeration type being
defined - either a type name or enum etag for some
enumeration tag etag - and each entry has one of the forms:
name
name = number
as in a C enumeration type. For example, in xpg3:search.h we have:
+ENUM ACTION := { FIND, ENTER } ;
As was mentioned in Introduction, the #pragma token
syntax is highly complex, and the token descriptions output by tspec form only a small subset of those possible. It is
possible to directly access the full #pragma token
syntax from tspec using the construct:
+TOKEN name %% text %% ;
where the token name is defined by the sequence
of characters text, which is delimited by double
percents. This is turned into the token description:
#pragma token text name #
No checks are applied to text. A more
sophisticated mechanism for defining complex tokens may be introduced in a
later version of tspec.
For example, in ansi:stdarg.h a token va_arg is defined which takes a variable of type va_list and a type t and
returns a value of type t. This is given by:
+TOKEN va_arg %% PROC ( EXP lvalue : va_list : e, TYPE t ) EXP rvalue : t : %% ;
See reference 3 for more details on the token syntax.
Table of Contents
Although most tspec constructs are concerned
either with specifying new objects or imposing structure upon various sets of
objects, there are a few which do not fall into these categories.
It is possible to introduce conditional compilation into the API description by means of the constructs:
+IF %% text %%
+IFDEF %% text %%
+IFNDEF %% text %%
+ELSE
+ENDIF
which are translated into:
#if text
#ifdef text
#ifndef text
#else /* text */
#endif /* text */
respectively. If text is just a simple number
or a single identifier the double percent delimiters may be excluded.
A couple of special +IFDEF (and also +IFNDEF) forms are available which are useful on
occasion. These are:
+IFDEF ~building_libs
+IFDEF ~protect ( "api", "header" )
The macros in these constructs expand respectively to __BUILDING_LIBS which, by convention is defined if and only if
TDF library building is taking place (see Section 5.4, “TDF Library
Building”), and the protection macro tspec makes up to protect the file api:header against multiple inclusion (see Section 5.2,
“Protection Macros”).
It is sometimes desirable to include text in the specification file which will be copied directly into one of the output files - for example, sections of C. This can be done by enclosing the text for copying into the include output file in double percents:
%% text %%
and text for copying into the source output file in triple percents:
%%% text %%%
In fact more percents may be used. An even number always indicates text for
the include output file, and an odd number the source output file. Note that
any # characters in text are copied as normal, and not treated as comments. This
also applies to the other cases where percent delimiters are used.
A special case of quoted text are C style comments:
/* text */
which are copied directly into the include output file.
Various properties of individual sets of objects or global properties can be set using file properties. These take the form:
$property = number ;
for numeric (or boolean) properties, and:
$property = "string" ;
for string properties.
The valid property names are as follows:
APINAME is a string property which may be used
to override the API name of the current set of objects.
FILE is a string property which is used by the
tspec preprocessor to indicate the current input
file name.
FILENAME is a string property which may be used
to override the header name of the current set of objects.
INCLNAME is a string property which may be used
to set the name of the include output file in place of the default name given
in Section 1.3,
“Output Layout”. Setting the property to the empty string
suppresses the output of this file.
INTERFACE is a numeric property which may be
set to force the creation of the source output file and cleared to suppress
it.
LINE is a numeric property which is used by the
tspec preprocessor to indicate the current input
file line number.
METHOD is a string property which may be used
to specify alternative construction methods for TDF library building (see Section 5.4,
“TDF Library Building”).
PREFIX is a string property which may be used
as a prefix to unique token names in place of the API and header names (see Section 3.1.1, “Internal
and External Names”).
PROTECT is a string property which may be used
to set the macro used by tspec to protect the
include output file against multiple inclusions (see Section 5.2, “Protection
Macros”). Setting the property to the empty string suppresses this
macro.
SOURCENAME is a string property which may be
used to set the name of the source output file in place of the default name
given in Section 1.3, “Output Layout”.
Setting the property to the empty string suppresses the output of this
file.
SUBSETNAME is a string property which may be
used to override the subset name of the current set of objects.
UNIQUE is a numeric property which may be used
to switch the unique token name flag on and off (see Section 3.1.1,
“Internal and External Names”). For standard APIs it is
recommended that this property is set to 1 in the API MASTER file.
VERBOSE is a numeric property which may be used
to set the level of the verbose option (see Section 1.5, “Command-line
Options”).
VERSION is a string property which may be used
to assign a version number or other identification to a tspec description. This information is reproduced in the
corresponding include output file.
Table of Contents
In this section we round up a few miscellaneous topics.
The +IMPLEMENT and +USE commands described in Section 2.2, “+IMPLEMENT and
+USE” are capable of further refinement. Normally each such command
is translated into a corresponding inclusion command in both the include and
source output files. Occasionally this is not desirable - in particular the
inclusion in the source output file can cause problems during TDF library
building. For this reason the tspec syntax has
been extended to allow for fine control of the output corresponding to +IMPLEMENT and +USE
commands. This takes the forms:
+IMPLEMENT "api" (key) ;
+IMPLEMENT "api", "header" (key) ;
+IMPLEMENT "api", "header", "subset" (key) ;
with corresponding forms for +USE. key specifies which output files the inclusion commands should
appear in. It can be:
??, indicating neither output file,
!?, indicating the include output file
only,
?!, indicating the source output file only,
!!, indicating both output files (this is the
same as the normal form).
The second refinement comes from the fact that APIs fall into two categories
- the base APIs, such as ansi, posix and xpg3, and the extension
APIs, such as x11, the X Windows API. The latter
can be used to extend the former, so that we can form ansi plus x11, posix plus x11, and so on. Base
APIs may be distinguished in tspec by including
the command:
+BASE_API ;
in their MASTER file. Occasionally, in an
extension API, we may wish to include a version of a header from the base API,
but, because this base API is not fixed, not be able to use a simple +USE command. Instead the special form:
+USE ( "api" ), "header" ;
is provided for this purpose (this is the only permitted form). It indicates
that tspec should use the api version of header for
checking purposes, but allow the inclusion of the version from the base API in
normal use.
Each include output file is surrounded by a construct of the form:
#ifndef MACRO
#define MACRO
....
#endif /* MACRO */
to protect it against multiple inclusions. Normally tspec will generate the macro name, MACRO, but it can be set using the PROTECT file property (see Section 4.4, “File
Properties”). Setting PROTECT to the
empty string suppresses the protection construct altogether. (Also see Section 4.1, “+IF,
+ELSE and +ENDIF”)
If it is invoked with the -i command-line
option, instead of creating its output file, tspec
prints an index of all the objects it has read to the standard output. This
information includes the external token name associated with the object,
whether the object is implemented or used, and where in the API description it
is defined. It also includes a brief description of the object. It is intended
that these indexes should be usable as quick reference guides to the underlying
APIs.
As was explained in reference 1, the #pragma
token headers output by tspec are used for
two purposes - checking applications against the API during normal compilation
and checking implementations against the API during TDF library building. This
dual use does necessitate some extra work for tspec. It is not always possible to use exactly the same code
in the two cases (usually because the C rules on, for example, structure
definitions get in the way during library building). tspec uses a standard macro, __BUILDING_LIBS, to distinguish between the two cases. It is
assumed to be defined if and only if library building is taking place. tspec descriptions can access this macro directly using
~building_libs (see Section 4.1, “+IF, +ELSE and
+ENDIF”).
The actual library building process consists of compiling the #pragma token descriptions of the objects comprising the API
along with the implementation of that API from the system headers (or
wherever). This creates the local token definitions for this API, which may be
stored in a token library. To facilitate this process tspec creates the source output files for each implemented
header api:header containing something like:
#pragma implement interface <../api/header>
#include <header>
together with a makefile to compile all these programs to token definitions
and to combine these token definitions into a token library. In fact two
makefiles are created in the source output directory (see Section 1.3, “Output
Layout”). The first is called M_api and
is designed for stand-alone library construction. The second is called Makefile and is designed for use with the library
building script MAKE_LIBS provided with tspec.
There are other methods whereby the source output file may be changed into a
set of token definitions. For example, in c:sys.h
the METHOD file property (see Section 4.4, “File
Properties”) is set to TDP, causing the
tdp program to be invoked to produce the
definitions for the basic C tokens for the system. As another example
consider:
$METHOD = "TNC" ;
+MACRO double fl_abs ( double ) ;
%%%
( make_tokdef fl_abs ( exp x ) exp
( floating_abs impossible x ) )
%%%
The include output file will specify a token fl_abs which takes a double and
returns a double. The TNC method tells MAKE_LIBS that
the source output file, which will just contain the quoted text:
( make_tokdef fl_abs ( exp x ) exp
( floating_abs impossible x ) )
is an input file for the TDF notation compiler, tnc (see reference 2). Thus we have defined a token which
directly accesses the TDF floating_abs
construct.
This document describes tspec version 2.0.
tspec 2.0 contains significant changes from
previous releases. For convenience the main changes which are visible to the
tspec user are listed here:
The added specification level of named subsets of headers has been
introduced (see Section 1.1, “Specification
Levels”). This has been done by introducing the +SUBSET construct and extending the +IMPLEMENT and +USE constructs,
as well as the command-line options. The previous method of dealing with such
subsets - namely shared headers - is now obsolete and its use is
discouraged.
A number of new command-line options have been added, and some of the existing options have been modified slightly (see Section 1.5, “Command-line Options”).
The suffix .api has been added to the output
directories (see Section 1.3, “Output Layout”) to
avoid possible confusion with other include file directories.
The use of identifiers beginning with ~ as
local variables is new (see Section 3.1.2, “More on Object
Names”).
The +STATEMENT and +DEFINE constructs (see Section 3.5, “+STATEMENT” and Section 3.6,
“+DEFINE”) are new.
The (extern), (weak) and (const) qualifiers for
+FUNC and +EXP (see
Section 3.2,
“+FUNC” and Section 3.3, “+EXP and
+CONST”) are new.
The (signed) and (unsigned) qualifiers for +TYPE
(see Section 3.7,
“+TYPE”) are new.
The ~special type constructor (see Section 3.8,
“+TYPEDEF”) is new.
The ~abstract type constructor has been
abandoned.
The +BASE_API command described in Section 5.1, “Fine
Control of Included Files” is new.
The indexing routines (see Section 5.3, “Index Printing”) have been greatly improved.
TDF and Portability, DRA, 1993.
The TDF Notation Compiler, DRA, 1993.
The C to TDF Producer, DRA, 1993.
This chapter describes revisions to this document.
Only major changes are listed in the revision history. Please see http://www.ten15.org/log/trunk/doc/en/tspec/book.xml?action=follow_copy&rev=&stop_rev=1&mode=follow_copy&verbose=on for a complete list of changes.
CVS revision numbers are located behind the date in the format rXX
| Revision History | ||
|---|---|---|
| Revision 1.1 | 2003/03/01 r1138 | stefanf |
| Cross references were fixed. | ||
| Revision 1.0 | 2002/10/06 r36 | verm |
| Converted to SGML from the TenDRA 4.1.2 Documentation. | ||