$TenDRA: book.xml 2447 2006-03-23
21:15:51Z verm $
Copyright © 2002, 2003, 2004 TenDRA Documentation Team
Copyright © 1997, 1998 Defence Evaluation and Research Agency (DERA)
Extensions to this document from the original TenDRA-4.1.2-doc.tar.gz source distribution are covered by the BSDL, while all prior modifications remain under the Crown Copyright.
Berkeley Systems Design License
Redistribution and use in source (SGML DocBook) and 'compiled' forms (SGML, HTML, PDF, PostScript, RTF and so forth) with or without modification, are permitted provided that the following conditions are met:
Redistributions of source code (SGML DocBook) must retain the above copyright notice, this list of conditions and the following disclaimer as the first lines of this file unmodified.
Redistributions in compiled form (transformed to other DTDs, converted to PDF, PostScript, RTF and other formats) must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
THIS DOCUMENTATION IS PROVIDED BY THE TENDRA DOCUMENTATION TEAM "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE TENDRA DOCUMENTATION TEAM BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Crown Copyright (c) 1997, 1998
This TenDRA(r) Computer Program is subject to Copyright owned by the United Kingdom Secretary of State for Defence acting through the Defence Evaluation and Research Agency (DERA). It is made available to Recipients with a royalty-free licence for its use, reproduction, transfer to other parties and amendment for any purpose not excluding product development provided that any such use et cetera shall be deemed to be acceptance of the following conditions:
Its Recipients shall ensure that this Notice is reproduced upon any copies or amended versions of it;
Any amended version of it shall be clearly marked to show both the nature of and the organisation responsible for the relevant amendment or amendments;
Its onward transfer from a recipient to another party shall be deemed to be that party's acceptance of these conditions;
DERA gives no warranty or assurance as to its quality or suitability for any purpose and DERA accepts no liability whatsoever in relation to any use to which it may be put.
This document was generated on 2006-09-07 16:05:30
Abstract
Please email us at <docs@ten15.org> if you see any
errors.
Table of Contents
The TDF notation compiler, tnc, is a tool for
translating TDF capsules to and from text. This paper gives a brief
introduction to how to use this utility and the syntax of the textual form of
TDF. The version here described is that supporting version 3.1 of the TDF
specification.
tnc has four modes, two input modes and two
output modes. These are as follows:
decode - translate an input TDF capsule into
the tnc internal representation,
read - translate an input text file into the
internal representation,
encode - translate the internal representation
into an output TDF capsule,
write - translate the internal representation
into an output text file.
Due to the modular nature of the program it is possible to form versions of
tnc in which not all the modes are available.
Passing the -version flag to tnc causes it to report which modes it has implemented.
Any application of tnc consists of the
composite of an input mode and an output mode. The default action is read-encode, i.e. translate
an input test file into an output TDF capsule. Other modes may be specified by
passing the following command line options to tnc:
-decode or -d,
-read or -r,
-encode or -e,
-write or -w.
The only other really useful action is decode-write, i.e. translate an
input TDF capsule into an output text file. This may also be specified by the
-print or -p option.
The actions decode-encode and read-write are not precise identities, they do however give
equivalent input and output files.
In addition, the decode mode may be modified to
accept a TDF library as input rather than a TDF capsule by passing the addition
flag:
-lib or -l,
to tnc.
The overall syntax for tnc is as follows:
tnc [ options ] [ input_file ] [ output_file ]
If the output file is not specified, the standard output is used.
Table of Contents
The rest of this paper is concerned with the form required of the input text file. The input can be divided into eight classes.
The characters ( and ) are used as delimiters to impose a syntactic structure on
the input.
White space comprises sequences of space, tab and newline characters, together with comments (see below). It is not significant to the output (TDF notation is completely free-form), and serves only to separate syntactic units. Every identifier, number etc. must be terminated by a white space or a delimiter.
Comments may be inserted in the input at any point. They begin with a # character and run to the end of the line.
An identifier consists of any sequence of characters drawn from the
following set: upper case letters, lower case letters, decimal digits,
underscore (_), dot (.), and tilde (~), which does not
begin with a decimal digit. tnc generates names
beginning with double tilde (~~) for unnamed
objects when in decode mode, so the use of such
identifiers is not recommended.
Numbers can be given in octal (prefixed by 0),
decimal, or hexadecimal (prefixed by 0x or 0X). Both upper and lower case letters can be used for
hex digits. A number can be preceded by any number of + or - signs.
A string consists of a sequence of characters enclosed in double quotes
("). The following escape sequences are
recognised:
\n represents a newline character,
\t represents a tab character,
\xxx, where xxx
consists of three octal digits, represents the character with ASCII code xxx.
Newlines are not allowed in strings unless they are escaped. For all other
escaped characters, \x represents x.
A single minus character (-) has a special
meaning. It may be used to indicate the absence of an optional argument or
optional group of arguments.
Table of Contents
The basic input syntax is very simple. A construct consists of an identifier followed by a list of arguments, all enclosed in brackets in a Lisp-like fashion. Each argument can be an identifier, a number, a string, a blank, a bar, or another construct. There are further restrictions on this basic syntax, described below.
construct : ( identifier arglist )
argument : construct
| identifier
| number
| string
| blank
| bar
arglist : (empty)
| argument arglist
The construct ( identifier ), with an empty
argument list, is equivalent to the identifier argument identifier. The two may be used interchangeably.
Except at the outermost level, which forms a special case discussed below,
every construct and argument has an associated sort. This is one of the basic
TDF sorts: access, al_tag, alignment, bitfield_variety, bool, callees, error_code, error_treatment, exp, floating_variety, label,
nat, ntest, procprops, rounding_mode ,
shape, signed_nat,
string, tag, transfer_mode, variety,
tdfint or tdfstring.
Ignoring for the moment the shorthands discussed below, the ways of creating
constructs of sort exp say, correspond to the TDF
constructs delivering an exp. For example, contents takes a shape and
an exp and delivers an exp. Thus:
( contents arg1 arg2 )
where arg1 is an argument of sort shape and arg2 is an argument of
sort exp, is a sort-correct construct. Only
constructs which are sort correct in this sense are allowed.
As another example, because of the rule concerning constructs with no arguments, both
( true )
and
false
are valid constructs of sort bool.
TDF constructs which take lists of arguments are easily dealt with. For example:
( make_nof arg1 ... argn )
where arg1, ..., argn are all arguments of sort exp, is valid. A vertical bar may be used to indicate the end
of a sequence of repeated arguments.
Optional arguments should be entered normally if they are present. Their absence may be indicated by means of a blank (minus sign), or by simply omitting the argument.
The vertical bar and blank should be used whenever the input is potentially
ambiguous. Particular care should be taken with apply_proc (which is genuinely ambiguous) and labelled.
The TDF specification should be consulted for a full list of valid TDF
constructs and their argument sorts. Alternatively the tnc help facility may be used. The command:
tnc -help cmd1 ... cmdn
prints sort information on the constructs or sorts cmd1, ..., cmdn.
Alternatively:
tnc -help
prints this information for all constructs. (To obtain help on the sort
alignment as opposed to the construct alignment use alignment_sort.
This confusion cannot occur elsewhere.)
Numbers can occur in two contexts, as the argument to the TDF constructs
make_nat and make_signed_nat. In the former case the number must be
positive. The following shorthands are understood by tnc:
number for ( make_nat number )
number for ( make_signed_nat number )
depending on whether a construct of sort nat or
signed_nat is expected.
Strings are nominally of sort tdfstring. They
are taken to be simple strings (8 bits per character). Multibyte strings (those
with other than 8 bits per character) may be represented by means of the multi_string construct. This takes the form:
( multi_string b c1 ... cn )
where b is the number of bits per character and
c1, ...,cn are the
codes of the characters comprising the string. These multibyte strings cannot
be used as external names.
In addition, a simple (8 bit) string can be used as a shorthand for a TDF
construct of sort string, as follows:
string for ( make_string string )
In TDF simple tokens, tags, alignment tags and labels are represented by
numbers which may, or may not, be associated with external names. In tnc however they are represented by identifiers. This
brings the problem of scoping which does not occur in TDF. The rules are that
all tokens, tags, alignment tags and labels must be declared before they are
used. Externally defined objects have global scope, and the scope of a formal
argument in a token definition is the definition body. For those constructs
which introduce a local tag or label - for example, identify, make_proc, make_general_proc and variable
for tags and conditional, labelled and repeat for labels -
the scope of the object is as set out in the TDF specification.
The following shorthands are understood by tnc,
according to the argument sort expected:
tag_id for ( make_tag tag_id )
al_tag_id for ( make_al_tag al_tag_id )
label_id for ( make_label label_id )
The syntax for token applications is as follows:
( apply_construct ( token_id arg1 ... argn ) )
where apply_construct is the appropriate TDF
token application construct, for example, exp_apply_token for tokens declared to deliver exp's. The token arguments arg1,
..., argn must be of the sorts indicated in the
token declaration or definition. For tokens without any arguments the
alternative form:
( apply_construct token_id )
is allowed.
The token application above may be abbreviated to:
( token_id arg1 ... argn )
the result sort being known from the token declaration. This in turn may be abbreviated to:
token_id
when there are no token arguments.
Care needs to be taken with these shorthands, as they can lead to confusion,
particularly when, due to optional arguments or lists of arguments, tnc is not sure what sort is coming next. The five
categories of objects represented by identifiers - TDF constructs, tokens,
tags, alignment tags and labels - occupy separate name spaces, but it is a good
idea to try to avoid duplication of names.
By default all these shorthands are used by tnc
in write mode. If this causes problems, the -V flag should be passed to tnc.
At the outer level tnc is expecting a sequence
of constructs of the following forms:
an included file,
a token declaration,
a token definition,
an alignment tag declaration,
an alignment tag definition,
a tag declaration,
a tag definition.
Included files may be of three types - text, TDF capsule or TDF library. For TDF capsules and libraries there are two include modes. The first just decodes the given capsule or set of capsules. The second scans through them to extract token declaration information. These declarations appear in the output file only if they are used elsewhere.
The syntax for an included text file is:
( include string )
where string is a string giving the pathname of
the file to be included. tnc applies read to this sub-file before continuing with the present
file.
Similarly, the syntaxes for included TDF capsules and libraries are:
( code string )
( lib string )
respectively. tnc applies decode to this capsule or set of capsules (provided this mode
is available) before continuing with the present file.
The syntaxes for extracting the token declaration information from a TDF capsule or library are:
( use_code string )
( use_lib string )
Again, these rely on the decode mode being
available.
All tokens, tags and alignment tags have an internal name, namely the associated identifier, but this name does not necessarily appear in the corresponding TDF capsule. There must firstly be an associated declaration or definition at the outer level - tags internal to a piece of TDF do not have external names. Even then we may not wish this name to appear at the outer level, because it is local to this file and is not required for linking purposes. Alternatively we may wish a different external name to be associated with it in the TDF capsule.
As an example of how tnc allows for this,
consider token declarations (although similar remarks apply to token
definitions, alignment tag definitions etc.). The basic form of the token
declaration is:
( make_tokdec token_id ... )
This creates a token with both internal and external names equal to token_id. Alternatively:
( local make_tokdec token_id ... )
creates a token with internal name token_id but
no external name. This allows the creation of tokens local to the current file.
Again:
( make_tokdec ( string_extern string ) token_id ... )
creates a token with internal name token_id and
external name given by the string string. For
example, to create a token whose external name is not a valid identifier, it
would be necessary to use this construct. Finally:
( make_tokdec ( unique_extern string1 ... stringn ) token_id ... )
creates a token with internal name token_id and
external name given by the unique name consisting of the strings string1, ..., stringn.
The local quantifier should be used
consistently on all declarations and definitions of the token, tag or alignment
tag. The alternative external name should only be given on the first occasion
however. Thereafter the object is identified by its internal name.
The basic form of a token declaration is:
( make_tokdec token_id ( arg1 ... argn ) res )
where the token token_id is declared to take
argument sorts arg1, ..., argn and deliver the result sort res. These sorts are given by their sort names, al_tag, alignment, bitfield_variety etc. For a token with no arguments the
declaration may be given in the form:
( make_tokdec token_id res )
A token may be declared any number of times, provided the declarations are consistent.
This basic declaration may be modified in the ways outlined above to specify the external token name.
The basic form of a token definition is:
( make_tokdef token_id ( arg1 id1 ... argn idn ) res def )
where the token token_id is defined to take
formal arguments id1, ..., idn of sorts arg1, ..., argn respectively and have the value def, which is a construct of sort res. The scope of the tokens id1,
..., idn is def.
For a token with no arguments the definition may be given in the form:
( make_tokdef token_id res def )
A token may be defined more than once. All definitions must be consistent with any previous declarations and definitions (the renaming of formal arguments is allowed however).
This basic definition may be modified in the ways outlined above to specify the external token name.
The basic form of an alignment tag declaration is:
( make_al_tagdec al_tag_id )
where the alignment tag al_tag_id is declared
to exist.
This basic declaration may be modified in the ways outlined above to specify the external alignment tag name.
The basic form of an alignment tag definition is:
( make_al_tagdef al_tag_id def )
where the alignment tag al_tag_id is defined to
be def, which is a construct of sort alignment. An alignment tag may be declared or defined more
than once, provided the definitions are consistent.
This basic definition may be modified in the ways outlined above to specify the external alignment tag name.
The basic forms of a tag declaration are:
( make_id_tagdec tag_id info dec )
( make_var_tagdec tag_id info dec )
( common_tagdec tag_id info dec )
where the tag tag_id is declared to be an
identity, variable or common tag with access information info, which is an optional construct of sort access, and shape dec, which is a
construct of sort shape. A tag may be declared
more than once, provided all declarations and definitions are consistent
(including agreement of whether the tag is an identity, a variable or
common).
These basic declarations may be modified in the ways outlined above to specify the external tag name.
The basic forms of a tag definition are:
( make_id_tagdef tag_id def )
( make_var_tagdef tag_id info def )
( common_tagdef tag_id info def )
where the tag tag_id is defined to be an
identity, variable or common tag with value def,
which is a construct of sort exp. Non-identity tag
definitions also have an optional access
construct, info. A tag must have been declared
before it is defined, but may be defined any number of times. All declarations
and definitions must be consistent (except that common tags may be defined
inconsistently) and agree on whether the tag is an identity, a variable, or
common.
These basic definitions may be modified in the ways outlined above to specify the external tag name.
The input in read (and to a lesser extent decode) mode is checked for shape correctness if the
-check or -c flag is
passed to tnc. This is not guaranteed to pick up
all shape errors, but is better than nothing.
When in write mode the results of the shape
checking may be viewed by passing the -cv flag to
tnc. Each expression is associated with its shape
by means of the:
( exp_with_shape exp shape ) -> exp
pseudo-construct. Unknown shapes are indicated by ....
The target independent TDF capsules produced by the C -> TDF compiler,
tcc, do not contain declarations or definitions
for all the tokens they use. Thus tnc cannot fully
decode them as they stand. However the necessary token declaration information
may be made available to tnc by using the use_lib construct. The commands:
( use_lib library ) ( code capsule )
will decode the TDF capsule capsule which uses
tokens defined in the TDF library library.
The main limitations in the current version of tnc are as follows:
There is no error recovery,
There is no support for foreign sorts,
The support for tokenised tokens is limited and undocumented.
In addition, far more of the checks (scopes, shape checking, checking of
consistency of declarations and definitions etc.) are implemented for read mode rather than decode mode. To shape check a TDF capsule, it will almost
certainly be more effective to translate it into text and check that.
Another limitation is that the scoping rules for local tags do not allow
such tags to be accessed outside their scopes using env_offset .
This chapter describes revisions to this document.
Only major changes are listed in the revision history. Please see http://www.ten15.org/log/trunk/doc/en/tnc/book.xml?action=follow_copy&rev=&stop_rev=1&mode=follow_copy&verbose=on for a complete list of changes.
CVS revision numbers are located behind the date in the format rXX
| Revision History | ||
|---|---|---|
| Revision 1.0 | 2002/10/07 r41 | verm |
| Converted to SGML from the TenDRA 4.1.2 Documentation. | ||