tcc User's Guide

The TenDRA Documentation Team

$TenDRA: book.xml 2447 2006-03-23 21:15:51Z verm $

Extensions to this document from the original TenDRA-4.1.2-doc.tar.gz source distribution are covered by the BSDL, while all prior modifications remain under the Crown Copyright.

Berkeley Systems Design License

Redistribution and use in source (SGML DocBook) and 'compiled' forms (SGML, HTML, PDF, PostScript, RTF and so forth) with or without modification, are permitted provided that the following conditions are met:

  1. Redistributions of source code (SGML DocBook) must retain the above copyright notice, this list of conditions and the following disclaimer as the first lines of this file unmodified.

  2. Redistributions in compiled form (transformed to other DTDs, converted to PDF, PostScript, RTF and other formats) must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

Important

THIS DOCUMENTATION IS PROVIDED BY THE TENDRA DOCUMENTATION TEAM "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE TENDRA DOCUMENTATION TEAM BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Crown Copyright (c) 1997, 1998

This TenDRA(r) Computer Program is subject to Copyright owned by the United Kingdom Secretary of State for Defence acting through the Defence Evaluation and Research Agency (DERA). It is made available to Recipients with a royalty-free licence for its use, reproduction, transfer to other parties and amendment for any purpose not excluding product development provided that any such use et cetera shall be deemed to be acceptance of the following conditions:

  1. Its Recipients shall ensure that this Notice is reproduced upon any copies or amended versions of it;

  2. Any amended version of it shall be clearly marked to show both the nature of and the organisation responsible for the relevant amendment or amendments;

  3. Its onward transfer from a recipient to another party shall be deemed to be that party's acceptance of these conditions;

  4. DERA gives no warranty or assurance as to its quality or suitability for any purpose and DERA accepts no liability whatsoever in relation to any use to which it may be put.

Alpha, ULTRIX is a registered trademark of Digital Equipment Corp. http://www.dec.com

HP-UX is a registered trademark of Hewlett-Packard Company. http://www.hp.com

MIPS and R4000 are registered trademarks of MIPS Technologies, Inc. in the United States and other countries. http://www.mips.com/

NeXT is a registered trademark and NeXTSTEP is a trademark of NeXT Computer, Inc. Please note that NeXT was purchased by Apple Computer, Inc. http://www.apple.com/

Motif, OSF/1, and UNIX are registered trademarks of The Open Group in the United States and other countries. http://www.opengroup.org/

Sun, Sun Microsystems, Solaris and SunOS are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. http://www.sun.com/

Sparc, Sparc64, SPARCEngine, and UltraSPARC are trademarks of SPARC International, Inc in the United States and other countries. Products bearing SPARC trademarks are based upon architecture developed by Sun Microsystems, Inc. http://www.sun.com/

This document was generated on 2006-09-07 15:01:19

Abstract

Please email us at if you see any errors


Table of Contents

Introduction
1. The TDF Compilation Strategy
2. The Overall Design of tcc
2.1. Specifying the API
2.2. The Main Compilation Path
2.3. Input File Types
2.4. Intermediate and Output Files
2.5. Other Compilation Paths
2.5.1. Preprocessing
2.5.2. TDF Archives
2.5.3. TDF Notation
2.5.4. Merging TDF Capsules
2.6. Finding out what tcc is doing
3. tcc Environments
3.1. The Environment Search Path
3.2. The Default Environment: Configuring tcc
3.3. Using Environments to Specify APIs
3.4. Using Environments to Implement tcc Options
3.5. User-Defined Environments
4. The Components of the TDF System
4.1. The C to TDF Producer
4.1.1. Include File Directories
4.1.2. Start-up Files and End-up Files
4.1.3. Compilation Modes and Portability Tables
4.1.4. Description of Compilation Modes
4.2. The TDF Linker
4.2.1. The Linker and TDF Libraries
4.2.2. Combining TDF Capsules
4.2.3. Constructing TDF Libraries
4.2.4. Useful tld Options
4.3. The TDF to Target Translator
4.3.1. tcc Options Affecting the Translator
4.3.2. Useful trans Options
4.3.3. Optimisation in TDF Translators
4.3.4. The MIPS® Translator and Assembler
4.4. The System Assembler
4.5. The System Linker
4.5.1. The System Linker and tcc Environments
4.5.2. The Effect of Command-Line Options on the System Linker
4.6. The C Preprocessor
4.7. The TDF Pretty Printer
4.8. The TDF Archiver
5. Miscellaneous Topics
5.1. Intermodular Checks
5.2. Debugging and Profiling
5.3. The System Environment
5.4. The Temporary Directory
5.5. The tcc Option Interpreter
6. tcc Reference Guide
6.1. Input and Output Files
6.2. Compilation Phases
6.3. Command-line Options
6.4. Compilation Modes
6.5. Supported APIs
6.6. Environment Identifiers
6.7. Standard Environments
7. Manual Pages
8. Revision History

List of Figures

1.1. Traditional Compilation Path
1.2. TDF Compilation Path
2.1. Basic tcc Compilation Path
2.2. Full tcc Compilation Path
4.1. MIPS® Compilation Path

List of Tables

6.1. Input File Suffixes
6.2. Default Output Filenames
6.3. Compilation Phase
6.4. Archiver Options
6.5. Supported APIs
6.6. Environment Identifiers
6.7. Standard Environments

Introduction

Like most compilation systems, the TDF C system consists of a number of components which transform or combine various types of input files into output files. tcc is designed to be a compilation manager, coordinating these various components into a coherent compilation scheme. It is also the normal user's interface to the TDF system on Unix machines: direct use of the various components of the system is not recommended. Therefore it is worth familiarising oneself with tcc before attempting to use the TDF system. To aid this familiarisation tcc has been designed to have the same look and feel as the system C compiler cc, but with added functionality to deal with the additional features of the TDF system. This does not mean that tcc can be necessarily regarded as a direct replacement for cc ; the extra portability checks performed by the TDF system require the precise compilation environment to be specified to tcc in a way that it cannot be to cc.

There are two basic components to this paper. The first describes the TDF compilation strategy and how it is implemented by tcc. The second is a Quick Reference section at the end of the paper, which is intended to be a tcc user's manual. For even quicker reference, tcc will print a list of all its command-line options (with a brief description) if invoked with the -query option.

Chapter 1. The TDF Compilation Strategy

Figure 1.1. Traditional Compilation Path

Traditional Compilation Path

Before discussing tcc itself in detail, it is necessary to explain the compilation strategy that it is designed to implement. This was discussed at length in [1] and is summarised in Fig. 2, which is taken from that paper. Readers are urged to consult [1].

Figure 1.2. TDF Compilation Path

TDF Compilation Path

Chapter 2. The Overall Design of tcc

Having discussed the compilation strategy tcc is designed to implement, let us move on to describe the details of this implementation. The basic compilation path is shown in Fig. 3, which corresponds to Fig. 2.

Figure 2.1. Basic tcc Compilation Path

Basic tcc Compilation Path

2.1. Specifying the API

As we have seen, the API plays a far more concrete role in the TDF compilation strategy than in the traditional scheme. Therefore the API needs to be explicitly specified to tcc before any compilation takes place. As can be seen from Fig. 3, the API has three components. Firstly, in the target independent (or production) half of the compilation, there are the target independent headers which describe the API. Secondly in the target dependent (or installation) half, there is the API implementation for the particular target machine. This is divided between the TDF libraries, derived from the system headers, and the system libraries. Specifying the API to tcc essentially consists of telling it what target independent headers, TDF libraries and system libraries to use. The precise way in which this is done is discussed below (in section 4.3).

2.2. The Main Compilation Path

Once the API has been specified, the actual compilation can begin. The default action of tcc is to perform production and installation consecutively on the same machine; any other action needs to be explicitly specified. So let us describe the entire compilation path from C source to executable shown in Fig. 3.

  1. The first stage is production. The C --> TDF producer transforms each input C source file into a target independent TDF capsule, using the target independent headers to describe the API in abstract terms. These target independent capsules will contain tokens to represent the uses of objects from the API, but these tokens will be left undefined.

  2. The second stage, which is also the first stage of the installation, is TDF linking. Each target independent capsule is combined with the TDF library describing the API implementation to form a target dependent TDF capsule. Recall that the TDF libraries contain the local definitions of the tokens left undefined by the producer, so the resultant target dependent capsule will contain both the uses of these tokens and the corresponding token definitions.

  3. The third stage of the compilation is for the TDF translator to transform each target dependent TDF capsule into an assembly source file for the appropriate target machine. Some TDF translators output not an assembly source file, but a binary object file. In this case the following assembler stage is redundant and the compilation skips to the system linking.

  4. The next stage of the compilation is for each assembly source file to be translated into a binary object file by the system assembler.

  5. The final compilation phase is for the system linker to combine all the binary object files with the system libraries to form a single, final executable. Recall that the system libraries are the final constituent of the API implementation, so this stage completes the combination of the program with the API implementation started in stage 2).

Let us, for convenience, tabulate these stages, giving the name of each compilation tool (plus the corresponding executable name), a code letter which tcc uses to refer to this stage, and the input and output file types for the stage (also see 7.2).

Number Tool Letter Input Output
1 C producer (tdfc) c C source target ind. TDF
2 TDF linker (tld) L target ind. TDF target dep. TDF
3 TDF translator (trans) t target dep. TDF assembly source
4 assembler (as) a assembly source binary object
5 system linker (ld) l binary object executable

The executable name of the TDF translator varies, depending on the target machine. It will normally start, or end, however, in trans. These stages are documented in more detail in sections 5.1 to 5.5.

The code letters for the various compilation stages can be used in the -Wtool, opt, ... command-line option to tcc. This passes the option(s) opt directly to the executable in the compilation stage identified by the letter tool. For example, -Wl, -x will cause the system linker to be invoked with the -x option. Similarly the -Etool: file allows the executable to be invoked at the compilation stage tool to be specified as file. This allows the tcc user access to the compilation tools in a very direct manner.

2.3. Input File Types

This compilation path may be joined at any point, and terminated at any point. The latter possibility is discussed below. For the former, tcc determines, for each input file it is given, to which of the file types it knows (C source, target independent TDF, etc.) this file belongs. This determines where in the compilation path described this file will start. The method used to determine the type of a file is the normal filename suffix convention:

  • files ending in .c are understood to be C source files,

  • files ending in .j are understood to be target independent TDF capsules,

  • files ending in .t are understood to be target dependent TDF capsules,

  • files ending in .s are understood to be assembly source files,

  • files ending in .o are understood to be binary object files,

  • files whose type cannot otherwise be determined are assumed to be binary object files,

(for a complete list see 7.1). Thus, for example, we speak of ".j files" as a shorthand for "target independent TDF capsules". Each file type recognised by tcc is assigned an identifying letter. For convenience, this corresponds to the suffix identifying the file type (c for C source files, j for target independent TDF capsules etc.).

There is an alternative method of specifying input files, by means of the -Stype, file, ... command-line option. This specifies that the file file should be treated as an input file of the type corresponding to the letter type, regardless of its actual suffix. Thus, for example, -Sc, file specifies that file should be regarded as a C source (or .c) file.

2.4. Intermediate and Output Files

During the compilation, tcc makes up names for the output files of each of the compilation phases. These names are based on the input file name, but with the input file suffix replaced by the output file suffix (unless the -make_up_names command-line option is given, in which case the intermediate files are given names of the form _tccnnnn.x, where nnnn is a number which is incremented for each intermediate file produced, and x is the suffix corresponding to the output file type). Thus if the input file file.c is given, this will be transformed into file.j by the producer, which in turn will be transformed into file.t by the TDF linker, and so on. The system linker output file name can not be deduced in the same way since it is the result of linking a number of .o files. By default, as with cc, this file is called a.out .

For most purposes these intermediate files are not required to be preserved; if we are compiling a single C source file to an executable, then the only output file we are interested in is the executable, not the intermediate files created during the compilation process. For this reason tcc creates a temporary directory in which to put these intermediate files, and removes this directory when the compilation is complete. All intermediate files are put into this temporary directory except:

  • those which are an end product of the compilation (such as the executable),

  • those which are explicitly ordered to be preserved by means of command-line options,

  • binary object files, when more than one such file is produced (this is for compatibility with cc).

tcc can be made to preserve intermediate files of various types by means of the -Ptype... command-line option, which specifies a list of letters corresponding to the file types to be preserved. Thus for example -Pjt specifies that all TDF capsules produced, whether target independent or target dependent, (i.e. all .j and .t files) should be preserved. The special form -Pa specifies that all intermediate files should be preserved. It is also possible to specify that a certain file type should not be preserved by preceding the corresponding letter by - in the -P option. The only really useful application of this is to use -P-o to cancel the cc convention on preserving binary object files mentioned above.

By default, all preserved files are stored in the current working directory. However the -work dir command-line option specifies that they should be stored in the directory dir.

The compilation can also be halted at any stage. The -Ftype option to tcc tells it to stop the compilation after creating the files of the type corresponding to the letter type. Because any files of this type which are produced will be an end product of the compilation, they will automatically be preserved. For example, -Fo halts the compilation after the creation of the binary object, or .o, files (i.e. just before the system linking), and preserves all such files produced. A number of other tcc options are equivalent to options of the form -Ftype:

  • -i is equivalent to -Fj (i.e. just apply the producer),

  • -S is equivalent to -Fs (cc compatibility),

  • -c is equivalent to -Fo (cc compatibility).

If more than one -F option (including the equivalent options just listed) is given, then tcc issues a warning. The stage coming first in the compilation path takes priority.

If the compilation has precisely one end product output file, then the name of this file can be specified to be file by means of the -o file command-line option. If a -o file option is given when there is more than one end product, then the first such file produced will be called file, and all such files produced subsequently will cause tcc to issue a warning.

Figure 2.2. Full tcc Compilation Path

Full tcc Compilation Path

2.5. Other Compilation Paths

So far we have been discussing the main tcc compilation path from C source to executable. This is however only part of the picture. The full complexity (almost) of all the possible compilation paths supported by tcc is shown in Fig. 4. This differs from Fig. 3 in that it only shows the left hand, or program, half of the main compilation diagram. The solid arrows show the default compilation paths; the shaded arrows are only followed if tcc is so instructed by means of command-line options. Let us consider those paths in this diagram which have not so far been mentioned.

2.5.1. Preprocessing

The first paths to be considered involve preprocessed C source files. These form a distinct file type which tcc recognises by means of the .i file suffix. Input .i files are treated in exactly the same way as .c files; that is, they are fed into the producer.

tcc can be made to preprocess the C source files it is given by means of the -P and -E options. If the -P option is given then each .c file is transformed into a corresponding .i file by the TDF C preprocessor, tdfcpp . If the -E option is given then the output of tdfcpp is sent instead to the standard output. In both cases the compilation halts after the preprocessor has been applied. Preprocessing is discussed further in section 5.6.

2.5.2. TDF Archives

The second new file type introduced in Fig. 4 is the TDF archive. This is recognised by tcc by means of the .ta file suffix. Basically a TDF archive is a set of target independent TDF capsules (this is slightly simplified, see section 5.2.3 for more details). Any input TDF archives are automatically split into their constituent target independent capsules. These then join the main compilation path in the normal way.

In order to create a TDF archive, tcc must be given the -prod command-line option. It will combine all the target independent TDF capsules it has into an archive, and the compilation will then halt. By default this archive is called a.ta, but another name may be specified using the -o option.

The routines for splitting and building TDF archives are built into tcc, and are not implemented by a separate compilation tool (in particular, TDF archives are not ar archives). Really TDF archives are a tcc-specific construction; they are not part of TDF proper.

2.5.3. TDF Notation

TDF has the form of an abstract syntax tree which is encoded as a series of bits. In order to examine the contents of a TDF capsule it is necessary to translate it into an equivalent human readable form. Two tools are provided which do this. The TDF pretty printer, disp, translates TDF into text, whereas the TDF notation compiler, tnc, both translates TDF to text and text to TDF. The two textual forms of TDF are incompatible - disp output cannot be used as tnc input. disp is in many ways the more sophisticated decoder - it understands the TDF extensions used to handle diagnostics, for example - but it does not handle the text to TDF translation which tnc does. By default tnc is a text to TDF translator, it needs to be passed the -p flag in order to translate TDF into text. We refer to the textual form of TDF supported by tnc as TDF notation.

By default, tcc uses disp. If it is given the -disp command-line option then all target independent TDF capsules (.j files) are transformed into text using disp . The -disp_t option causes all target dependent TDF capsules (.t files) to be transformed into text. In both cases the output files have a .p suffix, and the compilation halts after they are produced.

In order for tnc to be used, the -Ytnc flag should be passed to tcc. In this case the -disp and the -disp_t option cause, not disp, but tnc -p, to be invoked. But this flag also causes tcc to recognise files with a .p suffix as TDF notation source files. These are translated by tnc into target independent TDF capsules, which join the main compilation path in the normal way.

Similarly if the -Ypl_tdf flag is passed to tcc then it recognises files with a .pl suffix as PL_TDF source files. These are translated by the PL_TDF compiler, pl, into target independent TDF capsules.

disp and tnc are further discussed in section 5.7.

2.5.4. Merging TDF Capsules

The final unexplored path in Fig. 4 is the ability to combine all the target independent TDF capsules into a single capsule. This is specified by means of the -M command-line option to tcc. The combination of these capsules is performed by the TDF linker, tld. Whereas in the main compilation path tld is used to combine a single target independent TDF capsule with the TDF libraries to form a target dependent TDF capsule, in this case it is used to combine several target independent capsules into a single target independent capsule. By default the combined capsule is called a.j. The compilation will continue after the combination phase, with the resultant capsule rejoining the main compilation path. This merging operation is further discussed in section 5.2.2.

The only unresolved issue in this case is, if the -M option is given, to what .j files do the -Fj and the -Pj options refer? In fact, tcc takes them to refer to the merged TDF capsule rather than the capsules which are merged to form it. The -Pa option, however, will cause both sets of capsules to be preserved.

To summarise, tcc has an extra three file types, and an extra three compilation tools (not including the TDF archive creating and splitting routines which are built into tcc). These are:

  • files ending in .i are understood to be preprocessed C source files,

  • files ending in .ta are understood to be TDF archives,

  • files ending in .p are understood to be TDF notation source files,

and:

Number Tool Letter Input Output
6 C preprocessor (tdfcpp) c C source preproc. C source
7a pretty printer (disp) d TDF capsule TDF notation
7b reverse notation (tnc -p) d TDF capsule TDF notation
8 notation compiler (tnc) d TDF notation TDF capsule

(see 7.1 and 7.2 for complete lists).

2.6. Finding out what tcc is doing

With so many different file types and alternative compilation paths, it is often useful to be able to keep track of what tcc is doing. There are several command-line options which do this. The simplest is -v which specifies that tcc should print each command in the compilation process on the standard output before it is executed. The -vb option is similar, but only causes the name of each input file to be printed as it is processed. Finally the -dry option specifies that the commands should be printed (as with -v) but not actually executed. This can be used to experiment with tcc to find out what it would do in various circumstances.

Occasionally an unclear error message may be printed by one of the compilation tools. In this case the -show_errors option to tcc might be useful. It causes tcc to print the command it was executing when the error occurred. By default, if an error occurs during the construction of an output file, the file is removed by tcc. It can however be preserved for examination using the -keep_errors option. This applies not only to normal errors, but also to exceptional errors such as the user interrupting tcc by pressing ^C, or one of the compilation tools crashing. In the latter case, tcc will also remove any core file produced, unless the -keep_errors option is specified.

For purposes of configuration control, the -version flag will cause tcc to print its version number. This will typically be of the form:

          tcc: Version: 4.0, Revision: 1.5, Machine: hp
       

giving the version and revision number, plus the target machine identifier. The -V flag will also cause each compilation tool to print its version number (if appropriate) as it is invoked.

Chapter 3. tcc Environments

In addition to command-line options, there is a second method of specifying tcc's behaviour, namely tcc environments. An environment is just a file consisting of lines of the form:

    *IDENTIFIER "text"
     

where * stands for one of the environment prefixes, +, < and > (in fact ? is also a valid environment prefix. It is used to query the values represented by environmental identifiers. If tcc is invoked with the -Ystatus command-line option it will print the values of all the environmental identifiers it recognises). Any line in the environment not beginning with one of these characters is ignored. IDENTIFIER will be one of the environmental identifiers recognised by tcc, the environment prefix will tell tcc how to modify the value given by this identifier, and text what to modify it by.

The simplest environmental identifiers are those which are used to pass flags to tcc and the various components of the compilation system. The line:

    +FLAG "text"
     

causes text to be interpreted by tcc as if it was a command-line option. Similarly:

    +FLAG_TDFC "text"
     

causes text to be passed as an option to tdfc. There are similar environmental identifiers for each of the components of the compilation system (see 7.6 for a complete list).

The second class of environmental identifiers are those corresponding to simple string variables. Only the form:

    +IDENTIFIER "text"
     

is allowed. This will set the corresponding variable to text. The permitted environmental identifiers and the corresponding variables are:

      ENVDIR  the default environments directory
      (see section 4.1),
    MACHINE     the target machine type (see section 4.2),
    PORTABILITY the producer portability table (see section 5.1.3),
    TEMP        the default temporary directory (see section 6.4),
    VERSION     the target machine version (MIPS® only, see section 5.3.4).
     

The final class of environmental identifiers are those corresponding to lists of strings. Firstly text is transformed into a list of strings, b say, by splitting at any spaces, then the list corresponding to the identifier, a say, is modified by this value. How this modification is done depends on the environment prefix:

  • if the prefix is + then a = b,

  • if the prefix is > then a = a + b,

  • if the prefix is < then a = b + a,

where + denotes concatenation of lists. The lists represented in this way include those giving the pathnames of the executables of the various compilation components (plus default flags). These are given by the identifiers TDFC, TLD, etc. (see 7.6 for a complete list). The other lists can be divided between those affecting the producer, the TDF linker, and the system linker respectively (see sections 5.1, 5.2 and 5.5 for more details):

        INCL        list of default producer include file directories (as -I options),
        STARTUP     list of default producer start-up files (as -f options),
        STARTUP_DIR list of default producer start-up directories (as -I options),
 
        LIB     list of default TDF libraries (as -l options),
        LINK        list of default TDF library directories (as -L options),
 
        CRT0        list of default initial .o files,
        CRT1        second list of default initial .o files,
        CRTN        list of default final .o files,
        SYS_LIB     list of default system libraries (as -l options),
        SYS_LIBC    list of default standard system libraries (as -l options),
        SYS_LINK    list of default system library directories (as -L options).
       

3.1. The Environment Search Path

The command-line option -Yenv tells tcc to read the environment env. If env is not a full pathname then it is searched for along the environment search path. This consists of a colon-separated list of directories, the initial segment of which is given by the system variable TCCENV (we use the term "system variable" to describe TCCENV rather than the more normal "environmental variable" to avoid confusion with tcc environments) if this is defined, and the final segment of which consists of the default environments directory, which is built into tcc at compile-time, and the current working directory. The option -show_env causes tcc to print this environment search path. If the environment cannot be found, then a warning is issued.

3.2. The Default Environment: Configuring tcc

The most important environment is the default environment, which is built into tcc at compile-time. This does not mean that the default environment is read every time that tcc is invoked, but rather that it is read once (at compile-time) to determine the default configuration of tcc.

The information included in the default environment includes: the pathnames and default flags of the various components of the compilation system; the target machine type; the default temporary directory; the specification of the target independent headers, TDF libraries and system libraries comprising the default API (which is always ANSI); the variables specifying the default compilation mode; the default environments directory (mentioned above).

The target machine type, defined by the MACHINE environmental identifier, actually plays a very minor role in dealing with the very real target dependency problems in tcc. These problems are caused by the fact that tcc is designed to work on many different target machines. All the information on where the executables, include files, TDF libraries etc. are located on a particular machine is stored in the standard environments, and in particular, the default environment. The interaction with the system assembler and, more importantly, the system linker is also expressed using environments. The only target dependencies for which the machine type needs to be known are genuine aberrations. For example, the TDF to MIPS® translator and the MIPS® assembler are completely different from most other translator-assembler pairs in that they pass two files, a .G and a .T file, between them, rather than the more normal single .s file. Thus it is important for tcc to know that the machine type is MIPS® in this case (see section 5.3.4 for more details).

3.3. Using Environments to Specify APIs

Another important use of environments concerns their use in specifying APIs. As was mentioned above, an API may be considered to have three components: the target independent headers, giving an abstract description of the API to the producer, and the TDF libraries and system libraries, giving the details of the API implementation to the installer. Environments are an ideal medium for expressing this information. The INCL environmental identifier can be used to specify the location of the target independent headers, LIB and LINK the location of the TDF libraries, and SYS_LIB and SYS_LINK the location of the system libraries. Moreover, all this information can be canned into a single command-line option.

A number of standard APIs have been described as target independent headers and are provided with the TDF system. A tcc environment is provided for each of these APIs (for example, ansi, posix, xpg3 - see 7.5 for a complete list, also see section 6.3). There is an important distinction to be made between base APIs (for example, POSIX) and extension APIs (for example, X11 Release 5). The command-line option -Yposix sets the API to be precisely POSIX, whereas the option -Yx5_lib sets it to the existing API plus the X11 Release 5 basic X library. This is done by using +INCL etc. in the posix environment to set the various variables corresponding to these environmental identifiers to precisely the values for POSIX, but <INCL etc. in the x5_lib environment to extend these variables by the values for X11 Release 5. Thus, to specify the API POSIX plus X11 Release 5, the command-line options -Yposix -Yx5_lib are required (in that order).

All the standard API environments provided also contain lines which set, or modify, the INFO environmental identifier. This contains textual information on the API, including API names and version numbers. This information can be printed by invoking tcc with the -info command-line option. For example, the command-line options:

        > tcc -info -Yposix -Yx5_lib
       

cause the message:

        tcc: API is X11 Release 5 Xlib plus POSIX (1003.1).
       

to be printed.

As was mentioned above, the default API is ANSI. Thus invoking tcc without specifying an API environment is equivalent to giving the -Yansi command-line option. On the basis that, when it comes to portability, explicit decisions are better than implicit ones, the use of -Yansi is recommended.

3.4. Using Environments to Implement tcc Options

Another use to which environments are put is to implement certain tcc command-line options. In particular, some options require different actions depending on the target machine. It is far easier to implement these by means of an environment, which can be defined differently on each target machine, rather than by trying to build all the alternative definitions into tcc.

An important example is the -g flag, which causes the generation of information for symbolic debugging. Depending on the target machine, different flags may need to be passed to the assembler and system linker when -g is specified, or the default .o files and libraries used by the linker may need to be changed. For this reason tcc uses a standard environment, tcc_diag, to implement the -g option.

For a complete list of those options which are implemented by means of environments, see 7.7. If the given option is not supported on a particular target machine, then the corresponding environment will not exist, and tcc will issue a warning to that effect.

3.5. User-Defined Environments

The tcc user can also set up and use environments. It is anticipated that this facility will be used mainly to group a number of tcc command-line options into an environment using the FLAG environmental identifier and to set up environments corresponding to user-defined APIs.

Chapter 4. The Components of the TDF System

4.1. The C to TDF Producer

We now turn to the individual components of the TDF system. Most of the command-line options to tcc so far discussed have been concerned with controlling the behaviour of tcc itself. Another, even more important, class of options concern the ways in which the behaviour of the components can be specified. The -Wtool, opt, ... command-line option for communicating directly with the components has already been mentioned. This however is not recommended for normal purposes; the other tcc command-line options give a more controlled access to the components.

The first component to be considered is the C --> TDF producer, tdfc. This translates an input C source file (a .c file or a .i file) into a output target independent TDF capsule (a .j file).

4.1.1. Include File Directories

The most important producer options are those which tell it where to search for files included using a #include preprocessing directive. As with cc, the user can specify a directory, dir, to search for these files using the -Idir command-line option. However, unlike cc, the producer does not search /usr/include as default. Instead, the default search directories are those containing the target independent headers for the API selected, as given by the INCL identifier in the environment describing the API. In addition, the directories to search for the default start-up files (see below), as given by the STARTUP_DIR environmental identifier, are also passed to the producer.

If the -H option is passed to tcc then it will cause the producer to print the name of each file it opens. This is often helpful if a multiplicity of -I options leads to confusion.

4.1.2. Start-up Files and End-up Files

The producer has a useful feature of start-up and end-up files. The tcc command-line option -ffile is equivalent to inserting the line:

            #include "file"
           

at the start of each input C source file. Similarly -efile is equivalent to inserting this line at the end of each such file. These included files are searched for along the directories specified by the -I options in the normal manner.

tcc generates a producer start-up file, called tcc_startup.h , in order to implement certain command-line options. The cc-compatible options:

            -Dname
            -Dname=value
            -Uname
            -Astr
           
are translated into the lines:

            #define name 1
            #define name value
            #undef name
            #assert str
           

respectively. tcc does not check that these lines are valid C preprocessing directives since this will be done by the producer. So any producer error message referring to tcc_startup.h is likely actually to refer to the -D, -U and -A command-line options. In case of difficulties, tcc_startup.h can be preserved for closer examination using the -Ph option to tcc.

There may be default start-up options specified by the STARTUP environmental identifier. The purpose of these is discussed below. The order the start-up options are passed to the producer is: firstly, the default start-up options; secondly, the start-up option for the tcc built-in start-up file, tcc_startup.h; thirdly, any command-line start-up options. (For technical reasons, a -no_startup_options command-line option is provided which causes no start-up or end-up options to be passed to tdfc. This is not likely to prove useful in normal use.

4.1.3. Compilation Modes and Portability Tables

We have already described how one aspect of the compilation environment, the API, is specified to the producer by means of the default -I options. But another aspect, the control of the syntax and portability checks applied by the producer, can also be specified in a fairly precise manner.

The producer accepts a number of #pragma statements which tell it which portability checks to apply and which syntactic extensions to ISO/ANSI C to allow (see [3] and [2]). These can be inserted into the main C source, but the ideal place for them is in a start-up file. This is the purpose of the STARTUP environmental identifier, to give a list of default start-up files containing #pragma statements which specify the default behaviour of the producer.

In fact not all the information the producer requires is obtained through start-up files. The basic information on the minimum sizes which can be assumed for the basic integer types is passed to the producer by means of another type of file, the portability table. This is specified by means of the PORTABILITY environmental identifier. There are in fact only two portability tables provided, Ansi_Max.pf, which specifies the minimum sizes permitted by the ISO/ANSI standard, and Common.pf, which specifies the minimum sizes found on most 32-bits machines. The main difference between the two is that in ISO/ANSI it is stated that int is only guaranteed to have 16 bits, whereas on 32-bits machines it has at least 32 bits.

A number of tcc command-line options are concerned with specifying the compilation environment to the producer. The main option for setting the compilation mode is -Xmode. A number of different modes are available:

  • -Xs specifies strict ISO/ANSI C with extra portability checks,

  • -Xp specifies strict ISO/ANSI C with minimal portability checks,

  • -Xc specifies strict ISO/ANSI C with no extra portability checks,

  • -Xa specifies ISO/ANSI C with various syntactic extensions,

  • -Xt specifies "traditional" C.

The default is -Xc. For a precise description of each of these modes, see [3] (tchk is just tcc in disguise). In addition the command-line options -not_ansi and -nepc can be used to modify the basic compilation modes. -not_ansi specifies that certain non-ANSI syntactic constructions should be allowed. -nepc switches off the producer's extra portability checks (it also suppresses certain constant overflow checks in the TDF translators). All these options are implemented by start-up files.

The portability table to be used is specified separately by means of an environment. The default is the ISO/ANSI portability table, but -Y32bit or -Ycommon can be used to specify 32-bit checking. -Y16bit will restore the portability table to the default. Note that all checks involving the portability table are switched off by the -nepc command-line option, so in this case no portability table is specified to the producer.

4.1.4. Description of Compilation Modes

Let us briefly describe the compilation modes introduced in the previous section. The following tables describe some of the main features of each mode. The list of pre-defined macros is complete (other than the built-in macros, __FILE__, __LINE__, __DATE__ and __TIME__; because the producer is designed to be target independent it does not define any of the machine name macros which are built into cc. The cc-compatible option, -A-, which is meant to cause all pre-defined macros (other than those beginning with __) to be undefined, and all pre-assertions to be unasserted, is ignored by tcc. In the standard compilation modes there are no such macros and no such assertions. The integer promotion rules are either the arithmetic rules specified by ISO/ANSI or the "traditional" signed promotion rules. The precise set of syntactic relaxations to the ISO/ANSI standard allowed by each mode varies. For a complete list see [3]. The -not_ansi command-line option can be used to allow further relaxations. The extra prototype checks cause the producer to construct a prototype for procedures which are actually traditionally defined. This is very useful for getting prototype-like checking without having to use prototypes in function definitions. This, and other portability checks, are switched off by the -nepc option. Finally, the additional checks are lint-like checks which are useful in detecting possible portability problems.

            compilation mode: -Xs
            description: strict ISO/ANSI with additional checks
            pre-defined macros: __STDC__ = 1, __ANDF__ = 1, __TenDRA__ = 1
            integer promotions: ISO/ANSI
            syntactic relaxations: no
            extra prototype checks: yes
            additional checks: yes
       
            compilation mode: -Xp
            description: strict ISO/ANSI with minimal extra checks
            pre-defined macros: __STDC__ = 1, __ANDF__ = 1, __TenDRA__ = 1
            integer promotions: ISO/ANSI
            syntactic relaxations: no
            extra prototype checks: yes
            additional checks: some
       
            compilation mode: -Xc
            description: strict ISO/ANSI with no extra checks
            pre-defined macros: __STDC__ = 1, __ANDF__ = 1, __TenDRA__ = 1
            integer promotions: ISO/ANSI
            syntactic relaxations: no
            extra prototype checks: no
            additional checks: no
       
            compilation mode: -Xa (default)
            description: lenient ISO/ANSI with no extra checks
            pre-defined macros: __STDC__ = 1, __ANDF__ = 1, __TenDRA__ = 1
            integer promotions: ISO/ANSI
            syntactic relaxations: yes
            extra prototype checks: no
            additional checks: no
       
            compilation mode: -Xt
            description: traditional C
            pre-defined macros: __STDC__ = 0, __ANDF__ = 1, __TenDRA__ = 1
            integer promotions: signed
            syntactic relaxations: yes
            extra prototype checks: no
            additional checks: no
         

The choice of compilation mode very much depends on the level of checking required. -Xa is suitable for general compilation, and -Xc. -Xp and -Xs for serious program checking (although some may find the latter Xs-ive). -Xt is provided for cc compatibility only; its use is discouraged.

The recommended method of proceeding is to define your own compilation mode. In this way any choices about syntax and portability checking are made into conscious decisions. One still needs to select a basic mode to form the basis for this user-defined mode. -Xc is probably best; it is a well-defined mode (the definition being the ISO/ANSI standard) and so forms a suitable baseline. Suppose that, on examining the program to be compiled, we decide that we need to do the following:

  • allow the #ident directive,

  • allow through unknown escape sequences with a warning,

  • warn of uses of undeclared procedures,

  • warn of incorrect uses of simple return statements.

The first two of these are syntactic in nature. The third is more interesting. ISO/ANSI says that any undeclared procedures are assumed to return int. However for strict API checking we really need to know about these undeclared procedures, because they may be library routines which are not part of the declared API. The fourth condition is a simple lint-like check that no procedure which is declared to return a value contains a simple return statement (without a return value).

To tell the producer about these options, it is necessary to have them included in every source file. The easiest way of doing this is by using a start-up file, check.h say, containing the lines:

            #pragma TenDRA begin
            #pragma TenDRA directive ident allow
            #pragma TenDRA unknown escape warning
            #pragma TenDRA implicit function declaration warning
            #pragma TenDRA incompatible void return warning
         

The second, third, fourth and fifth lines correspond to the statements above (see [3]). The first line indicates that this file is defining a new checking scope.

Once the compilation mode has been described in this way, it needs to be specified to tcc in the form of the command-line options -Xc -fcheck.h.

4.2. The TDF Linker

The next component of the system to be considered is the TDF linker, tld. This is used to combine several TDF capsules or TDF libraries into a single TDF capsule. It is put to two distinct purposes in the tcc compilation scheme. Firstly, in the main compilation path, it is used in the installer half to combine a target independent TDF capsule (a .j file) with the TDF libraries representing the API implementation on the target machine, to form a target dependent TDF capsule (a .t file). Secondly, if the -M option is given to tcc, it is used in the producer half to combine all the target independent TDF capsules (.j files) into a single target independent capsule. Let us consider these two cases separately.

4.2.1. The Linker and TDF Libraries

In the main TDF linking phase, combining target independent capsules with TDF libraries to form target dependent capsules, two pieces of information need to be specified to tld. Firstly, the TDF libraries to be linked with, and, secondly, the directories to search for these libraries. For standard APIs, the location of the TDF libraries describing the API implementation is given in the environment corresponding to the API. The LIB identifier gives the names of the TDF libraries, and the LINK identifier the directories to be searched for these libraries. The user can also specify libraries and library directories by means of command-line options to tcc. The option -jstr indicates that the TDF library str.tl should be used for linking (.tl is the standard suffix for TDF libraries). The option -Jdir indicates that the directory dir should be added to the TDF library search path. Libraries and directories specified by command-line options are searched before those given in the API environment.

There is a potential source of confusion in that the tld options specifying the TDF library str.tl and the library directory dir are respectively -lstr and -Ldir. tcc automatically translates command-line -j options into tld -l options, and command-line -J options into tld -L options. However the LIB and LINK identifiers are actually lists of tld options, so they should use the -l and -L forms.

4.2.2. Combining TDF Capsules

The second use of tld is to combine all the .j files in the producer half of the compilation into a single capsule. This is specified by means of the -M ("merge") command-line option to tcc described in section 3.5.4. By default, the resultant capsule is called a.j. If the -M option is used to merge all the .j files from a very large program, the resultant TDF capsule can in turn be very large. It may in fact become too large for the installer to handle. Interesting it is often the system assembler rather than TDF translator which has problems.

The -MA ("merge all") option is similar to -M, but will in addition "hide" all the external tag and token names in the resultant capsule, except for the token names required for linking with the TDF libraries and the tag names required for linking with the system libraries (plus main). In effect, all the names which are internal to the program are removed. This means that the -MA option should only be used to merge complete programs. For details on how to use tld for more selective name hiding, see below.

4.2.3. Constructing TDF Libraries

There is a final use of the TDF linker supported by tcc which has not so far been mentioned, namely the construction of TDF libraries. As has been mentioned, TDF libraries are an indexed set of TDF capsules. tld, in addition to its linking mode, also has routines for constructing and manipulating TDF libraries. The library construction mode is supported by tcc by means of the makelib environment. This tells tcc to merge all the .j files and then to stop. But it also passes an option to tld which causes the merged file to be, not a TDF capsule, but a TDF library. Thus the command-line options:

            > tcc -Ymakelib -o a.tl a.j b.j c.j
         

cause the TDF capsules a.j, b.j and c.j to be combined into a TDF library, a.tl.

4.2.4. Useful tld Options

tld has a number of options which may be useful to the general user. The -w option, which causes warnings to be printed about undefined tags and tokens, can often provide interesting information; other options are concerned with the hiding of tag and token names. These options can be passed directly to tld by means of the -WL, opt, ... command-line option to tcc . The tld options are fully documented on the appropriate manual page.

4.3. The TDF to Target Translator

The next compilation tool to be considered is the TDF translator. This translates an input target dependent TDF capsule (.t file) into an assembly source file (.s) file for the appropriate target machine. This is the main code generation phase of the compilation process; most of the optimisation of code which occurs happens in the translator (some machines also have optimising assemblers).

Although referred to by the generic name of trans, the TDF translators for different target machines in fact have different names. The main division between translators is in the supported processor. However, operating system dependent features such as the precise form of the assembler input, and the symbolic debugger to be supported, may also cause different versions of the basic translator to be required for different machines of the same processor group. The current generation of translators includes the following:

  1. The TDF --> i386/i486 translator is called trans386. This exists in two versions, one running on SVR4.2 and one on SCO. The two versions differ primarily in the symbolic debugger they support. trans386 has also been ported to several other i386-based machines, including MS-DOS.

  2. The TDF --> Sparc® (Version 7) translator is called sparctrans . This again exists in two versions, one running on SVR4.2 and one on SunOS™ and Solaris™ 1. These versions again differ primarily in the symbolic debugger supported.

  3. The TDF --> MIPS® (R2000/R3000, little-endian) translator is called mipstrans. This differs from the other translators in that instead of outputting a single .s file, it outputs two files, a binasm file (with a .G suffix) and a symbol table file (with a .T suffix). This is discussed in more detail below. mipstrans runs on Ultrix, but again has two versions. One runs on Ultrix® 4.1 and earlier, the other on 4.2 and later. This necessary because of a change in the format of the binasm file between these two releases.

  4. The TDF --> 68030/68040 translator also exists in two versions. One runs on HP-UX® and is called hptrans; the other runs on NeXTSTEP™ and is called nexttrans (however the NeXT® is not a supported platform because of its lack of standard API coverage). These differ, not only in the symbolic debugger supported, but also in the format of the assembly source output.

This list is not intended to be definitive. Development work is proceeding on new translators all the time. Existing translators are also updated to support new operating systems releases when this is necessary.

4.3.1. tcc Options Affecting the Translator

A number of tcc command-line options are aimed at controlling the behaviour of the TDF translator. The cc-compatible option -Kitem, ... specifies the behaviour indicated by the argument item. Possible values for item, together with the behaviour they specify, include:

            PIC     causes position independent code to be produced,
            ieee        causes strict conformance to the IEEE floating point standard,
            noieee      allows non-strict conformance to the IEEE standard,
            frame       specifies that a frame pointer should always be used,
            no_frame        specifies that frame pointers need not always be used,
            i386        causes code generation to be tuned for the i386 processor,
            i486        causes code generation to be tuned for the i486 processor,
            P5      causes code generation to be tuned for the P5 processor.
         

Obviously not all of these options are appropriate for all versions of trans. Therefore all -K options are implemented by means of environments which translate item into the appropriate trans options. If a certain item is not applicable on a particular target machine then the corresponding environment will not exist, and tcc will print a warning to this effect.

The cc-compatible -Zstr option is similarly implemented by means of environments. On those machines which support this option it can be used to specify the packing of structures. If str is p1 then they are tightly packed, with no padding. Values of p2 and p4 specify padding to 2 or 4 byte boundaries when appropriate.

Finally, the tcc command-line option -wsl causes the translator to make all string literals writable. Again, this is implemented by an environment. For many machines this behaviour is default; for others it requires an option to be passed to the translator.

4.3.2. Useful trans Options

For further specifying the behaviour of trans it may be necessary to pass options to it directly. The command-line options implemented by trans vary from machine to machine. The following options are however common to all translators and may prove useful:

            -E      switches off certain constant overflow checks,
            -X      switches off most optimisations,
            -Z      prints the version number(s) of the input capsule.
         

These options may be passed directly to trans by means of the -Wt, opt, ... command-line option to tcc . The -E option is also automatically invoked when the -nepc command-line option to tcc is used. The manual page for the appropriate version of trans should be consulted for more details on these and other, machine dependent, options.

4.3.3. Optimisation in TDF Translators

As has been mentioned, the TDF translator is the main optimising phase of the TDF compilation scheme. All optimisations are believed to be correct and are switched on by default. Thus the standard cc -O option, which is intended to switch optimisations on, has no effect in tcc except to cancel any previous -g option. If, due to a translator bug, a certain piece of code is being optimised incorrectly, then the optimisations can be switched off by means of the -Wt, -X option mentioned above. However this should not normally be necessary.

4.3.4. The MIPS® Translator and Assembler

As has been mentioned, the TDF --> MIPS® translator, mipstrans is genuinely exceptional in that it outputs a pair of files for each input TDF capsule, rather than a single assembly source file. The general scheme is shown in Fig. 5.

Figure 4.1. MIPS® Compilation Path

MIPS® Compilation Path

mipstrans translates each input target dependent TDF capsule, a.t, into a binasm source file, a.G, and an assembler symbol table file, a.T. It may optionally output an assembly source file, a.s, which combines all the information from a.G with part of the information from a.T (it is the remainder of the information in a.T which is the reason why this scheme has to be adopted). The .s file is only produced if tcc is explicit told to preserve .s files by means of one of the command-line options, -Ps, -Pa, -Fs or -S. The two main mipstrans output files, a.G and a.T, are then transformed by the auxiliary MIPS® assembler, as1, into a binary object file, a.o.

Although they can be preserved for examination, the .G and .T files output by mipstrans cannot subsequently be processed using tcc. If a break in compilation is required at this stage, a .s file should be produced, and then entered as a tcc input file in the normal way. The information lost from the symbol table in this process is only important when symbolic debugging information is required. Input .s files are translated into binary object files by the main MIPS® assembler, as, in the normal way.

So, in addition to the main assembler, which is given by the AS environmental identifier, the location of the auxiliary assembler also needs to be specified to tcc. This is done using the AS1 environmental identifier, which is normally defined in the default environment. There is a further piece of information required for the compilation scheme to operate correctly: mipstrans needs to know the as1 version number. This number can be specified by means of the VERSION environmental identifier.

4.4. The System Assembler

The system assembler is the stage in the tcc compilation path which is likely to be of least interest to normal users. The assembler translates an assembly source (or .s) file into a binary object (or .o) file. (The exception to this is the MIPS® auxiliary assembler discussed above.) Most assemblers are straight translation phases, although some also offer peephole optimisation and scheduling facilities. No tcc command-line options are directly concerned with the assembler, however options can be passed to it directly by means of the -Wa, opt, ... command-line option.

4.5. The System Linker

The final stage in the main tcc compilation path is the system linking. The system linker, ld, combines all the binary object files with the system libraries to form a final executable image. By default this executable is called a.out, although this can be changed using the -o command-line option to tcc. In terms of the differences between target machines, the system linker is the most complex of the tools which are controlled by tcc. Our discussion can be divided between those aspects of the linker's behaviour which are controlled by tcc environments, and those which are controlled by command-line options.

4.5.1. The System Linker and tcc Environments

The general form of tcc's calls to ld are as follows:

            ld (linker options) -o (output file)
                (initial .o files) (binary object files)
                (final .o files) (default system library directories)
                (default system libraries) (default standard libraries)
         

The linker may require certain default binary object files to be linked into every executable created. These are divided between the initial .o files, which come before the main list of binary object files, and the final .o files, which come after. For technical reasons, the list of initial .o files is split into two; the first list is given by the CRT0 environmental identifier, and the second by CRT1. The list of final .o files is given by the CRTN environmental identifier.

The information on the default system libraries the linker requires is given by three environmental identifiers. SYS_LINK gives a list of directories to be searched for system libraries. This will exclude /lib and /usr/lib which are usually built into ld. These directories will be given as a list of options of the form -Ldir. The default system libraries are divided into two lists. The environmental identifier SYS_LIBC gives the "standard" library options (usually just -lc), and SYS_LIB gives any other default library options. Both of these are given by lists of options of the form -lstr. This option specifies that the linker should search for the library libstr.a if linking statically, or libstr.so if linking dynamically.

So the main target dependencies affecting the system linker are described in these six environmental variables: CRT0, CRT1, CRTN, SYS_LINK, SYS_LIB and SYS_LIBC. For a given machine these will be given once and for all in the default environment. Standard API environments may modify SYS_LINK and SYS_LIB to specify the location of the system libraries containing the API implementation, although at present this has not been done.

4.5.2. The Effect of Command-Line Options on the System Linker

The most important tcc command-line options affecting the system linker are those which specify the use of certain system libraries. The option -lstr indicates that the system libraries libstr.a (or libstr .so) should be searched for. The option -Ldir indicates that the directory dir should be added to the list of directories searched for system libraries. Both these options are position dependent. They are passed to the system linker in exactly the same position relative to the input files as they were given on the command-line. Thus normally -l (and to a lesser extent -L) options should be the final command-line options given.

The following tcc command-line options are passed directly to ld. A brief description is given of the purpose of each option, however whether or not ld supports this option depends on the target machine. The local ld manual page should be consulted for details.

            -Bstr       sets library type: str can be dynamic or static,
            -G      causes a shared object rather than an executable to be produced,
            -dn     causes dynamic linking to be switched off,
            -dy     causes dynamic linking to be switched on,
            -hstr       causes str to be marked as dynamic in a shared object,
            -s      causes the resultant executable to be stripped,
            -ustr       causes str to be marked as undefined,
            -zstr       specifies error behaviour, depending on str.
         

The position of any -Bstr options on the command-line is significant. These positions are therefore preserved. The position of the other options is not significant. In addition to these options, the -b command-line option causes the default standard system libraries (i.e. those given by the SYS_LIBC environmental identifier) not to be passed to ld.

Other command-line options may affect the system linker indirectly. For example, the -g option may require different default .o files and system libraries, the precise details of which are target dependent. Such options will be implemented by means of environments which will change the values of the environmental identifiers controlling the linker.

4.6. The C Preprocessor

The TDF C preprocessor, tdfcpp, is invoked only when tcc is passed the -E or -P command-line option, as described in section 3.5.1. These both cause all input .c files to be preprocessed, but in the former case the output is send to the standard output, whereas in the latter it is send to the corresponding .i files.

The TDF system differs from most C compilation systems in that preprocessing is an integral part of the producer, tdfc, rather than a preliminary textual substitution phase. This is because of difficulties involved with trying to perform the preprocessing in a target independent manner. Therefore tdfcpp is merely a modified version of tdfc which halts after the preprocessing phase and prints what it has read. This means that the tdfcpp output, while being equivalent to its input, will not correspond at the textual level to the degree which is normal in C preprocessors.

4.7. The TDF Pretty Printer

The TDF pretty printer, disp, and the TDF notation compiler, tnc, have already been discussed in some detail in section 3.5.3. The TDF decoding command-line options, -disp and -disp_t, cause respectively all .j files and all .t files to be decoded into .p files. This decoding is done using disp by default, and with tnc -p if the -Ytnc command-line option is specified. The -Ytnc option also causes any input .p files to be encoded into .j files by tnc.

The pretty printer, disp, can be used as a useful check that a given .j or .t file is a legal TDF capsule. The TDF decoding routines in the TDF linker and the TDF translator assume that their input is a legal capsule. The pretty printer performs more checks and has better diagnostics for illegal capsules. By default disp only decodes capsule units which belong to "core" TDF. Options to decode other unit types can be passed directly to disp by means of the -Wd, opt, ... command-line option to tcc. The potentially useful disp options include:

        -A      causes all known unit types to be decoded,
        -g      causes diagnostic information units to be decoded,
        -D      causes a binary dump of the capsule to be printed,
        -U      causes link information units to be decoded,
        -V      causes the input not to be rationalised,
        -W      causes a warning to be printed if a token is used before it is declared.
     

The manual page for disp should be consulted for more details.

The TDF notation compiler, tnc, is fully documented in [4].

4.8. The TDF Archiver

A TDF archive is a tcc-specific form intended for software distribution. It consists of a set of target independent TDF capsules (.j files) and a set of tcc command-line options. It is intended that a TDF archive can be produced on one machine, and distributed to, and installed on, a number of target machines.

If a TDF archive is given as an input file to tcc (it will be recognised by its .ta suffix), then it is split into its constituent capsules and options. The options are interpreted as if they had been given on the command-line (unless the -WJ, -no_options flag is specified), and the capsules are treated as input files in the normal way. The archive splitting and archive building routines are both built into tcc; there is no separate TDF archiver tool. Options passed to the archiver using -WJ, opt are interpreted by tcc.

In order to specify that a TDF archive should be created, the -prod flag should be used. This specifies that all target independent capsules (.j files) and all options opt given by a tcc option of the form -WI, opt, ... should be combined into a TDF archive. The compilation process halts after producing this archive. By default the TDF archive created is called a.ta, but this can be changed using the -o option. Normally the names of the capsules comprising the archive are inserted into the archive, but this may be suppressed by the use of the -WJ, -no_names option.

As an example of the kind of option that might be included in an archive, suppose that the production has been done using the POSIX API. Then the installation should also be done using this same API. Alternatively expressed, if a TDF archive has been constructed using the posix environment, then the -Yposix flag should be included in the archive to ensure that the installation also takes place in this same environment. In fact the environments describing the standard APIs have been set up so that this happens automatically. For example, the posix environment contains the line:

            +FLAG "-WI,-Yposix"
       

Another kind of option that it might be useful to include in an archive is a -lstr option. In this way all the information on the install-time options can be specified at produce-time.

A final example of an option which might be included in an archive is the -message option. The command-line option -message str causes tcc to print the message str with any @ characters in str replaced by spaces (there are problems with escaping spaces). So, by using the command-line option:

            -WI,-message"Installing@TDF@archive@..."
       

one can produce an archive which prints a message as it is installed. This option is also useful in environments. By inserting the line:

            +FLAG "-message Reading@tcc@environment@..."
       

one can produce an environment which prints a message whenever it is read.

Chapter 5. Miscellaneous Topics

In this section we draw together a number of miscellaneous topics not so far covered.

5.1. Intermodular Checks

All of the extra compiler checks described in Section 4.1.3, “Compilation Modes and Portability Tables” refer to a single C source file, however tcc also has support for a number of intermodular checks. These checks are enabled by means of the -im command-line option. This causes the producer to create for each C source file, in addition to its TDF capsule output file, a second output file, called a C spec file, containing a description of the C objects declared in that file. This C spec file is kept associated with the target independent TDF as it is transformed to a target dependent capsule, an assembly source file, and a binary object file. When these binary object files are linked then the associated C spec files are also linked using the C spec linker, spec_linker, into a single C spec file. This file is named a.k by default. It is this linking process which constitutes the intermodular checking (in fact spec_linker may also be invoked at the TDF merging level when the -M option is used).

When intermodular checks are specified, tcc will also recognise input files with a .k suffix as C spec files and pass them to the C spec linker.

The nature of the association between a C spec file and its binary object file needs discussion. While these files are internal to a single call of tcc it can keep track of the association, however if the compilation is halted before linking it needs to preserve this association. For example in:

            > tcc -im -c a.c
       

the binary object file a.o and the C spec file a.k need to be kept together. This is done by forming them into a single archive file named a.o. When a.o is subsequently linked, tcc recognises that it is an archive and splits it into its two components, one of which it passes to the system linker, and one to the C spec linker.

Intermodular checking is described in more detail in [3]. In tchk intermodular checking is on by default, but may be switched off using -im0.

5.2. Debugging and Profiling

tcc supports options for both symbolic debugging using the target machine's default debugger, and profiling using prof on those machines which have it.

The -g command-line option causes the producer to output extra debugging information in its output TDF capsule, and the TDF translator to translate this information into the appropriate assembler directives for its supported debugger (for details of which debuggers are supported by which translators, consult the appropriate manual pages). For the translator to have all the diagnostic information it requires, not only the TDF capsules output by the producer, but also those linked in by the TDF linker from the TDF libraries, need to contain this debugging information. This is ensured for the standard TDF libraries by having two versions of each library, one containing diagnostics and one not. By default the environmental identifier LINK, which gives the directories which the TDF linker should search, is set so that the non-diagnostic versions are found. However the -g option modifies LINK so that the diagnostic versions are found first.

Depending on the target machine, the -g option may also need to modify the behaviour of the system assembler and the system linker. Like all potentially target dependent options, -g is implemented by means of a standard environment, in this case tcc_diag.

The -p option is likewise implemented by means of a standard environment, tcc_prof. It causes the producer to output extra information on the names of statically declared objects, and the TDF translator to output assembler directives which enable prof to profile the number of calls to each procedure (including static procedures). The behaviour of the system assembler and system linker may also be modified by -p, depending on the target machine.

5.3. The System Environment

In section 4.3 we discussed how tcc environments can be used to specify APIs. There is one API environment however which is so exceptional that it needs to be treated separately. This is the system environment, which specifies that tcc should emulate cc on the machine on which it is being run. The system environment specifies that tcc should use the system headers directory, /usr/include, as its default include file directory, and should define all the machine dependent macros which are built into cc. It will also specify the 32-bit portability table on 32-bit machines.

Despite the differences from the normal API environments, the system environment is indeed specifying an API, namely the system headers and libraries on the host machine. This means that the .j files produced when using this environment are only "target independent" in the sense that they can be ported successfully to machines which have the exactly the same system headers and predefined macros.

Using the system headers is fraught with difficulties. In particular, they tend to be very cc-specific. It is often necessary to use the -not_ansi and -nepc options together with -Ysystem merely to negotiate the system headers. Even then, tcc may still reject some constructs. Of course, the number of problems encountered will vary considerably between different machines.

To conclude, the system environment is at best only suitable for experimental compilation. There are also potential problems involved with its use. It should therefore be used with care.

5.4. The Temporary Directory

As we have said, tcc creates a temporary directory in which to put all the intermediate files which are created, but are not to be preserved. By default, these intermediate files are left in the temporary directory until the end of the compilation, when the temporary directory is removed. However, if disk space is short, or a particularly large compilation is taking place, the -tidy command-line option may be appropriate. This causes tcc to remove each unwanted intermediate file immediately when it is no longer required.

The name of the temporary directory created by tcc to store the intermediate files is generated by the system library routine tempnam. It takes the form TEMP/tcc????, where TEMP is the main tcc temporary directory, and ???? is an automatically generated unique suffix. There are three methods of specifying TEMP, listed in order of increasing precedence:

  • by the TEMP environmental identifier (usually in the default environment),

  • by the -temp dir command-line option,

  • by the TMPDIR system variable.

Normally TEMP will be a normal temporary directory, /tmp or /usr/tmp for example, but any directory to which the user has write permission may be used. In general, the more spare disk space which is available in TEMP, the better.

5.5. The tcc Option Interpreter

All tcc command-line options and environmental directives are actually processed by the same method, namely the tcc option interpreter. A simple pattern matching algorithm is applied to the input and, if a match is found, the corresponding instructions are sent to the low-level option interpreter. The command-line option --str, ... causes str to be passed directly to the option interpreter. This is intended primarily to help in debugging tcc and not for use by the general user. However, if you are interested, --1DB is a good place to start.

            [1] TDF and Portability, DRA, 1994.
            [2] The C to TDF Producer, DRA, 1993.
            [3] The TenDRA Static Checker, DRA, 1994.
            [4] The TDF Notation Compiler, DRA, 1994.
       

Chapter 6. tcc Reference Guide

6.1. Input and Output Files

tcc identifies the file type of the input files it is passed by means of their file suffix. The recognised file suffixes are as follows:

Table 6.1. Input File Suffixes

Suffix Code Description
.c c C source file
.i i Preprocessed C source file
.j j Target independent TDF capsule
.t t Target dependent TDF capsule
.s s Assembly source file
.o o Binary object file
.k k C spec file (only if intermodular checks are enabled)
.p p TDF notation source file (only if -Ytnc is specified)
.pl P PL_TDF source file (only if -Ypl_tdf is specified)
.ta A TDF archive
.G G Binasm source file (MIPS® only)
.T T Assembler symbol table (MIPS® only)
other - Binary object file

Each file type is assigned an identifying letter, usually corresponding to its file suffix, which may be used in various command-line options. For example, -Fs instructs tcc to halt the compilation after creating the assembly source files, and is therefore equivalent to -S. Similarly -Po instructs it to preserve any binary object files it creates. There are a couple of special file type codes which may be used with the -P option. The option -Pa causes all intermediate files to be created, whereas -Ph causes the start-up file, tcc_startup.h, used by tcc to be preserved. The -P option can also be used to specify that intermediate files of various forms should not be preserved. For example, the option -P-o indicates that binary object files should not be preserved.

Most output file names are derived from the input file names with a simple substitution of file suffix, however certain output files (and other files) have default names. These are as follows:

Table 6.2. Default Output Filenames

Filename Description
a.j Default merged TDF capsule name (with -M option)
a.out Default executable name in binary object linking phase
a.ta Default output TDF archive name
a.k Default output file in intermodular checks linking phase
tcc_startup.h Start-up file used by tcc for _D and -U options
tcc_endup.h End-up file used by tcc

If there is a single output file, its name may be specified using the -o option. The default output filenames can also be overridden. For example, -doj b.j sets the default merged TDF capsule name to b.j.

6.2. Compilation Phases

The various compilation phases under the control of tcc may be summarised as follows:

Table 6.3. Compilation Phase

Executable Code Action Description
tdfc c
.c → .j
.i → .j
.c → .j + .k
.i → .j + .k
C to TDF producer
(in intermodular checking mode)
tld L
.j → .t
n × .j → .t
TDF linker
(in TDF merging mode)
trans t .t → .s TDF translator
as a .s → .o System assembler
ld l n × .j → cxcc System linker
(archiver) J n × .j → .ta Archive builder (built into tcc)
(archiver) J .ta → n × .j Archive splitter (built into tcc)
tdfcpp p .c → .i C preprocessor
spec_linker S n × .k → .k C spec linker
disp d
.j → .p
.t → .p
TDF pretty printer
tnc n .p → .j TDF notation compiler
tnc -p n
.j → .p
.t → .p
Reverse TDF notation compiler
pl P .pl → .j PL_TDF compiler
mipstrans t .t → .G + .T TDF to MIPS® translator
asl A .G + .T → .o Auxiliary assembler (MIPS® only)
cc C (various) System C compiler (specified with -cc option)

Each compilation phase is assigned a code letter which is used to identify that phase in various command-line options. For example, in order to pass the -x option to tdfc the -Wc, -x command-line option may be used. Similarly, to set the tld executable an option of the form -EL: /usr/local/bin/tld may be used.

6.3. Command-line Options

The following options are accepted by tcc. They can be given by command-line options or the TCCOPTS system variable. The spaces in the option descriptions are optional, they show where two-part or multi-part options can be split over more than one command-line argument.

Please see the manpage for a complete list of command-line options.

The following two tables list the producer and archiver options which can be passed using the -Wc, opt and -WJ, opt options, respectively.

Table 6.4. Archiver Options

Option Description
-copy or -c copies files into archives
-full or -f uses full pathnames for archive links (default)
-link or -l link files into archives (default)
-names or -n uses real names for copied files (default)
-no_names or -nn make up names for copied files
-no_options or -no does not interpret any archived options
-options or -o interprets any archived options (default)
-short or -s uses short pathnames for archive links

6.4. Compilation Modes

The built-in compilation modes are as follows:

  • Xs ("strict checks") denotes strict ISO/ANSI C with most extra checks enabled as warnings.

  • Xp ("partial checks") denotes strict ISO/ANSI C with some extra checks enabled.

  • Xc ("conformance") denotes strict ISO/ANSI C with no extra checks enabled (this is default).

  • Xa ("ANSI-ish") denotes ISO/ANSI C with syntactic relaxations and no extra checks.

  • Xt ("traditional") denotes traditional C with no extra checks.

The mode Xs is specified by passing the -Xs command-line option to tcc, and so on.

6.5. Supported APIs

The following standard APIs are supported in the form of TenDRA headers:

Table 6.5. Supported APIs

API Name Description Comments
ansi ANSI X3.159 base API (default)
iso ISO MSE 9899:1990 (Amendment 1:1993 (E)) base API
posix POSIX 1003.1 base API
posix2 POSIX 1003.2 base API
xpg3 X/Open Portability Guide 3 base API
xpg4 X/Open Portability Guide 4 base API
cosea COSE 1170 base API
svid3 System V Interface Dcnition 3rd Edition base API
aes AES Revision A base API
bsd_extnb BSD-Like extension for use with POSIX etc. extension API
x5_lib X11 (Release 5) X library extension API
x5_tc X11 (Release 5) Intrinsics Toolkit extension API
x5_mu X11 (Release 5) Miscellaneous Utilities extension API
x5_aw X11 (Release 5) Athena Widgets extension API
x5_mitd X11 (Release 5) MIT implementation extension API
x5_proto X11 (Release 5) Protocol Extension extension API
x5_ext X11 (Release 5) Extensions extension API
motif Motif® 1.1 extension API
system System headers as main API  
system+ System headers as last resort extension API  

  • This API description is based on an early version of the COSE 1170 specificationand may be subject to revision.

  • The BSD extension API consists of a pragmatic collection of BSD types and functions which arc commonly found on non-BSD machines. It roughly corresponds to the BSD component of COSE (contains sockets, select, etc.).

  • The X11 private headers are further protected. If the private headers are required the option -Yx5_private should also be given.

  • This API is designed to cover some of the commonly used features from MIT-based X11 implementations which are not actually part of the X11 specification.

Each API is specified to tcc by means of an environment with the same name as the API. Thus, for example, -Yposix specifies POSIX 1003.1. APIs are divided into two types, base APIs, such as POSIX 1003.1, and extension APIs, such as the X11 (Release 5) Toolkit. A program API consists of a base API plus an number of extension APIs, for example, POSIX plus the X11 Toolkit. This example would be specified by means of the options -Yposix -Yx5_t, in that order (base APIs override the previous API, extension APIs add to it).

Information on the current API may be printed by passing the -info option to tcc.

6.6. Environment Identifiers

The following tcc environment identifiers are recognised:

Table 6.6. Environment Identifiers

Identifier Description
*AS modifies the system assembler executable
*AS1 modifies the auxiliary assembler executable (MIPS® only)
*CC modifies the system compiler executable
*CRT0 modifies the first list of initial default .o files
*CRT1 modifies the second list of initial default .o files
*CRTN modifies the list of final default .o files
*DISP modifies the TDF pretty printer executable
+ENVDIR sets the main environment directory
+FLAG passes a flag to tcc
+FLAG_AS passes a falg to the assembler
+FLAG_CC passes a flag to the system compiler
+FLAG_DISP passes a flag to the TDF pretty printer
+FLAG_INSTALL passes a flag to the TDF archive builder
+FLAG_LD passes a flag to the system linker
+FLAG_SPEC_LINK passes a flag to the C spec linker
+FLAG_TDFC passes a flag to the producer
+FLAG_TDFCPP passes a flag to the preprocessor
+FLAG_TLD passes a flag to the TDF linker
+FLAG_TNC passes a flag to the TDF notation compiler
+FLAG_TRANS passes a flag to the TDF translator
*INCL modifies the list of default include file directories
*INFO modififes the list of API information
*LD modifies the system linker executable
*LIB modifies the list of default TDF libraries
+LINE_START inserts a line in the tcc built-in start-up file
+LINE_END >inserts a line in the tcc built-in end-up file
*LINK modifies the list of default TDF library directories
+MACHINE sets the target machine type
+PORTABILITY sets the producer portability table
*SPEC_LINK modifies the C spec linker executable
*STARTUP modifies the list of default producer start-up files
*STARTUP_DIR modifies the list of default start-up directories
*SYS_LIB modifies the list of default system libraries
*SYS_LIBC modifies the list of standard system libraries
*SYS_LINK modifies the list of default system library directories
*TDFC modifies the producer executable
*TDFCPP modifies the preprocessor executable
+TEMP sets the temporary directory
*TLD modifies the TDF linker executable
*TNC modifies the TDF notation compiler executable
*TRANS modifies the TDF translator executable
+VERSION sets the target machine version (MIPS® only)

* stands for any of the allowed environment modifiers +, < or >.

6.7. Standard Environments

In addition to the environments implementing the supported APIs (see section 7.5. on page 31), the following environments are standard:

Table 6.7. Standard Environments

Environment Description
default default settings (built into tcc)
tcc_diag used to implement the -g option
tcc_pp used to implement the -E and -P options
tcc_prof used to implement the -p option
tcc_time used to implement the -time option
wsl used to implement the -wsl option
Goption used to implement the -G option
K-item used to implement the -K item option
Versions used to implement the -V option
Xmode used to implement the -X mode option
Z-str used to implement the -Z str option
16bit specifies the minimal integer sizes allowed by ANSI
32bit specifies the integer sized found on most 32-bit machines
common equivalent to 32bit
makelib used to construct TDF libraries
pl_tdf used to specify the use of the PL_TDF compiler
status used to make report its environment status
tdp used in TDF library building
tnc used to specify the use of the TDF notation compiler

Chapter 7. Manual Pages

Table of Contents

FIXME: tcc.manpage

Chapter 8. Revision History

This chapter describes revisions to this document.

Only major changes are listed in the revision history. Please see http://www.ten15.org/log/trunk/doc/en/tcc/book.xml?action=follow_copy&rev=&stop_rev=1&mode=follow_copy&verbose=on for a complete list of changes.

Note

CVS revision numbers are located behind the date in the format rXX

Revision History
Revision 1.0 2003/07/30 r1229 verm
Converted to SGML from the TenDRA 4.1.2 Documentation.