BookRags.com Literature Guides Literature
Guides
Criticism & Essays Criticism &
Essays
Questions & Answers Questions &
Answers
Lesson Plans Lesson
Plans
My Bibliography Periodic Table U.S. Presidents Shakespeare Sonnet Shake-Up
Research Anything:        
History | Encyclopedias | Films | News | Create a Bibliography | More... Login | Register | Help

C trigraph

Print-Friendly
About 2 pages (605 words)

Bookmark and Share Know this topic well? Help others and get FREE products!

In the C family of programming languages, a trigraph is a sequence of three characters, the first two of which are both question marks, that represents a single character. The reason for their existence is that the basic character set of C (a subset of the ASCII character set) includes nine characters which lie outside the ISO 646 invariant character set. This can pose a problem for writing source code if the keyboard being used does not support any of these nine characters. The ANSI C committee invented trigraphs as a way of entering source code using keyboards that supported any version of the ISO 646 character set. Non-ASCII ISO 646 character sets are not much used today, but trigraphs remain in the C99 standard[1]. Trigraphs may also be useful with some EBCDIC code pages that lack characters such as { and }. Trigraphs are not commonly encountered outside compiler test suites. Some compilers either have an option to turn recognition of trigraphs off, or disable trigraphs by default and require an option to turn them on. Some can issue warnings when they encounter trigraphs in source files. Borland supplied a separate program, the trigraph preprocessor, to be used only when trigraph processing is desired.

Contents

Trigraph sequences

The C preprocessor replaces all occurrences of the following nine trigraph sequences by their single-character equivalents before any other processing.

    Trigraph     Equivalent
    ========     ==========
      ??=            #
      ??/            \
      ??'            ^
      ??(            [
      ??)            ]
      ??!            |
      ??<            {
      ??>            }
      ??-            ~

Note that ??? is not a trigraph sequence. Note also that the problematic characters are nevertheless required to exist within the implementation, in both the source and execution character sets. The ??/ trigraph can be used to introduce an escaped newline for line splicing; this must be taken into account for correct and efficient handling of trigraphs within the preprocessor. It can also cause surprises, particularly within comments. For example:

 // Will the next line be executed????????????????/
 a++;

which is a single logical comment line, and

 /??/
 * A comment *??/
 /

which is a correctly formed block comment.

Example

An example of a C program that uses all the defined trigraphs:

??=include <stdio.h>                         /* #          */
int main(void)
??<                                          /* {          */
        char n??(5??);                       /* [ and ]    */
        n??(4??) = '0' - (??-0 ??' 1 ??! 2); /* ~, ^ and | */
        printf("%c??/n", n??(4??));          /* /, [ and ] */
        return 0;
??>                                          /* }          */

Disambiguation

A programmer may want to place two question marks together yet not have the compiler treat them as introducing a trigraph. The C grammar does not permit two subsequent ? tokens, so the only places in a C file where two question marks in a row may be used are in multi-character constants, string literals, and comments. To safely place two consecutive question marks within a string literal, the programmer can use string concatenation "...?""?..." or an escape sequence "...?\?...".

Alternatives

In 1994 a normative amendment to the C standard, included in C99, supplied so-called digraphs as more readable alternatives to trigraphs. They are:

    Digraph     Equivalent
    =======     ==========
      <:             [
      :>             ]
      <%             {
      %>             }
      %:             #
      %:%:           ##

Unlike trigraphs, digraphs are handled during tokenization, and it must always represent a full token by itself. If a digraph sequence occurs inside another token, for example a quoted string, or a character constant, it will not be replaced.

References

  1. ^ https://www.securecoding.cert.org/confluence/x/nAE_

View More Summaries on C trigraph
 
Ask any question on C trigraph and get it answered FAST!
Answer questions in BookRags Q&A and earn points toward
discounted or even FREE Study Guides and other BookRags products!
Learn more about BookRags Q&A
Copyrights
C trigraph from Wíkipedia. ©2006 by Wíkipedia. Licensed under the GNU Free Documentation License. View a list of authors or edit this article.

Article Navigation
Join BookRagslearn moreJoin BookRags




About BookRags | Customer Service | Report an Error | Terms of Use | Privacy Policy