Programming languages -- C 1. Scope [#1] This International Standard specifies the form and establishes the interpretation of programs written in the C programming language.1) It specifies -- the representation of C programs; -- the syntax and constraints of the C language; -- the semantic rules for interpreting C programs; -- the representation of input data to be processed by C programs; -- the representation of output data produced by C programs; -- the restrictions and limits imposed by a conforming implementation of C. [#2] This International Standard does not specify -- the mechanism by which C programs are transformed for use by a data-processing system; -- the mechanism by which C programs are invoked for use by a data-processing system; -- the mechanism by which input data are transformed for use by a C program; -- the mechanism by which output data are transformed after being produced by a C program; -- the size or complexity of a program and its data that will exceed the capacity of any specific data- processing system or the capacity of a particular processor; -- all minimal requirements of a data-processing system that is capable of supporting a conforming implementation. ____________________ 1) This International Standard is designed to promote the portability of C programs among a variety of data- processing systems. It is intended for use by implementors and programmers. 1 General 1 2 Committee Draft -- August 3, 1998 WG14/N843 2. Normative references [#1] The following normative documents contain provisions which, through reference in this text, constitute provisions of this International Standard. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. However, parties to agreements based on this International Standard are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references, the latest edition of the normative document referred to applies. Members of ISO and IEC maintain registers of currently valid International Standards. [#2] ISO/IEC 646:1991, Information technology -- ISO 7-bit coded character set for information interchange. [#3] ISO/IEC 2382-1:1993, Information technology -- Vocabulary -- Part 1: Fundamental terms. [#4] ISO 4217:1995, Codes for the representation of currencies and funds. [#5] ISO 8601:1988, Data elements and interchange formats -- Information interchange -- Representation of dates and times. [#6] ISO/IEC 10646:1993, Information technology -- Universal Multiple-Octet Coded Character Set (UCS). [#7] IEC 60559:1989, Binary floating-point arithmetic for microprocessor systems, second edition (previously designated IEC 559:1989). 3. Terms and definitions [#1] For the purposes of this International Standard, the following definitions apply. Other terms are defined where they appear in italic type or on the left side of a syntax rule. Terms explicitly defined in this International Standard are not to be presumed to refer implicitly to similar terms defined elsewhere. Terms not defined in this International Standard are to be interpreted according to ISO/IEC 2382-1. 3.1 [#1] alignment requirement that objects of a particular type be located on storage boundaries with addresses that are particular multiples of a byte address 1 General 3.1 WG14/N843 Committee Draft -- August 3, 1998 3 3.2 [#1] argument actual argument actual parameter (deprecated) expression in the comma-separated list bounded by the parentheses in a function call expression, or a sequence of preprocessing tokens in the comma-separated list bounded by the parentheses in a function-like macro invocation 3.3 [#1] bit unit of data storage in the execution environment large enough to hold an object that may have one of two values [#2] NOTE It need not be possible to express the address of each individual bit of an object. 3.4 [#1] byte addressable unit of data storage large enough to hold any member of the basic character set of the execution environment [#2] NOTE 1 It is possible to express the address of each individual byte of an object uniquely. [#3] NOTE 2 A byte is composed of a contiguous sequence of bits, the number of which is implementation-defined. The least significant bit is called the low-order bit; the most significant bit is called the high-order bit. 3.5 [#1] character bit representation that fits in a byte 3.6 [#1] constraints restrictions, both syntactic and semantic, by which the exposition of language elements is to be interpreted 3.7 [#1] correctly rounded result a representation in the result format that is nearest in value, subject to the effective rounding mode, to what the result would be given unlimited range and precision 3.8 [#1] diagnostic message message belonging to an implementation-defined subset of the implementation's message output 3.9 [#1] forward references references to later subclauses of this International 3.2 General 3.9 4 Committee Draft -- August 3, 1998 WG14/N843 Standard that contain additional information relevant to this subclause 3.10 [#1] implementation a particular set of software, running in a particular translation environment under particular control options, that performs translation of programs for, and supports execution of functions in, a particular execution environment 3.11 [#1] implementation-defined behavior unspecified behavior where each implementation documents how the choice is made [#2] EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right. 3.12 [#1] implementation limits restrictions imposed upon programs by the implementation 3.13 [#1] locale-specific behavior behavior that depends on local conventions of nationality, culture, and language that each implementation documents [#2] EXAMPLE An example of locale-specific behavior is whether the islower function returns true for characters other than the 26 lowercase Latin letters. 3.14 [#1] multibyte character sequence of one or more bytes representing a member of the extended character set of either the source or the execution environment [#2] NOTE The extended character set is a superset of the basic character set. 3.15 [#1] object region of data storage in the execution environment, the contents of which can represent values [#2] NOTE When referenced, an object may be interpreted as having a particular type; see 6.3.2.1. 3.9 General 3.15 WG14/N843 Committee Draft -- August 3, 1998 5 3.16 [#1] parameter formal parameter formal argument (deprecated) object declared as part of a function declaration or definition that acquires a value on entry to the function, or an identifier from the comma-separated list bounded by the parentheses immediately following the macro name in a function-like macro definition 3.17 [#1] recommended practice specifications that are strongly recommended as being in keeping with the intent of the standard, but that may be impractical for some implementations 3.18 [#1] undefined behavior behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements [#2] NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message). [#3] EXAMPLE An example of undefined behavior is the behavior on integer overflow. 3.19 [#1] unspecified behavior behavior where this International Standard provides two or more possibilities and imposes no requirements on which is chosen in any instance [#2] EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are evaluated. Forward references: bitwise shift operators (6.5.7), expressions (6.5), function calls (6.5.2.2), the islower function (7.4.1.6), localization (7.11). 3.16 General 3.19 6 Committee Draft -- August 3, 1998 WG14/N843 4. Conformance [#1] In this International Standard, ``shall'' is to be interpreted as a requirement on an implementation or on a program; conversely, ``shall not'' is to be interpreted as a prohibition. [#2] If a ``shall'' or ``shall not'' requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ``undefined behavior'' or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ``behavior that is undefined''. [#3] A program that is correct in all other aspects, operating on correct data, containing unspecified behavior shall be a correct program and act in accordance with 5.1.2.3. [#4] The implementation shall not successfully translate a preprocessing translation unit containing a #error preprocessing directive unless it is part of a group skipped by conditional inclusion. [#5] A strictly conforming program shall use only those features of the language and library specified in this International Standard.2) It shall not produce output dependent on any unspecified, undefined, or implementation- defined behavior, and shall not exceed any minimum implementation limit. [#6] The two forms of conforming implementation are hosted and freestanding. A conforming hosted implementation shall accept any strictly conforming program. A conforming freestanding implementation shall accept any strictly conforming program that does not use complex types and in which the use of the features specified in the library clause (clause 7) is confined to the contents of the standard headers , , , , , , and . A conforming implementation may have extensions (including additional library functions), provided they do not alter ____________________ 2) A strictly conforming program can use conditional features (such as those in annex F) provided the use is guarded by a #ifdef directive with the appropriate macro. For example: #ifdef __STDC_IEC_559__ /* FE_UPWARD defined */ /* ... */ fesetround(FE_UPWARD); /* ... */ #endif 4 General 4 WG14/N843 Committee Draft -- August 3, 1998 7 the behavior of any strictly conforming program.3) [#7] A conforming program is one that is acceptable to a conforming implementation.4) [#8] An implementation shall be accompanied by a document that defines all implementation-defined and locale-specific characteristics and all extensions. Forward references: conditional inclusion (6.10.1), characteristics of floating types (7.7), alternative spellings (7.9), sizes of integer types (7.10), variable arguments (7.15), boolean type and values (7.16), common definitions (7.17), integer types (7.18). ____________________ 3) This implies that a conforming implementation reserves no identifiers other than those explicitly reserved in this International Standard. 4) Strictly conforming programs are intended to be maximally portable among conforming implementations. Conforming programs may depend upon nonportable features of a conforming implementation. 4 General 4 8 Committee Draft -- August 3, 1998 WG14/N843 5. Environment [#1] An implementation translates C source files and executes C programs in two data-processing-system environments, which will be called the translation environment and the execution environment in this International Standard. Their characteristics define and constrain the results of executing conforming C programs constructed according to the syntactic and semantic rules for conforming implementations. Forward references: In this clause, only a few of many possible forward references have been noted. 5.1 Conceptual models 5.1.1 Translation environment 5.1.1.1 Program structure [#1] A C program need not all be translated at the same time. The text of the program is kept in units called source files, (or preprocessing files) in this International Standard. A source file together with all the headers and source files included via the preprocessing directive #include is known as a preprocessing translation unit. After preprocessing, a preprocessing translation unit is called a translation unit. Previously translated translation units may be preserved individually or in libraries. The separate translation units of a program communicate by (for example) calls to functions whose identifiers have external linkage, manipulation of objects whose identifiers have external linkage, or manipulation of data files. Translation units may be separately translated and then later linked to produce an executable program. Forward references: conditional inclusion (6.10.1), linkages of identifiers (6.2.2), source file inclusion (6.10.2), external definitions (6.9), preprocessing directives (6.10). 5.1.1.2 Translation phases [#1] The precedence among the syntax rules of translation is specified by the following phases.5) 1. Physical source file multibyte characters are mapped to the source character set (introducing new-line characters for end-of-line indicators) if necessary. ____________________ 5) Implementations shall behave as if these separate phases occur, even though many are typically folded together in practice. 5 Environment 5.1.1.2 WG14/N843 Committee Draft -- August 3, 1998 9 Trigraph sequences are replaced by corresponding single-character internal representations. 2. Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines. If, as a result, a character sequence that matches the syntax of a universal character name is produced, the behavior is undefined. Only the last backslash on any physical source line shall be eligible for being part of such a splice. A source file that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing takes place. 3. The source file is decomposed into preprocessing tokens6) and sequences of white-space characters (including comments). A source file shall not end in a partial preprocessing token or in a partial comment. Each comment is replaced by one space character. New- line characters are retained. Whether each nonempty sequence of white-space characters other than new-line is retained or replaced by one space character is implementation-defined. 4. Preprocessing directives are executed, macro invocations are expanded, and _Pragma unary operator expressions are executed. If a character sequence that matches the syntax of a universal character name is produced by token concatenation (6.10.3.3), the behavior is undefined. A #include preprocessing directive causes the named header or source file to be processed from phase 1 through phase 4, recursively. All preprocessing directives are then deleted. 5. Each source character set member, escape sequence, and universal character name in character constants and string literals is converted to the corresponding member of the execution character set; if there is no corresponding member, it is converted to an implementation-defined member. 6. Adjacent string literal tokens are concatenated. 7. White-space characters separating tokens are no longer significant. Each preprocessing token is converted into a token. The resulting tokens are syntactically and semantically analyzed and translated as a ____________________ 6) As described in 6.4, the process of dividing a source file's characters into preprocessing tokens is context- dependent. For example, see the handling of < within a #include preprocessing directive. 5.1.1.2 Environment 5.1.1.2 10 Committee Draft -- August 3, 1998 WG14/N843 translation unit. 8. All external object and function references are resolved. Library components are linked to satisfy external references to functions and objects not defined in the current translation. All such translator output is collected into a program image which contains information needed for execution in its execution environment. Forward references: universal character names (6.4.3), lexical elements (6.4), preprocessing directives (6.10), trigraph sequences (5.2.1.1), external definitions (6.9). 5.1.1.3 Diagnostics [#1] A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined. Diagnostic messages need not be produced in other circumstances.7) [#2] EXAMPLE An implementation shall issue a diagnostic for the translation unit: char i; int i; because in those cases where wording in this International Standard describes the behavior for a construct as being both a constraint error and resulting in undefined behavior, the constraint error shall be diagnosed. 5.1.2 Execution environments [#1] Two execution environments are defined: freestanding and hosted. In both cases, program startup occurs when a designated C function is called by the execution environment. All objects in static storage shall be initialized (set to their initial values) before program startup. The manner and timing of such initialization are otherwise unspecified. Program termination returns control to the execution environment. ____________________ 7) The intent is that an implementation should identify the nature of, and where possible localize, each violation. Of course, an implementation is free to produce any number of diagnostics as long as a valid program is still correctly translated. It may also successfully translate an invalid program. 5.1.1.2 Environment 5.1.2 WG14/N843 Committee Draft -- August 3, 1998 11 Forward references: initialization (6.7.8). 5.1.2.1 Freestanding environment [#1] In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined. [#2] The effect of program termination in a freestanding environment is implementation-defined. 5.1.2.2 Hosted environment [#1] A hosted environment need not be provided, but shall conform to the following specifications if present. 5.1.2.2.1 Program startup [#1] The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters: int main(void) { /* ... */ } or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared): int main(int argc, char *argv[]) { /* ... */ } or equivalent;8) or in some other implementation-defined manner. [#2] If they are declared, the parameters to the main function shall obey the following constraints: -- The value of argc shall be nonnegative. -- argv[argc] shall be a null pointer. -- If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup. The intent is to supply to ____________________ 8) Thus, int can be replaced by a typedef name defined as int, or the type of argv can be written as char ** argv, and so on. 5.1.2 Environment 5.1.2.2.1 12 Committee Draft -- August 3, 1998 WG14/N843 the program information determined prior to program startup from elsewhere in the hosted environment. If the host environment is not capable of supplying strings with letters in both uppercase and lowercase, the implementation shall ensure that the strings are received in lowercase. -- If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment. If the value of argc is greater than one, the strings pointed to by argv[1] through argv[argc-1] represent the program parameters. -- The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination. 5.1.2.2.2 Program execution [#1] In a hosted environment, a program may use all the functions, macros, type definitions, and objects described in the library clause (clause 7). 5.1.2.2.3 Program termination [#1] If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument;9) reaching the } that terminates the main function returns a value of 0. If the return type is not compatible with int, the termination status returned to the host environment is unspecified. Forward references: definition of terms (7.1.1), the exit function (7.20.4.3). 5.1.2.3 Program execution [#1] The semantic descriptions in this International Standard describe the behavior of an abstract machine in which issues of optimization are irrelevant. [#2] Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of ____________________ 9) In accordance with 6.2.4, objects with automatic storage duration declared in main will no longer have storage guaranteed to be reserved in the former case even where they would in the latter. 5.1.2.2.1 Environment 5.1.2.3 WG14/N843 Committee Draft -- August 3, 1998 13 those operations are all side effects,10) which are changes in the state of the execution environment. Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place. (A summary of the sequence points is given in annex C.) [#3] In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or accessing a volatile object). [#4] When the processing of the abstract machine is interrupted by receipt of a signal, only the values of objects as of the previous sequence point may be relied on. Objects that may be modified between the previous sequence point and the next sequence point need not have received their correct values yet. [#5] An instance of each object with automatic storage duration is associated with each entry into its block. Such an object exists and retains its last-stored value during the execution of the block and while the block is suspended (by a call of a function or receipt of a signal). [#6] The least requirements on a conforming implementation are: -- At sequence points, volatile objects are stable in the sense that previous accesses are complete and subsequent accesses have not yet occurred. -- At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced. ____________________ 10)The IEC 60559 standard for binary floating-point arithmetic requires certain user-accessible status flags and control modes. Floating-point operations implicitly set the status flags; modes affect result values of floating-point operations. Implementations that support such floating-point state are required to regard changes to it as side effects -- see annex F for details. The floating-point environment library provides a programming facility for indicating when these side effects matter, freeing the implementations in other cases. 5.1.2.3 Environment 5.1.2.3 14 Committee Draft -- August 3, 1998 WG14/N843 -- The input and output dynamics of interactive devices shall take place as specified in 7.19.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input. [#7] What constitutes an interactive device is implementation-defined. [#8] More stringent correspondences between abstract and actual semantics may be defined by each implementation. [#9] EXAMPLE 1 An implementation might define a one-to-one correspondence between abstract and actual semantics: at every sequence point, the values of the actual objects would agree with those specified by the abstract semantics. The keyword volatile would then be redundant. [#10] Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. In such an implementation, at the time of each function entry and function return where the calling function and the called function are in different translation units, the values of all externally linked objects and of all objects accessible via pointers therein would agree with the abstract semantics. Furthermore, at the time of each such function entry the values of the parameters of the called function and of all objects accessible via pointers therein would agree with the abstract semantics. In this type of implementation, objects referred to by interrupt service routines activated by the signal function would require explicit specification of volatile storage, as well as other implementation-defined restrictions. [#11] EXAMPLE 2 In executing the fragment char c1, c2; /* ... */ c1 = c1 + c2; the ``integer promotions'' require that the abstract machine promote the value of each variable to int size and then add the two ints and truncate the sum. Provided the addition of two chars can be done without overflow, or with overflow wrapping silently to produce the correct result, the actual execution need only produce the same result, possibly omitting the promotions. 5.1.2.3 Environment 5.1.2.3 WG14/N843 Committee Draft -- August 3, 1998 15 [#12] EXAMPLE 3 Similarly, in the fragment float f1, f2; double d; /* ... */ f1 = f2 * d; the multiplication may be executed using single-precision arithmetic if the implementation can ascertain that the result would be the same as if it were executed using double-precision arithmetic (for example, if d were replaced by the constant 2.0, which has type double). [#13] EXAMPLE 4 Implementations employing wide registers have to take care to honor appropriate semantics. Values are independent of whether they are represented in a register or in memory. For example, an implicit spilling of a register is not permitted to alter the value. Also, an explicit store and load is required to round to the precision of the storage type. In particular, casts and assignments are required to perform their specified conversion. For the fragment double d1, d2; float f; d1 = f = expression; d2 = (float) expressions; the values assigned to d1 and d2 are required to have been converted to float. [#14] EXAMPLE 5 Rearrangement for floating-point expressions is often restricted because of limitations in precision as well as range. The implementation cannot generally apply the mathematical associative rules for addition or multiplication, nor the distributive rule, because of roundoff error, even in the absence of overflow and underflow. Likewise, implementations cannot generally replace decimal constants in order to rearrange expressions. In the following fragment, rearrangements suggested by mathematical rules for real numbers are often not valid (see F.8). double x, y, z; /* ... */ x = (x * y) * z; // not equivalent to x *= y * z; z = (x - y) + y ; // not equivalent to z = x; z = x + x * y; // not equivalent to z = x * (1.0 + y); y = x / 5.0; // not equivalent to y = x * 0.2; 5.1.2.3 Environment 5.1.2.3 16 Committee Draft -- August 3, 1998 WG14/N843 [#15] EXAMPLE 6 To illustrate the grouping behavior of expressions, in the following fragment int a, b; /* ... */ a = a + 32760 + b + 5; the expression statement behaves exactly the same as a = (((a + 32760) + b) + 5); due to the associativity and precedence of these operators. Thus, the result of the sum (a + 32760) is next added to b, and that result is then added to 5 which results in the value assigned to a. On a machine in which overflows produce an explicit trap and in which the range of values representable by an int is [-32768, +32767], the implementation cannot rewrite this expression as a = ((a + b) + 32765); since if the values for a and b were, respectively, -32754 and -15, the sum a + b would produce a trap while the original expression would not; nor can the expression be rewritten either as a = ((a + 32765) + b); or a = (a + (b + 32765)); since the values for a and b might have been, respectively, 4 and -8 or -17 and 12. However, on a machine in which overflow silently generates some value and where positive and negative overflows cancel, the above expression statement can be rewritten by the implementation in any of the above ways because the same result will occur. [#16] EXAMPLE 7 The grouping of an expression does not completely determine its evaluation. In the following fragment #include int sum; char *p; /* ... */ sum = sum * 10 - '0' + (*p++ = getchar()); the expression statement is grouped as if it were written as sum = (((sum * 10) - '0') + ((*(p++)) = (getchar()))); but the actual increment of p can occur at any time between the previous sequence point and the next sequence point (the 5.1.2.3 Environment 5.1.2.3 WG14/N843 Committee Draft -- August 3, 1998 17 ;), and the call to getchar can occur at any point prior to the need of its returned value. Forward references: compound statement, or block (6.8.2), expressions (6.5), files (7.19.3), sequence points (6.5, 6.8), the signal function (7.14), type qualifiers (6.7.3). 5.1.2.3 Environment 5.1.2.3 18 Committee Draft -- August 3, 1998 WG14/N843 5.2 Environmental considerations 5.2.1 Character sets [#1] Two sets of characters and their associated collating sequences shall be defined: the set in which source files are written, and the set interpreted in the execution environment. The values of the members of the execution character set are implementation-defined; any additional members beyond those required by this subclause are locale- specific. [#2] In a character constant or string literal, members of the execution character set shall be represented by corresponding members of the source character set or by escape sequences consisting of the backslash \ followed by one or more characters. A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string. [#3] Both the basic source and basic execution character sets shall have at least the following members: the 26 uppercase letters of the Latin alphabet A B C D E F G H I J K L M N O P Q R S T U V W X Y Z the 26 lowercase letters of the Latin alphabet a b c d e f g h i j k l m n o p q r s t u v w x y z the 10 decimal digits 0 1 2 3 4 5 6 7 8 9 the following 29 graphic characters ! " # % & ' ( ) * + , - . / : ; < = > ? [ \ ] ^ _ { | } ~ the space character, and control characters representing horizontal tab, vertical tab, and form feed. The representation of each member of the source and execution basic character sets shall fit in a byte. In both the source and execution basic character sets, the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous. In source files, there shall be some way of indicating the end of each line of text; this International Standard treats such an end-of-line indicator as if it were a single new-line character. In the execution character set, there shall be control characters representing alert, backspace, carriage 5.2 Environment 5.2.1 WG14/N843 Committee Draft -- August 3, 1998 19 return, and new line. If any other characters are encountered in a source file (except in an identifier, a character constant, a string literal, a header name, a comment, or a preprocessing token that is never converted to a token), the behavior is undefined. [#4] The universal character name construct provides a way to name other characters. Forward references: universal character names (6.4.3), character constants (6.4.4.4), preprocessing directives (6.10), string literals (6.4.5), comments (6.4.9), string (7.1.1). 5.2.1.1 Trigraph sequences [#1] All occurrences in a source file of the following sequences of three characters (called trigraph sequences11)) are replaced with the corresponding single character. ??= # ??) ] ??! | ??( [ ??' ^ ??> } ??/ \ ??< { ??- ~ No other trigraph sequences exist. Each ? that does not begin one of the trigraphs listed above is not changed. [#2] EXAMPLE The following source line printf("Eh???/n"); becomes (after replacement of the trigraph sequence ??/) printf("Eh?\n"); 5.2.1.2 Multibyte characters [#1] The source character set may contain multibyte characters, used to represent members of the extended character set. The execution character set may also contain multibyte characters, which need not have the same encoding as for the source character set. For both character sets, the following shall hold: -- The single-byte characters defined in 5.2.1 shall be present. ____________________ 11)The trigraph sequences enable the input of characters that are not defined in the Invariant Code Set as described in ISO/IEC 646, which is a subset of the seven- bit US ASCII code set. 5.2.1 Environment 5.2.1.2 20 Committee Draft -- August 3, 1998 WG14/N843 -- The presence, meaning, and representation of any additional members is locale-specific. -- A multibyte character set may have a state-dependent encoding, wherein each sequence of multibyte characters begins in an initial shift state and enters other locale-specific shift states when specific multibyte characters are encountered in the sequence. While in the initial shift state, all single-byte characters retain their usual interpretation and do not alter the shift state. The interpretation for subsequent bytes in the sequence is a function of the current shift state. -- A byte with all bits zero shall be interpreted as a null character independent of shift state. -- A byte with all bits zero shall not occur in the second or subsequent bytes of a multibyte character. [#2] For source files, the following shall hold: -- An identifier, comment, string literal, character constant, or header name shall begin and end in the initial shift state. -- An identifier, comment, string literal, character constant, or header name shall consist of a sequence of valid multibyte characters. 5.2.2 Character display semantics [#1] The active position is that location on a display device where the next character output by the fputc or fputwc function would appear. The intent of writing a printable character (as defined by the isprint or iswprint function) to a display device is to display a graphic representation of that character at the active position and then advance the active position to the next position on the current line. The direction of writing is locale-specific. If the active position is at the final position of a line (if there is one), the behavior is unspecified. [#2] Alphabetic escape sequences representing nongraphic characters in the execution character set are intended to produce actions on display devices as follows: \a (alert) Produces an audible or visible alert. The active position shall not be changed. \b (backspace) Moves the active position to the previous position on the current line. If the active position is at the initial position of a line, the behavior is unspecified. 5.2.1.2 Environment 5.2.2 WG14/N843 Committee Draft -- August 3, 1998 21 \f (form feed) Moves the active position to the initial position at the start of the next logical page. \n (new line) Moves the active position to the initial position of the next line. \r (carriage return) Moves the active position to the initial position of the current line. \t (horizontal tab) Moves the active position to the next horizontal tabulation position on the current line. If the active position is at or past the last defined horizontal tabulation position, the behavior is unspecified. \v (vertical tab) Moves the active position to the initial position of the next vertical tabulation position. If the active position is at or past the last defined vertical tabulation position, the behavior is unspecified. [#3] Each of these escape sequences shall produce a unique implementation-defined value which can be stored in a single char object. The external representations in a text file need not be identical to the internal representations, and are outside the scope of this International Standard. Forward references: the isprint function (7.4.1.7), the fputc function (7.19.7.3), the fputwc functions (7.24.3.3), the iswprint function (7.25.2.1.7). 5.2.3 Signals and interrupts [#1] Functions shall be implemented such that they may be interrupted at any time by a signal, or may be called by a signal handler, or both, with no alteration to earlier, but still active, invocations' control flow (after the interruption), function return values, or objects with automatic storage duration. All such objects shall be maintained outside the function image (the instructions that compose the executable representation of a function) on a per-invocation basis. 5.2.4 Environmental limits [#1] Both the translation and execution environments constrain the implementation of language translators and libraries. The following summarizes the language-related environmental limits on a conforming implementation; the library-related limits are discussed in clause 7. 5.2.2 Environment 5.2.4 22 Committee Draft -- August 3, 1998 WG14/N843 5.2.4.1 Translation limits [#1] The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:12) -- 127 nesting levels of compound statements, iteration statements, and selection statements -- 63 nesting levels of conditional inclusion -- 12 pointer, array, and function declarators (in any combinations) modifying an arithmetic, structure, union, or incomplete type in a declaration -- 63 nesting levels of parenthesized declarators within a full declarator -- 63 nesting levels of parenthesized expressions within a full expression -- 63 significant initial characters in an internal identifier or a macro name (each universal character name or extended source character is considered a single character) -- 31 significant initial characters in an external identifier (each universal character name specifying a character short identifier of 0000FFFF or less is considered 6 characters, each universal character name specifying a character short identifier of 00010000 or more is considered 10 characters, and each extended source character is considered the same number of characters as the corresponding universal character name, if any) -- 4095 external identifiers in one translation unit -- 511 identifiers with block scope declared in one block -- 4095 macro identifiers simultaneously defined in one preprocessing translation unit -- 127 parameters in one function definition -- 127 arguments in one function call -- 127 parameters in one macro definition ____________________ 12)Implementations should avoid imposing fixed translation limits whenever possible. 5.2.4.1 Environment 5.2.4.1 WG14/N843 Committee Draft -- August 3, 1998 23 -- 127 arguments in one macro invocation -- 4095 characters in a logical source line -- 4095 characters in a character string literal or wide string literal (after concatenation) -- 65535 bytes in an object (in a hosted environment only) -- 15 nesting levels for #included files -- 1023 case labels for a switch statement (excluding those for any nested switch statements) -- 1023 members in a single structure or union -- 1023 enumeration constants in a single enumeration -- 63 levels of nested structure or union definitions in a single struct-declaration-list 5.2.4.2 Numerical limits [#1] A conforming implementation shall document all the limits specified in this subclause, which are specified in the headers and . Additional limits are specified in . 5.2.4.2.1 Sizes of integer types [#1] The values given below shall be replaced by constant expressions suitable for use in #if preprocessing directives. Moreover, except for CHAR_BIT and MB_LEN_MAX, the following shall be replaced by expressions that have the same type as would an expression that is an object of the corresponding type converted according to the integer promotions. Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign. -- number of bits for smallest object that is not a bit- field (byte) CHAR_BIT 8 -- minimum value for an object of type signed char SCHAR_MIN -127 // -(27-1) -- maximum value for an object of type signed char SCHAR_MAX +127 // 27-1 -- maximum value for an object of type unsigned char UCHAR_MAX 255 // 28-1 5.2.4.1 Environment 5.2.4.2.1 24 Committee Draft -- August 3, 1998 WG14/N843 -- minimum value for an object of type char CHAR_MIN see below -- maximum value for an object of type char CHAR_MAX see below -- maximum number of bytes in a multibyte character, for any supported locale MB_LEN_MAX 1 -- minimum value for an object of type short int SHRT_MIN -32767 // -(215-1) -- maximum value for an object of type short int SHRT_MAX +32767 // 215-1 -- maximum value for an object of type unsigned short int USHRT_MAX 65535 // 216-1 -- minimum value for an object of type int INT_MIN -32767 // -(215-1) -- maximum value for an object of type int INT_MAX +32767 // 215-1 -- maximum value for an object of type unsigned int UINT_MAX 65535 // 216-1 -- minimum value for an object of type long int LONG_MIN -2147483647 // -(231-1) -- maximum value for an object of type long int LONG_MAX +2147483647 // 231-1 -- maximum value for an object of type unsigned long int ULONG_MAX 4294967295 // 232-1 -- minimum value for an object of type long long int LLONG_MIN -9223372036854775807 // -(263-1) -- maximum value for an object of type long long int LLONG_MAX +9223372036854775807 // 263-1 -- maximum value for an object of type unsigned long long int ULLONG_MAX 18446744073709551615 // 264-1 [#2] If the value of an object of type char is treated as a signed integer when used in an expression, the value of CHAR_MIN shall be the same as that of SCHAR_MIN and the value of CHAR_MAX shall be the same as that of SCHAR_MAX. Otherwise, the value of CHAR_MIN shall be 0 and the value of CHAR_MAX shall be the same as that of UCHAR_MAX.13) The value UCHAR_MAX+1 shall equal 2 raised to the power 5.2.4.2.1 Environment 5.2.4.2.1 WG14/N843 Committee Draft -- August 3, 1998 25 CHAR_BIT. 5.2.4.2.2 Characteristics of floating types [#1] The characteristics of floating types are defined in terms of a model that describes a representation of floating-point numbers and values that provide information about an implementation's floating-point arithmetic.14) The following parameters are used to define the model for each floating-point type: s sign (±1) b base or radix of exponent representation (an integer > 1) e exponent (an integer between a minimum emin and a maximum e max) p precision (the number of base-b digits in the significand) fk nonnegative integers less than b (the significand digits) [#2] A normalized floating-point number x (f1 > 0 if x != 0) is defined by the following model: x=s×be×k=1fk×b-k,emin<=e<=emax [#3] Floating types may include values that are not normalized floating-point numbers, for example subnormal floating-point numbers (x!=0,e=emin,f1=0), infinities, and NaNs.15) A NaN is an encoding signifying Not-a-Number. A quiet NaN propagates through almost every arithmetic operation without raising an exception; a signaling NaN generally raises an exception when occurring as an arithmetic operand.16) [#4] The accuracy of the floating-point operations (+, -, *, /) and of the library functions in and that return floating-point results is implementation defined. The implementation may state that the accuracy is unknown. [#5] All integer values in the header, except ____________________ 13)See 6.2.5. 14)The floating-point model is intended to clarify the description of each floating-point characteristic and does not require the floating-point arithmetic of the implementation to be identical. 15)Although they are stored in floating types, infinities and NaNs are not floating-point numbers. 16)IEC 60559:1989 specifies quiet and signaling NaNs. For implementations that do not support IEC 60559:1989, the terms quiet NaN and signaling NaN are intended to apply to encodings with similar behavior. 5.2.4.2.1 Environment 5.2.4.2.2 26 Committee Draft -- August 3, 1998 WG14/N843 FLT_ROUNDS, shall be constant expressions suitable for use in #if preprocessing directives; all floating values shall be constant expressions. All except DECIMAL_DIG, FLT_EVAL_METHOD, FLT_RADIX, and FLT_ROUNDS have separate names for all three floating-point types. The floating- point model representation is provided for all values except FLT_EVAL_METHOD and FLT_ROUNDS. [#6] The rounding mode for floating-point addition is characterized by the value of FLT_ROUNDS:17) -1 indeterminable 0 toward zero 1 to nearest 2 toward positive infinity 3 toward negative infinity All other values for FLT_ROUNDS characterize implementation- defined rounding behavior. [#7] The values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type. The use of evaluation formats is characterized by the value of FLT_EVAL_METHOD:18) -1 indeterminable; 0 evaluate all operations and constants just to the range and precision of the type; 1 evaluate operations and constants of type float and double to the range and precision of the double type, evaluate long double operations and constants to the range and precision of the long double type; 2 evaluate all operations and constants to the range and precision of the long double type. All other negative values for FLT_EVAL_METHOD characterize implementation-defined behavior. ____________________ 17)Evaluation of FLT_ROUNDS correctly reflects any execution-time change of rounding mode through the function fesetround in . 18)The evaluation method determines evaluation formats of expressions involving all floating types, not just real types. For example, if FLT_EVAL_METHOD is 1, then the product of two float _Complex operands is represented in the double _Complex format, and its parts are evaluated to double. 5.2.4.2.2 Environment 5.2.4.2.2 WG14/N843 Committee Draft -- August 3, 1998 27 [#8] The values given in the following list shall be replaced by implementation-defined constant expressions with values that are greater or equal in magnitude (absolute value) to those shown, with the same sign: -- radix of exponent representation, b FLT_RADIX 2 -- number of base-FLT_RADIX digits in the floating-point significand, p FLT_MANT_DIG DBL_MANT_DIG LDBL_MANT_DIG -- number of decimal digits, n, such that any floating- point number in the widest supported floating type with pmax radix b digits can be rounded to a floating-point number with n decimal digits and back again without changpmax×log10blueif b is a power of 10 |1+pmax×log10b|otherwise DECIMAL_DIG 10 -- number of decimal digits, q, such that any floating- point number with q decimal digits can be rounded into a floating-point number with p radix b digits and back again without change to the q decimal digits, 5.2.4.2.2 Environment 5.2.4.2.2 28 Committee Draft -- August 3, 1998 WG14/N843 p×log10b if b is a power of 10 |(p-1)×log10b|otherwise FLT_DIG 6 DBL_DIG 10 LDBL_DIG 10 -- minimum negative integer such that FLT_RADIX raised to one less than that power is a normalized floating-point number, emin FLT_MIN_EXP DBL_MIN_EXP LDBL_MIN_EXP -- minimum negative integer such that 10 raised to that power is in the range of normalized floating-point numbers, |log10bemin-1| FLT_MIN_10_EXP -37 DBL_MIN_10_EXP -37 LDBL_MIN_10_EXP -37 -- maximum integer such that FLT_RADIX raised to one less than that power is a representable finite floating- point number, emax FLT_MAX_EXP DBL_MAX_EXP LDBL_MAX_EXP -- maximum integer such that 10 raised to that power is in the range of representable finite floating-point numbers, |log10((1-b-p)×bemax)| 5.2.4.2.2 Environment 5.2.4.2.2 WG14/N843 Committee Draft -- August 3, 1998 29 FLT_MAX_10_EXP +37 DBL_MAX_10_EXP +37 LDBL_MAX_10_EXP +37 [#9] The values given in the following list shall be replaced by implementation-defined constant expressions with values that are greater than or equal to those shown: -- maximum representable finite floating-point number, (1-b-p)×bemax FLT_MAX 1E+37 DBL_MAX 1E+37 LDBL_MAX 1E+37 [#10] The values given in the following list shall be replaced by implementation-defined constant expressions with (positive) values that are less than or equal to those shown: -- the difference between 1 and the least value greater than 1 that is representable in the given floating point type, b1-p FLT_EPSILON 1E-5 DBL_EPSILON 1E-9 LDBL_EPSILON 1E-9 -- minimum normalized positive floating-point number, bemin-1 FLT_MIN 1E-37 DBL_MIN 1E-37 LDBL_MIN 1E-37 [#11] EXAMPLE 1 The following describes an artificial floating-point representation that meets the minimum requirements of this International Standard, and the appropriate values in a header for type float: x=s×16e×k=1fk×16-k,-31<=e<=+32 FLT_RADIX 16 FLT_MANT_DIG 6 FLT_EPSILON 9.53674316E-07F FLT_DIG 6 FLT_MIN_EXP -31 FLT_MIN 2.93873588E-39F FLT_MIN_10_EXP -38 FLT_MAX_EXP +32 FLT_MAX 3.40282347E+38F FLT_MAX_10_EXP +38 5.2.4.2.2 Environment 5.2.4.2.2 30 Committee Draft -- August 3, 1998 WG14/N843 [#12] EXAMPLE 2 The following describes floating-point representations that also meet the requirements for single- precision and double-precision normalized numbers in IEC 60559,19) and the appropriate values in a header for types float and double: xf=s×2e×k=1fk×2-k,-125<=e<=+128 xd=s×2e×k=1fk×2-k,-1021<=e<=+1024 FLT_RADIX 2 DECIMAL_DIG 17 FLT_MANT_DIG 24 FLT_EPSILON 1.19209290E-07F // decimal constant FLT_EPSILON 0X1P-23F // hex constant FLT_DIG 6 FLT_MIN_EXP -125 FLT_MIN 1.17549435E-38F // decimal constant FLT_MIN 0X1P-126F // hex constant FLT_MIN_10_EXP -37 FLT_MAX_EXP +128 FLT_MAX 3.40282347E+38F // decimal constant FLT_MAX 0X1.fffffeP127F // hex constant FLT_MAX_10_EXP +38 DBL_MANT_DIG 53 DBL_EPSILON 2.2204460492503131E-16 // decimal constant DBL_EPSILON 0X1P-52 // hex constant DBL_DIG 15 DBL_MIN_EXP -1021 DBL_MIN 2.2250738585072014E-308 // decimal constant DBL_MIN 0X1P-1022 // hex constant DBL_MIN_10_EXP -307 DBL_MAX_EXP +1024 DBL_MAX 1.7976931348623157E+308 // decimal constant DBL_MAX 0X1.ffffffffffffeP1023 // hex constant DBL_MAX_10_EXP +308 If a type wider than double were supported, then DECIMAL_DIG would be greater than 17. For example, if the widest type were to use the minimal-width IEC 60559 double-extended format (64 bits of precision), then DECIMAL_DIG would be 21. Forward references: conditional inclusion (6.10.1), complex arithmetic (7.3), mathematics (7.12), integer types (7.18). ____________________ 19)The floating-point model in that standard sums powers of b from zero, so the values of the exponent limits are one less than shown here. 5.2.4.2.2 Environment 5.2.4.2.2 WG14/N843 Committee Draft -- August 3, 1998 31 6. Language 6.1 Notation [#1] In the syntax notation used in this clause, syntactic categories (nonterminals) are indicated by italic type, and literal words and character set members (terminals) by bold type. A colon (:) following a nonterminal introduces its definition. Alternative definitions are listed on separate lines, except when prefaced by the words ``one of''. An optional symbol is indicated by the suffix ``-opt'', so that { expression-opt } indicates an optional expression enclosed in braces. [#2] A summary of the language syntax is given in annex A. 6.2 Concepts 6.2.1 Scopes of identifiers [#1] An identifier can denote an object; a function; a tag or a member of a structure, union, or enumeration; a typedef name; a label name; a macro name; or a macro parameter. The same identifier can denote different entities at different points in the program. A member of an enumeration is called an enumeration constant. Macro names and macro parameters are not considered further here, because prior to the semantic phase of program translation any occurrences of macro names in the source file are replaced by the preprocessing token sequences that constitute their macro definitions. [#2] For each different entity that an identifier designates, the identifier is visible (i.e., can be used) only within a region of program text called its scope. Different entities designated by the same identifier either have different scopes, or are in different name spaces. There are four kinds of scopes: function, file, block, and function prototype. (A function prototype is a declaration of a function that declares the types of its parameters.) [#3] A label name is the only kind of identifier that has function scope. It can be used (in a goto statement) anywhere in the function in which it appears, and is declared implicitly by its syntactic appearance (followed by a : and a statement). Label names shall be unique within a function. [#4] Every other identifier has scope determined by the placement of its declaration (in a declarator or type specifier). If the declarator or type specifier that declares the identifier appears outside of any block or list 6 Language 6.2.1 32 Committee Draft -- August 3, 1998 WG14/N843 of parameters, the identifier has file scope, which terminates at the end of the translation unit. If the declarator or type specifier that declares the identifier appears inside a block or within the list of parameter declarations in a function definition, the identifier has block scope, which terminates at the } that closes the associated block. If the declarator or type specifier that declares the identifier appears within the list of parameter declarations in a function prototype (not part of a function definition), the identifier has function prototype scope, which terminates at the end of the function declarator. If an identifier designates two different entities in the same name space, the scopes might overlap. If so, the scope of one entity (the inner scope) will be a strict subset of the scope of the other entity (the outer scope). Within the inner scope, the identifier designates the entity declared in the inner scope; the entity declared in the outer scope is hidden (and not visible) within the inner scope. [#5] Unless explicitly stated otherwise, where this International Standard uses the term identifier to refer to some entity (as opposed to the syntactic construct), it refers to the entity in the relevant name space whose declaration is visible at the point the identifier occurs. [#6] Two identifiers have the same scope if and only if their scopes terminate at the same point. [#7] Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator. Forward references: compound statement, or block (6.8.2), declarations (6.7), enumeration specifiers (6.7.2.2), function calls (6.5.2.2), function declarators (including prototypes) (6.7.5.3), function definitions (6.9.1), the goto statement (6.8.6.1), labeled statements (6.8.1), name spaces of identifiers (6.2.3), scope of macro definitions (6.10.3.5), source file inclusion (6.10.2), tags (6.7.2.3), type specifiers (6.7.2). 6.2.2 Linkages of identifiers [#1] An identifier declared in different scopes or in the same scope more than once can be made to refer to the same object or function by a process called linkage. There are three kinds of linkage: external, internal, and none. [#2] In the set of translation units and libraries that constitutes an entire program, each declaration of a 6.2.1 Language 6.2.2 WG14/N843 Committee Draft -- August 3, 1998 33 particular identifier with external linkage denotes the same object or function. Within one translation unit, each declaration of an identifier with internal linkage denotes the same object or function. Each declaration of an identifier with no linkage denotes a unique entity. [#3] If the declaration of a file scope identifier for an object or a function contains the storage-class specifier static, the identifier has internal linkage.20) [#4] For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible,21) if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage. [#5] If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external. [#6] The following identifiers have no linkage: an identifier declared to be anything other than an object or a function; an identifier declared to be a function parameter; a block scope identifier for an object declared without the storage-class specifier extern. [#7] If, within a translation unit, the same identifier appears with both internal and external linkage, the behavior is undefined. Forward references: compound statement, or block (6.8.2), declarations (6.7), expressions (6.5), external definitions (6.9). ____________________ 20)A function declaration can contain the storage-class specifier static only if it is at file scope; see 6.7.1. 21)As specified in 6.2.1, the later declaration might hide the prior declaration. 6.2.2 Language 6.2.2 34 Committee Draft -- August 3, 1998 WG14/N843 6.2.3 Name spaces of identifiers [#1] If more than one declaration of a particular identifier is visible at any point in a translation unit, the syntactic context disambiguates uses that refer to different entities. Thus, there are separate name spaces for various categories of identifiers, as follows: -- label names (disambiguated by the syntax of the label declaration and use); -- the tags of structures, unions, and enumerations (disambiguated by following any22) of the keywords struct, union, or enum); -- the members of structures or unions; each structure or union has a separate name space for its members (disambiguated by the type of the expression used to access the member via the . or -> operator); -- all other identifiers, called ordinary identifiers (declared in ordinary declarators or as enumeration constants). Forward references: enumeration specifiers (6.7.2.2), labeled statements (6.8.1), structure and union specifiers (6.7.2.1), structure and union members (6.5.2.3), tags (6.7.2.3). 6.2.4 Storage durations of objects [#1] An object has a storage duration that determines its lifetime. There are three storage durations: static, automatic, and allocated. Allocated storage is described in 7.20.3. [#2] An object whose identifier is declared with external or internal linkage, or with the storage-class specifier static has static storage duration. For such an object, storage is reserved and its stored value is initialized only once, prior to program startup. The object exists, has a constant address, and retains its last-stored value throughout the execution of the entire program.23) [#3] An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration. For objects that do not have a variable length array type, storage is guaranteed to be reserved for a new instance of the object on each entry into the block with which it is associated; the initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached. Storage for the object is no longer guaranteed to be reserved when execution of the block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) WG14/N843 Committee Draft -- August 3, 1998 35 [#4] For objects that do have a variable length array type, storage is guaranteed to be reserved for a new instance of the object each time the declaration is reached in the execution of the program. The initial value of the object is indeterminate. Storage for the object is no longer guaranteed to be reserved when the execution of the program leaves the scope of the declaration.24) [#5] If an object is referred to when storage is not reserved for it, the behavior is undefined. The value of a pointer that referred to an object whose storage is no longer reserved is indeterminate. During the time that its storage is reserved, an object has a constant address. Forward references: compound statement, or block (6.8.2), function calls (6.5.2.2), declarators (6.7.5), array declarators (6.7.5.2), initialization (6.7.8). 6.2.5 Types [#1] The meaning of a value stored in an object or returned by a function is determined by the type of the expression used to access it. (An identifier declared to be an object is the simplest such expression; the type is specified in the declaration of the identifier.) Types are partitioned into object types (types that describe objects), function types (types that describe functions), and incomplete types (types that describe objects but lack information needed to determine their sizes). [#2] An object declared as type _Bool is large enough to store the values 0 and 1. [#3] An object declared as type char is large enough to store any member of the basic execution character set. If a member of the required source character set enumerated in 5.2.1 is stored in a char object, its value is guaranteed to ____________________ 22)There is only one name space for tags even though three are possible. 23)The term constant address means that two pointers to the object constructed at possibly different times will compare equal. The address may be different during two different executions of the same program. In the case of a volatile object, the last store need not be explicit in the program. 24)Leaving the innermost block containing the declaration, or jumping to a point in that block or an embedded block prior to the declaration, leaves the scope of the declaration. 6.2.4 Language 6.2.5 36 Committee Draft -- August 3, 1998 WG14/N843 be positive. If any other character is stored in a char object, the resulting value is implementation-defined but shall be within the range of values that can be represented in that type. [#4] There are five standard signed integer types, designated as signed char, short int, int, long int, and long long int. (These and other types may be designated in several additional ways, as described in 6.7.2.) There may also be implementation-defined extended signed integer types.25) The standard and extended signed integer types are collectively called signed integer types.26) [#5] An object declared as type signed char occupies the same amount of storage as a ``plain'' char object. A ``plain'' int object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range INT_MIN to INT_MAX as defined in the header ). [#6] For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements. The type _Bool and the unsigned integer types that correspond to the standard signed integer types are the standard unsigned integer types. The unsigned integer types that correspond to the extended signed integer types are the extended unsigned integer types. [#7] The standard signed integer types and standard unsigned integer types are collectively called the standard integer types, the extended signed integer types and extended unsigned integer types are collectively called the extended integer types. [#8] For any two types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type. [#9] The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same.27) A computation involving unsigned operands ____________________ 25)Implementation-defined keywords shall have the form of an identifier reserved for any use as described in 7.1.3. 26)Therefore, any statement in this Standard about signed integer types also applies to the extended signed integer types. 6.2.5 Language 6.2.5 WG14/N843 Committee Draft -- August 3, 1998 37 can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type. [#10] There are three real floating types, designated as float, double, and long double. The set of values of the type float is a subset of the set of values of the type double; the set of values of the type double is a subset of the set of values of the type long double. [#11] There are three complex types, designated as float _Complex, double _Complex, and long double _Complex.28) The real floating and complex types are collectively called the floating types. [#12] For each floating type there is a corresponding real type, which is always a real floating type. For real floating types, it is the same type. For complex types, it is the type given by deleting the keyword _Complex from the type name. [#13] Each complex type has the same representation and alignment requirements as an array type containing exactly two elements of the corresponding real type; the first element is equal to the real part, and the second element to the imaginary part, of the complex number. [#14] The type char, the signed and unsigned integer types, and the floating types are collectively called the basic types. Even if the implementation defines two or more basic types to have the same representation, they are nevertheless different types.29) [#15] The three types char, signed char, and unsigned char are collectively called the character types. The implementation shall define char to have the same range, ____________________ 27)The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions. 28)A specification for imaginary types is in informative annex G. 29)An implementation may define new keywords that provide alternative ways to designate a basic (or any other) type; this does not violate the requirement that all basic types be different. Implementation-defined keywords shall have the form of an identifier reserved for any use as described in 7.1.3. 6.2.5 Language 6.2.5 38 Committee Draft -- August 3, 1998 WG14/N843 representation, and behavior as either signed char or unsigned char.30) [#16] An enumeration comprises a set of named integer constant values. Each distinct enumeration constitutes a different enumerated type. [#17] The type char, the signed and unsigned integer types, and the enumerated types are collectively called integer types. The integer and real floating types are collectively called real types. [#18] The void type comprises an empty set of values; it is an incomplete type that cannot be completed. [#19] Any number of derived types can be constructed from the object, function, and incomplete types, as follows: -- An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.31) Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T, the array type is sometimes called ``array of T''. The construction of an array type from an element type is called ``array type derivation''. -- A structure type describes a sequentially allocated nonempty set of member objects (and, in certain circumstances, an incomplete array), each of which has an optionally specified name and possibly distinct type. -- A union type describes an overlapping nonempty set of member objects, each of which has an optionally specified name and possibly distinct type. -- A function type describes a function with specified return type. A function type is characterized by its return type and the number and types of its parameters. A function type is said to be derived from its return type, and if its return type is T, the function type is sometimes called ``function returning T''. The ____________________ 30)CHAR_MIN, defined in , will have one of the values 0 or SCHAR_MIN, and this can be used to distinguish the two options. Irrespective of the choice made, char is a separate type from the other two and is not compatible with either. 31)Since object types do not include incomplete types, an array of incomplete type cannot be constructed. 6.2.5 Language 6.2.5 WG14/N843 Committee Draft -- August 3, 1998 39 construction of a function type from a return type is called ``function type derivation''. -- A pointer type may be derived from a function type, an object type, or an incomplete type, called the referenced type. A pointer type describes an object whose value provides a reference to an entity of the referenced type. A pointer type derived from the referenced type T is sometimes called ``pointer to T''. The construction of a pointer type from a referenced type is called ``pointer type derivation''. [#20] These methods of constructing derived types can be applied recursively. [#21] Integer and floating types are collectively called arithmetic types. Arithmetic types and pointer types are collectively called scalar types. Array and structure types are collectively called aggregate types.32) [#22] Each arithmetic type belongs to one typedomain. The real type domain comprises the real types. The complex type domain comprises the complex types. [#23] An array type of unknown size is an incomplete type. It is completed, for an identifier of that type, by specifying the size in a later declaration (with internal or external linkage). A structure or union type of unknown content (as described in 6.7.2.3) is an incomplete type. It is completed, for all declarations of that type, by declaring the same structure or union tag with its defining content later in the same scope. A structure type containing a flexible array member is an incomplete type that cannot be completed. [#24] Array, function, and pointer types are collectively called derived declarator types. A declarator type derivation from a type T is the construction of a derived declarator type from T by the application of an array-type, a function-type, or a pointer-type derivation to T. [#25] A type is characterized by its type category, which is either the outermost derivation of a derived type (as noted above in the construction of derived types), or the type itself if the type consists of no derived types. [#26] Any type so far mentioned is an unqualified type. Each unqualified type has several qualified versions of its type,33) corresponding to the combinations of one, two, or ____________________ 32)Note that aggregate type does not include union type because an object with union type can only contain one member at a time. 6.2.5 Language 6.2.5 40 Committee Draft -- August 3, 1998 WG14/N843 all three of the const, volatile, and restrict qualifiers. The qualified or unqualified versions of a type are distinct types that belong to the same type category and have the same representation and alignment requirements.27) A derived type is not qualified by the qualifiers (if any) of the type from which it is derived. [#27] A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements.27) All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements. [#28] EXAMPLE 1 The type designated as ``float *'' has type ``pointer to float''. Its type category is pointer, not a floating type. The const-qualified version of this type is designated as ``float * const'' whereas the type designated as ``const float *'' is not a qualified type -- its type is ``pointer to const-qualified float'' and is a pointer to a qualified type. [#29] EXAMPLE 2 The type designated as ``struct tag (*[5])(float)'' has type ``array of pointer to function returning struct tag''. The array has length five and the function has a single parameter of type float. Its type category is array. Forward references: character constants (6.4.4.4), compatible type and composite type (6.2.7), declarations (6.7), tags (6.7.2.3), type qualifiers (6.7.3). 6.2.6 Representations of types [#1] The representations of all types are unspecified except as stated in this subclause. 6.2.6.1 General [#1] Except for bit-fields, objects are composed of contiguous sequences of one or more bytes, the number, order, and encoding of which are either explicitly specified or implementation-defined. [#2] Values stored in objects of type unsigned char shall be ____________________ 33)See 6.7.3 regarding qualified array and function types. 6.2.5 Language 6.2.6.1 WG14/N843 Committee Draft -- August 3, 1998 41 represented using a pure binary notation.34) [#3] Values stored in objects of any other object type consist of n×CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value. Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations. [#4] Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is accessed by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.35) Such a representation is called a trap representation. [#5] When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.36) The values of padding bytes shall not affect whether the value of such an object is a trap representation. Those bits of a structure or union object that are in the same byte as a bit-field member, but are not part of that member, shall similarly not affect whether the value of such an object is a trap representation. [#6] When a value is stored in a member of an object of union type, the bytes of the object representation that do not correspond to that member but do correspond to other ____________________ 34)A positional representation for integers that uses the binary digits 0 and 1, in which the values represented by successive bits are additive, begin with 1, and are multiplied by successive integral powers of 2, except perhaps the bit with the highest position. (Adapted from the American National Dictionary for Information Processing Systems.) A byte contains CHAR_BIT bits, and the values of type unsigned char range from 0 to 2CHAR_BIT-1. 35)Thus an automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it. 36)Thus, for example, structure assignment may be implemented element-at-a-time or via memcpy. 6.2.6.1 Language 6.2.6.1 42 Committee Draft -- August 3, 1998 WG14/N843 members take unspecified values, but the value of the union object shall not thereby become a trap representation. [#7] Where an operator is applied to a value which has more than one object representation, which object representation is used shall not affect the value of the result. Where a value is stored in an object using a type that has more than one object representation for that value, it is unspecified which representation is used, but a trap representation shall not be generated. 6.2.6.2 Integer types [#1] For unsigned integer types other than unsigned char, the bits of the object representation shall be divided into two groups: value bits and padding bits (there need not be any of the latter). If there are N value bits, each bit shall represent a different power of 2 between 1 and 2N-1, so that objects of that type shall be capable of representing values from 0 to 2N-1 using a pure binary representation; this shall be known as the value representation. The values of any padding bits are unspecified.37) [#2] For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit. Each bit that is a value bit shall have the same value as the same bit in the object representation of the corresponding unsigned type (if there are M value bits in the signed type and N in the unsigned type, then M<=N). If the sign bit is zero, it shall not affect the resulting value. If the sign bit is one, then the value shall be modified in one of the following ways: -- the corresponding value with sign bit 0 is negated; -- the sign bit has the value -2N; -- the sign bit has the value 1-2N. [#3] The values of any padding bits are unspecified.37) A valid (non-trap) object representation of a signed integer ____________________ 37)Some combinations of padding bits might generate trap representations, for example, if one padding bit is a parity bit. Regardless, no arithmetic operation on valid values can generate a trap representation other than as part of an exception such as an overflow, and this cannot occur with unsigned types. All other combinations of padding bits are alternative object representations of the value specified by the value bits. 6.2.6.1 Language 6.2.6.2 WG14/N843 Committee Draft -- August 3, 1998 43 type where the sign bit is zero is a valid object representation of the corresponding unsigned type, and shall represent the same value. [#4] The precision of an integer type is the number of bits it uses to represent values, excluding any sign and padding bits. The width of an integer type is the same but including any sign bit; thus for unsigned integer types the two values are the same, while for signed integer types the width is one greater than the precision. 6.2.7 Compatible type and composite type [#1] Two types have compatible type if their types are the same. Additional rules for determining whether two types are compatible are described in 6.7.2 for type specifiers, in 6.7.3 for type qualifiers, and in 6.7.5 for declarators.38) Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed types, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types, and such that if one member of a corresponding pair is declared with a name, the other member is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values. [#2] All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined. [#3] A composite type can be constructed from two types that are compatible; it is a type that is compatible with both of the two types and satisfies the following conditions: -- If one type is an array of known constant size, the composite type is an array of that size; otherwise, if one type is a variable length array, the composite type is that type. -- If only one type is a function type with a parameter type list (a function prototype), the composite type is a function prototype with the parameter type list. ____________________ 38)Two types need not be identical to be compatible. 6.2.6.2 Language 6.2.7 44 Committee Draft -- August 3, 1998 WG14/N843 -- If both types are function types with parameter type lists, the type of each parameter in the composite parameter type list is the composite type of the corresponding parameters. These rules apply recursively to the types from which the two types are derived. [#4] For an identifier with internal or external linkage declared in a scope in which a prior declaration of that identifier is visible,39) if the prior declaration specifies internal or external linkage, the type of the identifier at the later declaration becomes the composite type. [#5] EXAMPLE Given the following two file scope declarations: int f(int (*)(), double (*)[3]); int f(int (*)(char *), double (*)[]); The resulting composite type for the function is: int f(int (*)(char *), double (*)[3]); Forward references: declarators (6.7.5), enumeration specifiers (6.7.2.2), structure and union specifiers (6.7.2.1), type definitions (6.7.7), type qualifiers (6.7.3), type specifiers (6.7.2). ____________________ 39)As specified in 6.2.1, the later declaration might hide the prior declaration. 6.2.7 Language 6.2.7 WG14/N843 Committee Draft -- August 3, 1998 45 6.3 Conversions [#1] Several operators convert operand values from one type to another automatically. This subclause specifies the result required from such an implicit conversion, as well as those that result from a cast operation (an explicit conversion). The list in 6.3.1.8 summarizes the conversions performed by most ordinary operators; it is supplemented as required by the discussion of each operator in 6.5. [#2] Conversion of an operand value to a compatible type causes no change to the value or the representation. Forward references: cast operators (6.5.4). 6.3.1 Arithmetic operands 6.3.1.1 Boolean, characters, and integers [#1] Every integer type has an integer conversion rank defined as follows: -- No two signed integer types shall have the same rank, even if they have the same representation. -- The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision. -- The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char. -- The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any. -- The rank of any standard integer type shall be greater than the rank of any extended integer type with the same width. -- The rank of char shall equal the rank of signed char and unsigned char. -- The rank of _Bool shall be less than the rank of all other standard integer types. -- The rank of any enumerated type shall equal the rank of the compatible integer type. -- The rank of any extended signed integer type relative to another extended signed integer type with the same precision is implementation-defined, but still subject 6.3 Language 6.3.1.1 46 Committee Draft -- August 3, 1998 WG14/N843 to the other rules for determining the integer conversion rank. -- For all integer types T1, T2, and T3, if T1 has greater rank than T2 and T2 has greater rank than T3, then T1 has greater rank than T3. [#2] The following may be used in an expression wherever an int or unsigned int may be used: -- An object or expression with an integer type whose integer conversion rank is less than the rank of int and unsigned int. -- A bit-field of type _Bool, int, signed int, or unsigned int. If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.40) All other types are unchanged by the integer promotions. [#3] The integer promotions preserve value including sign. As discussed earlier, whether a ``plain'' char is treated as signed is implementation-defined. Forward references: enumeration specifiers (6.7.2.2), structure and union specifiers (6.7.2.1). 6.3.1.2 Boolean type [#1] When any scalar value is converted to _Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1. 6.3.1.3 Signed and unsigned integers [#1] When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged. [#2] Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. ____________________ 40)The integer promotions are applied only: as part of the usual arithmetic conversions, to certain argument expressions, to the operands of the unary +, -, and ~ operators, and to both operands of the shift operators, as specified by their respective subclauses. 6.3.1.1 Language 6.3.1.3 WG14/N843 Committee Draft -- August 3, 1998 47 [#3] Otherwise, the new type is signed and the value cannot be represented in it; the result is implementation-defined. 6.3.1.4 Real floating and integer [#1] When a finite value of real floating type is converted to integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.41) [#2] When a value of integer type is converted to real floating type, if the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower value, chosen in an implementation-defined manner. If the value being converted is outside the range of values that can be represented, the behavior is undefined. 6.3.1.5 Real floating types [#1] When a float is promoted to double or long double, or a double is promoted to long double, its value is unchanged. [#2] When a double is demoted to float or a long double to double or float, if the value being converted is outside the range of values that can be represented, the behavior is undefined. If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is either the nearest higher or nearest lower value, chosen in an implementation-defined manner. 6.3.1.6 Complex types [#1] When a value of complex type is converted to another complex type, both the real and imaginary parts follow the conversion rules for the corresponding real types. 6.3.1.7 Real and complex [#1] When a value of real type is converted to a complex type, the real part of the complex result value is determined by the rules of conversion to the corresponding real type and the imaginary part of the complex result value is a positive zero or an unsigned zero. [#2] When a value of complex type is converted to a real ____________________ 41)The remaindering operation performed when a value of integer type is converted to unsigned type need not be performed when a value of real floating type is converted to unsigned type. Thus, the range of portable real floating values is (-1, Utype_MAX+1). 6.3.1.3 Language 6.3.1.7 48 Committee Draft -- August 3, 1998 WG14/N843 type, the imaginary part of the complex value is discarded and the value of the real part is converted according to the conversion rules for the corresponding real type. 6.3.1.8 Usual arithmetic conversions [#1] Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is determined by the operator. This pattern is called the usual arithmetic conversions: First, if the corresponding real type of either operand is long double, the other operand is converted, without change of type domain, to a type whose corresponding real type is long double. Otherwise, if the corresponding real type of either operand is double, the other operand is converted, without change of type domain, to a type whose corresponding real type is double. Otherwise, if the corresponding real type of either operand is float, the other operand is converted, without change of type domain, to a type whose corresponding real type is float.42) Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands: If both operands have the same type, then no further conversion is needed. Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank. Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type ____________________ 42)For example, addition of a double _Complex and a float entails just the conversion of the float operand to double (and yields a double _Complex result). 6.3.1.7 Language 6.3.1.8 WG14/N843 Committee Draft -- August 3, 1998 49 of the operand with unsigned integer type. Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type. Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type. [#2] The values of floating operands and of the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.43) 6.3.2 Other operands 6.3.2.1 Lvalues and function designators [#1] An lvalue is an expression with an object type or an incomplete type other than void;44) if an lvalue does not designate an object when it is evaluated, the behavior is undefined. When an object is said to have a particular type, the type is specified by the lvalue used to designate the object. A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a const-qualified type, and if it is a structure or union, does not have any member (including, recursively, any member or element of all contained aggregates or unions) with a const-qualified type. [#2] Except when it is the operand of the sizeof operator, the unary & operator, the ++ operator, the -- operator, or ____________________ 43)The cast and assignment operators are still required to perform their specified conversions as described in 6.3.1.4 and 6.3.1.5. 44)The name ``lvalue'' comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an object ``locator value''. What is sometimes called ``rvalue'' is in this International Standard described as the ``value of an expression''. An obvious example of an lvalue is an identifier of an object. As a further example, if E is a unary expression that is a pointer to an object, *E is an lvalue that designates the object to which E points. 6.3.1.8 Language 6.3.2.1 50 Committee Draft -- August 3, 1998 WG14/N843 the left operand of the . operator or an assignment operator, an lvalue that does not have array type is converted to the value stored in the designated object (and is no longer an lvalue). If the lvalue has qualified type, the value has the unqualified version of the type of the lvalue; otherwise, the value has the type of the lvalue. If the lvalue has an incomplete type and does not have array type, the behavior is undefined. [#3] Except when it is the operand of the sizeof operator or the unary & operator, or is a string literal used to initialize an array, an expression that has type ``array of type'' is converted to an expression with type ``pointer to type'' that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined. [#4] A function designator is an expression that has function type. Except when it is the operand of the sizeof operator45) or the unary & operator, a function designator with type ``function returning type'' is converted to an expression that has type ``pointer to function returning type''. Forward references: address and indirection operators (6.5.3.2), assignment operators (6.5.16), common definitions (7.17), initialization (6.7.8), postfix increment and decrement operators (6.5.2.4), prefix increment and decrement operators (6.5.3.1), the sizeof operator (6.5.3.4), structure and union members (6.5.2.3). 6.3.2.2 void [#1] The (nonexistent) value of a void expression (an expression that has type void) shall not be used in any way, and implicit or explicit conversions (except to void) shall not be applied to such an expression. If an expression of any other type is evaluated as a void expression, its value or designator is discarded. (A void expression is evaluated for its side effects.) 6.3.2.3 Pointers [#1] A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer. ____________________ 45)Because this conversion does not occur, the operand of the sizeof operator remains a function designator and violates the constraint in 6.5.3.4. 6.3.2.1 Language 6.3.2.3 WG14/N843 Committee Draft -- August 3, 1998 51 [#2] For any qualifier q, a pointer to a non-q-qualified type may be converted to a pointer to the q-qualified version of the type; the values stored in the original and converted pointers shall compare equal. [#3] An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.46) If a null pointer constant is assigned to or compared for equality to a pointer, the constant is converted to a pointer of that type. Such a pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function. [#4] Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal. [#5] An integer may be converted to any pointer type. The result is implementation-defined, might not be properly aligned, and might not point to an entity of the referenced type.47) [#6] Any pointer type may be converted to an integer type; the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type. [#7] A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned48) for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. ____________________ 46)The macro NULL is defined in as a null pointer constant; see 7.17. 47)The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to be consistent with the addressing structure of the execution environment. 48)In general, the concept ``correctly aligned'' is transitive: if a pointer to type A is correctly aligned for a pointer to type B, which in turn is correctly aligned for a pointer to type C, then a pointer to type A is correctly aligned for a pointer to type C. 6.3.2.3 Language 6.3.2.3 52 Committee Draft -- August 3, 1998 WG14/N843 [#8] A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined. Forward references: cast operators (6.5.4), equality operators (6.5.9), simple assignment (6.5.16.1). 6.3.2.3 Language 6.3.2.3 WG14/N843 Committee Draft -- August 3, 1998 53 6.4 Lexical elements Syntax [#1] token: keyword identifier constant string-literal punctuator preprocessing-token: header-name identifier pp-number character-constant string-literal punctuator each universal-character-name that cannot be one of the above each non-white-space character that cannot be one of the above Constraints [#2] Each preprocessing token that is converted to a token shall have the lexical form of a keyword, an identifier, a constant, a string literal, or a punctuator. Semantics [#3] A token is the minimal lexical element of the language in translation phases 7 and 8. The categories of tokens are: keywords, identifiers, constants, string literals, and punctuators. A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6. The categories of preprocessing token are: header names, identifiers, preprocessing numbers, character constants, string literals, punctuators, and single non-white-space characters that do not lexically match the other preprocessing token categories.49) If a ' or a " character matches the last category, the behavior is undefined. Preprocessing tokens can be separated by white space; this consists of comments (described later), or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both. As described in 6.10, in certain circumstances during translation phase 4, white space (or the absence thereof) serves as more than preprocessing token separation. White space may appear within a preprocessing ____________________ 49)An additional category, placemarkers, is used internally in translation phase 4 (see 6.10.3.3); it cannot occur in source files. 6.4 Language