C Style -- Formatting

1. Introduction

.scope: This document describes the Ravenbrook conventions for the general format of C source code in the MPS.

.readership: This document is intended for anyone working on or with the C source code.

2. General Formatting Conventions

Line Width

.width: Lines should be no wider than 72 characters. .width.why: Many people use 80 column terminal windows so that multiple windows can be placed side by side. Restricting lines to 72 characters allows line numbering to be used (in vi for example) and also allows diffs to be displayed without overflowing the terminal.

White Space

.space.notab: No tab characters should appear in the source files. Ordinary spaces should be used to indent and format the sources. .space.notab.why: Tab characters are displayed differently on different platforms, and sometimes translated back and forth, destroying layout information.

.space.punct: There should always be whitespace after commas and semicolons and similar punctuation.

.space.op: Put white space around operators in expressions, except when removing it would make the expression clearer by binding certain sub-expressions more tightly. For example:

  foo = x + y*z;

.space.control.not: No space between the keywords if, switch, while, for and the following paren.

.space.function.not: No space between a function name and the following paren beginning its argument list.

.space.function.not: No space between a function name and the following paren beginning its argument list.

Sections and Paragraphs

.section: Source files can be thought of as breaking down into "sections" and "paragraphs". A section might be the leader comment of a file, the imports, or a set of declarations which are related.

.section.space: Precede sections by two blank lines (except the first one in the file, which should be the leader comment in any case).

.section.comment: Each section should start with a banner comment (see .comment.banner) describing what the section contains.

.para: Within sections, code often breaks down into natural units called "paragraphs". A paragraph might be a set of strongly related declarations (Init and Finish, for example), or a few lines of code which it makes sense to consider together (the assignment of fields into a structure, for example).

.para.space: Precede paragraphs by a single blank line.

Statements

.statement.one: Generally only have at most one statement per line. In particular the following are deprecated:

  if(thing) return;  

  a=0; b=0;

  case 0: f = inRampMode ? AMCGen0RampmodeFrequency : AMCGen0Frequency;

.statement.one.why: Debuggers can often only place breakpoints on lines, not expressions or statements within a line. The "if(thing) return;" is a particularly important case, if thing is a reasonably rare return condition then you might want to breakpoint it in a debugger session. Annoying because "if(thing) return;" is quite compact and pleasing otherwise.

Indentation

.indent: Indent the body of a block by two spaces. For formatting purposes, the "body of a block" means:

  • statements between braces,
  • a single statement following a lone if;
  • statements in a case-clause; see .switch.

(.indent.logical: The aim is to group what we think of as logical blocks, even though they may not exactly match how "block" is used in the definition of C syntax).

Some examples:

  if(res != ResOK) {
    SegFinish(&span->segStruct);
    PoolFreeP(MV->spanPool, span, sizeof(SpanStruct));
    return res;
  }

  if(res != ResOK)
    goto error;

  if(j == block->base) {
    if(j+step == block->limit) {
      if(block->thing)
        putc('@', stream);
    }
  } else if(j+step == block->limit) {
    putc(']', stream);
    pop_bracket();
  } else {
    putc('.', stream);
  }

  case 'A': {
    c = 'A';
    p += 1;
    break;
  }

.indent.goto-label: Place each goto-label on a line of its own, outdented to the same level as the surrounding block. Then indent the non-label part of the statement normally.

  result foo(void)
  {
    statement();
    if(error)
      goto foo;
    statement();
    return OK;

  foo:
    unwind();
    return ERROR;
  }

.indent.case-label.not: Do not outdent case- and default-labels in a switch statement. See .switch.

.indent.cont: If an expression or statement won't fit on a single line, indent the continuation lines by two spaces, apart from the following exception:

.indent.cont.parens: if you break a statement inside a parameter list or other parenthesized expression, indent so that the continuation lines up just after the open parenthesis. For example:

  PoolClassInit(&PoolClassMVStruct,
                "MV", init, finish, allocP, freeP,
                NULL, NULL, describe, isValid);

.indent.cont.expr: Note that when breaking an expression it is clearer to place the operator at the start of the continuation line:

  CHECKL(AddrAdd((Addr)chunk->pageTableMapped,
                 BTSize(chunk->pageTablePages))
         <= AddrAdd(chunk->base, chunk->ullageSize));

This is particularly useful in long conditional expressions that use && and ||. For example:

  } while(trace->state != TraceFINISHED
          && (trace->emergency || traceWorkClock(trace) < pollEnd));

.indent.hint: Usually, it is possible to determine the correct indentation for a line by looking to see if the previous line ends with a semicolon. If it does, indent to the same amount, otherwise indent by two more spaces. The main exceptions are lines starting with a close brace, goto-labels, and line-breaks between parentheses.

Positioning of Braces

.brace.otb: Use the "One True Brace" (or OTB) style. This places the open brace after the control word or expression, separated by a space, and when there is an else, places that after the close brace. For example:

  if(isBase) {
    new->base = limit;
    new->limit = block->limit;
    block->limit = base;
    new->next = block->next;
    block->next = new;
  } else {
    new->base = block->base;
    new->limit = base;
    block->base = limit;
    new->next = block;
    *prev = new;
  }

The same applies to struct, enum, union.

.brace.otb.function.not: OTB is never used for function definitions.

.brace.always: Braces are always required after if, else, switch, while, do, and for.

.brace.always.except: Except that a lone if with no else is allowed to drop its braces when its body is a single simple statement. Typically this will be a goto or an assignment. For example:

  if(res != ResOK)
    goto failStart;

Note in particular that an if with an else must have braces on both paths.

Switch Statements

.switch: format switch statements like this:

  switch(action) {
    case WIBBLE:
    case WOBBLE: {
      int angle;
      err = move(plate, action, &angle);
      break;
    }
    case QUIESCENT: {
      err = 0;
      break;
    }
    default: {
      NOTREACHED;
      break;
    }
  }

The component rules that result in this style are:

.switch.two-levels: The switch statement as a whole is a block, and (for formatting purposes) each case- or default-label starts a nested block within that. Always use two levels. Do not try to 'save a level' by outdenting case-labels: it's an anachronism from when BSD used 8-column indents, and it causes havoc with brace positioning.

.switch.brace: Put braces around every case-clause body. Put the open brace on the same line as the last case-label (following .brace.otb). Put the close brace on a line on its own.

.switch.break: The last line of every case-clause body, before the close brace, must be an unconditional jump statement (usually break, but may be goto, continue, or return), or if a fall-through is intended, the comment /* fall-through */. (Note: if the unconditional jump should never be taken, because of previous conditional jumps, use NOTREACHED on the line before it). This rule is to prevent accidental fall-throughs, even if someone makes a editing mistake that causes a conditional jump to be missed.

(.switch.break.outside.not: Do not put the break outside the case-clause braces: there is no need. Placing it on the last line of the case-clause body is less weird and equally clear.)

.switch.default: It is usually a good idea to have a default-clause, even if all it contains is NOTREACHED and break.

.switch.new-rules: The above rules are simple, consistent, and not onerous. But they are also new, so existing MPS source code may not yet follow them.

.switch.horrors: There are currently around 5 different formatting styles for switch statements in MPS source code: this is madness. Variation is mostly caused by inventive violations of .switch.two-levels (either case-label outdenting, or something worse: see root.c), compounded by inventive violations of OTB for the braces around the case-clause. The following transgressions are deprecated:

  switch(action) {
  case WOBBLE: {   /* BAD: open brace on outdented case-label line! */
    ...
  }  /* BAD: outdented close brace! */

  } break;  /* BAD: break sharing line with close brace! */

Formatting Comments

.comment: There are three types of comments: banners, paragraph comments, and columnar comments.

.comment.banner: Banner comments come at the start of sections. A banner comment consists of a heading usually composed of a symbol, an em-dash (--) and an short explanation, followed by English text which is formatted using conventional text documentation guidelines (see guide.text). The open and close comment tokens ("/*" and "*/") are placed at the top and bottom of a column of asterisks. The text is separated from the asterisks by one space. Place a blank line between the banner comment and the section it comments. For example:

/* BlockStruct --  Block descriptor
 *
 * The pool maintains a descriptor structure for each 
 * contiguous allocated block of memory it manages.  
 * The descriptor is on a simple linked-list of such 
 * descriptors, which is in ascending order of address.
 */

typedef struct BlockStruct {

.comment.para: Paragraph comments come at the start of paragraphs in the code. A paragraph comment consists of formatted English text, with each line wrapped by the open and close comment tokens ("/*" and "*/"). (This avoids problems when cutting and pasting comments.) For example:

  /* If the freed area is in the base sentinel then insert */
  /* the new descriptor after it, otherwise insert before. */
  if(isBase) {

.comment.para.precede: Paragraph comments, even one-liners, precede the code to which they apply.

.comment.column: Columnar comments appear in a column to the right of the code. They should be used sparingly, since they clutter the code and make it hard to edit. Use them on variable declarations and structure, union, or enum declarations. They should start at least at column 32 (counting from 0, that is, on a tab-stop), and should be terse descriptive text. Abandon English sentence structure if this makes the comment clearer. Don't write more than one line. Here's an example:

  typedef struct PoolMVStruct
  {
    Pool blockPool;           /* for block descriptors */
    Pool spanPool;            /* for span descriptors */
    size_t extendBy;          /* size to extend pool by */
    size_t avgSize;           /* estimate of allocation size */
    size_t maxSize;           /* estimate of maximum size */
    Addr space;               /* total free space in pool */
    Addr lost;                /* lost when free can't allocate */
    struct SpanStruct *spans; /* span chain */
  } PoolMVStruct;

Macros

.macro.careful: Macros in C are a real horror bag, be extra careful. There's lots that could go here, but proper coverage probably deserves a separate document. Which isn't written yet.

.macro.general: Do try and follow the other formatting conventions for code in macro definitions.

.macro.backslash: Backslashes used for continuation lines in macro definitions should be put on the right somewhere where they will be less in the way. Example:

#define RAMP_RELATION(X)                       \
  X(RampOUTSIDE,        "outside ramp")        \
  X(RampBEGIN,          "begin ramp")          \
  X(RampRAMPING,        "ramping")             \
  X(RampFINISH,         "finish ramp")         \
  X(RampCOLLECTING,     "collecting ramp")

B. History

2007-06-04 DRJ Adopted from //info.ravenbrook.com/project/mps/doc/2002-06-18/obsolete-mminfo/mminfo/guide/impl/c/format/index.txt and edited.
2007-06-04 DRJ Changed .width from 80 to 72. Banned space between "if" and "(". Required braces on almost everything. Clarified that paragraph comments precede the code.
2007-06-13 RHSK Removed .brace.block, because MPS source always uses .brace.otb. Remove .indent.elseif because it is obvious (ahem) and showing an example is sufficient. New rules for .switch.*: current MPS practice is a mess, so lay down a neat new law.
2007-06-27 RHSK Added .space.function.not.
2007-07-17 DRJ Added .macro.*