A library of procedures for formatting Scheme objects to text in
various ways, and for easily concatenating, composing and extending
these formatters efficiently without resorting to capturing and
manipulating intermediate strings.
For Gauche run
For MzScheme you can download and install the latest
http://synthcode.com/scheme/fmt/fmt.plt
To build the
For Scheme48 the package descriptions are in
For other implementations you'll need to load SRFI's 1, 6, 13, 33
(sample provided) and 69 (also provided), and then load the following
files:
The traditional approach is to use templates - typically strings,
though in theory any object could be used and indeed Emacs' mode-line
format templates allow arbitrary sexps. Templates can use either
escape sequences (as in C's
This library takes a combinator approach. Formats are nested chains
of closures, which are called to produce their output as needed.
The primary goal of this library is to have, first and foremost, a
maximally expressive and extensible formatting library. The next
most important goal is scalability - to be able to handle
arbitrarily large output and not build intermediate results except
where necessary. The third goal is brevity and ease of use.
where
Each
would return the string
A
These parameters may seem unwieldy, but they can also take their
defaults from state variables, described below.
The
See http://www.bipm.org/en/si/si_brochure/chapter3/prefixes.html.
As
Note these are column-oriented padders, so won't necessarily work
with multi-line output (padding doesn't seem a likely operation for
multi-line output).
If a truncation ellipse is set (e.g. with the
A convenience control structure can be useful in combination with
these states:
Many of the previously mentioned combinators have behavior which can
be altered with state variables. Although
will return the string
would return
Note that fixed point formatting supports arbitrary precision in
implementations with exact non-integral rationals. When trying to
print inexact numbers more than the machine precision you will
typically get results like
but with an exact rational it will give you as many digits as you
request:
would output
would output
outputs
assuming a 16-char width (the left side gets half the width, or 8
spaces, and is left aligned). Note that we explicitly use DSP instead
of the strings directly. This is because
would output
You may also prefix any column with any of the symbols
You can further prefix any column with a width modifier. Any
positive integer is treated as a fixed width, ignoring the available
width. Any real number between 0 and 1 indicates a fraction of the
available width (after subtracting out any fixed widths). Columns
with unspecified width divide up the remaining width evenly.
Note that
As an implementation detail,
outputs
This makes it easier to generate tables without knowing widths in
advance. However, because it requires generating the entire output in
advance to determine the correct column widths,
where
outputs
The Unix
There are two approaches to using the C formatting extensions -
procedural and sexp-oriented (described in 6.7). In the
procedural interface, C operators are made available as formatters
with a "c-" prefix, literals are converted to their C equivalents and
symbols are output as-is (you're responsible for making sure they are
valid C identifiers). Indentation is handled automatically.
outputs
In addition, the formatter knows when you're in an expression and
when you're in a statement, and behaves accordingly, so that
outputs
Similary,
Moreover, we also keep track of the final expression in a function
and insert returns for you:
outputs
although it knows that void functions don't return.
Switch statements insert breaks by default if they don't return:
though you can explicitly fallthrough if you want:
Operators are available with just a "c" prefix, e.g. c+, c-, c*, c/,
etc.
Function applications are written with
When a C formatter encounters an object it doesn't know how to write
(including lists and records), it outputs them according to the
format state's current
If the
Macros can be handled with
As with all C formatters, the CPP output is pretty printed as
needed, and if it wraps over several lines the lines are terminated
with a backslash.
To write a C header file that is included at most once, you can wrap
the entire body in
The C formatters also respect the
Types are just typically just symbols, or lists of symbols such as
These can also accessed as %fun and %prototype at the head of a list.
Typically a type is just a symbol such as
Pointers may be written as
Unamed structs, classes, unions and enums may be used directly as
types, using their respective keywords at the head of a list.
Two special types are the %array type and function pointer type. An
array is written:
where
A function pointer is written:
For example:
Wherever a type is expected but not given, the value of the
Type declarations work uniformly for variables and parameters, as well
for casts and typedefs.
Rather than building formatting closures by hand, it can be more
convenient to just build a normal s-expression and ask for it to be
formatted as C code. This can be thought of as a simple Scheme->C
compiler without any runtime support.
In a s-expression, strings and characters are printed as C strings and
characters, booleans are printed as 0 or 1, symbols are displayed
as-is, and numbers are printed as C numbers (using the current
formatting radix if specified). Vectors are printed as
comma-separated lists wrapped in braces, which can be used for
initializing arrays or structs.
A list indicates a C expression or statement. Any of the existing C
keywords can be used to pretty-print the expression as described with
the c-keyword formatters above. Thus, the example above
could also be written
C constructs that are dependent on the underlying syntax and have no
keyword are written with a % prefix (
For example, the following definition of the fibonacci sequence, which
apart from the return type of
prints the working C definition:
Currently expressions will all be terminated with a semi-colon, but
that will be made optional in a later release.
and more generally
where color can be a symbol name or
can be used to mark the format state as being inside HTML, which the
above color formats will understand and output HTML
It also recognizes and ignores ANSI escapes, in particular useful if
you want to combine this with the fmt-color utilities.
1 Table of Contents
2 Installation
Available for Chicken as the fmt
egg, providing the fmt
,
fmt-c
, fmt-color
and fmt-unicode
extensions. To install
manually for Chicken just run "chicken-setup"
in the fmt
directory.
"make gauche && make install-gauche"
. The modules
are installed as text.fmt
, text.fmt.c
, text.fmt.color
and
text.fmt.unicode
.
fmt.plt
yourself
from:
fmt.plt
for yourself you can run "make mzscheme"
.
fmt-scheme48.scm
:
> ,config ,load fmt-scheme48.scm
> ,open fmt
(load "let-optionals.scm") ; if you don't have LET-OPTIONALS*
(load "read-line.scm") ; if you don't have READ-LINE
(load "string-ports.scm") ; if you don't have CALL-WITH-OUTPUT-STRING
(load "make-eq-table.scm")
(load "mantissa.scm")
(load "fmt.scm")
(load "fmt-pretty.scm") ; optional pretty printing
(load "fmt-column.scm") ; optional columnar output
(load "fmt-c.scm") ; optional C formatting utilities
(load "fmt-color.scm") ; optional color utilities
(load "fmt-unicode.scm") ; optional Unicode-aware formatting,
; also requires SRFI-4 or SRFI-66
3 Background
There are several approaches to text formatting. Building strings to
display
is not acceptable, since it doesn't scale to very large
output. The simplest realistic idea, and what people resort to in
typical portable Scheme, is to interleave display
and write
and manual loops, but this is both extremely verbose and doesn't
compose well. A simple concept such as padding space can't be
achieved directly without somehow capturing intermediate output.
printf
and CL's
format
) or pattern matching (as in Visual Basic's Format
,
Perl6's form
, and SQL date formats). The
primary disadvantage of templates is the relative difficulty (usually
impossibility) of extending them, their opaqueness, and the
unreadability that arises with complex formats. Templates are not
without their advantages, but they are already addressed by other
libraries such as SRFI-28 and
SRFI-48.
4 Usage
The primary interface is the fmt
procedure:
(fmt <output-dest> <format> ...)
<output-dest>
has the same semantics as with format
-
specifically it can be an output-port, #t
to indicate the current
output port, or #f
to accumulate output into a string.
<format>
should be a format closure as discussed below. As a
convenience, non-procedure arguments are also allowed and are
formatted similar to display
, so that
(fmt #f "Result: " res nl)
"Result: 42n"
, assuming RES
is bound
to 42
.
nl
is the newline format combinator.
5 Specification
The procedure names have gone through several variations, and I'm
still open to new suggestions. The current approach is to use
abbreviated forms of standard output procedures when defining an
equivalent format combinator (thus display
becomes dsp
and
write
becomes wrt
), and to use an fmt-
prefix for
utilities and less common combinators. Variants of the same formatter
get a /<variant>
suffix.
5.1 Formatting Objects
(dsp <obj>)
Outputs <obj>
using display
semantics. Specifically, strings
are output without surrounding quotes or escaping and characters are
written as if by write-char
. Other objects are written as with
write
(including nested strings and chars inside <obj>
). This
is the default behavior for top-level formats in fmt
, cat
and
most other higher-order combinators.
(wrt <obj>)
Outputs <obj>
using write
semantics. Handles shared
structures as in SRFI-38.
(wrt/unshared <obj>)
As above, but doesn't handle shared structures. Infinite loops can
still be avoided if used inside a combinator that truncates data (see
trim
and fit
below).
(pretty <obj>)
Pretty-prints <obj>
. Also handles shared structures. Unlike many
other pretty printers, vectors and data lists (lists that don't begin
with a (nested) symbol), are printed in tabular format when there's
room, greatly saving vertical space.
(pretty/unshared <obj>)
As above but without sharing.
(slashified <str> [<quote-ch> <esc-ch> <renamer>])
Outputs the string <str>
, escaping any quote or escape characters.
If <esc-ch>
is #f
escapes only the <quote-ch>
by
doubling it, as in SQL strings and CSV values. If <renamer>
is
provided, it should be a procedure of one character which maps that
character to its escape value, e.g. #\newline => #\n
, or #f
if
there is no escape value.
(fmt #f (slashified "hi, "bob!""))
=> "hi, "bob!""
(maybe-slashified <str> <pred> [<quote-ch> <esc-ch> <renamer>])
Like slashified
, but first checks if any quoting is required (by
the existence of either any quote or escape characters, or any
character matching <pred>
), and if so outputs the string in quotes
and with escapes. Otherwise outputs the string as is.
(fmt #f (maybe-slashified "foo" char-whitespace? #\"))
=> "foo"
(fmt #f (maybe-slashified "foo bar" char-whitespace? #\"))
=> ""foo bar""
(fmt #f (maybe-slashified "foo"bar"baz" char-whitespace? #\"))
=> ""foo"bar"baz""
5.2 Formatting Numbers
(num <n> [<radix> <precision> <sign> <comma> <comma-sep> <decimal-sep>])
Formats a single number <n>
. You can optionally specify any
<radix>
from 2 to 36 (even if <n>
isn't an integer).
<precision>
forces a fixed-point format.
<sign>
of #t
indicates to output a plus sign (+) for positive
integers. However, if <sign>
is a character, it means to wrap the
number with that character and its mirror opposite if the number is
negative. For example, #\(
prints negative numbers in parenthesis,
financial style: -3.14 => (3.14)
<comma>
is an integer specifying the number of digits between
commas. Variable length, as in subcontinental-style, is not yet
supported.
<comma-sep>
is the character to use for commas, defaulting to #\,
.
<decimal-sep>
is the character to use for decimals, defaulting to
#\.
, or to #\,
(European style) if <comma-sep>
is already
#\.
.
(num/comma <n> [<base> <precision> <sign>])
Shortcut for num
to print with commas.
(fmt #f (num/comma 1234567))
=> "1,234,567"
(num/si <n> [<base> <suffix>])
Abbreviates <n>
with an SI suffix as in the -h or --si option to
many GNU commands. The base defaults to 1024, using suffix names
like Ki, Mi, Gi, etc. Other bases (e.g. the standard 1000) have the
suffixes k, M, G, etc.
<suffix>
argument is appended only if an abbreviation is used.
(fmt #f (num/si 608))
=> "608"
(fmt #f (num/si 3986))
=> "3.9Ki"
(fmt #f (num/si 3986 1000 "B"))
=> "4kB"
(num/fit <width> <n> . <ARGS>)
Like num
, but if the result doesn't fit in <width>
, output
instead a string of hashes (with the current <precision>
) rather
than showing an incorrectly truncated number. For example
(fmt #f (fix 2 (num/fit 4 12.345)))
=> "#.##"
(num/roman <n>)
Formats the number as a Roman numeral:
(fmt #f (num/roman 1989))
=> "MCMLXXXIX"
(num/old-roman <n>)
Formats the number as an old-style Roman numeral, without the
subtraction abbreviation rule:
(fmt #f (num/old-roman 1989))
=> "MDCCCCLXXXVIIII"
5.3 Formatting Space
nl
Outputs a newline.
fl
Short for "fresh line," outputs a newline only if we're not already
at the start of a line.
(space-to <column>)
Outputs spaces up to the given <column>
. If the current column is
already >= <column>
, does nothing.
(tab-to [<tab-width>])
Outputs spaces up to the next tab stop, using tab stops of width
<tab-width>
, which defaults to 8. If already on a tab stop, does
nothing. If you want to ensure you always tab at least one space, you
can use (cat " " (tab-to width))
.
fmt-null
Outputs nothing (useful in combinators and as a default noop in
conditionals).
5.4 Concatenation
(cat <format> ...)
Concatenates the output of each <format>
.
(apply-cat <list>)
Equivalent to (apply cat <list>)
but may be more efficient.
(fmt-join <formatter> <list> [<sep>])
Formats each element <elt>
of <list>
with (<formatter>
<elt>)
, inserting <sep>
in between. <sep>
defaults to the
empty string, but can be any format.
(fmt #f (fmt-join dsp '(a b c) ", "))
=> "a, b, c"
(fmt-join/prefix <formatter> <list> [<sep>])
(fmt-join/suffix <formatter> <list> [<sep>])
(fmt #f (fmt-join/prefix dsp '(usr local bin) "/"))
=> "/usr/local/bin"
fmt-join
, but inserts <sep>
before/after every element.
(fmt-join/last <formatter> <last-formatter> <list> [<sep>])
As fmt-join
, but the last element of the list is formatted with
<last-formatter>
instead.
(fmt-join/dot <formatter> <dot-formatter> <list> [<sep>])
As fmt-join
, but if the list is a dotted list, then formats the dotted
value with <dot-formatter>
instead.
5.5 Padding and Trimming
(pad <width> <format> ...)
(pad/left <width> <format> ...)
(pad/both <width> <format> ...)
Analogs of SRFI-13 string-pad
, these add extra space to the left,
right or both sides of the output generated by the <format>
s to
pad it to <width>
. If <width>
is exceeded has no effect.
pad/both
will include an extra space on the right side of the
output if the difference is odd.
pad
does not accumulate any intermediate data.
(trim <width> <format> ...)
(trim/left <width> <format> ...)
(trim/both <width> <format> ...)
Analogs of SRFI-13 string-trim
, truncates the output of the
<format>
s to force it in under <width>
columns. As soon as
any of the <format>
s exceed <width>
, stop formatting and
truncate the result, returning control to whoever called trim
. If
<width>
is not exceeded has no effect.
ellipses
procedure
below), then when any truncation occurs trim
and trim/left
will append and prepend the ellipse, respectively. trim/both
will
both prepend and append. The length of the ellipse will be considered
when truncating the original string, so that the total width will
never be longer than <width>
.
(fmt #f (ellipses "..." (trim 5 "abcde")))
=> "abcde"
(fmt #f (ellipses "..." (trim 5 "abcdef")))
=> "ab..."
(trim/length <width> <format> ...)
A variant of trim
which acts on the actual character count rather
than columns, useful for truncating potentially cyclic data.
(fit <width> <format> ...)
(fit/left <width> <format> ...)
(fit/both <width> <format> ...)
A combination of pad
and trunc
, ensures the output width is
exactly <width>
, truncating if it goes over and padding if it goes
under.
5.6 Format Variables
You may have noticed many of the formatters are aware of the current
column. This is because each combinator is actually a procedure of
one argument, the current format state, which holds basic
information such as the row, column, and any other information that
a format combinator may want to keep track of. The basic interface
is:
(fmt-let <name> <value> <format> ...)
(fmt-bind <name> <value> <format> ...)
fmt-let
sets the name for the duration of the <format>
s, and
restores it on return. fmt-bind
sets it without restoring it.
(fmt-if <pred> <pass> [<fail>])
<pred>
takes one argument (the format state) and returns a boolean
result. If true, the <pass>
format is applied to the state,
otherwise <fail>
(defaulting to the identity) is applied.
fmt-let
and fmt-bind
could be used, these common variables have shortcuts:
(radix <k> <format> ...)
(fix <k> <format> ...)
These alter the radix and fixed point precision of numbers output with
dsp
, wrt
, pretty
or num
. These settings apply
recursively to all output data structures, so that
(fmt #f (radix 16 '(70 80 90)))
"(#x46 #x50 #x5a)"
. Note that read/write
invariance is essential, so for dsp
, wrt
and pretty
the
radix prefix is always included when not decimal. Use num
if you
want to format numbers in alternate bases without this prefix. For
example,
(fmt #f (radix 16 "(" (fmt-join num '(70 80 90) " ") ")"))
"(46 50 5a)"
, the same output as above without the
"#x" radix prefix.
(fmt #f (fix 30 #i2/3))
=> "0.666666666666666600000000000000"
(fmt #f (fix 30 2/3))
=> "0.666666666666666666666666666667"
(decimal-align <k> <format> ...)
Specifies an alignment for the decimal place when formatting numbers,
useful for outputting tables of numbers.
(define (print-angles x)
(fmt-join num (list x (sin x) (cos x) (tan x)) " "))
(fmt #t (decimal-align 5 (fix 3 (fmt-join/suffix print-angles (iota 5) nl))))
0.000 0.000 1.000 0.000
1.000 0.842 0.540 1.557
2.000 0.909 -0.416 -2.185
3.000 0.141 -0.990 -0.142
4.000 -0.757 -0.654 1.158
(comma-char <k> <format> ...)
(decimal-char <k> <format> ...)
comma-char
and decimal-char
set the defaults for number
formatting.
(pad-char <k> <format> ...)
The pad-char
sets the character used by space-to
, tab-to
,
pad/*
, and fit/*
, and defaults to #\space
.
(define (print-table-of-contents alist)
(define (print-line x)
(cat (car x) (space-to 72) (pad/left 3 (cdr x))))
(fmt #t (pad-char #\. (fmt-join/suffix print-line alist nl))))
(print-table-of-contents
'(("An Unexpected Party" . 29)
("Roast Mutton" . 60)
("A Short Rest" . 87)
("Over Hill and Under Hill" . 100)
("Riddles in the Dark" . 115)))
An Unexpected Party.....................................................29
Roast Mutton............................................................60
A Short Rest............................................................87
Over Hill and Under Hill...............................................100
Riddles in the Dark....................................................115
(ellipse <ell> <format> ...)
Sets the truncation ellipse to <ell>
, would should be a string or
character.
(with-width <width> <format> ...)
Sets the maximum column width used by some formatters. The default
is 78.
5.7 Columnar Formatting
Although tab-to
, space-to
and padding can be used to manually
align columns to produce table-like output, these can be awkward to
use. The optional extensions in this section make this easier.
(columnar <column> ...)
Formats each <column>
side-by-side, i.e. as though each were
formatted separately and then the individual lines concatenated
together. The current column width is divided evenly among the
columns, and all but the last column are right-padded. For example
(fmt #t (columnar (dsp "abcndefn") (dsp "123n456n")))
abc 123
def 456
columnar
treats raw
strings as literals inserted into the given location on every line, to
be used as borders, for example:
(fmt #t (columnar "/* " (dsp "abcndefn")
" | " (dsp "123n456n")
" */"))
/* abc | 123 */
/* def | 456 */
'left
,
'right
or 'center
to control the justification. The symbol
'infinite
can be used to indicate the column generates an infinite
stream of output.
columnar
builds its output incrementally, interleaving
calls to the generators until each has produced a line, then
concatenating that line together and outputting it. This is important
because as noted above, some columns may produce an infinite stream of
output, and in general you may want to format data larger than can fit
into memory. Thus columnar would be suitable for line numbering a
file of arbitrary size, or implementing the Unix yes(1)
command,
etc.
columnar
uses first-class
continuations to interleave the column output. The core fmt
itself has no knowledge of or special support for columnar
, which
could complicate and potentially slow down simpler fmt
operations.
This is a testament to the power of call/cc
- it can be used to
implement coroutines or arbitrary control structures even where they
were not planned for.
(tabular <column> ...)
Equivalent to columnar
except that each column is padded at least
to the minimum width required on any of its lines. Thus
(fmt #t (tabular "|" (dsp "a\\nbc\\ndef\\n") "|" (dsp "123n45n6n") "|"))
|a |123|
|bc |45 |
|def|6 |
tabular
cannot
format a table larger than would fit in memory.
(fmt-columns <column> ...)
The low-level formatter on which columnar
is based. Each <column>
must be a list of 2-3 elements:
(<line-formatter> <line-generator> [<infinite?>])
<line-generator>
is the column generator as above, and the
<line-formatter>
is how each line is formatted. Raw concatenation
of each line is performed, without any spacing or width adjustment.
<infinite?>
, if true, indicates this generator produces an
infinite number of lines and termination should be determined without
it.
(wrap-lines <format> ...)
Behaves like cat
, except text is accumulated and lines are optimally
wrapped to fit in the current width as in the Unix fmt(1)
command.
(justify <format> ...)
Like wrap-lines
except the lines are full-justified.
(define func
'(define (fold kons knil ls)
(let lp ((ls ls) (acc knil))
(if (null? ls) acc (lp (cdr ls) (kons (car ls) acc))))))
(define doc
(string-append
"The fundamental list iterator. Applies KONS to each element "
"of LS and the result of the previous application, beginning "
"with KNIL. With KONS as CONS and KNIL as '(), equivalent to REVERSE."))
(fmt #t (columnar (pretty func) " ; " (justify doc)))
(define (fold kons knil ls) ; The fundamental list iterator.
(let lp ((ls ls) (acc knil)) ; Applies KONS to each element of
(if (null? ls) ; LS and the result of the previous
acc ; application, beginning with KNIL.
(lp (cdr ls) ; With KONS as CONS and KNIL as '(),
(kons (car ls) acc))))) ; equivalent to REVERSE.
(fmt-file <pathname>)
Simply displayes the contents of the file <pathname>
a line at a
time, so that in typical formatters such as columnar
only constant
memory is consumed, making this suitable for formatting files of
arbitrary size.
(line-numbers [<start>])
A convenience utility, just formats an infinite stream of numbers (in
the current radix) beginning with <start>
, which defaults to 1
.
nl(1)
utility could be implemented as:
(fmt #t (columnar 6 'right 'infinite (line-numbers)
" " (fmt-file "read-line.scm")))
1
2 (define (read-line . o)
3 (let ((port (if (pair? o) (car o) (current-input-port))))
4 (let lp ((res '()))
5 (let ((c (read-char port)))
6 (if (or (eof-object? c) (eqv? c #\newline))
7 (list->string (reverse res))
8 (lp (cons c res)))))))
6 C Formatting
6.1 C Formatting Basics
For purposes such as writing wrappers, code-generators, compilers or
other language tools, people often need to generate or emit C code.
Without a decent library framework it's difficult to maintain proper
indentation. In addition, for the Scheme programmer it's tedious to
work with all the context sensitivities of C, such as the expression
vs. statement distinction, special rules for writing preprocessor
macros, and when precedence rules require parenthesis. Fortunately,
context is one thing this formatting library is good at keeping
track of. The C formatting interface tries to make it as easy as
possible to generate C code without getting in your way.
(fmt #t (c-if 1 2 3))
if (1) {
2;
} else {
3;
}
(fmt #t (c-if (c-if 1 2 3) 4 5))
if (1 ? 2 : 3) {
4;
} else {
5;
}
c-begin
, used for sequencing, will separate with
semi-colons in a statement and commas in an expression.
(fmt #t (c-fun 'int 'foo '() (c-if (c-if 1 2 3) 4 5)))
int foo () {
if (1 ? 2 : 3) {
return 4;
} else {
return 5;
}
}
(fmt #t (c-switch 'y
(c-case 1 (c+= 'x 1))
(c-default (c+= 'x 2))))
switch (y) {
case 1:
x += 1;
break;
default:
x += 2;
break;
}
(fmt #t (c-switch 'y
(c-case/fallthrough 1 (c+= 'x 1))
(c-default (c+= 'x 2))))
switch (y) {
case 1:
x += 1;
default:
x += 2;
break;
}
c++
is a prefix operator, c++/post
is postfix. ||, | and
|= are written as c-or
, c-bit-or
and c-bit-or=
respectively.
c-apply
. Other control
structures such as c-for
and c-while
work as expected. The full
list is in the procedure index below.
'gen
variable. This allows you to specify
generators for your own types, e.g. if you are using your own AST
records in a compiler.
'gen
variable isn't set it defaults to the c-expr/sexp
procedure, which formats an s-expression as if it were C code. Thus
instead of c-apply
you can just use a list. The full API is
available via normal s-expressions - formatters that aren't keywords
in C are prefixed with a % or otherwise made invalid C identifiers so
that they can't be confused with function application.
6.2 C Preprocessor Formatting
C preprocessor formatters also properly handle their surrounding
context, so you can safely intermix them in the normal flow of C
code.
(fmt #t (c-switch 'y
(c-case 1 (c= 'x 1))
(cpp-ifdef 'H_TWO (c-case 2 (c= 'x 4)))
(c-default (c= 'x 5))))
switch (y) {
case 1:
x = 1;
break;
#ifdef H_TWO
case 2:
x = 4;
break;
#endif /* H_TWO */
default:
x = 5;
break;
}
cpp-define
, which knows to wrap
individual variable references in parenthesis:
(fmt #t (cpp-define '(min x y) (c-if (c< 'x 'y) 'x 'y)))
#define min(x, y) (((x) < (y)) ? (x) : (y))
cpp-wrap-header
:
(fmt #t (cpp-wrap-header "FOO_H"
(c-extern (c-prototype 'int 'foo '()))))
#ifndef FOO_H
#define FOO_H
extern int foo ();
#endif /* ! FOO_H */
6.3 Customizing C Style
The output uses a simplified K&R style with 4 spaces for indentation
by default. The following state variables let you override the
style:
'indent-space
how many spaces to indent bodies, default 4
'switch-indent-space
how many spaces to indent switch clauses, also defaults to 4
'newline-before-brace?
insert a newline before an open brace (non-K&R), defaults to #f
'braceless-bodies?
omit braces when we can prove they aren't needed
'non-spaced-ops?
omit spaces between operators and operands for groups of variables and
literals (e.g. "a+b+3" instead of "a + b + 3"}
'no-wrap?
Don't wrap function calls and long operator groups over mulitple
lines. Functions and control structures will still use multiple
lines.
'radix
and 'precision
settings.
6.4 C Formatter Index
(c-if <condition> <pass> [<fail> [<condition2> <pass2> ...]])
Print a chain of if/else conditions. Use a final condition of 'else
for a final else clause.
(c-for <init> <condition> <update> <body> ...)
(c-while <condition> <body> ...)
Basic loop constructs.
(c-fun <type> <name> <params> <body> ...)
(c-prototype <type> <name> <params>)
Output a function or function prototype. The parameters should be a
list 2-element lists of the form (<param-type> <param-name>)
,
which are output with DSP. A parameter can be abbreviated as just the
symbol name, or #f
can be passed as the type, in which case the
'default-type
state variable is used. The parameters may be a
dotted list, in which case ellipses for a C variadic are inserted -
the actual name of the dotted value is ignored.
'(const char)
. A complete description is given below in section
6.6.
(c-var <type> <name> [<init-value>])
Declares and optionally initializes a variable. Also accessed as %var
at the head of a list.
(c-begin <expr> ...)
Outputs each of the <expr>s, separated by semi-colons if in a
statement or commas if in an expression.
(c-switch <clause> ...)
(c-case <values> <body> ...)
(c-case/fallthrough <values> <body> ...)
(c-default <body> ...)
Switch statements. In addition to using the clause formatters,
clauses inside a switch may be handled with a Scheme CASE-like list,
with the car a list of case values and the cdr the body.
(c-label <name>)
(c-goto <name>)
(c-return [<result>])
c-break
c-continue
Manual labels and jumps. Labels can also be accessed as a list
beginning with a colon, e.g. '(: label1)
.
(c-const <expr>)
(c-static <expr>)
(c-volatile <expr>)
(c-restrict <expr>)
(c-register <expr>)
(c-auto <expr>)
(c-inline <expr>)
(c-extern <expr>)
Declaration modifiers. May be nested.
(c-extern/C <body> ...)
Wraps body in an extern "C" { ... } for use with C++.
(c-cast <type> <expr>)
Casts an expression to a type. Also %cast at the head of a list.
(c-typedef <type> <new-name> ...)
Creates a new type definition with one or more names.
(c-struct [<name>] <field-list> [<attributes>])
(c-union [<name>] <field-list> [<attributes>])
(c-class [<name>] <field-list> [<attributes>])
(c-attribute <values> ...)
Composite type constructors. Attributes may be accessed as
%attribute at the head of a list.
(fmt #f (c-struct 'employee
'((short age)
((char *) name)
((struct (year month day)) dob))
(c-attribute 'packed)))
struct employee {
short age;
char* name;
struct {
int year;
int month;
int day;
} dob;
} __attribute__ ((packed));
(c-enum [<name>] <enum-list>)
Enumerated types. <enum-list>
may be strings, symbols, or lists of
string or symbol followed by the enum's value.
(c-comment <formatter> ...)
Outputs the <formatter>
s wrapped in C's /* ... */ comment. Properly
escapes nested comments inside in an Emacs-friendly style.
6.5 C Preprocessor Formatter Index
(cpp-include <file>)
If file is a string, outputs in it "quotes", otherwise (as a symbol
or arbitrary formatter) it outputs it in brackets.
(fmt #f (cpp-include 'stdio.h))
=> "#include <stdio.h>n"
(fmt #f (cpp-include "config.h"))
=> "#include "config.h"n"
(cpp-define <macro> [<value>])
Defines a preprocessor macro, which may be just a name or a list of
name and parameters. Properly wraps the value in parenthesis and
escapes newlines. A dotted parameter list will use the C99 variadic
macro syntax, and will also substitute any references to the dotted
name with __VA_ARGS__
:
(fmt #t (cpp-define '(eprintf . args) '(fprintf stderr args)))
#define eprintf(...) (fprintf(stderr, __VA_ARGS__))
(cpp-if <condition> <pass> [<fail> ...])
(cpp-ifdef <condition> <pass> [<fail> ...])
(cpp-ifndef <condition> <pass> [<fail> ...])
(cpp-elif <condition> <pass> [<fail> ...])
(cpp-else <body> ...)
Conditional compilation.
(cpp-line <num> [<file>])
Line number information.
(cpp-pragma <args> ...)
(cpp-error <args> ...)
(cpp-warning <args> ...)
Additional preprocessor directives.
(cpp-stringify <expr>)
Stringifies <expr>
by prefixing the # operator.
(cpp-sym-cat <args> ...)
Joins the <args>
into a single preprocessor token with the ##
operator.
(cpp-wrap-header <name> <body> ...)
Wrap an entire header to only be included once.
Operators:
c++ c-- c+ c- c* c/ c% c& c^ c~ c! c&& c<< c>> c== c!=
c< c> c<= c>= c= c+= c-= c*= c/= c%= c&= c^= c<<= c>>=
c++/post c--/post c-or c-bit-or c-bit-or=
6.6 C Types
'char
or 'int
. You
can wrap types with modifiers such as c-const
, but as a
convenience you can just use a list such as in '(const unsignedchar *)
.
You can also nest these lists, so the previous example is
equivalent to '(* (const (unsigned char)))
.
'(%pointer <type>)
for readability -
%pointer
is exactly equivalent to *
in types.
(%array <type> [<size>])
<type>
is any other type (including another array or
function pointer), and <size>
, if given, will print the array
size. For example:
(c-var '(%array (unsigned long) SIZE) 'table '#(1 2 3 4))
unsigned long table[SIZE] = {1, 2, 3, 4};
(%fun <return-type> (<param-types> ...))
(c-typedef '(%fun double (double double int)) 'f)
typedef double (*f)(double, double, int);
'default-type
formatting state variable is used. By default this
is just 'int
.
6.7 C as S-Expressions
(fmt #t (c-if (c-if 1 2 3) 4 5))
(fmt #t (c-expr '(if (if 1 2 3) 4 5)))
%fun
, %var
, %pointer
,
%array
, %cast
), including C preprocessor constructs
(%include
, %define
, %pragma
, %error
, %warning
,
%if
, %ifdef
, %ifndef
, %elif
). Labels are written as
(: <label-name>)
. You can write a sequence as (%begin <expr>
...)
.
#f
looks like a Lisp definition:
(fmt #t (c-expr '(%fun #f fib (n)
(if (<= n 1)
1
(+ (fib (- n 1)) (fib (- n 2)))))))
int fib (int n) {
if (n <= 1) {
return 1;
} else {
return fib((n - 1)) + fib((n - 2));
}
}
7 JavaScript Formatting
The experimental fmt-js library extends the fmt-c library with
functionality for formatting JavaScript code.
(js-expr x)
Formats a JavaScript expression similarly to c-expr
. Inside a
js-expr
formatter, you can use the normal c-
prefixed
formatters described in the previous section, and they will format
appropriately for JavaScript.
(js-function [<name>] (<params>) <body> ...)
Defines a function (anonymously if no name is provided).
(js-var <name> [<init-value>])
Declares a JavaScript variable, optionally with an initial value.
(js-comment <formatter> ...)
Formats a comment prefixing lines with "// "
.
8 Formatting with Color
The fmt-color library provides the following utilities:
(fmt-red <formatter> ...)
(fmt-blue <formatter> ...)
(fmt-green <formatter> ...)
(fmt-cyan <formatter> ...)
(fmt-yellow <formatter> ...)
(fmt-magenta <formatter> ...)
(fmt-white <formatter> ...)
(fmt-black <formatter> ...)
(fmt-bold <formatter> ...)
(fmt-underline <formatter> ...)
(fmt-color <color> <formatter> ...)
#xRRGGBB
numeric value.
Outputs the formatters colored with ANSI escapes. In addition
(fmt-in-html <formatter> ...)
<span>
tags with
the appropriate style colors, instead of ANSI escapes.
9 Unicode
The fmt-unicode library provides the fmt-unicode
formatter, which
just takes a list of formatters and overrides the string-length for
padding and trimming, such that Unicode double or full width
characters are considered 2 characters wide (as they typically are in
fixed-width terminals), while treating combining and non-spacing
characters as 0 characters wide.
format | fmt |
~a | dsp |
~c | dsp |
~s | wrt/unshared |
~w | wrt |
~y | pretty |
~x | (radix 16 ...) or (num <n> 16) |
~o | (radix 8 ...) or (num <n> 8) |
~b | (radix 2 ...) or (num <n> 2) |
~f | (fix <digits> ...) or (num <n> <radix> <digits>) |
~% | nl |
~& | fl |
~[...~] | normal if or fmt-if (delayed test) |
~{...~} | (fmt-join ... <list> [<sep>]) |
12 References
[1] R. Kelsey, W. Clinger, J. Rees (eds.)
Revised^5 Report on the Algorithmic Language Scheme
[2] Guy L. Steele Jr. (editor) Common Lisp Hyperspec
[3] Scott G. Miller SRFI-28 Basic Format Strings
[4] Ken Dickey SRFI-48 Intermediate Format Strings
[5] Ray Dillinger SRFI-38 External Representation for Data With Shared Structure
[6] Damian Conway Perl6 Exegesis 7 - formatting