Preface A utility without a manual is of no utility at all. This is a guide for writing UNIX manuals in the mdoc language. If you're new to writing UNIX manuals, or you want to learn about best practises for high-quality manuals, this book may benefit your work. To those unfamiliar with UNIX, mdoc is a language for documenting utilities, programming functions, file and wire formats, hardware device interfaces, and so on. By a language I mean a structured, machine-readable document format such as HTML, the primary language of web pages; or RTF, used by word processors. man is the utility for querying documents in mdoc and other languages, collectively called man pages. The following, for example, is a fragment of man output for the cat command. NAME cat — concatenate and print files SYNOPSIS cat [ -benstuv ] [ file ... ] DESCRIPTION The cat utility reads files sequentially, writing them to the standard output. The file operands are processed in command-line order. If file is a single dash (‘​-') or absent, cat reads from the standard input. Why mdoc? After all, there are plenty of other UNIX manual languages out there, from the historical man to DocBook. In short, mdoc is: portable, as any modern UNIX system can format it without needing clumsy toolchains;

expressive, capturing the semantic content of manpages instead of just presentation cues;

concise, making line-based source control painless; and

well-documented, well-supported, and actively maintained by a community of knowledgable developers. No other format can boast all of these points at once. In fact, although I've mentioned UNIX several times already, mdoc isn't exclusively tied to UNIX. Although UNIX and mdoc are historically linked, open source mdoc tools exist for any operating system. Furthermore, the documentation capabilities of mdoc apply to computing systems in general — not just UNIX. In this book, however, I'll assume you are casually familiar with man and its output. This will allow us to focus on manuals with the same formatted output in mind. Thus, if you're unfamiliar with the man utility, this is a good time to read an introductory text on the subject (such as a UNIX beginner's guide), or at the very least, read the output of man man (the manual page of the man command). This is not a canonical reference! The mdoc language is not standardised. For official reference, consult the manual distributed with your target computer system with man mdoc. This work primarily addresses the elements of mdoc common to any UNIX deployment, noting common pitfalls in portability. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Tutorial Introduction Let's begin with practical examples of mdoc. The intended audience of this part is somebody who has never written a mdoc manual. Although you may be tempted to jump to the chapter relevant to your manual type (for example, a command or function library), it's best to read the chapters in order. I'll explain mdoc syntax as we go. If you've already written a few manuals, you may want to read this part anyway: beyond explaining technical mdoc language concepts by example, I'll also introduce some best practises and discuss portability between various mdoc environments. I'll frequently refer to the screen output of mdoc documents as displayed with the UNIX man utility. Furthermore, I'll refer to command invocation in the traditional UNIX way — on the command line. In short, a bit of UNIX knowledge will help to avoid confusion. But I'll briefly introduce invocation syntaxes as the need arises. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Commands Commands are the way in which a user operates her computer. Already I've noted the man command: if you've interacted with a UNIX system, you've probably run at least man intro or man man to learn about your system. In this chapter, I'll discuss how to document these commands with mdoc. This may be unfamiliar if you're accustomed to graphical interfaces — all of our examples will refer to command-line, text-based commands. If your target environment isn't a UNIX system, it's a good idea to read these examples anyway, as as they will expose the rudimentary structure of the mdoc language. As mentioned before, reading an introductory text on UNIX will help avoid confusion. Let's begin by making a mental checklist for the criteria that make a good manual for a command. This checklist arises by inverting what a manual reader expects in opening a manual: what does the command do and how do I operate it? Do I describe the calling syntax of the command?

Do I describe each flag and argument of the command?

Do I describe the command's operation?

Do I describe the command's exit status?

Do I describe referenced files and environment variables? Above all, the best litmus test is whether a colleague or friend can read your manual and be able to use your command without any assistance on your part. Don't be discouraged by how this can take several tries to get right! I'll begin with a simple command, hi, which prints hello, world to the screen. I'll then add some command-line arguments to this command. By the time you finish this chapter, you should have a grasp of mdoc syntax and some of its more widely-used macros. In this text, I'll refer to the invocation of commands as cmd flag farg arg. Here, cmd refers to the command invocation name, flag is a flag (or switch) to that command, farg is an argument to a flag (not all flags have arguments), and arg is an argument to the command. The dash in front of flag indicates a flag, while the square brackets around flag farg indicate an optional part of the invocation. Since arg is not bracketed, it is a mandatory part of the invocation. This convention is formalised by the POSIX.1-2008 standard (Base Definitions, sec. 12.1), so you can expect to see it often in the UNIX world. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 22:24:07 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Simple Command Consider a simple UNIX command hi that prints hello, world and exits. Let's create a manual page hi.1 documenting this command. In this example, I'll begin with the full manual. In later examples, we'll build up the manual piece by piece. .Dd May 30, 2011

.Dt HI 1

.Os

.Sh NAME

.Nm hi

.Nd print \(dqhello, world\(dq

.Sh SYNOPSIS

.Nm

.Sh DESCRIPTION

Print

.Qq hello, world

and exit. How to display this manual page depends on the system you're using. Traditionally, the command for formatting UNIX manuals for a terminal is nroff. For now, let's stick with that. To display output, you must invoke nroff as nroff -mandoc file. The mandoc flag indicates that input is in mdoc. Hereafter, I'll refer to nroff simply as the formatter to avoid confusion, as there are many available mdoc formatters. NAME hi — print "hello, world" SYNOPSIS hi DESCRIPTION Print “hello, world” and exit. Let's start by studying the input and output. We can see most of the text translated into output, for instance, the capitalised NAME input is left-justified and in bold text. Same with SYNOPSIS and DESCRIPTION, although the .Sh text before this terms is missing. We can even see the output sentence Print "hello, world" and exit spread over lines 10–12: Print

.Qq hello, world

and exit. Let's take a closer look at this fragment. The .Qq is part of mdoc's instruction syntax. Input lines beginning with a dot are instructions to the formatter called mdoc macros, or just macros for short. The macro name is a terse two or three-character word following the dot, for example, Qq. The name of a macro tersely hints at its function. The words following the Qq to the end of line are arguments in the scope of the macro. Scope, a technical term in the field of programming languages, refers to the body of input within the context of an instruction or variable. In mdoc, a macro's scope is the block of text and instructions in the formatting context of that macro. Looking at the input and output, we can infer the scope of Qq by seeing what's surrounded by quotes (the formatting, in this case). .Qq hello, world Print “ hello, world ” and exit. As we explore more and more macros in this book, we'll see that each macro follows one of a handful of scope rules. It's already clear that Qq is limited in scope to its invocation line. But notice that the formatter recognised the content between Sh macros as requiring indentation. So it's clear that mdoc also has a concept of multi-line scope. In fact, Sh has both line arguments, for the name of the section; and multi-line arguments, for section content. .Sh SECTION 1

Section text.

.Sh SECTION 2

New section text. Furthermore, the existence of Qq within the Sh scope means that scopes may be nested. In the next section we'll see how multiple macros may even be specified on a single line. .Sh SECTION 1

Section text.

.Sh SECTION 2

.Qq Section text nested in a quote. We can visualise this scoping as follows, with an outer scope and inner scope: .Sh SECTION 2

.Qq Section text nested in a quote. Now let's return to hi.1 with this new knowledge of macros and scopes. We see seven macros in total, Dd, Dt, Os, Sh, Nm, Nd, and Qq. We know now that Qq encloses its arguments in double-quotes, Sh begins a named section with indented multi-line arguments. Of the remaining macros, Dd accepts the last modification date of the manual in month day, year format. Dt refers to the manual's title, HI, and its category, 1. Numbered manual categories are UNIX conventions, but applicable to any operating system. We'll explore more standard categories throughout this book. Note that HI is uppercase: by convention, Dt should always accept a capitalised document title. We'll talk more about titles and sections in later chapters of this book. For now, let's assume that a category number identifies the topic of the manual, where 1 refers to utilities. Next, Os indicates the operating system of the system running the formatter. If left unspecified, the formatter will return the current operating system (e.g., OpenBSD 4.9, Linux 2.6.32-5, or Microsoft Windows XP). .Dd May 30, 2011

.Dt HI 1

.Os \" Current operating system. Note that text following the \" marker is an mdoc comment, which has the following syntax: Text. \" Comment to end of the line.

.\" Extending across the full line. Comments are line-scoped, like Qq: .\" .Sh NAME Moving along, Nm accepts the manual's name. This differs from the title, Dt, in that a single manual may document multiple components. We'll see examples of this in later chapters. Finally, Nd accepts a brief, one-line description of the command. .Sh NAME

.Nm hi

.Nd print \(dqhello, world\(dq You can see that we re-invoke Nm in the SYNOPSIS, only without arguments. The formatter is smart enough to fill in its argument with the last supplied argument, in this case being hi. Since our simple command has no command-line arguments, its invocation is simply the command name. .Sh SYNOPSIS

.Nm Piecing this all together, we now have the following. .Dd May 30, 2011

.Dt HI 1

.Os

.Sh NAME

.Nm hi

.Nd print \(dqhello, world\(dq

.Sh SYNOPSIS

.Nm

.Sh DESCRIPTION

Print

.Qq hello, world

and exit. In this example, you've noticed that \(dqhello, world\(dq has the same behaviour of the Qq invocation. In mdoc, quotation marks signify literal strings. Thus, we used an escape character \(dq to render ". You may ask why not just use Qq, such as .Nd print

.Qq hello, world For the time being, assume that Nd must have its scope on the invocation line. Strictly-speaking, we could have written .Nd print "hello, world" but this encourages dangerous behaviour in assuming that quoted arguments may not affect output. This isn't always the case! We'll see later how quoted terms on macro lines change the grouping of arguments — at times non-intuitively. Before moving on to the next section, let's look quickly over our checklist for a well-formed manual. Did I describe the calling syntax of the command? Yes. It was only the name of the macro (no arguments or flags). Did I describe each flag and argument of the command? There were none, so yes. Did I describe the command's operation? Yes, it prints hello, world and exits. Did I describe the command's exit status? No, we only mentioned that it exits. Did I describe referenced files and environment variables? This is not applicable. To the effect of the exit status, let's modify the DESCRIPTION slightly for clarity. .Sh DESCRIPTION

Print

.Qq hello, world

and exit 0. Of course, our command must actually do so! For simplicity's sake, let's assume that this is the case. With our simple, well-documented example in mind, let's move on to a more realistic UNIX command. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 22:24:07 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Elaborate Command Most UNIX commands have flags, arguments, return values, environmental variables, and so on. So let's expand upon our example to include arguments for writing to an output file and a flag for outputting in uppercase letters. Furthermore, we'll accept an optional prefix string on the command-line, and return non-zero on failure. This changes two parts of our manual: the SYNOPSIS section, where we'll record the invocation syntax of our command; and the DESCRIPTION, where we'll describe the command-line options. We'll also add a new section, EXIT STATUS, to describe the non-zero exit on failure. Let's start by documenting our command-line options in the SYNOPSIS section: .Sh SYNOPSIS

.Nm

.Op Fl C

.Op Fl o Ar output

.Op Ar prefix The output renders as follows: SYNOPSIS hello [ -C ] [ -o output ] [ prefix ] Already, we begin to see the output take shape with the C and o characters, and the prefix. It's also clear that the Op macro surrounds its arguments in square brackets, just as Qq surrounded its line in double-quotes. But how did the formatter know to prefix the C and o with a dash, or underline the arguments output and prefix? It's obvious this has something to do with Fl and Ar. Macro lines may in fact consist of multiple macros — sometimes nesting further macros, sometimes closing prior scopes to begin one anew. The Fl and Ar words are macros nested within the scope of Op. However, while Op contains both of these child scopes, the Ar macro closes out the Fl scope and begin its own. .Op Fl C .Op Fl o Ar output .Op Ar prefix Outer parts are an outer scope, while inner parts are an inner scope. Now it's easy to see how Fl prefixes only the C with a dash and not the arguments following: its scope is closed out by Ar. Note that to document a flag Ar, we would need to quote its arguments as Fl "Ar" (we'll later learn how to escape arguments with zero-width spaces to accomplish the same). As there are many mdoc macros, a popular novice mistake is to unknowingly invoke a macro when expecting to print text. With our command syntax documented, let's document the arguments themselves. To do so, we detail the meaning of flags and arguments in the DESCRIPTION section. The

.Nm

function prints

.Qq hello, world

and returns.

.Pp

Its arguments are as follows:

.Bl -tag -width Ds

.It Fl C

Print only uppercase letters.

.It Fl o Ar output

Write to file

.Ar output .

.It Ar prefix

Prefix the output with

.Ar prefix .

.El Immediately, we see the introduction of several new macros: Pp, Bl, It, and El. More interestingly, we notice the text on the Bl begins with a dash, just as when passing arguments on a command line. This is the first instance of a macro that accepts flags. The rendered output of this fragment is as follows. -C Print only uppercase letters. -o output Write to file output. prefix Prefix the output with prefix. Its arguments are as follows: It should be clear that the Pp macro, which always stands alone, introduces a vertical paragraph break. Earlier, I introduced the concept of a multi-line scope for Sh, which was closed and re-opened by subsequent invocations of Sh. In this fragment, the Bl macro (for begin list ) is explicitly closed out by the El macro ( end list ). This is an example of explicit scope closure, versus the implicit scope closure of Sh sequences. Predictably, the Bl and El enclosure consists of list items, begun by the multi-line It macro lines. Like Sh, the It macro has its scope closed by subsequent invocations of It. As expected, its scope also closes when the surrounding list is closed with El. Until now, we've discussed only macros and macro arguments. But a handful of macros — Bl included — also accept flags which themselves may have arguments. In our example, the tag flag to Bl stipulates a tagged list. A tagged list entry consists of two parts: a tag and data, similar to the <DL> descriptive lists in HTML consisting of a key and data. Bl accepts a second flag, width, which accepts the argument Ds. This instructs the formatter that the tag portion of the list has width Ds, which is shorthand for default spacing. Next, let's look closer at the input line .Ar prefix . Note that it's correctly rendered with the period flushed up against the text, whereas the period is space-separated in the input. (The period itself isn't font-decorated, although this is difficult to see in the media you're reading.) prefix Prefix the output with prefix. By making the punctuation a separate argument, we distinguish it from the term prefix, and thus it is not underlined. The formatter is smart enough to distinguish standalone punctuation. When writing an mdoc manual, punctuation should always be separated from macro arguments unless it's part of the argument itself. This allows the formatter to correctly intuit end-of-line spacing. If we hadn't done so, the formatter wouldn't distinguish period from word. This is more intuitive when re-using the familiar Qq. .Qq first . .Qq second. We can now see the difference in the placement of punctuation: “ first ”. “ second. ” Let's piece this all together. You'll recognise the Dd, Dt, and Os macros from the last section, although the Dt argument has changed with our command name. .Dd May 30, 2011

.Dt HELLO 1

.Os

.Sh NAME

.Nm hello

.Nd print \(dqhello, world\(dq

.Sh SYNOPSIS

.Nm

.Op Fl C

.Op Fl o Ar output

.Op Ar prefix

.Sh DESCRIPTION

The

.Nm

function prints

.Qq hello, world

and returns.

.Pp

Its arguments are as follows:

.Bl -tag -width Ds

.It Fl C

Print only uppercase letters.

.It Fl o Ar output

Write to file

.Ar output .

.It Ar prefix

Prefix the output with

.Ar prefix .

.El Notice that we don't repeat the Op macros in the DESCRIPTION, although we stipulate them in the SYNOPSIS. This is because we document the flags and arguments themselves in the DESCRIPTION, not the calling syntax of the command. Finally, let's accomodate for command errors by stipulating the exit status of the command. To do this, we add a new section to the end of the manual, EXIT STATUS, consisting of a single macro. We didn't add this to hi.1 because we didn't stipulate any exit state; however, it's good practise to always include this section, even if your command only exits in one way. .Sh EXIT STATUS

.Ex -std The Ex macro is special in that it always accepts a flag, std. This is by convention. Although you can specify an argument to Ex, it works like Nm without arguments in that it reproduces the name of the document as last invoked with Nm. It prints a standardised message about the exit status of the command. EXIT STATUS The hello utility exits 0 on success, and >0 if an error occurs. With our manual complete, let's go over our checklist. Did I describe the calling syntax of the command? Yes, including flags and arguments. Did I describe each flag and argument of the command? Yes for all flags and arguments. Did I describe the command's operation? Yes, that it prints hello, world . Did I describe the command's exit status? Yes, that it returns a non-zero exit code on failure. Did I describe referenced files and environment variables? This is not applicable to this manual. Of course, most real manuals have many other useful bits of information, such as author names, referenced standards, files, and so on. I'll describe these in detail in later chapters of this book. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 22:57:49 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Case Study I now introduce a case study of a real-world manual, in particular the echo utility from OpenBSD. The original file may be viewed on-line at src/bin/echo/echo.1, file version 1.20. I choose this mainly because of its simplicity. .\" $​OpenBSD: echo.1,v 1.20 2010/09/03 09:53:20 jmc Exp $​

.\" $​NetBSD: echo.1,v 1.7 1995/03/21 09:04:26 cgd Exp $​ These initial comments are automatically created by the source-control system cvs, which fills in information about the last editor. I'll talk about revision control and those funny dollar-sign enclosures in Part 3. These particular comments indicate that the file was initially imported from NetBSD in 1995, where it was last edited by cgd (a system name, not the user's real name). It was last edited in OpenBSD, its current form, by jmc in 2010. If you're keeping your manual under source control, it's usually a good idea to begin your file with a similar line. .\" $​Id$ A tab character separates the comment marker from the text. Again, this will be covered later in this book — don't worry if it looks strange. .\" Copyright (c) 1990, 1993

.\" The Regents of the University of California. All rights reserved.

.\"

.\" This code is derived from software contributed to Berkeley by

.\" the Institute of Electrical and Electronics Engineers, Inc.

.\"

.\" Redistribution and use in source and binary forms, with or without

.\" modification, are permitted provided that the following conditions

.\" are met:

.\" 1. Redistributions of source code must retain the above copyright

.\" notice, this list of conditions and the following disclaimer.

.\" 2. Redistributions in binary form must reproduce the above copyright

.\" notice, this list of conditions and the following disclaimer in the

.\" documentation and/or other materials provided with the distribution.

.\" 3. Neither the name of the University nor the names of its contributors

.\" may be used to endorse or promote products derived from this software

.\" without specific prior written permission.

.\"

.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND

.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE

.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE

.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL

.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS

.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)

.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT

.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY

.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF

.\" SUCH DAMAGE. This long comment is the license and copyright of the source file. Of course, our use of this source file is compatible with the license, as may be read from the text itself! .\" @(#)echo.1 8.1 (Berkeley) 7/22/93 This comment is of historical note. The @(#) sequence was inserted by the sccs utility (Source Code Control System). Although this utility is part of POSIX.1-2008, it has mostly been replaced by cvs. You'll probably never encounter this string in your own manuals, so it's safe to disregard. At this point the manual content itself begins. .Dd $​Mdocdate: September 3 2010 ​$

.Dt ECHO 1

.Os This indicates that the manual's title is ECHO in category 1 (utilities) for the current installed operating system. The $​Mdocdate$ enclosure is similar to that as defined at the top of the file with $​OpenBSD$. .Sh NAME

.Nm echo

.Nd write arguments to the standard output This documents a single command, the echo command, which does as mentioned. .Sh SYNOPSIS

.Nm echo

.Op Fl n

.Op Ar string ... The command accepts a single optional flag, n, and an arbitrary number of optional arguments string. Note that re-stating the command name for the Nm is superfluous in this case. .Sh DESCRIPTION

The

.Nm

utility writes any specified operands, separated by single blank

.Pq Sq \ \&

characters and followed by a newline

.Pq Sq \en

character, to the standard

output.

When no operands are given, only the newline is written. The DESCRIPTION opens with a brief explanation of the utility and its output. The strange set of backslash-escaped characters \ \& is required to make the doubly-nested macros Pq and Sq (parenthesise and single-quote, respectively) correctly enclose a single space. .Pp

The options are as follows:

.Bl -tag -width Ds

.It Fl n

Do not print the trailing newline character.

.El This follows the standard form of documenting flags and arguments as a term/definition list. Each one — in this case only one — is documented in alphabetical order. .Sh EXIT STATUS

.Ex -std echo Notes the standard exit sequence. Note that the argument to Ex is superfluous, as only one command is listed for the manual. .Sh SEE ALSO

.Xr csh 1 ,

.Xr ksh 1 ,

.Xr printf 1 Although these weren't cited in other sections of the manual, the author felt it necessary to link to them. This is probably because both csh and ksh include internal implementations of a function by the same name. .Sh STANDARDS

The

.Nm

utility is compliant with the

.St -p1003.1-2008

specification.

.Pp

The flag

.Op Fl n

is an extension to that specification.

.Pp

.Nm

also exists as a built-in to

.Xr csh 1

and

.Xr ksh 1 ,

though with a different syntax. This last section fully describes the utility's conformance to the POSIX standard, which is very important to those writing portable utilities. The St macro expands into the relevant standard's full name, IEEE Std 1003.1-2008 (“POSIX.1”). For a full list of standards, consult your local documentation for the macro. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/05 00:39:38 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Functions Programming functions are a significant part of the UNIX canon, from the system call interface to the C library. If you're a C or C++ developer, chances are you've at least glanced through the manuals of functions such as socket, printf, or memmove. In general, the mdoc macros used for documenting programming functions are the same as those used for Commands; however, there are some domain-specific bits to annotate the various parts of function versus command invocation. You'll see that each command invocation macro, such as Fl for a flag, has an analogue for programming functions, such as the Fa, for a function argument. The mdoc format is used primarily for the C language and Fortran, but it works with C++, Perl, Tcl, and other imperative languages. In fact, most any language with functions (or subroutines) and variables will work, typed or not. In this book, I focus exclusively on the C language. This is due to the overwhelming presence of C libraries and functions documented with mdoc. Before beginning, we need to change our mental checklist for a complete manual. Since function manuals can document more than just function parts, our manual must grow to account for complexity. Do I describe the preprocessing and linking information?

Do I describe the calling syntax of each function and variable?

Do I describe the type of each function and variable?

Do I describe each argument in each calling syntax?

Do I describe each function's operation?

Do I describe each function's side effects? A function is any callable instruction, be it a C function, routine, or macro. A variable may also be a C variable or macro. I'll consistently use the function and variable terminology in this book. In general, you don't have to be knowledgeable of C to understand this section, but it helps to have a grasp of basic programming structure, such as functions, function prototypes, and header files. In any event, I'll refer to a header file as a text file consisting of function prototypes. Header files for the C language, such as in our examples, end with the .h suffix. A C function prototype indicates the calling syntax of a function, such as the following. int

isspace(int c); In this, the C function isspace, notationally referred to as isspace, has a single argument int c (an integer named c) and returns a value of type int (another integer). Multiple arguments are comma-separated. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/05 16:50:11 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Simple Function Let's study a simple C function, hi, which prints hello, world just like in prior sections. We begin with the familiar first macros. .Dd May 30, 2011

.Dt HI 3

.Os

.Sh NAME

.Nm hi

.Nd print \(dqhello, world\(dq All that's changed is the manual category, from 1 to 3. We'll discuss manual categories later in this book. Suffice to say that programming functions and libraries (not system calls!) are in category 3. The calling syntax of our function is documented in the SYNOPSIS section. Assume that our function prototype is within the header file hi.h as void hi(void), which, in non-programming terms, declares that a function hi accepts no arguments and does not return a value. .Sh SYNOPSIS

.In hi.h

.Ft void

.Fn hi This introduces three unfamiliar macros. The In macro specifies an include file that interfacing programmes will need to include. The Ft and Fn macros collectively document the function (return) type and function name. Since not all languages have return types, the Ft macro is optional in this context. SYNOPSIS #include < hi.h > void

hi(); (); By now it comes as no surprise that Ft is scoped to the end of its line, as is Fn in this example. In fact, both of these macros are syntactically similar to the Ar and Fl found in the first chapter: their scopes are closed by subsequent macros on the same line. Since our function has no arguments or return values, all we need to do is add some bits in the DESCRIPTION section to complete this manual. .Dd May 30, 2011

.Dt HI 3

.Os

.Sh NAME

.Nm hi

.Nd print \(dqhello, world\(dq

.Sh SYNOPSIS

.In hi.h

.Ft void

.Fn hi

.Sh DESCRIPTION

The

.Fn hi

function prints

.Qq hello, world

and returns.

.Pp

It has no arguments. Here, you'll notice a difference between a function and command manual. It's clear that we refer back to our invoked command using Fn instead of Nm. Why is this? The Nm macro, when used in the body of a manual, refers to the command name, not the manual name, as we used Nm to annotate that utility name in the SYNOPSIS. In functions, we use the Fn macro. The difference is that Fn won't repeat the manual name if used without arguments. This complexity is simply the result of poor planning in the mdoc language. Let's visualise the output so far: NAME hi — print "hello, world" SYNOPSIS #include < hi.h > void

hi(); (); DESCRIPTION The hi() function prints “hello, world” and returns. The() function prints “hello, world” and returns. It has no arguments. It has no arguments. Lastly, let's stipulate the function return value in its own section, RETURN VALUES. This mirrors the EXIT STATUS introduced for hello.1. Although we don't have a return value, it's good practise to include this section anyway. .Sh RETURN VALUES

The

.Fn hi

function does not return a value. Although this example is instructive, it's quite simple. Let's review our checklist before moving on. Did I describe the preprocessing and linking information? Yes, a header file. There is no linking information. Did I describe the calling syntax of each function and variable? Yes, the hi function. Did I describe the type of each function and variable? Yes, as hi has neither type nor value. Did I describe each argument in each calling syntax? This does not apply, as it has none. Did I describe each function's operation? Yes, in that it prints hello, world . Did I describe each function's side effects? This does not apply, as it has none. Very few real-world functions are so simple. In the next section, we introduce a more practical function with types and arguments. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2014/04/07 21:27:38 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Elaborate Function Let's also study a function form of the elaborate command example. Again, I'll use C as my example. Since this is a bit more involved, you may feel a little lost if you're not familiar with C programming. I'll keep the technical jargon to a minimum. Let's re-write hi as hello, accepting a Boolean (zero or one) integer of whether to capitalise, and an optional character string (a word) prefix. Let's also stipulate an integer return value. .Sh SYNOPSIS

.In hello.h

.Ft int

.Fo hello

.Fa "int C"

.Fa "const char *prefix"

.Fc If you're not familiar with C, the const char * and int parts are part of the C language. Note that the C and prefix names haven't changed. The include file (In) and function return type (Ft) are unchanged but for the type (int instead of void). I've added an explicit-scope macro pair Fo and Fc, syntactically like Bl and El, that encloses the function's arguments. This renders as follows. Note that the formatter is smart enough to comma-separate the Fa macros. SYNOPSIS #include < hello.h > int

hello(int C, const char *prefix); ); It's clear that the Fo macro accepts the function name (as Fn did for the simple example), but it also opens a function prototype scope. This scope is closed by Fc. The contained Fa macros are for function arguments. If you're wondering why I didn't use Fn as in the last section, it's a matter of readability. Using Fn puts everything on one long line, such as the following. .Sh SYNOPSIS

.In hello.h

.Ft int

.Fn hello "int C" "const char *prefix" This works with two arguments, but can quickly run into long lines. In general, your mdoc manual shouldn't exceed 78 characters per line. Shorter lines are useful when managing manuals in cvs or other version management systems — we'll discuss this in later sections of this book. The quoted arguments to Fa may seem superfluous, but each argument to the Fa, for the C language, refers to a type and variable name. Since one may specify several arguments to a single Fa, the quotes are necessary for signifying a single argument type and name. .Sh SYNOPSIS

.In hello.h

.Ft int

.Fo hello

.Fa " int C " " const char *prefix "

.Fc This renders as follows, with the Fa scope highlighted. SYNOPSIS #include < hi.h > void

hello(int C, const char *prefix); ); In the C language, function prototypes don't necessarily need named function arguments. However, it's good practise to name arguments when documenting them in the SYNOPSIS so that we can consistently refer to them later on in the manual. Let's refer to them now in the DESCRIPTION, where we document our arguments. Note that there are no set conventions for documenting function arguments in the DESCRIPTION body. Sometimes this is done within the flow of a regular sentence. Other times, as below, we'll introduce each argument as part of a list. .Sh DESCRIPTION

The

.Fn hello

function prints

.Qq hello, world .

.Pp

It accepts the following arguments:

.Bl -tag -width Ds

.It Fa "int C"

Non-zero if the output should be uppercase.

.It Fa "const char *prefix"

A prefix to precede the output or NULL for no prefix.

.El Here, we see the familiar Bl and El list enclosure. Notice how I re-use the Fa macro to document individual arguments, just like I re-used Fl and Ar when documenting command-line flags and arguments. In the last section, I mentioned why we use Fn instead of Nm for re-stating the name. This renders as follows. DESCRIPTION The hello() function prints “hello, world”. The() function prints “hello, world”. It accepts the following arguments: int C Non-zero if the output should be uppercase. const char *prefix A prefix to precede the output or NULL for no prefix. It accepts the following arguments: Finally, let's add a section documenting the return value of this function. This will differ from the simple example in that we actually return a value. .Sh RETURN VALUES

The

.Fn hello

function returns 1 on success, 0 on failure. Piecing this example together, we have the following the following respectable C function manual. .Dd May 30, 2011

.Dt HELLO 3

.Os

.Sh NAME

.Nm hello

.Nd print \(dqhello, world\(dq

.Sh SYNOPSIS

.In hello.h

.Ft int

.Fo hello

.Fa "int C" "const char *prefix"

.Fc

.Sh DESCRIPTION

The

.Fn hello

function prints

.Qq hello, world .

.Pp

It accepts the following arguments:

.Bl -tag -width Ds

.It Fa "int C"

Non-zero if the output should be uppercase.

.It Fa "const char *prefix"

A prefix to precede the output or NULL for no prefix.

.El

.Sh RETURN VALUES

The

.Nm

function returns 1 on success, 0 on failure. Running through our checklist, we see that we've described preprocessor information with the header file macro In; function calling syntax and types in the SYNOPSIS; and arguments in the DESCRIPTION along with function operation. This contains all we need to know about the function. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2014/04/07 21:27:38 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Case Study I now introduce a case study of a real-world function manual, in particular the manual for the strtonum function, which is an extension to the C Standard Library found in OpenBSD. The original file may be viewed on-line at src/lib/libc/stdlib/strtonum.3, file version 1.14. In this case study, I've chosen a manual with some bad behaviour — not broken, but bad. This is intentional to show how real-world manuals deviate from recommended forms. I'll explicitly note each instance of bad behaviour as we explore the manual's contents. .\" $​OpenBSD: strtonum.3,v 1.14 2007/05/31 19:19:31 jmc Exp $

.\"

.\" Copyright (c) 2004 Ted Unangst

.\"

.\" Permission to use, copy, modify, and distribute this software for any

.\" purpose with or without fee is hereby granted, provided that the above

.\" copyright notice and this permission notice appear in all copies.

.\"

.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES

.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF

.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR

.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES

.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN

.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF

.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. This is the standard comment header to manual files in OpenBSD. Indeed, most distributed manuals begin with a copyright notice, then a license. The $​OpenBSD$ line is automatically updated by the revision control system, cvs, whenever an update to the file is committed. The line following is the copyright message, and following that is the text form of the ISC license. .Dd $​Mdocdate: May 31 2007 $

.Dt STRTONUM 3

.Os These three standard macros establish the last modified date, manual title (same as the single documented function but capitalised), manual category 3 (functions), and the default operating system. The $​Mdocdate$ line, like the $​OpenBSD$ line, is automatically updated by cvs whenever the document is committed to the source repository. .Sh NAME

.Nm strtonum

.Nd "reliably convert string value to an integer" Declares a single documented function, strtonum, and its purpose. The quotations within the Nd macro are superfluous: like Qq macro studied earlier, Nd accepts an arbitrary number of arguments to format. Quotation, in grouping these as one argument, serves little but to pass in whitespace (there is no special whitespace to pass in). .Sh SYNOPSIS

.Fd #include <stdlib.h>

.Ft long long

.Fo strtonum

.Fa "const char *nptr"

.Fa "long long minval"

.Fa "long long maxval"

.Fa "const char **errstr"

.Fc This declares the function prototype and calling syntax. First, let's examine the new Fd macro. The use of this macro for a header inclusion is obsolete: new manuals should always use In. This makes it easier for parsers to understand a header file — and possibly link to it — instead of being a generic preprocessor statement. The re-written form would begin as follows: .Sh SYNOPSIS

.In stdlib.h Moving along, we see that each function argument includes its name (e.g., nptr and minval). While not common in header file prototypes, this allows later references of function invocation in the manual to refer back to the prototype for type and context information. In the previous section, we discussed the relevance of quotation with Fa: the same is done here. While we could have used Fn, it would have created an overly long input line. Using Fn instead of Fo is purely a matter of convenience and has no effect on parsing or formatting. .Sh DESCRIPTION

The

.Fn strtonum

function converts the string in

.Fa nptr

to a

.Li long long

value. In the SYNOPSIS, the Fa included the full type information. Here, however, we use Fa with just its name, nptr. We could have done the same in the SYNOPSIS, but the C language includes all type information in its prototypes. The Li macro here isn't good practise: since the long long refers to a type, it should be of type Vt. This behaviour — using a presentation macro instead of a semantic one — is a holder from legacy manual forms that are purely presentational. If you find yourself applying a style, think twice whether it's a good idea! The

.Fn strtonum

function was designed to facilitate safe, robust programming

and overcome the shortcomings of the

.Xr atoi 3

and

.Xr strtol 3

family of interfaces.

.Pp

The string may begin with an arbitrary amount of whitespace

(as determined by

.Xr isspace 3 )

followed by a single optional

.Ql +

or

.Ql -

sign.

.Pp

The remainder of the string is converted to a

.Li long long

value according to base 10.

.Pp

The value obtained is then checked against the provided

.Fa minval

and

.Fa maxval

bounds.

If

.Fa errstr

is non-null,

.Fn strtonum

stores an error string in

.Fa *errstr

indicating the failure. The remainder of the DESCRIPTION section has completely captured the calling syntax and behaviour of the function. The usage of Ql macro is simply to set aside non-alphanumeric letters from the regular stream of text. .Sh RETURN VALUES

The

.Fn strtonum

function returns the result of the conversion,

unless the value would exceed the provided bounds or is invalid.

On error, 0 is returned,

.Va errno

is set, and

.Fa errstr

will point to an error message.

.Fa *errstr

will be set to

.Dv NULL

on success;

this fact can be used to differentiate

a successful return of 0 from an error. Since this function returns a rather tricky error message, it's necessary to describe the effects of both the return value and the passed-in arguments. .Sh EXAMPLES

Using

.Fn strtonum

correctly is meant to be simpler than the alternative functions.

.Bd -literal -offset indent

int iterations;

const char *errstr;



iterations = strtonum(optarg, 1, 64, &errstr);

if (errstr)

errx(1, "number of iterations is %s: %s", errstr, optarg);

.Ed

.Pp

The above example will guarantee that the value of iterations is between

1 and 64 (inclusive). Many manual readers jump directly to the EXAMPLES section to gain an understanding of your function. Thus, not only must the example compile and run, it must also demonstrate as many parts of the function as possible. In the case of strtonum, an error condition and a non-error condition are documented. However, the header file inclusion(s) are missing, which may mislead readers. In particular, the non-standard errx function requires the err.h header file. .Sh ERRORS

.Bl -tag -width Er

.It Bq Er ERANGE

The given string was out of range.

.It Bq Er EINVAL

The given string did not consist solely of digit characters.

.It Bq Er EINVAL

.Ar minval

was larger than

.Ar maxval .

.El

.Pp

If an error occurs,

.Fa errstr

will be set to one of the following strings:

.Pp

.Bl -tag -width "too largeXX" -compact

.It too large

The result was larger than the provided maximum value.

.It too small

The result was smaller than the provided minimum value.

.It invalid

The string did not consist solely of digit characters.

.El The ERRORS section will be rigorously covered in the section on System Calls. In brief, since the errno global error variable is set, each possible value must be documented in a list using the Er macro. These are always enclosed within Bq. Furthermore, the error string in errstr must also be documented. .Sh SEE ALSO

.Xr atof 3 ,

.Xr atoi 3 ,

.Xr atol 3 ,

.Xr atoll 3 ,

.Xr sscanf 3 ,

.Xr strtod 3 ,

.Xr strtol 3 ,

.Xr strtoul 3 This section collects all references to other manuals made elsewhere in this manual, then adds more for completion. Note that the entries are alphabetically sorted. .Sh STANDARDS

.Fn strtonum

is an

.Ox

extension.

The existing alternatives, such as

.Xr atoi 3

and

.Xr strtol 3 ,

are either impossible or difficult to use safely.

.Sh HISTORY

The

.Fn strtonum

function first appeared in

.Ox 3.6 . Since this function is included in OpenBSD's C Standard Library, the fact that the function is not standard must absolutely be documented. In this, the Ox macro indicates the OpenBSD operating system (each BSD UNIX operating system has its own macro). Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/05 16:50:11 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Function Library I've mentioned several times that the name provided to Nm doesn't necessarily refer to the title of the manual in Dt. Let's study a simple function library, using both hi and hello, which demonstrates this concept. A function library is a collection of object files, which consist mainly of programming functions, within a single file called a library. On most UNIX systems, you can find libraries installed in /usr/local, ending in .a or .so. This example applies to any number of functions belonging to the same library — not necessarily all functions in the library. In fact, one commonly finds large libraries spread over many manuals, each of which contain several similar functions. For simplicity's sake, I'll call this C function library libgreeting, implying that the installed library is called libgreeting.a or libgreeting.so. It will consist of two header files, hi.h and hello.h, containing the function prototypes for hi and hello, respectively. Let's begin with the first few macros, which are also called the manual prologue. .Dd May 30, 2011

.Dt GREETING 3

.Os Note that I've changed the document title to be GREETING instead of choosing between function names. This is because the manual documents the entire function library, not just one particular function. In general, a function library should have its name not include the leading lib . It's a good rule of thumb that the Dt title of your document matches its filename. Next, I'll list the names of the functions being documented. I also change the description of the manual to be more generic, just in case I want to add new functions, later. .Sh NAME

.Nm hello ,

.Nm hi

.Nd print greeting messages Here I've used Nm twice to indicate that the manual documents two functions. In doing so, I'll have to be careful when invoking Nm in later parts of the manual, as it will produce hi if I don't specify a name, and this is probably not desired (nor should it be depended upon, as I may re-order the names). If we were only documenting a single function in a library, we would only assign Nm and Nd to the relevant function and not that of the library. It's good practise to alphabetise the function names in the NAME section. We must also be sure to comma-separate each name, leaving the last invocation without a comma. Let's look at the output so far. NAME hello, hi — print greeting messages Even though that is hard to maintain and not very useful, some operating systems, for example FreeBSD and NetBSD, require a LIBRARY section for base system libraries. For portable libraries, do not include such a section. .Sh LIBRARY

.Lb libgreeting This uses the macro Lb, which accepts the name of the library starting with lib . This macro is not portable because the list of known library names is system dependent, so it will produce different output on different systems, which is not desirable for a manual page. NAME hello, hi — print greeting messages LIBRARY library “libgreeting” The SYNOPSIS section will simply be a collection of the calling syntaxes for both functions, which we've already studied. If we were only documenting one function, would list only that function here. .Sh SYNOPSIS

.In hello.h

.In hi.h

.Ft int

.Fo hello

.Fa "int C" "const char *prefix"

.Fc

.Ft void

.Fn hi Note that I've listed both include files prior to the function prototypes. This is familiar to C programmers, where functions may have multiple include files that need a specific order. The functions are listed in the same order as their Nm listing. Let's examine the output so far. NAME hello, hi — print greeting messages LIBRARY library “libgreeting” SYNOPSIS #include < hello.h >

#include < hi.h > int

hello(int C, const char *prefix); ); void

hi(); (); Already, a manual reader has lots of pertinent information: the name of the library, its header file, and the function calling syntax. Let's continue in documenting the functions and their arguments, but this time, we'll do so in a different style than before. Instead of using lists, we describe each function as a free-form stream of text. We depend on the SYNOPSIS to hint the reader as to the function argument types; there's no need to re-state them. .Sh DESCRIPTION

The

.Fn hi

and

.Fn hello

functions print out greeting messages.

.Pp

The

.Fn hi

function accepts no arguments and prints out

.Qq hello, world .

.Pp

The

.Fn hello

function accepts a value

.Fa C ,

which if non-zero indicates output should be uppercase; and

.Fa prefix ,

which, if non-NULL, shall be prefixed to the output.

The

.Fa prefix

argument, if non-NULL, must be nil-terminated. Notice how each sentence in this fragment ends on its own line, for example, which, if non-NULL, shall be prefixed to the output.

The

.Fa prefix By doing so, the formatter is able to recognise the end of sentence and correctly handle sentential spacing. In most cases, this means adding two spaces between the period and subsequent text. From this follows a rule of thumb, new sentence, new line . In this DESCRIPTION we've captured what each function does and what its arguments are. What remains are return values. .Sh RETURN VALUES

The

.Fn hi

function does not return a value.

.Pp

The

.Fn hello

function returns 1 on success, 0 on failure. Let's collect these fragments into a single document and see if it's enough to use as a programming reference. NAME hello, hi — print greeting messages LIBRARY library “libgreeting” SYNOPSIS #include < hello.h >

#include < hi.h > int

hello(int C, const char *prefix); ); void

hi(); (); DESCRIPTION The hi() and hello() functions print out greeting messages. The() and() functions print out greeting messages. The hi() function accepts no arguments and prints out “hello, world”. The() function accepts no arguments and prints out “hello, world”. The hello() function accepts a value C, which if non-zero indicates output should be uppercase; and prefix, which, if non-NULL, shall be prefixed to the output. The prefix argument, if non-NULL, must be nil-terminated. The() function accepts a value, which if non-zero indicates output should be uppercase; and, which, if non-NULL, shall be prefixed to the output. Theargument, if non-NULL, must be nil-terminated. RETURN VALUES The hi() function does not return a value. The() function does not return a value. The hello() function returns 1 on success, 0 on failure. The() function returns 1 on success, 0 on failure. We'll use our mental checklist as a guide. First we stipulated linking information with the Lb macro. Then we introduced the calling syntax of each function, naming their arguments. We also stipulated the necessary header files in the order they'd be included in source files. In the DESCRIPTION, we described each function and its arguments in full. Lastly, we documented return values in the RETURN VALUES section. From this information, a programmer should be able to interface with our library. Contents Next Home History Last edited by $Author: schwarze $ on $Date: 2016/03/22 14:28:44 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Case Study I now introduce a case study of a real-world function library manual, in particular the manual for the getc, fgetc, getw, and getchar functions from OpenBSD. The original file may be viewed on-line at src/lib/libc/stdio/getc.3, file version 1.12. This is not the manual for the full function library, but only a handful of similar functions. .\" $​OpenBSD: getc.3,v 1.12 2007/05/31 19:19:31 jmc Exp ​$

.\"

.\" Copyright (c) 1990, 1991, 1993

.\" The Regents of the University of California. All rights reserved.

.\"

.\" This code is derived from software contributed to Berkeley by

.\" Chris Torek and the American National Standards Committee X3,

.\" on Information Processing Systems.

.\"

.\" Redistribution and use in source and binary forms, with or without

.\" modification, are permitted provided that the following conditions

.\" are met:

.\" 1. Redistributions of source code must retain the above copyright

.\" notice, this list of conditions and the following disclaimer.

.\" 2. Redistributions in binary form must reproduce the above copyright

.\" notice, this list of conditions and the following disclaimer in the

.\" documentation and/or other materials provided with the distribution.

.\" 3. Neither the name of the University nor the names of its contributors

.\" may be used to endorse or promote products derived from this software

.\" without specific prior written permission.

.\"

.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND

.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE

.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE

.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL

.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS

.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)

.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT

.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY

.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF

.\" SUCH DAMAGE. This is the standard comment header to manual files in OpenBSD. The $​OpenBSD$ line is automatically updated by the revision control system, cvs, whenever an update to the file is committed. The line following is the copyright message, and following that is the text form of the BSD license. .Dd $​Mdocdate: May 31 2007 ​$

.Dt GETC 3

.Os This classifies our manual in category 3 as a function or function library. The title of the manual, GETC, is chosen as the most general of those functions listed below in the NAME section. .Sh NAME

.Nm fgetc ,

.Nm getc ,

.Nm getchar ,

.Nm getw

.Nd get next character or word from input stream Lists (alphabetically) all the functions that will be documented, and some general notes about their collective function. We next jump down into the SYNOPSIS; since this set of functions is part of the C Standard Library, it needs no special linking information. .Sh SYNOPSIS

.Fd #include <stdio.h>

.Ft int

.Fn fgetc "FILE *stream"

.Ft int

.Fn getc "FILE *stream"

.Ft int

.Fn getchar "void"

.Ft int

.Fn getw "FILE *stream" This documents the calling syntax of all functions. Note that the Fd macro is used instead of the In macro. This invocation is historically relevant, but new manuals should always use In. .In stdio.h Next, each function and its arguments is explained as a free-flowing paragraph. This was probably chosen instead of using a list item for each argument (with Bl) due to the small number of arguments. .Sh DESCRIPTION

The

.Fn fgetc

function obtains the next input character (if present) from the stream

pointed at by

.Fa stream ,

or the next character pushed back on the stream via

.Xr ungetc 3 .

.Pp

The

.Fn getc

function acts essentially identically to

.Fn fgetc ,

but is a macro that expands in-line.

.Pp

The

.Fn getchar

function is equivalent to

.Fn getc

with the argument

.Em stdin .

.Pp

The

.Fn getw

function obtains the next

.Li int

(if present)

from the stream pointed at by

.Fa stream . The usage of the Em macro is not correct: the Va or Dv macro would have been more appropriate. The same applies to the Li. The mdoc language is semantic, so using presentation macros such as Li and Em is discouraged. .Sh RETURN VALUES

If successful, these routines return the next requested object from the

.Fa stream .

If the stream is at end-of-file or a read error occurs, the routines return

.Dv EOF .

The routines

.Xr feof 3 and

.Xr ferror 3

must be used to distinguish between end-of-file and error.

If an error occurs, the global variable

.Va errno

is set to indicate the error.

The end-of-file condition is remembered, even on a terminal, and all

subsequent attempts to read will return

.Dv EOF

until the condition is cleared with

.Xr clearerr 3 .

.Sh SEE ALSO

.Xr ferror 3 ,

.Xr fopen 3 ,

.Xr fread 3 ,

.Xr putc 3 ,

.Xr ungetc 3 All possible return values are correctly documented in the RETURN VALUES section and relevant functions cross-linked in the SEE ALSO section. Note that the cross-linked manuals are also alphabetically sorted. .Sh STANDARDS

The

.Fn fgetc ,

.Fn getc ,

and

.Fn getchar

functions conform to

.St -ansiC . Noting standards conformance is extremely important: it allows programmers and administrators to depend on your component in a cross-platform fashion. These functions are part of the C Standard Library. .Sh BUGS

Since

.Dv EOF

is a valid integer value,

.Xr feof 3

and

.Xr ferror 3

must be used to check for failure after calling

.Fn getw .

.Pp

Since the size and byte order of an

.Vt int

may vary from one machine to another,

.Fn getw

is not recommended for portable applications. The BUGS section should be used very carefully — bugs preferably should be fixed. In this section, design bugs have been documented. Whether the CAVEATS section would be more appropriate is up to the manual author. We found several inconsistent uses of mdoc in this manual. In general, if you find unusual or erroneous macros or styles in UNIX manuals, notify the authors! A bug in a manual is just as important as a bug in the code. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

System Call A system call differs from a user-land function in that it triggers the operating system kernel to perform some operation. This usually applies to I/O, such as reading from files or sockets with write. Other than that, system calls are no different than regular functions — they're invoked, have return values, and so on. In mdoc, however, a system call is a special function consisting of at least one section not found in ordinary function manuals. The first difference between ordinary functions and system calls is the manual category. Let's study a function khello, kernel hello , which is similar to the hello function described earlier. .Dd May 30, 2011

.Dt KHELLO 2

.Os All system calls are in category 2. Furthermore, unless under special circumstances, system call are each accorded their own manual. I'll use the same descriptive text as in the hello example. Note that for system calls, the hello.h header file should be in the compiler's standard include path. This is usually /usr/include on UNIX systems. .Sh NAME

.Nm hello

.Nd print greeting messages

.Sh SYNOPSIS

.In hello.h

.Ft int

.Fo hello

.Fa "int C" "const char *prefix"

.Fc

.Sh DESCRIPTION

The

.Nm

function prints out a greeting message.

.Pp

It accepts a value

.Fa C ,

which if non-zero indicates output should be uppercase; and

.Fa prefix ,

which, if not

.Dv NULL ,

shall be prefixed to the output.

The

.Fa prefix

argument, if not

.Dv NULL ,

must be nil-terminated. You'll notice I've omitted the LIBRARY section in this example, as system calls by definition aren't a part of a library. Furthermore, I've used the Dv macro to annotate the term NULL as a constant variable. Let's examine the output so far. NAME hello — print greeting messages SYNOPSIS #include < hello.h > int

hello(int C, const char *prefix); ); DESCRIPTION The hello function prints out a greeting message. Thefunction prints out a greeting message. It accepts a value C, which if non-zero indicates output should be uppercase; and prefix, which, if not NULL , shall be prefixed to the output. The prefix argument, if not NULL , must be nil-terminated. It accepts a value, which if non-zero indicates output should be uppercase; and, which, if not, shall be prefixed to the output. Theargument, if not, must be nil-terminated. In the hello example, I included a section RETURN VALUES detailing the return value of the function. System calls, however, usually return a standard value and have a side effect of setting the C library errno variable when invoked within a C language context. This is documented with a special macro Rv. .Sh RETURN VALUES

.Rv -std The std flag is by convention always specified. This macro will produce standard text regarding the errno value and that the function returns -1 on failure and 0 on success. If you have multiple functions specified in your manual, you must list them individually as arguments to Rv. Next, the possible values of errno must be specified in the ERRORS section as a list. Let's assume that EFAULT may be set if the pointer is invalid. .Sh ERRORS

.Bl -tag -width Er

.It Er EFAULT

.Fa prefix

points outside the allocated address space.

.El The syntax of this list differs from lists we've already encountered. Earlier we used the special term Ds as an argument to width to specify a generic width. Here, we used Er, which is also specified at the start of each list tag (lines beginning with It). The macro Er specifies a possible value of errno. There are many standard variable names for errno values, such as EFAULT used in our example. When we stipulate this as the argument of width, the formatter is able to translate this into a generic width of most Er macro contents. You should avoid using this construct unless it's in a conventional way, as it is here. If your system call is part of an operating system, it's common to add some lines as to when it was added. Let's assume you're adding the function to a fictional Foo OS. Most modern UNIX operating systems have their own macros, such as Bx for BSD UNIX. Be sure to note the version of the operating system. .Sh HISTORY

The

.Nm

function call appeared in Foo OS version 1.0. Let's put all of these sections together and preview the output. NAME hello — print greeting messages SYNOPSIS #include < hello.h > int

hello(int C, const char *prefix); ); DESCRIPTION The hello function prints out a greeting message. Thefunction prints out a greeting message. It accepts a value C, which if non-zero indicates output should be uppercase; and prefix, which, if not NULL , shall be prefixed to the output. The prefix argument, if not NULL , must be nil-terminated. It accepts a value, which if non-zero indicates output should be uppercase; and, which, if not, shall be prefixed to the output. Theargument, if not, must be nil-terminated. RETURN VALUES The hello() function returns the value 0 if successful; otherwise the value -1 is returned and the global variable errno is set to indicate the error. ERRORS EFAULT prefix points outside the allocated address space. HISTORY The hello function call appeared in Foo OS version 1.0. We can make sure the manual is complete by reviewing the checklist for function documentation. First we implied linking information by using category two (which does not need to be specially linked). Then we introduced the calling syntax of the function, naming its arguments. We also stipulated the necessary header files. In the DESCRIPTION, we described the function and its arguments in full. Lastly, we documented return values in the RETURN VALUES section and the errors set in ERRORS. We also added a HISTORY section, which isn't mentioned as part of our checklist but is considered good practise for system calls. In general, a note on historical information is useful to put your component in the general context of related machinery. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2014/04/07 21:27:38 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Case Study I now introduce a case study of a real-world system call manual, in particular the manual for the fsync function from OpenBSD. The original file may be viewed on-line at src/lib/libc/sys/fsync.2, file version 1.9. .\" $​OpenBSD: fsync.2,v 1.9 2011/04/29 07:12:44 jmc Exp $

.\" $​NetBSD: fsync.2,v 1.4 1995/02/27 12:32:38 cgd Exp $

.\"

.\" Copyright (c) 1983, 1993

.\" The Regents of the University of California. All rights reserved.

.\"

.\" Redistribution and use in source and binary forms, with or without

.\" modification, are permitted provided that the following conditions

.\" are met:

.\" 1. Redistributions of source code must retain the above copyright

.\" notice, this list of conditions and the following disclaimer.

.\" 2. Redistributions in binary form must reproduce the above copyright

.\" notice, this list of conditions and the following disclaimer in the

.\" documentation and/or other materials provided with the distribution.

.\" 3. Neither the name of the University nor the names of its contributors

.\" may be used to endorse or promote products derived from this software

.\" without specific prior written permission.

.\"

.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND

.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE

.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE

.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE

.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL

.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS

.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)

.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT

.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY

.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF

.\" SUCH DAMAGE.

.\"

.\" @(#)fsync.2 8.1 (Berkeley) 6/4/93 The cvs identifiers (both from the current system, OpenBSD, and the import source system, NetBSD), copyright, license, and sccs identifier (from the original system) are presented in the usual way: the the $​OpenBSD$ and $​NetBSD$ lines are automatically updated by the revision control system, cvs, whenever an update to the file is committed. The line following is the copyright message, and following that is the text form of the BSD license. .Dd $​Mdocdate: April 29 2011 $

.Dt FSYNC 2

.Os The manual's last-modified date is maintained with the automatically-updated $​Mdocdate$ sequence. Its title is set to the single function's capitalised form, category 2 for system calls under the current operating system. .Sh NAME

.Nm fsync

.Nd "synchronize a file's in-core state with that on disk" The Nd macro's arguments are superfluously quoted again. .Sh SYNOPSIS

.Fd #include <unistd.h>

.Ft int

.Fn fsync "int fd" Again, in historical manuals, Fd is sometimes used instead of the modern In macro. Note also the inclusion of the function argument's name, fd, where regular C prototypes would usually only include the type. .Sh DESCRIPTION

.Fn fsync

causes all modified data and attributes of

.Fa fd

to be moved to a permanent storage device.

This normally results in all in-core modified copies

of buffers for the associated file to be written to a disk.

.Pp

.Fn fsync

should be used by programs that require a file to be in a known state,

for example, in building a simple transaction facility. Since fsync is a simple function, its description is fairly straightforward. The single function argument fd is fully described as well.

.Sh RETURN VALUES

A 0 value is returned on success.

A \-1 value indicates an error. This is not correct, as it omits information on the errno global error being set. The Rv macro should be used instead. .Sh ERRORS

The

.Fn fsync

fails if:

.Bl -tag -width Er

.It Bq Er EBADF

.Fa fd

is not a valid descriptor.

.It Bq Er EINVAL

.Fa fd

refers to a socket, not to a file.

.It Bq Er EIO

An I/O error occurred while reading from or writing to the file system.

.El Most (if not all) system calls set the errno global error upon failure. This, erroneously, was not mentioned in the RETURN VALUES section, but is documented here. .Sh SEE ALSO

.Xr sync 2 ,

.Xr sync 8

.Sh HISTORY

The

.Fn fsync

function call appeared in

.Bx 4.2 . Note that the cross-references in SEE ALSO are ordered first by section, then alphabetically. The Bx is referenced as the origin of the system call. The STANDARDS section is sorely missing, as fsync is a function specified by POSIX.1-2008 standard. We again found several inconsistent uses of mdoc in this case study. Let this serve as a reminder that if you find bad or unusual mdoc in your manuals, notify the authors! A bug in a manual is just as important as a bug in the code. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Manual Syntax and Structure In the last part, I introduced some mdoc language syntax by way of example. We covered Commands and Functions. In this part, I'll study the structure of the UNIX manual itself. Historically, the syntax and structure of mdoc derive from roff, a text processing language predating even UNIX. mdoc was in fact a bundle of macros expanded by a formatter into roff — not a separate language. Only recently has mdoc been mature enough to consider as a standalone language. The general syntax of roff (and thus mdoc) can be traced to the RUNOFF command from the mid-sixties! The conventions of section names and manual categories were formalised later, in the early seventies, with the Version 1 AT&T UNIX Programmer's Manual. Although the focus of this book is obviously on mdoc, a great deal of its idiosyncrasies derive from roff, so we'll spend some time discussing seemingly-unnecessary complexity in the context of general text processing. I reiterate that this is not a canonical mdoc reference: mdoc is not a standard, and varies in subtle ways across formatters and operating systems. In this part, I'll discuss only the portable parts of mdoc. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Syntax Before studying the structure of mdoc manuals, let's review the language we've seen so far. Foremost, we've noticed that mdoc documents consist only of printable ASCII characters. We noted that a period at the beginning of a line indicates a mdoc macro: .Qq hello, world It's safe to say, in this case, that mdoc is line-oriented in that programme flow is in part governed by position on a line. In the case of Qq, we saw how the macro extends to the end of the line. This is also the first notion of scope, specifically scoping to the end of line. We then saw examples where scope covers multiple lines and accommodates for nested macros as well as text. .Sh DESCRIPTION

The

.Nm

utility... We were briefly introduced to the concept of macros accepting flags and flag arguments. .Bl -tag -width Ds

.It List key.

List value.

.El Finally, we noted that double-quotes have special semantic significance, which led to the topic of escaped terms such as \(dq for a double-quote character. We also saw how punctuation is treated in special ways when lying at line boundaries. End of sentence, end of line.

Same goes with

.Em macros . In this chapter, we'll formalise these concepts. I'll draw my terminology from the literature of formal languages and grammar, but it's not necessary to be familiar with the terms beforehand. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Input Encoding Without exception, a well-formed mdoc document consists only of ASCII printable characters, the space character, the newline character, and in some cases the tab character. Most modern formatters allow for CR+LF newlines \r

, but this is not portable. Modern formatters also accomodate for unlimit to line length; this is not necessarily the case for legacy formatters. Unilaterally, the backslash \ is always interpreted as the beginning of an escape sequence. If an escape precedes a newline, it escapes the current line: .Em This is considered one \

line of input. Macro Line Formally speaking, a macro line is one beginning with a control character. In mdoc, this is traditionally the . character, although historical documents may also use the ' character. This notation extends back to the historical RUNOFF utility. Control Words: Input generally consists of English text, 360 or fewer characters to a line. Control words must begin a new line, and begin with a period so that they may be distinguished from other text. RUNOFF does not print the control words. A line with only a control character followed by zero or more whitespace characters is stripped from input. A macro line may, in some circumstances, contain more macros. The first macro — the one following the control character — may then be distinguished as the line macro. On macro lines the following non-alphanumeric characters are syntactically meaningful as follows. These characters are collectively called reserved characters. ! punctuation " control character (quotation) ( punctuation ) punctuation , punctuation - control character (macro argument) . punctuation : punctuation ; punctuation ? punctuation [ punctuation \ control character (escape sequence) ] punctuation | punctuation To pass these characters along as literal text, you must either escape or quote them. If an unescaped space character is encountered on a macro line, it is used to delimit macros, macro arguments, and flags. Multiple consecutive space characters have no effect on output. .Em Hello, World

.Em Hello, World The spaces between Hello, and world delimit arguments in this case, and produce the same output of Hello, World without extra spaces. Text Line A text line is any line not beginning with a control character. Text lines are never parsed for macros and may consist of printable ASCII character. Text lines are concatenated together when forming output, so unless in certain circumstances, newlines are stripped from input. Using a blank text line as a vertical separator is not portable. If a space character is encountered on a text line, it is reproduced verbatim in the output. Hello, World

Hello, World The spaces between Hello, and world will be reproduced in both cases as-is. However, it is considered non-portable to use spaces on a text-line to shape output: HTML, for example, by default collapses whitespace. Secondly, consider whether controlled spacing between text in an otherwise free-form text sequence is appropriate. In most space-retaining cases, such as in source code examples, you're better off with a literal display mode such as covered at the end of this section. Do not use the space-retaining feature to create double-spaces following a sentential period! See Sentential Punctuation for how to do this properly. If the first letter of a text line is a space character, the output line shall be preceded by a newline. This creates the effect of an implicit literal display. Hello, World.

The newline, leading spaces, and in-line spacing are retained.

This is free-form text. The portability of this behaviour is unknown. For greater portability (and semantic annotation), a literal display mode should be opened instead with, for example, the Bd literal: Hello, World.

.Bd -literal -compact

The newline and leading spaces are retained.

.Ed

While this is not. In this example, the compact flag prevents leading vertical space. To effect a vertical space following the literal display, use a Pp. Consider the following example:

.Bd -literal

int a_function(int *foo, int bar) {

*foo += bar;

}

.Ed

.Pp

This is subsequent text. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/12/25 14:44:21 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Escape Sequences An escape sequence is any grouping of characters following a backslash \. This may happen anywhere in input. What follows the escape sequence syntactically depends upon the first letter. The following sections describe common escape sequences. The use of any other sequence is strongly discouraged for portable manuals; in fact, the use of any escape beyond \& should be strongly avoided: it makes manuals in different output formats inconsistent depending on their methods of glyph rendering. Special Characters Special characters allow the encoding of non-ASCII characters and, in macro lines, the use of reserved characters. Special characters may be invoked anywhere in input. There are three forms of special character, distinguished by the number of letters in the sequence.

one-letter \(nn two-letter \[N] n-letter The n-letter form may be used to express any of the others. For example, \& (a zero-width space) is equivalent to \[&]. The most common escape sequence is in fact \&, a non-printing, zero-width space. When preceding a word, it automatically causes it to be rendered as regular text: The following flags are also macros:

.Fl \&Ar If the Ar were not preceded with an escape, it would have be interpreted as the Ar macro instead of the flags Ar. An alternative to this is to quote the argument (see Quotation). The zero-width escape is found more readily in literal contexts beginning with a period, such as .Bd -literal

\&.Fl Ar

.Ed Predefined Strings An alternative form of special character is the predefined string. These are legacy roff constructs of an escape sequence that may be programmatically set or unset. The syntax for predefined strings follows: \*n one-letter \*(nn two-letter \*[N] n-letter The use of predefined strings is discouraged in portable manuals, as available strings may differ between implementations and formatters. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/12/25 14:52:52 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Comments Comments — words in an mdoc document not interpreted by the formatter — are indicated by the special character \". Regular text. \" In a comment.

.Em A macro . \" Another comment. The comment extends from the special character to the end of the line. If the newline is escaped, the comment only applies to the current line. In other words, the newline escape is commented. Not in a comment, \" in a comment \

Not in a comment. A comment may span an entire line if it's specified as a pseudo-macro, that is, following the control character .. .\" This is a full-line comment. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Punctuation The mdoc language, in descending from the type-setting language roff, has significant type-setting capabilities. Punctuation is treated specially in all mdoc documents, both in terms of macro and text lines. The following characters are considered punctuation: ! ending sentence " ending enclosure ( opening enclosure ) ending enclosure , ending . ending sentence : ending ; ending ? ending sentence [ opening enclosure ] ending enclosure | intervening These are treated specially by the formatter when used in macro lines and at the end of text lines. Sentential Punctuation End of Sentence, End of Line. The end of a sentence should always be at the end of a line. This way, the formatter can recognise a sentence by the punctuation used and insert the correct amount of spaces. If supported by the output media (HTML, for example, does not), all modern mdoc formatters use English spacing to mark sentence boundaries. The ending sentence punctuation in the punctuation table marks an end of sentence. In text lines, sentence punctuation should always occur at the end of the line. End of sentence.

End of line.

("Even with nested sentences.") Note, in the last sentence, that the formatter will recognise sentence punctuation even when followed by ending enclosure punctuation as noted in the punctuation table. However, take care that non-sentence punctuation, such as for abbreviations, does not happen to fall at the line boundary. Paging Dr.

Freud. In this case, the formatter will interpret Dr. as ending a sentence. In this event, you can either restructure your line or add a zero-width escape following the period. Paging Dr.\&

Freud. Macro lines are slightly more complicated. The same rules apply, but punctuation marks must be separated by spaces. The formatter will understand the role of the punctuation and remove the spaces accordingly, or reorder sentence and closing punctuation. Text (parenthesised

.Em text ) .

.Qq Properly period-closed quotation . The punctuation may be escaped by either a trailing escape, as in the text case, or a preceding escape. In this case it is not considered punctuation, but regular text. Note that this will also cause an intervening space to be printed. .Em End of sentence .

.Em Not end of sentence \&.

.Em Not end of sentence .\& Regular Punctuation Non-sentential text line punctuation — commas, parenthesis, quotes, etc.— is a matter of literal printing. Some text (punctuation), another "clause". The rules for macro lines are the same but for in-line macros, which might decorate individual terms with text. In this case, punctuation as a standalone argument is specially treated in that it is not decorated, and whitespace removed according to the punctuation type (opening, closing). .Em ( Nicely spaced and decorated . )

.Em (All text decorated, no end-of-sentence.)

.Em ( Text alright , excepting the period \&. ) In the second example, (All and end-of-sentence.) are considered arguments, and thus not accommodated for in terms of punctuation. In the third, the period is escaped and thus considered regular text. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/12/25 14:44:21 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Quotation Several times I've mentioned how to interpret macro arguments as text — instead of, say, other macros — by quotation. In this section, I formalise the notion of quoting arguments. The issue of quotation is fairly complex owing to mdoc's predecessor, roff. In short, quoting arguments to macros passes the enclosed text verbatim as a single argument. An obvious case follows: .Fl "Ar" By quoting Ar, it is passed verbatim to Fl If not, it would be interpreted as the macro Ar and open a new macro scope. What's worse is that the syntax is entirely legal! This illustrates a minor short-coming of mdoc: beginners may unwittingly invoke macros (such as Ar in our example). Printing a warning would cause more harm than good with well-formed manuals; thus, it's the responsibility of the document author to double-check that macro instructions are properly treated. This condition could have been avoided by beginning the argument Ar with a zero-width escape, such as \&Ar. The need for quotation is more obvious with the Fn macro: .Fn int foo int bar The syntax of Fn is that it first accepts an optional function type, then a function name, then arguments to the function. These arguments usually include a type followed by a name. In our example, int refers to the function type, foo to the name, and both int and bar as separate arguments. Our intention, however, was to have int bar considered a single argument. To do so, we would need to quote. .Fn int foo " int bar " The int bar argument is now passed intact to the macro. To include quotation marks in quoted text, use two quotation marks in a row. .Li """ " This artificial invocation passes a quotation mark followed by four whitespaces to the Li macro. It is, however, unwise to use this language component: it's jarring to those expecting symmetric quotes, and easy to mis-type, leaving runaway quotes. It's safer to use an escape, such as \(dq, instead of pair-wise quotations. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/12/25 15:10:22 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Structure An mdoc manual is divided into two logical parts: the prologue and the document body. .\" Prologue follows:

.Dd May 26 2011

.Dt MDOC 7

.Os

.\" Document body follows:

.Sh NAME

.Nm mdoc

.Nd mdoc language reference

.Sh DESCRIPTION

The

.Nm mdoc

language is used to format

.Bx

.Ux

manuals. The prologue specifies information regarding the manual's classification. For the most part, this information does not change over the course of development. It specifies the manual's title (which may encompass multiple documented components) and category, the date of last editing, the other information. .Dd May 26 2011

.Dt MDOC 7

.Os The document body consists of the documentation content. This material changes over the course of development, and is the bulk of the manual page. It minimally consists of the component name, invocation syntax (if applicable), and a description of operation. .Sh NAME

.Nm mdoc

.Nd mdoc language reference

.Sh DESCRIPTION

The

.Nm mdoc

language is used to format

.Bx

.Ux

manuals. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Prologue The prologue consists at most of the Dd, Dt, and Os macros. These always occur at the beginning of a manual. .Dd May 26 2011

.Dt MDOC 7

.Os The only firm requirement of the mdoc prologue is that the Dd macro comes first: many formatting systems will read up to the first macro to determine the formatting language. If Dd is not encountered first, the mdoc format may not be recognised. Following the Dd, the prologue is conventionally ordered as first Dt and then Os. The Os macro is usually left without arguments, meaning that the manual applies to the current system. After parsing the document prologue, the following is known: The date of last modification.

The canonical title of the manual.

The manual category ( manual section ).

). Whether the manual relates to a particular hardware architecture.

The relevant operating system. Date The date is specified by the Dd macro. Dd date While no particular date format is required, it's best to use the month day, year format, where month is the month in English; day is the day of month; and year is the four-digit year. Arbitrary white-space may separate the tokens, which may also be quoted. Example of canonical form: .Dd June 03, 1991 Example of not zero-padded digit form: .Dd June 3, 1991 Example of quoted-string form: .Dd "June 3, 1991" All of the above examples will normalise to the third of June, 1991. It's especially important that the month be in English, as not all operating systems support localisation. Some formatters also support a special date format as follows: .Dd $Mdocdate: January 1 2012 $ This is usually used in conjunction with source-code control systems that automatically change the date. Consult your formatter's manual for whether it supports this feature. Title A manual's title identifies the entire manual document. It is always specified in uppercase as the first argument of the Dt macro, which conventionally follows the initial Dd macro. Dt TITLE category architecture The title usually corresponds to the file-name of the document, but this is not necessarily the case. In the case of a single-component manual, such as the manual for a single UNIX command or programming function, the title corresponds to the manual name as specified with the SYNOPSIS Nm macro argument. In the event of multiple components, such as a programming library, the title usually corresponds to the library name. If multiple commands are specified, such as with aliased names, the canonical form should be used. Example of a title for the ls utility: .Dt LS 1 Example of a title for the libgreeting function library, consisting of the hi and hello functions: .Dt GREETING 3 If the title is left unspecified by omitting the Dt macro, behaviour is undefined. Usually a formatter will default to an empty string or LOCAL. In general, however, a manual without Dt may be considered incomplete. Category The category of a manual, sometimes called the manual section, specifies the type of component a manual describes. It is specified in the second argument of the Dt macro. Dt TITLE category architecture These categories are dictated by convention extending to the Version 1 AT&T UNIX Programmer's Manual. This manual is divided into seven sections: Commands System calls Subroutines Special files File formats User-maintained programs Miscellaneous Commands are programs intended to be invoked directly by the user, in contradistinction to subroutines, which are intended to be called by the user's programs. Commands generally reside in directory bin (for binary programs). These sections have been expanded and formalised in the intervening years, amounting to the following modern conventions. 1 : user utilities. Most commands fall under this category. A user utility is usable by all operators of a UNIX system. Common examples: ls, man, cat. 2 : system calls. These are a special class of programming function, usually in C, that do not need header file or linking information. Common examples: open , close , write . 3 : user programming functions. Most functions fall under this category. A user programming function is available as standalone or library function, although some, such as the C library, need not be explicitly linked. Common examples: strcpy , isascii . 4 : device interfaces. This category is not as common as categories 1 – 3 ; in fact, not all systems use this section at all. When used, it consists of manuals for hardware device drivers. These manuals are usually tied to a particular architecture. 5 : file formats. This category is not as common as categories 1 – 3 . When used, it consists of structure text file documentation. Common example: passwd . 6 : games (and user utility miscellanea). This category is not as common as categories 1 – 3 , many systems do not come pre-supplied with games. When used, it refers to games or arcana utilities. 7 : miscellaneous. Introductory materials or general text. This category is common, but its contents vary from system to system. 8 : administrative utilities. This consists of utilities for system administration, which may not be accessible or executable by general users (see category 1 ). Common examples: dump, restore, fsck. 9 : kernel programming functions. This category is found on few operating systems. Where applicable, it consists of those functions used in operating system internal development ( kernel development). There are several refinements to the numerical category convention. Perl, Fortran, and Tcl libraries are often grouped under category 3p, 3f, and 3tcl, respectively. Perl modules may also fall under 3pm. Tcl libraries are also found in the n category. Although some common libraries are traditionally referred to with a custom suffix, such as 3ssl for the OpenSSL library, this notation is heavily discouraged. Manuals for the X Window System, traditionally bundled with UNIX systems, are categorised under X11. Manuals for the popular X11R6 distribution of the X Window System may also be listed under X11R6. The paper category historically consisted of longer papers, the draft category consists of draft manuals, unass consists of uncategorised manuals, and local consists of local system documentation. These categories are rarely used and should be avoided for portable, readable manuals. Architecture Some manuals, especially those in category 4 or 9, relate only to a particular hardware architecture. This is a useful specifier in the machine-dependent manuals for category 9 manuals. These use the optional third argument of the Dt macro. Dt TITLE category architecture For a list of possible architectures, consult your local documentation. A safe example is i386, for 32-bit x86-based systems; or amd64 for 64-bit AMD systems. A device referring to a particular architecture uses this to explicitly note its relevant architecture. In normal manuals, this should not be used. Operating System Similar to architecture, some manuals only pertain to a particular operating system. This system may be specified to the Os macro of the prologue. Os system If system is unspecified, the manual is assumed to apply to any operating system. This form is useful when multiple operating systems have access to local-network administrative manuals, such as in a networked file-system environment. Otherwise, it is rarely used. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2012/01/01 15:13:32 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Document Body The document body begins with the first macro not in the prologue set (Dd, Dt, and Os). The document body consists of the manual content itself, and varies significantly between categories and, of course, the material itself. .Sh NAME

.Nm mdoc

.Nd mdoc language reference

.Sh DESCRIPTION

The

.Nm mdoc

language is used to format

.Bx

.Ux

manuals. The content of the document body is divided into sections. Sections are indicated by the Sh macro. Sh SECTION NAME

Text within the section... Text within the section... As described in the introduction, a section consists of its line arguments and all subsequent lines until the end of file or another Sh macro. By convention, Sh arguments are capitalised. I'll describe conventional sections at length in the next chapter, as for the most part follow long-standing document conventions. In general, the document body requires at least the NAME and DESCRIPTION sections, and usually the SYNOPSIS section as well. The first section must be NAME, optionally followed by SYNOPSIS. The DESCRIPTION section must follow either the NAME or SYNOPSIS. Contents Next Home History Last edited by $Author: kristaps $ on $Date: 2011/11/04 01:06:28 $. Copyright © 2011, Kristaps Dzonsons. CC BY-SA.

Layout An mdoc document body is divided into sections. The names and ordering of these sections is dictated by convention extending to the Version 1 AT&T UNIX Programmer's Manual. The name section repeats the entry name and gives a very short description of its purpose.

section repeats the entry name and gives a very short description of its purpose. The synopsis summarizes the use of the program being described. A few conventions are used, particularly in the Commands section. Underlined words are considered literals, and are typed just as they appear. Square brackets ([]) around an argument indicate that the argument is optional. When an argument is given as name, it always refers to a file name. Ellipses ... are used to show that the previous argument-prototype may be repeated. A final convention is used by the commands themselves. An argument beginning with a minus sign - is often taken to mean some sort of flag argument even if it appears in a position where a file name could appear. Therefore, it is unwise to have files whose names begin with -. The description section discusses in detail the subject at hand.

summarizes the use of the program being described. A few conventions are used, particularly in the Commands section. Underlined words are considered literals, and are typed just as they appear. Square brackets ([]) around an argument indicate that the argument is optional. When an argument is given as name, it always refers to a file name. Ellipses ... are used to show that the previous argument-prototype may be repeated. A final convention is used by the commands themselves. An argument beginning with a minus sign - is often taken to mean some sort of flag argument even if it appears in a position where a file name could appear. Therefore, it is unwise to have files whose names begin with -. The description section discusses in detail the subject at hand. The files section gives the names of files which are built into the program.

section gives the names of files which are built into the program. A see also section gives pointers to related information.

section gives pointers to related information. A diagnostics section discusses the diagnostics that may be produced. This section tends to be as terse as the diagnostics themselves.

section discusses the diagnostics that may be produced. This section tends to be as terse as the diagnostics themselves. The bugs section gives known bugs and sometimes deficiencies. occasionally also the suggested fix is described.

section gives known bugs and sometimes deficiencies. occasionally also the suggested fix is described. The owner section gives the name of the person or persons to be consulted in case of difficulty. The rule has been that the last one to modify something owns it, so the owner is not necessarily the author. These conventional sections haven't changed much over the years, although more sections have been added and several have changed with evolving UNIX operating system conventions. The full set of modern sections, and their order, is as follows. NAME Name of all documented components and a collective description. SYNOPSIS Calling syntax of the components. DESCRIPTION Description of all components. This constitutes the bulk of the manual. IMPLEMENTATION NOTES Specific notes on the implementation of a generic (e.g., standardised) component. RETURN VALUES Return values, if the components are functions. ENVIRONMENT Environmental variables affecting the components' operation. FILES Files affecting the components' operation. EXIT STATUS Exit status, if the components are commands. EXAMPLES Brief examples of invocation. DIAGNOSIS Error conditions, if a command or device driver. ERRORS Error conditions, if a function or library. SEE ALSO Links to other relevant manuals or references. STANDARDS Implemented or referenced standards. HISTORY A brief history of the components. AUTHORS The authors of the components. CAVEATS Caveats regarding the components' operation. BUGS Known bugs in the components. SECURITY CONSIDERAT