2002-4-10 Haddock User Guide Simon Marlow
simonmar@microsoft.com
2002 Simon Marlow This document describes Haddock, a Haskell documentation tool.
Introduction This is Haddock, a tool for automatically generating documentation from annotated Haskell source code. Haddock was designed with several goals in mind: When documenting APIs, it is desirable to keep the documentation close to the actual interface or implementation of the API, preferably in the same file, to reduce the risk that the two become out of sync. Haddock therefore lets you write the documentation for an entity (function, type, or class) next to the definition of the entity in the source code. There is s tremendous amount of useful API documentation that can be extracted from just the bare source code, including types of exported functions, definitions of data types and classes, and so on. Haddock can therefore generate documentation from a set of straight Haskell 98 modules, and the documentation will contain precisely the interface that is available to a programmer using those modules. Documentation annotations in the source code should be easy on the eye when editing the source code itself, so as not to obsure the code and to make reading and writing documentation annotations easy. The easier it is to write documentation, the more likely the programmer is to do it. Haddock therefore uses lightweight markup in its annotations, taking several ideas from IDoc. In fact, Haddock can understand IDoc-annotated source code. The documentation should not expose any of the structure of the implementation, or to put it another way, the implementer of the API should be free to structure the implementation however he or she wishes, without exposing any of that structure to the consumer. In practical terms, this means that while an API may internally consist of several Haskell modules, we often only want to expose a single module to the user of the interface, where this single module just re-exports the relevant parts of the implementation modules. Haddock therefore understands the Haskell module system and can generate documentation which hides not only non-exported entities from the interface, but also the internal module structure of the interface. A documentation annotation can still be placed next to the implementation, and it will be propagated to the external module in the generated documentation. Being able to move around the documentation by following hyperlinks is essential. Documentation generated by Haddock is therefore littered with hyperlinks: every type and class name is a link to the corresponding definition, and user-written documentation annotations can contain identifiers which are linked automatically when the documentation is generated. We might want documentation in multiple formats - online and printed, for example. Haddock comes with HTML and DocBook backends, and it is structured in such a way that adding new back-ends is straightforward.
Obtaining Haddock Distributions (source & binary) of Haddock can be obtained from its web site. Up-to-date sources can also be obtained from CVS. The Haddock sources are under fptools/haddock in the fptools CVS repository, which also contains GHC, Happy, and several other projects. See Using The CVS Repository for information on how to access the CVS repository. Note that you need to check out the fpconfig module first to get the generic build system (the fptools directory), and then check out fptools/haddock to get the Haddock sources.
License The following license covers this documentation, and the Haddock source code, except where otherwise indicated.
Copyright 2002, Simon Marlow. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Acknowledgements Several documentation systems provided the inspiration for Haddock, most notably: IDoc HDoc Doxygen and probably several others I've forgotten. Thanks to the following people for useful feedback, discussion, patches, and moral support: Simon Peyton Jones, Mark Shields, Manuel Chakravarty, Ross Patterson, Brett Letner, the members of haskelldoc@haskell.org, and everyone who contributed to the many libraries that Haddock makes use of.
Invoking Haddock Haddock is invoked from the command line, like so: haddock option file Where each file is a filename containing a Haskell source module. All the modules specified on the command line will be processed together. When one module refers to an entity in another module being processed, the documentation will link directly to that entity. Entities that cannot be found, for example because they are in a module that isn't being processed as part of the current batch, simply won't be hyperlinked in the generated documentation. Haddock will emit warnings listing all the indentifiers it couldn't resolve. The modules should not be mutually recursive, as Haddock don't like swimming in circles. The following options are available: Output documentation in SGML DocBook format. NOTE: at time of writing this is only partially implemented and doesn't work. Generate documentation in HTML format. Several files will be generated into the current directory (or the specified directory if the option is given), including the following: index.html The top level page of the documentation: lists the modules available, using indentation to represent the hierarchy if the modules are hierarchical. haddock.css The stylesheet used by the generated HTML. Feel free to modify this to change the colors or layout, or even specify your own stylesheet using the option. module.html An HTML page for each module. doc-index.html doc-index-XX.html The index, split into two (functions/constructors and types/classes, as per Haskell namespaces) and further split alphabetically. dir =dir Generate files into dir instead of the current directory. URL =URL Include links to the source files in the generated documentation, where URL is the base URL where the source files can be found. title =title Use title as the page heading for each page in the documentation.This will normally be the name of the library being documented. The title should be a plain string (no markup please!). Reserved for future expansion. =filename Specify a stylesheet to use instead of the default one that comes with Haddock. It should specify certain classes: see the default stylesheet for details. Documentation and Markup Haddock understands special documentation annotations in the Haskell source file and propagates these into the generated documentation. The annotations are purely optional: if there are no annotations, Haddock will just generate documentation that contains the type signatures, data type declarations, and class declarations exported by each of the modules being processed.
Documenting a top-level declaration The simplest example of a documentation annotation is for documenting any top-level declaration (function type signature, type declaration, or class declaration). For example, if the source file contains the following type signature: square :: Int -> Int square x = x * x Then we can document it like this: -- |The 'square' function squares an integer. square :: Int -> Int square x = x * x The -- | syntax begins a documentation annotation, which applies to the following declaration in the source file. Note that the annotation is just a comment in Haskell — it will be ignored by the Haskell compiler. The declaration following a documentation annotation should be one of the following: A type signature for a top-level function, A data declaration, A newtype declaration, A type declaration, or A class declaration. If the annotation is followed by a different kind of declaration, it will probably be ignored by Haddock. Some people like to write their documentation after the declaration; this is possible in Haddock too: square :: Int -> Int -- ^The 'square' function squares an integer. square x = x * x Note that Haddock doesn't contain a Haskell type system — if you don't write the type signature for a function, then Haddock can't tell what its type is and it won't be included in the documentation. Documentation annotations may span several lines; the annotation continues until the first non-comment line in the source file. For example: -- |The 'square' function squares an integer. -- It takes one argument, of type 'Int'. square :: Int -> Int square x = x * x You can also use Haskell's nested-comment style for documentation annotations, which is sometimes more convenient when using multi-line comments: {-| The 'square' function squares an integer. It takes one argument, of type 'Int'. -} square :: Int -> Int square x = x * x
Documenting parts of a declaration In addition to documenting the whole declaration, in some cases we can also document individual parts of the declaration.
Class methods Class methods are documented in the same way as top level type signatures, by using either the -- | or -- ^ annotations: class C a where -- | This is the documentation for the 'f' method f :: a -> Int -- | This is the documentation for the 'g' method g :: Int -> a Note that in Haddock documentation annotations are first-class syntactic objects that are subject to the same layout rules as other syntactic objects; thus in the example class declaration above the documentation annotations must begin in the same column as the method signatures. If you use explicit layout, then don't forget the semi-colon after each documentation comment (but don't put the semi-colon on the same line as the documentation comment, because it will be interpreted as part of the documentation!).
Constructors and record fields Constructors are documented like so: data T a b = -- | This is the documentation for the 'C1' constructor C1 a b | -- | This is the documentation for the 'C2' constructor C2 a b or like this: data T a b = C1 a b -- ^ This is the documentation for the 'C1' constructor | C2 a b -- ^ This is the documentation for the 'C2' constructor Record fields are documented using one of these styles: data R a b = C { -- | This is the documentation for the 'a' field a :: a, -- | This is the documentation for the 'b' field b :: b } data R a b = C { a :: a -- ^ This is the documentation for the 'a' field -- (NOTE: *before* the following comma) , b :: b -- ^ This is the documentation for the 'b' field }
Function arguments Individual arguments to a function may be documented like this: f :: Int -- ^ The 'Int' argument -> Float -- ^ The 'Float' argument -> IO () -- ^ The return value
The module description A module may contain a documentation comment before the module header, in which case this comment is interpreted by Haddock as an overall description of the module itself, and placed in a section entitled Description in the documentation for the module. For example: -- | This is the description for module "Foo" module Foo where ...
Controlling the documentation structure Haddock produces interface documentation that lists only the entities actually exported by the module. The documentation for a module will include all entities exported by that module, even if they were re-exported by another module. The only exception is when Haddock can't see the declaration for the re-exported entity, perhaps because it isn't part of the batch of modules currently being processed. However, to Haddock the export list has even more significance than just specifying the entities to be included in the documentation. It also specifies the order that entities will be listed in the generated documentation. This leaves the programmer free to implement functions in any order he/she pleases, and indeed in any module he/she pleases, but still specify the order that the functions should be documented in the export list. Indeed, many programmers already do this: the export list is often used as a kind of ad-hoc interface documentation, with headings, groups of functions, type signatures and declarations in comments. You can insert headings and sub-headings in the documentation by including annotations at the appropriate point in the export list. For example: module Foo ( -- * Classes C(..), -- * Types -- ** A data type T, -- ** A record R, -- * Some functions f, g ) where Headings are introduced with the syntax -- *, -- ** and so on, where the number of *s indicates the level of the heading (section, sub-section, sub-sub-section, etc.). If you use section headings, then Haddock will generate a table of contents at the top of the module documentation for you.
Re-exporting an entire module Haskell allows you to re-export the entire contents of a module (or at least, everything currently in scope that was imported from a given module) by listing it in the export list: module A ( module B, module C ) where What will the Haddock-generated documentation for this module look like? Well, it depends on how the modules B and C are imported. If they are imported wholly and without any hiding qualifiers, then the documentation will just contain a cross-reference to the documentation for B and C. However, if the modules are not completely re-exported, for example: module A ( module B, module C ) where import B hiding (f) import C (a, b) then Haddock behaves as if the set of entities re-exported from B and C had been listed explicitly in the export listNOTE: this is not fully implemented at the time of writing (version 0.2). At the moment, Haddock always inserts a cross-reference. . The exception to this rule is when the re-exported module is declared with the hide attribute (), in which case the module is never cross-referenced; the contents are always expanded in place in the re-exporting module.
Omitting the export list If there is no export list in the module, how does Haddock generate documentation? Well, when the export list is omitted, e.g.: module Foo where this is equivalent to an export list which mentions every entity defined at the top level in this module, and Haddock treats it in the same way. Furthermore, the generated documentation will retain the order in which entities are defined in the module. In this special case the module body may also include section headings (normally they would be ignored by Haddock).
Named chunks of documentation Occasionally it is desirable to include a chunk of documentation which is not attached to any particular Haskell declaration. There are two ways to do this: The documentation can be included in the export list directly, e.g.: module Foo ( -- * A section heading -- | Some documentation not attached to a particular Haskell entity ... ) where If the documentation is large and placing it inline in the export list might bloat the export list and obscure the structure, then it can be given a name and placed out of line in the body of the module. This is achieved with a special form of documentation annotation -- $: module Foo ( -- * A section heading -- $doc ... ) where -- $doc -- Here is a large chunk of documentation which may be referred to by -- the name $doc. The documentation chunk is given a name, which is the sequence of alphanumeric characters directly after the -- $, and it may be referred to by the same name in the export list.
Hyperlinking and re-exported entities When Haddock renders a type in the generated documentation, it hyperlinks all the type constructors and class names in that type to their respective definitions. But for a given type constructor or class there may be several modules re-exporting it, and therefore several modules whose documentation contains the definition of that type or class (possibly including the current module!) so which one do we link to? Let's look at an example. Suppose we have three modules A, B and C defined as follows: module A (T) where data T a = C a module B (f) where import A f :: T Int -> Int f (C i) = i module C (T, f) where import A import B Module A exports a datatype T. Module B imports A and exports a function f whose type refers to T: the hyperlink in f's signature will point to the definition of T in the documentation for module A. Now, module C exports both T and f. We have a choice about where to point the hyperlink to T in f's type: either the definition exported by module C or the definition exported by module A. Haddock takes the view that in this case pointing to the definition in C is better, because the programmer might not wish to expose A to the programmer at all: A might be a module internal to the implementation of the library in which C is the external interface, so linking to definitions in the current module is preferrable over an imported module. The general rule is this: when attempting to link an instance of a type constructor or class to its definition, the link is made to the current module, if the current module exports the relevant definition, or the module that the entity was imported from, otherwise. If the entity was imported via multiple routes, then Haddock picks the module listed earliest in the imports of the current module.
Module Attributes Certain attributes may be specified for each module which affects the way that Haddock generates documentation for that module. Attributes are specified in a comma-separated list in a -- # (or {- # ... -}) comment at the top of the module, either before or after the module description. For example: -- #hide, prune, ignore-exports -- |Module description module A where ... The following attributes are currently understood by Haddock: hide hide Omit this module from the generated documentation, but nevertheless propagate definitions and documentation from within this module to modules that re-export those definitions. prune hide Omit definitions that have no documentation annotations from the generated documentation. ignore-exports hide Ignore the export list. Generate documentation as if the module had no export list - i.e. all the top-level declarations are exported, and section headings may be given in the body of the module.
Markup Haddock understands certain textual cues inside documentation annotations that tell it how to render the documentation. The cues (or markup) have been designed to be simple and mnemonic in ASCII so that the programmer doesn't have to deal with heavyweight annotations when editing documentation comments.
Paragraphs One or more blank lines separates two paragraphs in a documentation comment.
Special characters The following characters have special meanings in documentation comments: /, ', `, ", @, <. To insert a literal occurrence of one of these special characters, precede it with a backslash (\). Additionally, the character > has a special meaning at the beginning of a line, and the following characters have special meanings at the beginning of a paragraph: *, -. These characters can also be escaped using \.
Code Blocks Displayed blocks of code are indicated by surrounding a paragraph with @...@ or by preceding each line of a paragraph with > (we often call these “bird tracks”). For example: -- | This documentation includes two blocks of code: -- -- @ -- f x = x + x -- @ -- -- > g x = x * 42
There is an important difference between the two forms of code block: in the bird-track form, the text to the right of the ‘>’ is interpreted literally, whereas the @...@ form interprets markup as normal inside the code block.
Hyperlinked Identifiers Referring to a Haskell identifier, whether it be a type, class, constructor, or function, is done by surrounding it with single quotes: -- | This module defines the type 'T'. If there is an entity T in scope in the current module, then the documentation will hyperlink the reference in the text to the definition of T (if the output format supports hyperlinking, of course; in a printed format it might instead insert a page reference to the definition). It is also possible to refer to entities that are not in scope in the current module, by giving the full qualified name of the entity: -- | The identifier 'M.T' is not in scope If M.T is not otherwise in scope, then Haddock will simply emit a link pointing to the entity T exported from module M (without checking to see whether either M or M.T exist). To make life easier for documentation writers, a quoted identifier is only interpreted as such if the quotes surround a lexically valid Haskell identifier. This means, for example, that it normally isn't necessary to escape the single quote when used as an apostrophe: -- | I don't have to escape my apostrophes; great, isn't it? For compatibility with other systems, the following alternative form of markup is accepted We chose not to use this as the primary markup for identifiers because strictly speaking the ` character should not be used as a left quote, it is a grave accent. : `T'.
Emphasis and Monospaced text Emphasis may be added by surrounding text with /.../. Monospaced (or typewriter) text is indicated by surrounding it with @...@. Other markup is valid inside a monospaced span: for example @'f' a b@ will hyperlink the identifier f inside the code fragment.
Linking to modules Linking to a module is done by surrounding the module name with double quotes: -- | This is a reference to the "Foo" module.
Itemized and Enumerated lists A bulleted item is represented by preceding a paragraph with either * or -. A sequence of bulleted paragraphs is rendered as an itemized list in the generated documentation, eg.: -- | This is a bulleted list: -- -- * first item -- -- * second item An enumerated list is similar, except each paragraph must be preceded by either (n) or n. where n is any integer. e.g. -- | This is an enumerated list: -- -- (1) first item -- -- 2. second item
URLs A URL can be included in a documentation comment by surrounding it in angle brackets: <...>. If the output format supports it, the URL will be turned into a hyperlink when rendered.