Programming guidelines shall help to make the code of a project betterreadable and maintainable by the varying number of contributors.
It takes some programming experience to develop something like apersonal "coding style" and guidelines only serve as rough shape forcode. Guidelines should be followed by all members working on theproject even if they prefer (or are already used to) differentguidelines.
These guidelines have been originally set up for thehets-project and arenow put on theHaskellWiki graduallyintegrating parts of the old hawikientriesThingsToAvoid andHaskellStyle (hopefully nothurting someone's copyrights). The other related entryTipsAndTricks treats morespecific points that are left out here,
Surely some style choices are a bit arbitrary (or "religious") andtoo restrictive with respect to language extensions. Nevertheless I hopeto keep up these guidelines (at least as a basis) for our projectin order to avoid maintaining diverging guidelines. Of course I wantto supply - partly tool-dependent - reasons for certain decisions andalso show alternatives by possibly bad examples. At the time ofwriting I use ghc-6.4.1, haddock-0.7 and (GNU-) emacs with the latesthaskell mode.
The following quote and links are taken fromHaskellStyle:
Some comments from the GHC team about their internal codingstandards can be found athttp://hackage.haskell.org/trac/ghc/wiki/WorkingConventions
Alsohttps://simon.peytonjones.org/publications-2000/#wearing-the-hair-shirt-a-retrospective-on-haskell-2003 contains some brief comments on syntax and style.
What now follows are descriptions of program documentation, fileformat, naming conventions and good programming practice (adapted formMatt's C/C++ Programming Guidelines and the Linux kernel codingstyle).
{- |Module : <File name or $Header$ to be replaced automatically>Description : <optional short text displayed on contents page>Copyright : (c) <Authors or Affiliations>License : <license>Maintainer : <email>Stability : unstable | experimental | provisional | stable | frozenPortability : portable | non-portable (<reason>)<module description starting at first column>-}\$Header\$ entry will be automatically expanded.){-# LANGUAGE CPP #-}) may precede this header. The following hierarchical module name must, of course, match the file name.(custom-set-variables '(indent-tabs-mode nil))#endif without newline). Emacs usually asks for a final newline.Please have a look at theHaddock module header documentation.
"\\" at the end of a line causes CPP preprocessor problems.)\ t -> … instead of\t -> ….do,let,where, andcase … of …. Make sure that renamings don't destroy your layout. (If you get too far to the right, the code is unreadable anyway and needs to be decomposed.)case foo of Foo -> "Foo" Bar -> "Bar"
case <longer expression> of Foo -> "Foo" Bar -> "Bar"
error with a fixed string"<ModuleName>.<function>" to indicate the error position (in case the impossible should happen). Don't invest time to "show" the offending value, only do this temporarily when debugging the code.head should be used with care or (even better) be made obsolete by a case statement.isJust andfromJust fromData.Maybe) can be avoided by using themaybe function:maybe (error "<ModuleName>.<function>") id $ Map.lookup key map
Do avoid mixing and nestinglet andwhere. (I prefer the expression-stylisticlet.) Use auxiliary top-level functions that you do not export. Export lists also support the detection of unused functions.
If you notice that you're doing the same task again, try to generalize it in order to avoid duplicate code. It is frustrating to change the same error in several places.
Many parentheses can be eliminated using the infix application operator$ with lowest priority. Try at least to avoid unnecessary parentheses in standard infix expression.
f x : g x ++ h xa == 1 && b == 1 || a == 0 && b == 0
Rather than putting a large final argument in parentheses (with a distant closing one) consider using$ instead.
f (g x) becomesf $ g x and consecutive applicationsf (g (h x)) can be written asf $ g $ h x orf . g $ h x.A function definition likef x = g $ h x can be abbreviated tof = g . h.
Note that the final argument may even be an infix or case-expression:
map id $ c : lfilter (const True) . map id $ case l of …
However, be aware that$-terms cannot be composed further in infix expressions.
Probably wrong:
f $ x ++ g $ x
But the scope of an expression is also limited by the layout rule, so it is usually safe to use$ on right hand sides:
do f $ l++do g $ l
Of course$ can not be used in types. GHC has also some primitive functions involving the kind# that cannot be applied using$.
Last warning: always leave spaces around$ (and other mixfix operators) since a clash with template haskell is possible.
Use these only when "short and sweet". Prefermap,filter, andfoldr!
Instead of:
[toUpper c | c <- s]
write:
map toUpper s
Consider:
[toUpper c | s <- strings, c <- s]
Here it takes some time for the reader to find out which value depends on what other value and it is not so clear how many times the interim valuess andc are used.
In contrast to that the following can't be clearer:
map toUpper (concat strings)
When using higher-order functions you can switch easier to data structures different from list. Compare:
map (1+) list
and:
Set.map (1+) set
For (large) records avoid the use of the constructor directly and remember that the order and number of fields may change.
Take care with (the rare case of) depend polymorphic fields:
data Fields a = VariantWithTwo { field1 :: a , field2 :: a }The type of a valuev can not be changed by only settingfield1:
v { field1 = f }Better construct a new value:
VariantWithTwo { field1 = f } -- leaving field2 undefinedOr use a polymorphic element that is instantiated by updating:
empty = VariantWithTwo { field1 = [], field2 = [] }empty { field1 = [f] }Several variants with identical fields may avoid some code duplication when selecting and updating, though possibly not in a few depended polymorphic cases.
However, I doubt if the following is a really good alternative to the above dataMode with dataBoxOrDiamond.
data Mode f p = Box { formula :: f, positions :: p } | Diamond { formula :: f, positions :: p }FilePath,String or[Char].)data Mode f p = Box f p | Diamond f p
data BoxOrDiamond = Box | Diamonddata Mode f p = Mode BoxOrDiamond f p
data Tuple a b = Tuple a b | Undefined
data Tuple a b = Tuple a b
Maybe (Tuple a b)
Try to strictly separate monadic I/O and pure (withoutdo) function programming (possibly via separate modules).
x <- return y ...
let x = y ...
Don't usePrelude.interact and make sure your program does not depend on the (not always obvious) order of evaluation e.g. don't read and write to the same file!
This will fail:
do s <- readFile f writeFile f $ 'a' : s
because of lazy I/O! (Writing is starting before reading is finished).
Standard library modules likeChar,List,Maybe,Monad, etc should be imported by their hierarchical module name, i.e. the base package (so that haddock finds them):
import Data.Listimport Control.Monadimport System.Environment
The libraries forSet andMap are to be imported qualified:
import qualified Data.Set as Setimport qualified Data.Map as Map
Stay away from extensions as long as possible. Also use classes with care because soon the desire for overlapping instances (like for lists and strings) may arise. Then you may want MPTC (multi-parameter type classes), functional dependencies (FD), undecidable and possibly incoherent instances and then you are "in the wild" (according to SPJ).
Tracing is for debugging purposes only and should not be used as feedback for the user. Clean code is not cluttered by trace calls.
Despite guidelines, writing "correct code" (without formal proof support yet) still remains the major challenge. As motivation to follow these guidelines consider the points that are from the "C++ Coding Standard", where I replaced "C++" with "Haskell".
Good Points:
Bad Points: