Skip to content

Style guide

JKTKops edited this page Feb 22, 2022 · 9 revisions

Formatting

We use Fourmolu and EditorConfig to enforce style. Pull requests will be formatted automatically, but your editor should be configured to apply the formatters when a file is saved.

Line length

Lines should be no longer than 80 characters. In exceptional circumstances, lines may be as long as 100 characters.
Rationale:

  1. 80 characters is short enough to display two files side-by-side on two screens.
  2. Research in human vision indicates that 45-75 characters long is the optimal range for lines of text. Every line break interrupts reading as they eyes search for the beginning of the next line. When the lines are too short, the interruptions are too frequent. When the lines are too long, the interruptions are less frequent but have longer duration. In either case, the reader expends more effort to read the same text.
  3. Source code is highly structured text. Long lines tend to obscure the structure by discouraging the use of the vertical dimension. Using vertical structure allows similar items to be spatially close.

Whitespace is not allowed at the end of lines.

Declarations

Pragmas

Top-level pragmas immediately follow the corresponding definition.

id :: a -> a
id x = x
{-# INLINE id #-}

Imports

Sort import declarations alphabetically.
Do not separate import declarations into groups.
Rationale: Grouping import declarations interferes with tools that manage them automatically.

Prefer postpositive qualified syntax: import Module qualified as M.

Prefer qualified imports for terms, except between closely related modules.

Prefer unqualified imports for types, except to disambiguate between similar names.

Names

Use CamelCase for names, except in the test suite where prefixes (e.g. test_) have a special meaning.

Names should be expressive and brief.
Rationale:

  1. Good names are no replacement for good documentation, but an expressive name can help recall the documentation.
  2. Names up to 10-12 characters long can be read with a single glance. Longer names take multiple eye movements to read.

Use full words instead of abbreviations, unless the abbreviation is much more common than the full word.
Rule of thumb: Write the abbreviation if you would say it (for example: HTTP, XML, JSON, HTML) but write the full word otherwise.

Don't capitalize all letters when using an abbreviation, except two-letter abbreviations which are the entire name (e.g. IO).
For example, write HttpServer instead of HTTPServer.
Rationale: It is important to preserve word boundaries in CamelCase so that the eye can find them easily.

Do not use prefixes on names to replace namespaces.
Instead, use a qualified import.
Rationale:

  1. Names do not need to be qualified in their native context. If the prefix is part of the name, it cannot be omitted.
  2. The eye first goes to the beginning of the name, 2-3 characters past the word boundary. If this part of the name is unique, then the name can usually be read in one glance. If the first part of the name is a common prefix shared by many identifiers, it may take multiple eye movements to read the name.

Do not use short names, like n, sk, and f, unless they are so abstract that it is difficult to be more precise.

Modules

Module names are singular, for example Data.Map not Data.Maps.

Name test modules after the module they test.
For example, the tests for a module Foo.Bar should be in the module Test.Foo.Bar.

Types

When a data type has only one constructor, the name of the constructor should be the same as the type.
For example,

data User = User Int String

Likewise, the constructor of a newtype should have the same name as the type.

The name of a newtype field should be unNewtype, where Newtype is the name of the type.
For newtypes of monads, the prefix should be run.

Record fields

Do not prefix record field names, but rely on DuplicateRecordFields and NamedFieldPuns to disambiguate.

Do not use record selector functions with unqualified names.

Tests

Practice test-driven development: every change to the code should be preceded by a failing test.

Write a test case for every user-facing message.

Documentation

Write complete sentences with correct capitalization and punctuation.

Top-level declaration

Write documentation for every top-level declaration (function, class, or type).
Describe the fields and constructors of every data type. Write an explicit type signature for every top-level function and describe the arguments.

Functions

-- | Send a message on a socket.  The socket must be in a connected
-- state.  Returns the number of bytes sent.  Applications are
-- responsible for ensuring that all data has been sent.
send ::
    -- | Connected socket
    Socket ->
    -- | Data to send
    ByteString ->
    -- | Returns number of bytes sent
    IO Int

For functions the documentation should give enough information to apply the function without looking at the function's definition.

Data types

-- | Bla bla bla.
data Person = Person
    { age  :: !Int     -- ^ Age
    , name :: !String  -- ^ First name
    }

For fields that require longer comments format them like so:

data Record = Record
    { -- | This is a very very very long comment that is split over
      -- multiple lines.
      field1 :: !Text

      -- | This is a second very very very long comment that is split
      -- over multiple lines.
    , field2 :: !Int
    }

End-of-line Comments

Separate end-of-line comments from the code using 2 spaces.

data Parser = Parser
    !Int         -- Current position
    !ByteString  -- Remaining input

foo :: Int -> Int
foo n = salt * 32 + 9
  where
    salt = 453645243  -- Magic hash salt.

Links

Use in-line links economically. You are encouraged to add links for API names. It is not necessary to add links for all API names in a Haddock comment. We therefore recommend adding a link to an API name if:

  • The user might actually want to click on it for more information (in your judgment), and
  • Only for the first occurrence of each API name in the comment (don't bother repeating a link)

Writing code

Strictness

The Strict language feature is enabled by default. Don't use lazy fields or bindings unless the benefit can be demonstrated with a profiling report.
Rationale:

  1. There will always be programs that are too strict or too lazy. The profiling tools for too-strict programs are better than those for too-lazy programs, so we choose to be biased toward writing too-strict programs.
  2. Strict bindings preserve the strictness semantics of types, but lazy bindings do not: a strict binding of a lazy type is lazy, but a lazy binding of a strict type is not strict.

Point-free style

Avoid over-using point-free style. For example, this is hard to read:

-- Bad:
f = (g .) . h

Sum, product, and record types

Do not mix sum and record types. Prefer using record types instead of raw product types. Do not make product types where there could be any reasonable ambiguity about what the arguments mean. Do not make product types where it’s not clear what the arguments mean, regardless of ambiguity.

-- Bad:
data Exists = Exists Sort Sort Variable Pattern

-- non-commutative operator, non-obvious operand order.
-- sum expression from lower_limit to upper_limit
data Sum = Sum Exp Exp Exp

-- non-commutative operator, non-obvious operand order,
-- non-obvious operand meaning (i.e. many people wouldn't think
-- about the d(exp) part as being part of the integral).
-- integrate expression from lower_limit to upper_limit d(expression)
data Integral = Integral Exp Exp Exp Exp

-- Acceptable:
data Exists = Exists PatternSort VariableSort Variable Pattern

-- commutative operator, ambiguity does not matter
data Add = Add Exp Exp
-- non-commutative operator, but really obvious operator order
-- that everyone knows.
data Div = Div Exp Exp

Language extensions

The package description lists many extensions that are enabled throughout the project.

The following language extensions may be enabled on a per-module basis:

  • AllowAmbiguousTypes
  • PolyKinds
  • TemplateHaskell

From

Use the class From to define (total) conversions between types. For example,

instance From String Text where
    from = Text.pack

Instances should be homomorphisms preserving some structure of the input. If the preserved structure is not obvious, please document it.

Always use at least one type application or annotation with from. Rationale: The signature of from is very generic and it is difficult to tell how it is being used when its type is inferred. The following examples are all acceptable:

-- acceptable: both types are explicit and adjacent to 'from'
toText :: String -> Text
toText = from

-- acceptable: uses type application
someFunction =
    let toText = from @String
    in _

Pull requests

The history of a pull request should tell a reasonable story, rather than recording an exact history. At the tip of the pull request, the build must succeed with the --pedantic Stack option, i.e. with the -Wall -Werror GHC options. All tests must pass at the tip of the pull request.

If a pull request introduces a bug that is later fixed in the same pull request, the bug should be removed from history by squashing (using git rebase) its resolution into its introduction. The --fixup option to git commit is useful for automatically squashing bugs. A bug may be preserved in the pull request history if it is especially tricky or demonstrates a problem in other code or tools; in this case the resolution must be rebased to immediately follow the commit that introduced the bug.

Pull requests will be squashed and merged after approval.

Every pull request will be reviewed by another member before merging. The reviewer will use the review checklist in the pull request template to evaluate the pull request.