Authors: | Andreas Rumpf, Zahary Karadjov |
---|---|
Version: | 2.2.4 |
"Complexity" seems to be a lot like "energy": you can transfer it from the end-user to one/some of the other players, but the total amount seems to remain pretty much constant for a given task. -- Ran
Note: This document is a draft! Several of Nim's features may need more precise wording. This manual is constantly evolving into a proper specification.
Note: The experimental features of Nim are coveredhere.
Note: Assignments, moves, and destruction are specified in thedestructors document.
This document describes the lexis, the syntax, and the semantics of the Nim language.
To learn how to compile Nim programs and generate documentation see theCompiler User Guide and theDocGen Tools Guide.
The language constructs are explained using an extended BNF, in which(a)* means 0 or morea's,a+ means 1 or morea's, and(a)? means an optionala. Parentheses may be used to group elements.
& is the lookahead operator;&a means that ana is expected but not consumed. It will be consumed in the following rule.
The|,/ symbols are used to mark alternatives and have the lowest precedence./ is the ordered choice that requires the parser to try the alternatives in the given order./ is often used to ensure the grammar is not ambiguous.
Non-terminals start with a lowercase letter, abstract terminal symbols are in UPPERCASE. Verbatim terminal symbols (including keywords) are quoted with'. An example:
ifStmt = 'if' expr ':' stmts ('elif' expr ':' stmts)* ('else' stmts)?
The binary^* operator is used as a shorthand for 0 or more occurrences separated by its second argument; likewise^+ means 1 or more occurrences:a^+b is short fora(ba)* anda^*b is short for(a(ba)*)?. Example:
arrayConstructor = '[' expr ^* ',' ']'
Other parts of Nim, like scoping rules or runtime semantics, are described informally.
Nim code specifies a computation that acts on a memory consisting of components calledlocations. A variable is basically a name for a location. Each variable and location is of a certaintype. The variable's type is calledstatic type, the location's type is calleddynamic type. If the static type is not the same as the dynamic type, it is a super-type or subtype of the dynamic type.
Anidentifier is a symbol declared as a name for a variable, type, procedure, etc. The region of the program over which a declaration applies is called thescope of the declaration. Scopes can be nested. The meaning of an identifier is determined by the smallest enclosing scope in which the identifier is declared unless overloading resolution rules suggest otherwise.
An expression specifies a computation that produces a value or location. Expressions that produce locations are calledl-values. An l-value can denote either a location or the value the location contains, depending on the context.
A Nimprogram consists of one or more textsource files containing Nim code. It is processed by a Nimcompiler into anexecutable. The nature of this executable depends on the compiler implementation; it may, for example, be a native binary or JavaScript source code.
In a typical Nim program, most of the code is compiled into the executable. However, some code may be executed atcompile-time. This can include constant expressions, macro definitions, and Nim procedures used by macro definitions. Most of the Nim language is supported at compile-time, but there are some restrictions -- seeRestrictions on Compile-Time Execution for details. We use the termruntime to cover both compile-time execution and code execution in the executable.
The compiler parses Nim source code into an internal data structure called theabstract syntax tree (AST). Then, before executing the code or compiling it into the executable, it transforms the AST throughsemantic analysis. This adds semantic information such as expression types, identifier meanings, and in some cases expression values. An error detected during semantic analysis is called astatic error. Errors described in this manual are static errors when not otherwise specified.
Apanic is an error that the implementation detects and reports at runtime. The method for reporting such errors is viaraising exceptions ordying with a fatal error. However, the implementation provides a means to disable theseruntime checks. See the sectionPragmas for details.
Whether a panic results in an exception or in a fatal error is implementation specific. Thus, the following program is invalid; even though the code purports to catch theIndexDefect from an out-of-bounds array access, the compiler may instead choose to allow the program to die with a fatal error.
vara:array[0..1,char]leti=5try:a[i]='N'exceptIndexDefect:echo"invalid index"
The current implementation allows switching between these different behaviors via--panics:on|off. When panics are turned on, the program dies with a panic, if they are turned off the runtime errors are turned into exceptions. The benefit of--panics:on is that it produces smaller binary code and the compiler has more freedom to optimize the code.
Anunchecked runtime error is an error that is not guaranteed to be detected and can cause the subsequent behavior of the computation to be arbitrary. Unchecked runtime errors cannot occur if onlysafe language features are used and if no runtime checks are disabled.
Aconstant expression is an expression whose value can be computed during a semantic analysis of the code in which it appears. It is never an l-value and never has side effects. Constant expressions are not limited to the capabilities of semantic analysis, such as constant folding; they can use all Nim language features that are supported for compile-time execution. Since constant expressions can be used as an input to semantic analysis (such as for defining array bounds), this flexibility requires the compiler to interleave semantic analysis and compile-time code execution.
It is mostly accurate to picture semantic analysis proceeding top to bottom and left to right in the source code, with compile-time code execution interleaved when necessary to compute values that are required for subsequent semantic analysis. We will see much later in this document that macro invocation not only requires this interleaving, but also creates a situation where semantic analysis does not entirely proceed top to bottom and left to right.
All Nim source files are in the UTF-8 encoding (or its ASCII subset). Other encodings are not supported. Any of the standard platform line termination sequences can be used - the Unix form using ASCII LF (linefeed), the Windows form using the ASCII sequence CR LF (return followed by linefeed), or the old Macintosh form using the ASCII CR (return) character. All of these forms can be used equally, regardless of the platform.
Nim's standard grammar describes anindentation sensitive language. This means that all the control structures are recognized by indentation. Indentation consists only of spaces; tabulators are not allowed.
The indentation handling is implemented as follows: The lexer annotates the following token with the preceding number of spaces; indentation is not a separate token. This trick allows parsing of Nim with only 1 token of lookahead.
The parser uses a stack of indentation levels: the stack consists of integers counting the spaces. The indentation information is queried at strategic places in the parser but ignored otherwise: The pseudo-terminalIND{>} denotes an indentation that consists of more spaces than the entry at the top of the stack;IND{=} an indentation that has the same number of spaces.DED is another pseudo terminal that describes theaction of popping a value from the stack,IND{>} then implies to push onto the stack.
With this notation we can now easily define the core of the grammar: A block of statements (simplified example):
ifStmt = 'if' expr ':' stmt (IND{=} 'elif' expr ':' stmt)* (IND{=} 'else' ':' stmt)?simpleStmt = ifStmt / ...stmt = IND{>} stmt ^+ IND{=} DED # list of statements / simpleStmt # or a simple statement
Comments start anywhere outside a string or character literal with the hash character#. Comments consist of a concatenation ofcomment pieces. A comment piece starts with# and runs until the end of the line. The end of line characters belong to the piece. If the next line only consists of a comment piece with no other tokens between it and the preceding one, it does not start a new comment:
i=0# This is a single comment over multiple lines.# The lexer merges these two pieces.# The comment continues here.
Documentation comments are comments that start with two##. Documentation comments are tokens; they are only allowed at certain places in the input file as they belong to the syntax tree.
Starting with version 0.13.0 of the language Nim supports multiline comments. They look like:
#[Comment here.Multiple linesare not a problem.]#
Multiline comments support nesting:
#[ #[ Multiline comment in already commented out code. ]#proc p[T](x: T) = discard]#
Multiline documentation comments also exist and support nesting too:
procfoo=##[Long documentation comment here. ]##
You can also use thediscard statement together withtriple quoted string literals to create multiline comments:
discard""" You can have any Nim code text commentedout inside this with no indentation restrictions. yes("May I ask a pointless question?") """
This was how multiline comments were done before version 0.13.0, and it is used to provide specifications totestament test framework.
Identifiers in Nim can be any string of letters, digits and underscores, with the following restrictions:
two immediate following underscores__ are not allowed:
letter ::= 'A'..'Z' | 'a'..'z' | '\x80'..'\xff'digit ::= '0'..'9'IDENTIFIER ::= letter ( ['_'] (letter | digit) )*
Currently, any Unicode character with an ordinal value > 127 (non-ASCII) is classified as aletter and may thus be part of an identifier but later versions of the language may assign some Unicode characters to belong to the operator characters instead.
The following keywords are reserved and cannot be used as identifiers:
addrandasasmbindblockbreakcasecastconceptconstcontinueconverterdeferdiscarddistinctdivdoelifelseendenumexceptexportfinallyforfromfuncifimportinincludeinterfaceisisnotiteratorletmacromethodmixinmodnilnotnotinobjectoforoutprocptrraiserefreturnshlshrstatictemplatetrytupletypeusingvarwhenwhilexoryield
Some keywords are unused; they are reserved for future developments of the language.
Two identifiers are considered equal if the following algorithm returns true:
procsameIdentifier(a,b:string):bool=a[0]==b[0]anda.replace("_","").toLowerAscii==b.replace("_","").toLowerAscii
That means only the first letters are compared in a case-sensitive manner. Other letters are compared case-insensitively within the ASCII range and underscores are ignored.
This rather unorthodox way to do identifier comparisons is calledpartial case-insensitivity and has some advantages over the conventional case sensitivity:
It allows programmers to mostly use their own preferred spelling style, be it humpStyle or snake_style, and libraries written by different programmers cannot use incompatible conventions. A Nim-aware editor or IDE can show the identifiers as preferred. Another advantage is that it frees the programmer from remembering the exact spelling of an identifier. The exception with respect to the first letter allows common code likevarfoo:Foo to be parsed unambiguously.
Note that this rule also applies to keywords, meaning thatnotin is the same asnotIn andnot_in (all-lowercase version (notin,isnot) is the preferred way of writing keywords).
Historically, Nim was a fullystyle-insensitive language. This meant that it was not case-sensitive and underscores were ignored and there was not even a distinction betweenfoo andFoo.
If a keyword is enclosed in backticks it loses its keyword property and becomes an ordinary identifier.
Examples
var`var`="Hello Stropping"
typeObj=object`type`:intlet`object`=Obj(`type`:9)assert`object`isObjassert`object`.`type`==9var`var`=42let`let`=8assert`var`+`let`==50const`assert`=trueassert`assert`
Terminal symbol in the grammar:STR_LIT.
String literals can be delimited by matching double quotes, and can contain the followingescape sequences:
Escape sequence | Meaning |
---|---|
\p | platform specific newline: CRLF on Windows, LF on Unix |
\r,\c | carriage return |
\n,\l | line feed (often callednewline) |
\f | form feed |
\t | tabulator |
\v | vertical tabulator |
\\ | backslash |
\" | quotation mark |
\' | apostrophe |
\ '0'..'9'+ | character with decimal value d; all decimal digits directly following are used for the character |
\a | alert |
\b | backspace |
\e | escape[ESC] |
\x HH | character with hex value HH; exactly two hex digits are allowed |
\u HHHH | unicode codepoint with hex value HHHH; exactly four hex digits are allowed |
\u {H+} | unicode codepoint; all hex digits enclosed in{} are used for the codepoint |
Strings in Nim may contain any 8-bit value, even embedded zeros. However, some operations may interpret the first binary zero as a terminator.
Terminal symbol in the grammar:TRIPLESTR_LIT.
String literals can also be delimited by three double quotes""" ...""". Literals in this form may run for several lines, may contain" and do not interpret any escape sequences. For convenience, when the opening""" is followed by a newline (there may be whitespace between the opening""" and the newline), the newline (and the preceding whitespace) is not included in the string. The ending of the string literal is defined by the pattern"""[^"], so this:
""""long string within quotes""""
Produces:
"long string within quotes"
Terminal symbol in the grammar:RSTR_LIT.
There are also raw string literals that are preceded with the letterr (orR) and are delimited by matching double quotes (just like ordinary string literals) and do not interpret the escape sequences. This is especially convenient for regular expressions or Windows paths:
varf=openFile(r"C:\texts\text.txt")# a raw string, so ``\t`` is no tab
To produce a single" within a raw string literal, it has to be doubled:
r"a""b"
Produces:
a"b
r"""" is not possible with this notation, because the three leading quotes introduce a triple quoted string literal.r""" is the same as""" since triple quoted string literals do not interpret escape sequences either.
Terminal symbols in the grammar:GENERALIZED_STR_LIT,GENERALIZED_TRIPLESTR_LIT.
The constructidentifier"string literal" (without whitespace between the identifier and the opening quotation mark) is a generalized raw string literal. It is a shortcut for the constructidentifier(r"string literal"), so it denotes a routine call with a raw string literal as its only argument. Generalized raw string literals are especially convenient for embedding mini languages directly into Nim (for example regular expressions).
The constructidentifier"""string literal""" exists too. It is a shortcut foridentifier("""string literal""").
Character literals are enclosed in single quotes'' and can contain the same escape sequences as strings - with one exception: the platform dependentnewline (\p) is not allowed as it may be wider than one character (it can be the pair CR/LF). Here are the validescape sequences for character literals:
Escape sequence | Meaning |
---|---|
\r,\c | carriage return |
\n,\l | line feed |
\f | form feed |
\t | tabulator |
\v | vertical tabulator |
\\ | backslash |
\" | quotation mark |
\' | apostrophe |
\ '0'..'9'+ | character with decimal value d; all decimal digits directly following are used for the character |
\a | alert |
\b | backspace |
\e | escape[ESC] |
\x HH | character with hex value HH; exactly two hex digits are allowed |
A character is not a Unicode character but a single byte.
Rationale: It enables the efficient support ofarray[char,int] orset[char].
TheRune type can represent any Unicode character.Rune is declared in theunicode module.
A character literal that does not end in' is interpreted as' if there is a preceding backtick token. There must be no whitespace between the preceding backtick token and the character literal. This special case ensures that a declaration likeproc `'customLiteral`(s: string) is valid.proc `'customLiteral`(s: string) is the same asproc `'\''customLiteral`(s: string).
See alsocustom numeric literals.
Numeric literals have the form:
hexdigit = digit | 'A'..'F' | 'a'..'f'octdigit = '0'..'7'bindigit = '0'..'1'unary_minus = '-' # See the section about unary minusHEX_LIT = unary_minus? '0' ('x' | 'X' ) hexdigit ( ['_'] hexdigit )*DEC_LIT = unary_minus? digit ( ['_'] digit )*OCT_LIT = unary_minus? '0' 'o' octdigit ( ['_'] octdigit )*BIN_LIT = unary_minus? '0' ('b' | 'B' ) bindigit ( ['_'] bindigit )*INT_LIT = HEX_LIT | DEC_LIT | OCT_LIT | BIN_LITINT8_LIT = INT_LIT ['\''] ('i' | 'I') '8'INT16_LIT = INT_LIT ['\''] ('i' | 'I') '16'INT32_LIT = INT_LIT ['\''] ('i' | 'I') '32'INT64_LIT = INT_LIT ['\''] ('i' | 'I') '64'UINT_LIT = INT_LIT ['\''] ('u' | 'U')UINT8_LIT = INT_LIT ['\''] ('u' | 'U') '8'UINT16_LIT = INT_LIT ['\''] ('u' | 'U') '16'UINT32_LIT = INT_LIT ['\''] ('u' | 'U') '32'UINT64_LIT = INT_LIT ['\''] ('u' | 'U') '64'exponent = ('e' | 'E' ) ['+' | '-'] digit ( ['_'] digit )*FLOAT_LIT = unary_minus? digit (['_'] digit)* (('.' digit (['_'] digit)* [exponent]) |exponent)FLOAT32_SUFFIX = ('f' | 'F') ['32']FLOAT32_LIT = HEX_LIT '\'' FLOAT32_SUFFIX | (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] FLOAT32_SUFFIXFLOAT64_SUFFIX = ( ('f' | 'F') '64' ) | 'd' | 'D'FLOAT64_LIT = HEX_LIT '\'' FLOAT64_SUFFIX | (FLOAT_LIT | DEC_LIT | OCT_LIT | BIN_LIT) ['\''] FLOAT64_SUFFIXCUSTOM_NUMERIC_LIT = (FLOAT_LIT | INT_LIT) '\'' CUSTOM_NUMERIC_SUFFIX# CUSTOM_NUMERIC_SUFFIX is any Nim identifier that is not# a pre-defined type suffix.
As can be seen in the productions, numeric literals can contain underscores for readability. Integer and floating-point literals may be given in decimal (no prefix), binary (prefix0b), octal (prefix0o), and hexadecimal (prefix0x) notation.
The fact that the unary minus- in a number literal like-1 is considered to be part of the literal is a late addition to the language. The rationale is that an expression-128'i8 should be valid and without this special case, this would be impossible --128 is not a validint8 value, only-128 is.
For theunary_minus rule there are further restrictions that are not covered in the formal grammar. For- to be part of the number literal the immediately preceding character has to be in the set{'', '\t', '\n', '\r', ',',';','(','[','{'}. This set was designed to cover most cases in a natural manner.
In the following examples,-1 is a single token:
echo-1echo(-1)echo[-1]echo3,-1"abc";-1
In the following examples,-1 is parsed as two separate tokens (as-1):
echox-1echo(int)-1echo[a]-1"abc"-1
The suffix starting with an apostrophe (''') is called atype suffix. Literals without a type suffix are of an integer type unless the literal contains a dot orE|e in which case it is of typefloat. This integer type isint if the literal is in the rangelow(int32)..high(int32), otherwise it isint64. For notational convenience, the apostrophe of a type suffix is optional if it is not ambiguous (only hexadecimal floating-point literals with a type suffix can be ambiguous).
The pre-defined type suffixes are:
Type Suffix | Resulting type of literal |
---|---|
'i8 | int8 |
'i16 | int16 |
'i32 | int32 |
'i64 | int64 |
'u | uint |
'u8 | uint8 |
'u16 | uint16 |
'u32 | uint32 |
'u64 | uint64 |
'f | float32 |
'd | float64 |
'f32 | float32 |
'f64 | float64 |
Floating-point literals may also be in binary, octal or hexadecimal notation:0B0_10001110100_0000101001000111101011101111111011000101001101001001'f64 is approximately 1.72826e35 according to the IEEE floating-point standard.
Literals must match the datatype, for example,333'i8 is an invalid literal. Non-base-10 literals are used mainly for flags and bit pattern representations, therefore the checking is done on bit width and not on value range. Hence: 0b10000000'u8 == 0x80'u8 == 128, but, 0b10000000'i8 == 0x80'i8 == -1 instead of causing an overflow error.
If the suffix is not predefined, then the suffix is assumed to be a call to a proc, template, macro or other callable identifier that is passed the string containing the literal. The callable identifier needs to be declared with a special' prefix:
importstd/strutilstypeu4=distinctuint8# a 4-bit unsigned integer aka "nibble"proc`'u4`(n:string):u4=# The leading ' is required.result=(parseInt(n)and0x0F).u4varx=5'u4
More formally, a custom numeric literal123'custom is transformed to r"123".'custom in the parsing step. There is no AST node kind that corresponds to this transformation. The transformation naturally handles the case that additional parameters are passed to the callee:
importstd/strutilstypeu4=distinctuint8# a 4-bit unsigned integer aka "nibble"proc`'u4`(n:string;moreData:int):u4=result=(parseInt(n)and0x0F).u4varx=5'u4(123)
Custom numeric literals are covered by the grammar rule namedCUSTOM_NUMERIC_LIT. A custom numeric literal is a single token.
Nim allows user defined operators. An operator is any combination of the following characters:
= + - * / < >@ $ ~ & % |! ? ^ . : \
(The grammar uses the terminal OPR to refer to operator symbols as defined here.)
These keywords are also operators:andornotxorshlshrdivmodinnotinisisnotofasfrom.
.,=,:,:: are not available as general operators; they are used for other notational purposes.
*: is as a special case treated as the two tokens* and: (to supportvarv*:T).
Thenot keyword is always a unary operator,anotb is parsed asa(notb), not as(a)not(b).
These Unicode operators are also parsed as operators:
∙ ∘ × ★ ⊗ ⊘ ⊙ ⊛ ⊠ ⊡ ∩ ∧ ⊓ # same priority as * (multiplication)± ⊕ ⊖ ⊞ ⊟ ∪ ∨ ⊔ # same priority as + (addition)
Unicode operators can be combined with non-Unicode operator symbols. The usual precedence extensions then apply, for example,⊠= is an assignment like operator just like*= is.
No Unicode normalization step is performed.
The following strings denote other tokens:
` ( ) { } [ ] , ; [. .] {. .} (. .) [:
Theslice operator.. takes precedence over other tokens that contain a dot:{..} are the three tokens{,..,} and not the two tokens{.,.}.
This section lists Nim's standard syntax. How the parser handles the indentation is already described in theLexical Analysis section.
Nim allows user-definable operators. Binary operators have 11 different levels of precedence.
Binary operators whose first character is^ are right-associative, all other binary operators are left-associative.
proc`^/`(x,y:float):float=# a right-associative division operatorresult=x/yecho12^/4^/8# 24.0 (4 / 8 = 0.5, then 12 / 0.5 = 24.0)echo12/4/8# 0.375 (12 / 4 = 3.0, then 3 / 8 = 0.375)
Unary operators always bind stronger than any binary operator:$a+b is($a)+b and not$(a+b).
If a unary operator's first character is@ it is asigil-like operator which binds stronger than aprimarySuffix:@x.abc is parsed as(@x).abc whereas$x.abc is parsed as$(x.abc).
For binary operators that are not keywords, the precedence is determined by the following rules:
Operators ending in either->,~> or=> are calledarrow like, and have the lowest precedence of all operators.
If the operator ends with= and its first character is none of<,>,!,=,~,?, it is anassignment operator which has the second-lowest precedence.
Otherwise, precedence is determined by the first character.
Precedence level | Operators | First character | Terminal symbol |
---|---|---|---|
10 (highest) | $^ | OP10 | |
9 | */divmodshlshr% | *%\/ | OP9 |
8 | +- | +-~| | OP8 |
7 | & | & | OP7 |
6 | .. | . | OP6 |
5 | ==<=<>=>!=innotinisisnotnotofasfrom | =<>! | OP5 |
4 | and | OP4 | |
3 | orxor | OP3 | |
2 | @:? | OP2 | |
1 | assignment operator (like+=,*=) | OP1 | |
0 (lowest) | arrow like operator (like->,=>) | OP0 |
Whether an operator is used as a prefix operator is also affected by preceding whitespace (this parsing change was introduced with version 0.13.0):
echo$foo# is parsed asecho($foo)
Spacing also determines whether(a,b) is parsed as an argument list of a call or whether it is parsed as a tuple constructor:
echo(1,2)# pass 1 and 2 to echo
echo(1,2)# pass the tuple (1, 2) to echo
Terminal symbol in the grammar:DOTLIKEOP.
Dot-like operators are operators starting with., but not with.., for e.g..?; they have the same precedence as., so thata.?b.c is parsed as(a.?b).c instead ofa.?(b.c).
The grammar's start symbol ismodule.
# This file is generated by compiler/parser.nim.module = complexOrSimpleStmt ^* (';' / IND{=})comma = ',' COMMENT?semicolon = ';' COMMENT?colon = ':' COMMENT?colcom = ':' COMMENT?operator = OP0 | OP1 | OP2 | OP3 | OP4 | OP5 | OP6 | OP7 | OP8 | OP9 | 'or' | 'xor' | 'and' | 'is' | 'isnot' | 'in' | 'notin' | 'of' | 'as' | 'from' | 'div' | 'mod' | 'shl' | 'shr' | 'not' | '..'prefixOperator = operatoroptInd = COMMENT? IND?optPar = (IND{>} | IND{=})?simpleExpr = arrowExpr (OP0 optInd arrowExpr)* pragma?arrowExpr = assignExpr (OP1 optInd assignExpr)*assignExpr = orExpr (OP2 optInd orExpr)*orExpr = andExpr (OP3 optInd andExpr)*andExpr = cmpExpr (OP4 optInd cmpExpr)*cmpExpr = sliceExpr (OP5 optInd sliceExpr)*sliceExpr = ampExpr (OP6 optInd ampExpr)*ampExpr = plusExpr (OP7 optInd plusExpr)*plusExpr = mulExpr (OP8 optInd mulExpr)*mulExpr = dollarExpr (OP9 optInd dollarExpr)*dollarExpr = primary (OP10 optInd primary)*operatorB = OP0 | OP1 | OP2 | OP3 | OP4 | OP5 | OP6 | OP7 | OP8 | OP9 | 'div' | 'mod' | 'shl' | 'shr' | 'in' | 'notin' | 'is' | 'isnot' | 'not' | 'of' | 'as' | 'from' | '..' | 'and' | 'or' | 'xor'symbol = '`' (KEYW|IDENT|literal|(operator|'('|')'|'['|']'|'{'|'}'|'=')+)+ '`' | IDENT | 'addr' | 'type' | 'static'symbolOrKeyword = symbol | KEYWexprColonEqExpr = expr ((':'|'=') expr / doBlock extraPostExprBlock*)?exprEqExpr = expr ('=' expr / doBlock extraPostExprBlock*)?exprList = expr ^+ commaoptionalExprList = expr ^* commaexprColonEqExprList = exprColonEqExpr (comma exprColonEqExpr)* (comma)?qualifiedIdent = symbol ('.' optInd symbolOrKeyword)?setOrTableConstr = '{' ((exprColonEqExpr comma)* | ':' ) '}'castExpr = 'cast' ('[' optInd typeDesc optPar ']' '(' optInd expr optPar ')') /parKeyw = 'discard' | 'include' | 'if' | 'while' | 'case' | 'try' | 'finally' | 'except' | 'for' | 'block' | 'const' | 'let' | 'when' | 'var' | 'mixin'par = '(' optInd ( &parKeyw (ifExpr / complexOrSimpleStmt) ^+ ';' | ';' (ifExpr / complexOrSimpleStmt) ^+ ';' | pragmaStmt | simpleExpr ( (doBlock extraPostExprBlock*) | ('=' expr (';' (ifExpr / complexOrSimpleStmt) ^+ ';' )? ) | (':' expr (',' exprColonEqExpr ^+ ',' )? ) ) ) optPar ')'literal = | INT_LIT | INT8_LIT | INT16_LIT | INT32_LIT | INT64_LIT | UINT_LIT | UINT8_LIT | UINT16_LIT | UINT32_LIT | UINT64_LIT | FLOAT_LIT | FLOAT32_LIT | FLOAT64_LIT | STR_LIT | RSTR_LIT | TRIPLESTR_LIT | CHAR_LIT | CUSTOM_NUMERIC_LIT | NILgeneralizedLit = GENERALIZED_STR_LIT | GENERALIZED_TRIPLESTR_LITidentOrLiteral = generalizedLit | symbol | literal | par | arrayConstr | setOrTableConstr | tupleConstr | castExprtupleConstr = '(' optInd (exprColonEqExpr comma?)* optPar ')'arrayConstr = '[' optInd (exprColonEqExpr comma?)* optPar ']'primarySuffix = '(' (exprColonEqExpr comma?)* ')' | '.' optInd symbolOrKeyword ('[:' exprList ']' ( '(' exprColonEqExpr ')' )?)? generalizedLit? | DOTLIKEOP optInd symbolOrKeyword generalizedLit? | '[' optInd exprColonEqExprList optPar ']' | '{' optInd exprColonEqExprList optPar '}'pragma = '{.' optInd (exprColonEqExpr comma?)* optPar ('.}' | '}')identVis = symbol OPR? # postfix positionidentVisDot = symbol '.' optInd symbolOrKeyword OPR?identWithPragma = identVis pragma?identWithPragmaDot = identVisDot pragma?declColonEquals = identWithPragma (comma identWithPragma)* comma? (':' optInd typeDescExpr)? ('=' optInd expr)?identColonEquals = IDENT (comma IDENT)* comma? (':' optInd typeDescExpr)? ('=' optInd expr)?)tupleTypeBracket = '[' optInd (identColonEquals (comma/semicolon)?)* optPar ']'tupleType = 'tuple' tupleTypeBrackettupleDecl = 'tuple' (tupleTypeBracket / COMMENT? (IND{>} identColonEquals (IND{=} identColonEquals)*)?)paramList = '(' declColonEquals ^* (comma/semicolon) ')'paramListArrow = paramList? ('->' optInd typeDesc)?paramListColon = paramList? (':' optInd typeDesc)?doBlock = 'do' paramListArrow pragma? colcom stmtroutineExpr = ('proc' | 'func' | 'iterator') paramListColon pragma? ('=' COMMENT? stmt)?routineType = ('proc' | 'iterator') paramListColon pragma?forStmt = 'for' ((varTuple / identWithPragma) ^+ comma) 'in' expr colcom stmtforExpr = forStmtexpr = (blockExpr | ifExpr | whenExpr | caseStmt | forExpr | tryExpr) / simpleExprsimplePrimary = SIGILLIKEOP? identOrLiteral primarySuffix*commandStart = &('`'|IDENT|literal|'cast'|'addr'|'type'|'var'|'out'| 'static'|'enum'|'tuple'|'object'|'proc')primary = simplePrimary (commandStart expr (doBlock extraPostExprBlock*)?)? / operatorB primary / routineExpr / rawTypeDesc / prefixOperator primaryrawTypeDesc = (tupleType | routineType | 'enum' | 'object' | ('var' | 'out' | 'ref' | 'ptr' | 'distinct') typeDesc?) ('not' primary)?typeDescExpr = (routineType / simpleExpr) ('not' primary)?typeDesc = rawTypeDesc / typeDescExprtypeDefValue = ((tupleDecl | enumDecl | objectDecl | conceptDecl | ('ref' | 'ptr' | 'distinct') (tupleDecl | objectDecl)) / (simpleExpr (exprEqExpr ^+ comma postExprBlocks?)?)) ('not' primary)?extraPostExprBlock = ( IND{=} doBlock | IND{=} 'of' exprList ':' stmt | IND{=} 'elif' expr ':' stmt | IND{=} 'except' optionalExprList ':' stmt | IND{=} 'finally' ':' stmt | IND{=} 'else' ':' stmt )postExprBlocks = (doBlock / ':' (extraPostExprBlock / stmt)) extraPostExprBlock*exprStmt = simpleExpr postExprBlocks? / simplePrimary (exprEqExpr ^+ comma) postExprBlocks? / simpleExpr '=' optInd (expr postExprBlocks?)importStmt = 'import' optInd expr ((comma expr)* / 'except' optInd (expr ^+ comma))exportStmt = 'export' optInd expr ((comma expr)* / 'except' optInd (expr ^+ comma))includeStmt = 'include' optInd expr ^+ commafromStmt = 'from' expr 'import' optInd expr (comma expr)*returnStmt = 'return' optInd expr?raiseStmt = 'raise' optInd expr?yieldStmt = 'yield' optInd expr?discardStmt = 'discard' optInd expr?breakStmt = 'break' optInd expr?continueStmt = 'continue' optInd expr?condStmt = expr colcom stmt COMMENT? (IND{=} 'elif' expr colcom stmt)* (IND{=} 'else' colcom stmt)?ifStmt = 'if' condStmtwhenStmt = 'when' condStmtcondExpr = expr colcom stmt optInd ('elif' expr colcom stmt optInd)* 'else' colcom stmtifExpr = 'if' condExprwhenExpr = 'when' condExprwhileStmt = 'while' expr colcom stmtofBranch = 'of' exprList colcom stmtofBranches = ofBranch (IND{=} ofBranch)* (IND{=} 'elif' expr colcom stmt)* (IND{=} 'else' colcom stmt)?caseStmt = 'case' expr ':'? COMMENT? (IND{>} ofBranches DED | IND{=} ofBranches)tryStmt = 'try' colcom stmt &(IND{=}? 'except'|'finally') (IND{=}? 'except' optionalExprList colcom stmt)* (IND{=}? 'finally' colcom stmt)?tryExpr = 'try' colcom stmt &(optInd 'except'|'finally') (optInd 'except' optionalExprList colcom stmt)* (optInd 'finally' colcom stmt)?blockStmt = 'block' symbol? colcom stmtblockExpr = 'block' symbol? colcom stmtstaticStmt = 'static' colcom stmtdeferStmt = 'defer' colcom stmtasmStmt = 'asm' pragma? (STR_LIT | RSTR_LIT | TRIPLESTR_LIT)genericParam = symbol (comma symbol)* (colon expr)? ('=' optInd expr)?genericParamList = '[' optInd genericParam ^* (comma/semicolon) optPar ']'pattern = '{' stmt '}'indAndComment = (IND{>} COMMENT)? | COMMENT?routine = optInd identVis pattern? genericParamList? paramListColon pragma? ('=' COMMENT? stmt)? indAndCommentcommentStmt = COMMENTsection(RULE) = COMMENT? RULE / (IND{>} (RULE / COMMENT)^+IND{=} DED)enumDecl = 'enum' optInd (symbol pragma? optInd ('=' optInd expr COMMENT?)? comma?)+objectWhen = 'when' expr colcom objectPart COMMENT? ('elif' expr colcom objectPart COMMENT?)* ('else' colcom objectPart COMMENT?)?objectBranch = 'of' exprList colcom objectPartobjectBranches = objectBranch (IND{=} objectBranch)* (IND{=} 'elif' expr colcom objectPart)* (IND{=} 'else' colcom objectPart)?objectCase = 'case' (declColonEquals / pragma)? ':'? COMMENT? (IND{>} objectBranches DED | IND{=} objectBranches)objectPart = IND{>} objectPart^+IND{=} DED / objectWhen / objectCase / 'nil' / 'discard' / declColonEqualsobjectDecl = 'object' ('of' typeDesc)? COMMENT? objectPartconceptParam = ('var' | 'out' | 'ptr' | 'ref' | 'static' | 'type')? symbolconceptDecl = 'concept' conceptParam ^* ',' (pragma)? ('of' typeDesc ^* ',')? &IND{>} stmttypeDef = identVisDot genericParamList? pragma '=' optInd typeDefValue indAndComment?varTupleLhs = '(' optInd (identWithPragma / varTupleLhs) ^+ comma optPar ')' (':' optInd typeDescExpr)?varTuple = varTupleLhs '=' optInd exprcolonBody = colcom stmt postExprBlocks?variable = (varTuple / identColonEquals) colonBody? indAndCommentconstant = (varTuple / identWithPragma) (colon typeDesc)? '=' optInd expr indAndCommentbindStmt = 'bind' optInd qualifiedIdent ^+ commamixinStmt = 'mixin' optInd qualifiedIdent ^+ commapragmaStmt = pragma (':' COMMENT? stmt)?simpleStmt = ((returnStmt | raiseStmt | yieldStmt | discardStmt | breakStmt | continueStmt | pragmaStmt | importStmt | exportStmt | fromStmt | includeStmt | commentStmt) / exprStmt) COMMENT?complexOrSimpleStmt = (ifStmt | whenStmt | whileStmt | tryStmt | forStmt | blockStmt | staticStmt | deferStmt | asmStmt | 'proc' routine | 'method' routine | 'func' routine | 'iterator' routine | 'macro' routine | 'template' routine | 'converter' routine | 'type' section(typeDef) | 'const' section(constant) | ('let' | 'var' | 'using') section(variable) | bindStmt | mixinStmt) / simpleStmtstmt = (IND{>} complexOrSimpleStmt^+(IND{=} / ';') DED) / simpleStmt ^+ ';'
Order of evaluation is strictly left-to-right, inside-out as it is typical for most others imperative programming languages:
vars=""procp(arg:int):int=s.add$argresult=argdiscardp(p(1)+p(2))doAsserts=="123"
Assignments are not special, the left-hand-side expression is evaluated before the right-hand side:
varv=0procgetI():int=result=vincvvara,b:array[0..2,int]procsomeCopy(a:varint;b:int)=a=ba[getI()]=getI()doAsserta==[1,0,0]v=0someCopy(b[getI()],getI())doAssertb==[1,0,0]
Rationale: Consistency with overloaded assignment or assignment-like operations,a=b can be read asperformSomeCopy(a,b).
However, the concept of "order of evaluation" is only applicable after the code was normalized: The normalization involves template expansions and argument reorderings that have been passed to named parameters:
vars=""procp():int=s.add"p"result=5procq():int=s.add"q"result=3# Evaluation order is 'b' before 'a' due to template# expansion's semantics.templateswapArgs(a,b):untyped=b+adoAssertswapArgs(p()+q(),q()-p())==6doAsserts=="qppq"# Evaluation order is not influenced by named parameters:procconstruct(first,second:int)=discard# 'p' is evaluated before 'q'!construct(second=q(),first=p())doAsserts=="qppqpq"
Rationale: This is far easier to implement than hypothetical alternatives.
Aconstant is a symbol that is bound to the value of a constant expression. Constant expressions are restricted to depend only on the following categories of values and operations, because these are either built into the language or declared and evaluated before semantic analysis of the constant expression:
A constant expression can contain code blocks that may internally use all Nim features supported at compile time (as detailed in the next section below). Within such a code block, it is possible to declare variables and then later read and update them, or declare variables and pass them to procedures that modify them. However, the code in such a block must still adhere to the restrictions listed above for referencing values and operations outside the block.
The ability to access and modify compile-time variables adds flexibility to constant expressions that may be surprising to those coming from other statically typed languages. For example, the following code echoes the beginning of the Fibonacci seriesat compile-time. (This is a demonstration of flexibility in defining constants, not a recommended style for solving this problem.)
importstd/strformatvarfibN{.compileTime.}:intvarfibPrev{.compileTime.}:intvarfibPrevPrev{.compileTime.}:intprocnextFib():int=result=iffibN<2:fibNelse:fibPrevPrev+fibPrevinc(fibN)fibPrevPrev=fibPrevfibPrev=resultconstf0=nextFib()constf1=nextFib()constdisplayFib=block:constf2=nextFib()varresult=fmt"Fibonacci sequence: {f0}, {f1}, {f2}"foriin3..12:add(result,fmt", {nextFib()}")resultstatic:echodisplayFib
Nim code that will be executed at compile time cannot use the following language features:
The use of wrappers that use FFI and/orcast is also disallowed. Note that these wrappers include the ones in the standard libraries.
Some or all of these restrictions are likely to be lifted over time.
All expressions have a type that is known during semantic analysis. Nim is statically typed. One can declare new types, which is in essence defining an identifier that can be used to denote this custom type.
These are the major type classes:
Ordinal types have the following characteristics:
Integers, bool, characters, and enumeration types (and subranges of these types) belong to ordinal types.
A distinct type is an ordinal type if its base type is an ordinal type.
These integer types are pre-defined:
In addition to the usual arithmetic operators for signed and unsigned integers (+-* etc.) there are also operators that formally work onsigned integers but treat their arguments asunsigned: They are mostly provided for backwards compatibility with older versions of the language that lacked unsigned integer types. These unsigned operations for signed integers use the% suffix as convention:
operation | meaning |
---|---|
a+%b | unsigned integer addition |
a-%b | unsigned integer subtraction |
a*%b | unsigned integer multiplication |
a/%b | unsigned integer division |
a%%b | unsigned integer modulo operation |
a<%b | treata andb as unsigned and compare |
a<=%b | treata andb as unsigned and compare |
Automatic type conversion is performed in expressions where different kinds of integer types are used: the smaller type is converted to the larger.
Anarrowing type conversion converts a larger to a smaller type (for exampleint32->int16). Awidening type conversion converts a smaller type to a larger type (for exampleint16->int32). In Nim only widening type conversions areimplicit:
varmyInt16=5i16varmyInt:intmyInt16+34# of type `int16`myInt16+myInt# of type `int`myInt16+2i32# of type `int32`
However,int literals are implicitly convertible to a smaller integer type if the literal's value fits this smaller type and such a conversion is less expensive than other implicit conversions, somyInt16+34 produces anint16 result.
For further details, seeConvertible relation.
A subrange type is a range of values from an ordinal or floating-point type (the base type). To define a subrange type, one must specify its limiting values -- the lowest and highest value of the type. For example:
typeSubrange=range[0..5]PositiveFloat=range[0.0..Inf]Positive*=range[1..high(int)]# as defined in `system`
Subrange is a subrange of an integer which can only hold the values 0 to 5.PositiveFloat defines a subrange of all positive floating-point values. NaN does not belong to any subrange of floating-point types. Assigning any other value to a variable of typeSubrange is a panic (or a static error if it can be determined during semantic analysis). Assignments from the base type to one of its subrange types (and vice versa) are allowed.
A subrange type has the same size as its base type (int in the Subrange example).
The following floating-point types are pre-defined:
Automatic type conversion in expressions with different kinds of floating-point types is performed: SeeConvertible relation for further details. Arithmetic performed on floating-point types follows the IEEE standard. Integer types are not converted to floating-point types automatically and vice versa.
The IEEE standard defines five types of floating-point exceptions:
The IEEE exceptions are either ignored during execution or mapped to the Nim exceptions:FloatInvalidOpDefect,FloatDivByZeroDefect,FloatOverflowDefect,FloatUnderflowDefect, andFloatInexactDefect. These exceptions inherit from theFloatingPointDefect base class.
Nim provides the pragmasnanChecks andinfChecks to control whether the IEEE exceptions are ignored or trap a Nim exception:
{.nanChecks:on,infChecks:on.}vara=1.0varb=0.0echob/b# raises FloatInvalidOpDefectechoa/b# raises FloatOverflowDefect
In the current implementationFloatDivByZeroDefect andFloatInexactDefect are never raised.FloatOverflowDefect is raised instead ofFloatDivByZeroDefect. There is also afloatChecks pragma that is a short-cut for the combination ofnanChecks andinfChecks pragmas.floatChecks are turned off as default.
The only operations that are affected by thefloatChecks pragma are the+,-,*,/ operators for floating-point types.
An implementation should always use the maximum precision available to evaluate floating-point values during semantic analysis; this means expressions like0.09'f32+0.01'f32==0.09'f64+0.01'f64 that are evaluating during constant folding are true.
The boolean type is namedbool in Nim and can be one of the two pre-defined valuestrue andfalse. Conditions inwhile,if,elif,when-statements need to be of typebool.
This condition holds:
ord(false)==0andord(true)==1
The operatorsnot,and,or,xor,<,<=,>,>=,!=,== are defined for the bool type. Theand andor operators perform short-cut evaluation. Example:
whilep!=nilandp.name!="xyz":# p.name is not evaluated if p == nilp=p.next
The size of the bool type is one byte.
The character type is namedchar in Nim. Its size is one byte. Thus, it cannot represent a UTF-8 character, but a part of it.
TheRune type is used for Unicode characters, it can represent any Unicode character.Rune is declared in theunicode module.
Enumeration types define a new type whose values consist of the ones specified. The values are ordered. Example:
typeDirection=enumnorth,east,south,west
Now the following holds:
ord(north)==0ord(east)==1ord(south)==2ord(west)==3# Also allowed:ord(Direction.west)==3
The implied order is: north < east < south < west. The comparison operators can be used with enumeration types. Instead ofnorth etc., the enum value can also be qualified with the enum type that it resides in,Direction.north.
For better interfacing to other programming languages, the fields of enum types can be assigned an explicit ordinal value. However, the ordinal values have to be in ascending order. A field whose ordinal value is not explicitly given is assigned the value of the previous field + 1.
An explicit ordered enum can haveholes:
typeTokenType=enuma=2,b=4,c=89# holes are valid
However, it is then not ordinal anymore, so it is impossible to use these enums as an index type for arrays. The proceduresinc,dec,succ andpred are not available for them either.
The compiler supports the built-in stringify operator$ for enumerations. The stringify's result can be controlled by explicitly giving the string values to use:
typeMyEnum=enumvalueA=(0,"my value A"),valueB="value B",valueC=2,valueD=(3,"abc")
As can be seen from the example, it is possible to both specify a field's ordinal value and its string value by using a tuple. It is also possible to only specify one of them.
An enum can be marked with thepure pragma so that its fields are added to a special module-specific hidden scope that is only queried as the last attempt. Only non-ambiguous symbols are added to this scope. But one can always access these via type qualification written asMyEnum.value:
typeMyEnum{.pure.}=enumvalueA,valueB,valueC,valueD,ambOtherEnum{.pure.}=enumvalueX,valueY,valueZ,ambechovalueA# MyEnum.valueAechoamb# Error: Unclear whether it's MyEnum.amb or OtherEnum.ambechoMyEnum.amb# OK.
Enum value names are overloadable, much like routines. If both of the enumsT andU have a member namedfoo, then the identifierfoo corresponds to a choice betweenT.foo andU.foo. During overload resolution, the correct type offoo is decided from the context. If the type offoo is ambiguous, a static error will be produced.
typeE1=enumvalue1,value2E2=enumvalue1,value2=4constLookuptable=[E1.value1:"1",# no need to qualify value2, known to be E1.value2value2:"2"]procp(e:E1)=# disambiguation in 'case' statements:caseeofvalue1:echo"A"ofvalue2:echo"B"pvalue2
In some cases, ambiguity of enums is resolved depending on the relation between the current scope and the scope the enums were defined in.
# a.nimtypeFoo*=enumabc# b.nimimportatypeBar=enumabcechoabcisBar# trueblock:typeBaz=enumabcechoabcisBaz# true
To implement bit fields with enums seeBit fields.
All string literals are of the typestring. A string in Nim is very similar to a sequence of characters. However, strings in Nim are both zero-terminated and have a length field. One can retrieve the length with the builtinlen procedure; the length never counts the terminating zero.
The terminating zero cannot be accessed unless the string is converted to thecstring type first. The terminating zero assures that this conversion can be done in O(1) and without any allocations.
The assignment operator for strings always copies the string. The& operator concatenates strings.
Most native Nim types support conversion to strings with the special$ proc. When calling theecho proc, for example, the built-in stringify operation for the parameter is called:
echo3# calls `$` for `int`
Whenever a user creates a specialized object, implementation of this procedure provides forstring representation.
typePerson=objectname:stringage:intproc`$`(p:Person):string=# `$` always returns a stringresult=p.name&" is "&$p.age&# we *need* the `$` in front of p.age which# is natively an integer to convert it to# a string" years old."
While$p.name can also be used, the$ operation on a string does nothing. Note that we cannot rely on automatic conversion from anint to astring like we can for theecho proc.
Strings are compared by their lexicographical order. All comparison operators are available. Strings can be indexed like arrays (lower bound is 0). Unlike arrays, they can be used in case statements:
caseparamStr(i)of"-v":incl(options,optVerbose)of"-h","-?":incl(options,optHelp)else:write(stdout,"invalid command line option!\n")
Per convention, all strings are UTF-8 strings, but this is not enforced. For example, when reading strings from binary files, they are merely a sequence of bytes. The index operations[i] means the i-thchar ofs, not the i-thunichar. The iteratorrunes from theunicode module can be used for iteration over all Unicode characters.
Thecstring type meaningcompatiblestring is the native representation of a string for the compilation backend. For the C backend thecstring type represents a pointer to a zero-terminated char array compatible with the typechar* in ANSI C. Its primary purpose lies in easy interfacing with C. The index operations[i] means the i-thchar ofs; however no bounds checking forcstring is performed making the index operation unsafe.
A Nimstring is implicitly convertible tocstring for convenience. If a Nim string is passed to a C-style variadic proc, it is implicitly converted tocstring too:
procprintf(formatstr:cstring){.importc:"printf",varargs,header:"<stdio.h>".}printf("This works %s","as expected")
Even though the conversion is implicit, it is notsafe: The garbage collector does not consider acstring to be a root and may collect the underlying memory. For this reason, the implicit conversion will be removed in future releases of the Nim compiler. Certain idioms like conversion of aconst string tocstring are safe and will remain to be allowed.
A$ proc is defined for cstrings that returns a string. Thus, to get a nim string from a cstring:
varstr:string="Hello!"varcstr:cstring=strvarnewstr:string=$cstr
cstring literals shouldn't be modified.
varx=cstring"literals"x[1]='A'# This is wrong!!!
If thecstring originates from a regular memory (not read-only memory), it can be modified:
varx="123456"prepareMutation(x)# call `prepareMutation` before modifying the stringsvars:cstring=cstring(x)s[0]='u'# This is ok
cstring values may also be used in case statements like strings.
A variable of a structured type can hold multiple values at the same time. Structured types can be nested to unlimited levels. Arrays, sequences, tuples, objects, and sets belong to the structured types.
Arrays are a homogeneous type, meaning that each element in the array has the same type. Arrays always have a fixed length specified as a constant expression (except for open arrays). They can be indexed by any ordinal type. A parameterA may be anopen array, in which case it is indexed by integers from 0 tolen(A)-1. An array expression may be constructed by the array constructor[]. The element type of this array expression is inferred from the type of the first element. All other elements need to be implicitly convertible to this type.
An array type can be defined using thearray[size,T] syntax, or usingarray[lo..hi,T] for arrays that start at an index other than zero.
Sequences are similar to arrays but of dynamic length which may change during runtime (like strings). Sequences are implemented as growable arrays, allocating pieces of memory as items are added. A sequenceS is always indexed by integers from 0 tolen(S)-1 and its bounds are checked. Sequences can be constructed by the array constructor[] in conjunction with the array to sequence operator@. Another way to allocate space for a sequence is to call the built-innewSeq procedure.
A sequence may be passed to a parameter that is of typeopen array.
Example:
typeIntArray=array[0..5,int]# an array that is indexed with 0..5IntSeq=seq[int]# a sequence of integersvarx:IntArrayy:IntSeqx=[1,2,3,4,5,6]# [] is the array constructory=@[1,2,3,4,5,6]# the @ turns the array into a sequenceletz=[1.0,2,3,4]# the type of z is array[0..3, float]
The lower bound of an array or sequence may be received by the built-in proclow(), the higher bound byhigh(). The length may be received bylen().low() for a sequence or an open array always returns 0, as this is the first valid index. One can append elements to a sequence with theadd() proc or the& operator, and remove (and get) the last element of a sequence with thepop() proc.
The notationx[i] can be used to access the i-th element ofx.
Arrays are always bounds checked (statically or at runtime). These checks can be disabled via pragmas or invoking the compiler with the--boundChecks:off command-line switch.
An array constructor can have explicit indexes for readability:
typeValues=enumvalA,valB,valCconstlookupTable=[valA:"A",valB:"B",valC:"C"]
If an index is left out,succ(lastIndex) is used as the index value:
typeValues=enumvalA,valB,valC,valD,valEconstlookupTable=[valA:"A","B",valC:"C","D","e"]
Often fixed size arrays turn out to be too inflexible; procedures should be able to deal with arrays of different sizes. Theopenarray type allows this; it can only be used for parameters. Open arrays are always indexed with anint starting at position 0. Thelen,low andhigh operations are available for open arrays too. Any array with a compatible base type can be passed to an open array parameter, the index type does not matter. In addition to arrays, sequences can also be passed to an open array parameter.
Theopenarray type cannot be nested: multidimensional open arrays are not supported because this is seldom needed and cannot be done efficiently.
proctestOpenArray(x:openArray[int])=echorepr(x)testOpenArray([1,2,3])# array[]testOpenArray(@[1,2,3])# seq[]
Avarargs parameter is an open array parameter that additionally allows a variable number of arguments to be passed to a procedure. The compiler converts the list of arguments to an array implicitly:
procmyWriteln(f:File,a:varargs[string])=forsinitems(a):write(f,s)write(f,"\n")myWriteln(stdout,"abc","def","xyz")# is transformed to:myWriteln(stdout,["abc","def","xyz"])
This transformation is only done if thevarargs parameter is the last parameter in the procedure header. It is also possible to perform type conversions in this context:
procmyWriteln(f:File,a:varargs[string,`$`])=forsinitems(a):write(f,s)write(f,"\n")myWriteln(stdout,123,"abc",4.0)# is transformed to:myWriteln(stdout,[$123,$"abc",$4.0])
In this example$ is applied to any argument that is passed to the parametera. (Note that$ applied to strings is a nop.)
Note that an explicit array constructor passed to avarargs parameter is not wrapped in another implicit array construction:
proctakeV[T](a:varargs[T])=discardtakeV([123,2,1])# takeV's T is "int", not "array of int"
varargs[typed] is treated specially: It matches a variable list of arguments of arbitrary type butalways constructs an implicit array. This is required so that the builtinecho proc does what is expected:
procecho*(x:varargs[typed,`$`]){...}echo@[1,2,3]# prints "@[1, 2, 3]" and not "123"
TheUncheckedArray[T] type is a special kind ofarray where its bounds are not checked. This is often useful to implement customized flexibly sized arrays. Additionally, an unchecked array is translated into a C array of undetermined size:
typeMySeq=objectlen,cap:intdata:UncheckedArray[int]
Produces roughly this C code:
typedefstruct{NIlen;NIcap;NIdata[];}MySeq;
The base type of the unchecked array may not contain any GC'ed memory but this is currently not checked.
Future directions: GC'ed memory should be allowed in unchecked arrays and there should be an explicit annotation of how the GC is to determine the runtime size of the array.
A variable of a tuple or object type is a heterogeneous storage container. A tuple or object defines various namedfields of a type. A tuple also defines a lexicographicorder of the fields. Tuples are meant to be heterogeneous storage types with few abstractions. The() syntax can be used to construct tuples. The order of the fields in the constructor must match the order of the tuple's definition. Different tuple-types areequivalent if they specify the same fields of the same type in the same order. Thenames of the fields also have to be the same.
typePerson=tuple[name:string,age:int]# type representing a person:# it consists of a name and an age.varperson:Personperson=(name:"Peter",age:30)assertperson.name=="Peter"# the same, but less readable:person=("Peter",30)assertperson[0]=="Peter"assertPersonis(string,int)assert(string,int)isPersonassertPersonisnottuple[other:string,age:int]# `other` is a different identifier
A tuple with one unnamed field can be constructed with the parentheses and a trailing comma:
procechoUnaryTuple(a:(int,))=echoa[0]echoUnaryTuple(1,)
In fact, a trailing comma is allowed for every tuple construction.
The implementation aligns the fields for the best access performance. The alignment is compatible with the way the C compiler does it.
For consistency withobject declarations, tuples in atype section can also be defined with indentation instead of[]:
typePerson=tuple# type representing a personname:string# a person consists of a nameage:Natural# and an age
Objects provide many features that tuples do not. Objects provide inheritance and the ability to hide fields from other modules. Objects with inheritance enabled have information about their type at runtime so that theof operator can be used to determine the object's type. Theof operator is similar to theinstanceof operator in Java.
typePerson=objectofRootObjname*:string# the * means that `name` is accessible from other modulesage:int# no * means that the field is hiddenStudent=refobjectofPerson# a student is a personid:int# with an id fieldvarstudent:Studentperson:Personassert(studentofStudent)# is trueassert(studentofPerson)# also true
Object fields that should be visible from outside the defining module have to be marked by*. In contrast to tuples, different object types are neverequivalent, they are nominal types whereas tuples are structural. Objects that have no ancestor are implicitlyfinal and thus have no hidden type information. One can use theinheritable pragma to introduce new object roots apart fromsystem.RootObj.
typePerson=object# example of a final objectname*:stringage:intStudent=refobjectofPerson# Error: inheritance only works with non-final objectsid:int
The assignment operator for tuples and objects copies each component. The methods to override this copying behavior are describedhere.
Objects can also be created with anobject construction expression that has the syntaxT(fieldA:valueA,fieldB:valueB,...) whereT is anobject type or arefobject type:
typeStudent=objectname:stringage:intPStudent=refStudentvara1=Student(name:"Anton",age:5)vara2=PStudent(name:"Anton",age:5)# this also works directly:vara3=(refStudent)(name:"Anton",age:5)# not all fields need to be mentioned, and they can be mentioned out of order:vara4=Student(age:5)
Note that, unlike tuples, objects require the field names along with their values. For arefobject typesystem.new is invoked implicitly.
Often an object hierarchy is an overkill in certain situations where simple variant types are needed. Object variants are tagged unions discriminated via an enumerated type used for runtime type flexibility, mirroring the concepts ofsum types andalgebraic data types (ADTs) as found in other languages.
An example:
# This is an example of how an abstract syntax tree could be modelled in NimtypeNodeKind=enum# the different node typesnkInt,# a leaf with an integer valuenkFloat,# a leaf with a float valuenkString,# a leaf with a string valuenkAdd,# an additionnkSub,# a subtractionnkIf# an if statementNode=refNodeObjNodeObj=objectcasekind:NodeKind# the `kind` field is the discriminatorofnkInt:intVal:intofnkFloat:floatVal:floatofnkString:strVal:stringofnkAdd,nkSub:leftOp,rightOp:NodeofnkIf:condition,thenPart,elsePart:Node# create a new case object:varn=Node(kind:nkIf,condition:nil)# accessing n.thenPart is valid because the `nkIf` branch is active:n.thenPart=Node(kind:nkFloat,floatVal:2.0)# the following statement raises an `FieldDefect` exception, because# n.kind's value does not fit and the `nkString` branch is not active:n.strVal=""# invalid: would change the active object branch:n.kind=nkIntvarx=Node(kind:nkAdd,leftOp:Node(kind:nkInt,intVal:4),rightOp:Node(kind:nkInt,intVal:2))# valid: does not change the active object branch:x.kind=nkSub
As can be seen from the example, an advantage to an object hierarchy is that no casting between different object types is needed. Yet, access to invalid object fields raises an exception.
The syntax ofcase in an object declaration follows closely the syntax of thecase statement: The branches in acase section may be indented too.
In the example, thekind field is called thediscriminator: For safety, its address cannot be taken and assignments to it are restricted: The new value must not lead to a change of the active object branch. Also, when the fields of a particular branch are specified during object construction, the corresponding discriminator value must be specified as a constant expression.
Instead of changing the active object branch, replace the old object in memory with a new one completely:
varx=Node(kind:nkAdd,leftOp:Node(kind:nkInt,intVal:4),rightOp:Node(kind:nkInt,intVal:2))# change the node's contents:x[]=NodeObj(kind:nkString,strVal:"abc")
Starting with version 0.20system.reset cannot be used anymore to support object branch changes as this never was completely memory safe.
As a special rule, the discriminator kind can also be bounded using acase statement. If possible values of the discriminator variable in acase statement branch are a subset of discriminator values for the selected object branch, the initialization is considered valid. This analysis only works for immutable discriminators of an ordinal type and disregardselif branches. For discriminator values with arange type, the compiler checks if the entire range of possible values for the discriminator value is valid for the chosen object branch.
A small example:
letunknownKind=nkSub# invalid: unsafe initialization because the kind field is not statically known:vary=Node(kind:unknownKind,strVal:"y")varz=Node()caseunknownKindofnkAdd,nkSub:# valid: possible values of this branch are a subset of nkAdd/nkSub object branch:z=Node(kind:unknownKind,leftOp:Node(),rightOp:Node())else:echo"ignoring: ",unknownKind# also valid, since unknownKindBounded can only contain the values nkAdd or nkSubletunknownKindBounded=range[nkAdd..nkSub](unknownKind)z=Node(kind:unknownKindBounded,leftOp:Node(),rightOp:Node())
Some restrictions for case objects can be disabled via a{.cast(uncheckedAssign).} section:
typeTokenKind*=enumstrLit,intLitToken=objectcasekind*:TokenKindofstrLit:s*:stringofintLit:i*:int64procpassToVar(x:varTokenKind)=discardvart=Token(kind:strLit,s:"abc"){.cast(uncheckedAssign).}:# inside the 'cast' section it is allowed to pass 't.kind' to a 'var T' parameter:passToVar(t.kind)# inside the 'cast' section it is allowed to set field 's' even though the# constructed 'kind' field has an unknown value:t=Token(kind:t.kind,s:"abc")# inside the 'cast' section it is allowed to assign to the 't.kind' field directly:t.kind=intLit
Object fields are allowed to have a constant default value. The type of field can be omitted if a default value is given.
typeFoo=objecta:int=2b:float=3.14c="I can have a default value"Bar=refobjecta:int=2b:float=3.14c="I can have a default value"
The explicit initialization uses these defaults which includes anobject created with an object construction expression or the proceduredefault; arefobject created with an object construction expression or the procedurenew; an array or a tuple with a subtype which has a default created with the proceduredefault.
typeFoo=objecta:int=2b=3.0Bar=refobjecta:int=2b=3.0block:# created with an object construction expressionletx=Foo()assertx.a==2andx.b==3.0lety=Bar()asserty.a==2andy.b==3.0block:# created with an object construction expressionletx=default(Foo)assertx.a==2andx.b==3.0lety=default(array[1,Foo])asserty[0].a==2andy[0].b==3.0letz=default(tuple[x:Foo])assertz.x.a==2andz.x.b==3.0block:# created with the procedure `new`lety=newBarasserty.a==2andy.b==3.0
or equivalent. When constructing a set with signed integer literals, the set's base type is defined to be in the range0..DefaultSetElements-1 whereDefaultSetElements is currently always 2^8. The maximum range length for the base type of a set isMaxSetElements which is currently always 2^16. Types with a bigger range length are coerced into the range0..MaxSetElements-1.
The reason is that sets are implemented as high performance bit vectors. Attempting to declare a set with a larger type will result in an error:
vars:set[int64]# Error: set is too large; use `std/sets` for ordinal types# with more than 2^16 elements
Note: Nim also offershash sets (which you need to import withimportstd/sets), which have no such restrictions.
Sets can be constructed via the set constructor:{} is the empty set. The empty set is type compatible with any concrete set type. The constructor can also be used to include elements (and ranges of elements):
typeCharSet=set[char]varx:CharSetx={'a'..'z', '0'..'9'}# This constructs a set that contains the# letters from 'a' to 'z' and the digits# from '0' to '9'
The module`std/setutils` provides a way to initialize a set from an iterable:
importstd/setutilsletuniqueChars=myString.toSet
These operations are supported by sets:
operation | meaning |
---|---|
A+B | union of two sets |
A*B | intersection of two sets |
A-B | difference of two sets (A without B's elements) |
A==B | set equality |
A<=B | subset relation (A is subset of B or equal to B) |
A<B | strict subset relation (A is a proper subset of B) |
einA | set membership (A contains element e) |
enotinA | A does not contain element e |
contains(A,e) | A contains element e |
card(A) | the cardinality of A (number of elements in A) |
incl(A,elem) | same asA=A+{elem} |
excl(A,elem) | same asA=A-{elem} |
Sets are often used to define a type for theflags of a procedure. This is a cleaner (and type safe) solution than defining integer constants that have to beor'ed together.
Enum, sets and casting can be used together as in:
typeMyFlag*{.size:sizeof(cint).}=enumABCDMyFlags=set[MyFlag]proctoNum(f:MyFlags):int=cast[cint](f)proctoFlags(v:int):MyFlags=cast[MyFlags](v)asserttoNum({})==0asserttoNum({A})==1asserttoNum({D})==8asserttoNum({A,C})==5asserttoFlags(0)=={}asserttoFlags(7)=={A,B,C}
Note how the set turns enum values into powers of 2.
If using enums and sets with C, use distinct cint.
For interoperability with C see also thebitsize pragma.
References (similar to pointers in other programming languages) are a way to introduce many-to-one relationships. This means different references can point to and modify the same location in memory (also calledaliasing).
Nim distinguishes betweentraced anduntraced references. Untraced references are also calledpointers. Traced references point to objects of a garbage-collected heap, untraced references point to manually allocated objects or objects somewhere else in memory. Thus, untraced references areunsafe. However, for certain low-level operations (accessing the hardware) untraced references are unavoidable.
Traced references are declared with theref keyword, untraced references are declared with theptr keyword. In general, aptrT is implicitly convertible to thepointer type.
An empty subscript[] notation can be used to de-refer a reference, theaddr procedure returns the address of an item. An address is always an untraced reference. Thus, the usage ofaddr is anunsafe feature.
The. (access a tuple/object field operator) and[] (array/string/sequence index operator) operators perform implicit dereferencing operations for reference types:
typeNode=refNodeObjNodeObj=objectle,ri:Nodedata:intvarn:Nodenew(n)n.data=9# no need to write n[].data; in fact n[].data is highly discouraged!
In order to simplify structural type checking, recursive tuples are not valid:
# invalid recursiontypeMyTuple=tuple[a:refMyTuple]
LikewiseT=refT is an invalid type.
As a syntactical extension,object types can be anonymous if declared in a type section via therefobject orptrobject notations. This feature is useful if an object should only gain reference semantics:
typeNode=refobjectle,ri:Nodedata:int
To allocate a new traced object, the built-in procedurenew has to be used. To deal with untraced memory, the proceduresalloc,dealloc andrealloc can be used. The documentation of thesystem module contains further information.
If a reference points tonothing, it has the valuenil.nil is the default value for allref andptr types. Thenil value can also be used like any other literal value. For example, it can be used in an assignment likemyRef=nil.
Dereferencingnil is an unrecoverable fatal runtime error (and not a panic).
A successful dereferencing operationp[] implies thatp is not nil. This can be exploited by the implementation to optimize code like:
p[].field=3ifp!=nil:# if p were nil, `p[]` would have caused a crash already,# so we know `p` is always not nil here.action()
Into:
p[].field=3action()
Note: This is not comparable to C's "undefined behavior" for dereferencing NULL pointers.
Special care has to be taken if an untraced object contains traced objects like traced references, strings, or sequences: in order to free everything properly, the built-in procedurereset has to be called before freeing the untraced memory manually:
typeData=tuple[x,y:int,s:string]# allocate memory for Data on the heap:vard=cast[ptrData](alloc0(sizeof(Data)))# create a new string on the garbage collected heap:d.s="abc"# tell the GC that the string is not needed anymore:reset(d.s)# free the memory:dealloc(d)
Without thereset call the memory allocated for thed.s string would never be freed. The example also demonstrates two important features for low-level programming: thesizeof proc returns the size of a type or value in bytes. Thecast operator can circumvent the type system: the compiler is forced to treat the result of thealloc0 call (which returns an untyped pointer) as if it would have the typeptrData. Casting should only be done if it is unavoidable: it breaks type safety and bugs can lead to mysterious crashes.
Note: The example only works because the memory is initialized to zero (alloc0 instead ofalloc does this):d.s is thus initialized to binary zero which the string assignment can handle. One needs to know low-level details like this when mixing garbage-collected data with unmanaged memory.
A procedural type is internally a pointer to a procedure.nil is an allowed value for a variable of a procedural type.
Examples:
procprintItem(x:int)=...procforEach(c:proc(x:int){.cdecl.})=...forEach(printItem)# this will NOT compile because calling conventions differ
typeOnMouseMove=proc(x,y:int){.closure.}proconMouseMove(mouseX,mouseY:int)=# has default calling conventionecho"x: ",mouseX," y: ",mouseYprocsetOnMouseMove(mouseMoveEvent:OnMouseMove)=discard# ok, 'onMouseMove' has the default calling convention, which is compatible# to 'closure':setOnMouseMove(onMouseMove)
A subtle issue with procedural types is that the calling convention of the procedure influences the type compatibility: procedural types are only compatible if they have the same calling convention. As a special extension, a procedure of the calling conventionnimcall can be passed to a parameter that expects a proc of the calling conventionclosure.
Nim supports thesecalling conventions:
Most calling conventions exist only for the Windows 32-bit platform.
The default calling convention isnimcall, unless it is an inner proc (a proc inside of a proc). For an inner proc an analysis is performed whether it accesses its environment. If it does so, it has the calling conventionclosure, otherwise it has the calling conventionnimcall.
Adistinct type is a new type derived from abase type that is incompatible with its base type. In particular, it is an essential property of a distinct type that itdoes not imply a subtype relation between it and its base type. Explicit type conversions from a distinct type to its base type and vice versa are allowed. See alsodistinctBase to get the reverse operation.
A distinct type is an ordinal type if its base type is an ordinal type.
A distinct type can be used to model different physicalunits with a numerical base type, for example. The following example models currencies.
Different currencies should not be mixed in monetary calculations. Distinct types are a perfect tool to model different currencies:
typeDollar=distinctintEuro=distinctintvard:Dollare:Euroechod+12# Error: cannot add a number with no unit and a `Dollar`
Unfortunately,d+12.Dollar is not allowed either, because+ is defined forint (among others), not forDollar. So a+ for dollars needs to be defined:
proc`+`(x,y:Dollar):Dollar=result=Dollar(int(x)+int(y))
It does not make sense to multiply a dollar with a dollar, but with a number without unit; and the same holds for division:
proc`*`(x:Dollar,y:int):Dollar=result=Dollar(int(x)*y)proc`*`(x:int,y:Dollar):Dollar=result=Dollar(x*int(y))proc`div`...
This quickly gets tedious. The implementations are trivial and the compiler should not generate all this code only to optimize it away later - after all+ for dollars should produce the same binary code as+ for ints. The pragmaborrow has been designed to solve this problem; in principle, it generates the above trivial implementations:
proc`*`(x:Dollar,y:int):Dollar{.borrow.}proc`*`(x:int,y:Dollar):Dollar{.borrow.}proc`div`(x:Dollar,y:int):Dollar{.borrow.}
Theborrow pragma makes the compiler use the same implementation as the proc that deals with the distinct type's base type, so no code is generated.
But it seems all this boilerplate code needs to be repeated for theEuro currency. This can be solved withtemplates.
templateadditive(typ:typedesc)=proc`+`*(x,y:typ):typ{.borrow.}proc`-`*(x,y:typ):typ{.borrow.}# unary operators:proc`+`*(x:typ):typ{.borrow.}proc`-`*(x:typ):typ{.borrow.}templatemultiplicative(typ,base:typedesc)=proc`*`*(x:typ,y:base):typ{.borrow.}proc`*`*(x:base,y:typ):typ{.borrow.}proc`div`*(x:typ,y:base):typ{.borrow.}proc`mod`*(x:typ,y:base):typ{.borrow.}templatecomparable(typ:typedesc)=proc`<`*(x,y:typ):bool{.borrow.}proc`<=`*(x,y:typ):bool{.borrow.}proc`==`*(x,y:typ):bool{.borrow.}templatedefineCurrency(typ,base:untyped)=typetyp*=distinctbaseadditive(typ)multiplicative(typ,base)comparable(typ)defineCurrency(Dollar,int)defineCurrency(Euro,int)
The borrow pragma can also be used to annotate the distinct type to allow certain builtin operations to be lifted:
typeFoo=objecta,b:ints:stringBar{.borrow:`.`.}=distinctFoovarbb:refBarnewbb# field access now validbb.a=90bb.s="abc"
Currently, only the dot accessor can be borrowed in this way.
An SQL statement that is passed from Nim to an SQL database might be modeled as a string. However, using string templates and filling in the values is vulnerable to the famousSQL injection attack:
importstd/strutilsprocquery(db:DbHandle,statement:string)=...varusername:stringdb.query("SELECT FROM users WHERE name = '$1'"%username)# Horrible security hole, but the compiler does not mind!
This can be avoided by distinguishing strings that contain SQL from strings that don't. Distinct types provide a means to introduce a new string typeSQL that is incompatible withstring:
typeSQL=distinctstringprocquery(db:DbHandle,statement:SQL)=...varusername:stringdb.query("SELECT FROM users WHERE name = '$1'"%username)# Static error: `query` expects an SQL string!
It is an essential property of abstract types that theydo not imply a subtype relation between the abstract type and its base type. Explicit type conversions fromstring toSQL are allowed:
importstd/[strutils,sequtils]procproperQuote(s:string):SQL=# quotes a string properly for an SQL statementreturnSQL(s)proc`%`(frmt:SQL,values:openarray[string]):SQL=# quote each argument:letv=values.mapIt(properQuote(it))# we need a temporary type for the type conversion :-(typeStrSeq=seq[string]# call strutils.`%`:result=SQL(string(frmt)%StrSeq(v))db.query("SELECT FROM users WHERE name = '$1'".SQL%[username])
Now we have compile-time checking against SQL injection attacks. Since"".SQL is transformed toSQL("") no new syntax is needed for nice lookingSQL string literals. The hypotheticalSQL type actually exists in the library as theSqlQuery type of modules likedb_sqlite.
Theauto type can only be used for return types and parameters. For return types it causes the compiler to infer the type from the routine body:
procreturnsInt():auto=1984
For parameters it currently creates implicitly generic routines:
procfoo(a,b:auto)=discard
Is the same as:
procfoo[T1,T2](a:T1,b:T2)=discard
However, later versions of the language might change this to mean "infer the parameters' types from the body". Then the abovefoo would be rejected as the parameters' types can not be inferred from an emptydiscard statement.
The following section defines several relations on types that are needed to describe the type checking done by the compiler.
Nim uses structural type equivalence for most types. Only for objects, enumerations and distinct types and for generic types name equivalence is used.
If objecta inherits fromb,a is a subtype ofb.
This subtype relation is extended to the typesvar,ref,ptr. IfA is a subtype ofB andA andB areobject types then:
Note: One of the above pointer-indirections is required for assignment from a subtype to its parent type to prevent "object slicing".
A typea isimplicitly convertible to typeb iff the following algorithm returns true:
procisImplicitlyConvertible(a,b:PType):bool=ifisSubtype(a,b):returntrueifisIntLiteral(a):returnbin{int8,int16,int32,int64,int,uint,uint8,uint16,uint32,uint64,float32,float64}casea.kindofint:result=bin{int32,int64}ofint8:result=bin{int16,int32,int64,int}ofint16:result=bin{int32,int64,int}ofint32:result=bin{int64,int}ofuint:result=bin{uint32,uint64}ofuint8:result=bin{uint16,uint32,uint64}ofuint16:result=bin{uint32,uint64}ofuint32:result=bin{uint64}offloat32:result=bin{float64}offloat64:result=bin{float32}ofseq:result=b==openArrayandtypeEquals(a.baseType,b.baseType)ofarray:result=b==openArrayandtypeEquals(a.baseType,b.baseType)ifa.baseType==charanda.indexType.rangeA==0:result=b==cstringofcstring,ptr:result=b==pointerofstring:result=b==cstringofproc:result=typeEquals(a,b)orcompatibleParametersAndEffects(a,b)
We used the predicatetypeEquals(a,b) for the "type equality" property and the predicateisSubtype(a,b) for the "subtype relation".compatibleParametersAndEffects(a,b) is currently not specified.
Implicit conversions are also performed for Nim'srange type constructor.
Leta0,b0 of typeT.
LetA=range[a0..b0] be the argument's type,F the formal parameter's type. Then an implicit conversion fromA toF exists ifa0>=low(F)andb0<=high(F) and bothT andF are signed integers or if both are unsigned integers.
A typea isexplicitly convertible to typeb iff the following algorithm returns true:
procisIntegralType(t:PType):bool=result=isOrdinal(t)ort.kindin{float,float32,float64}procisExplicitlyConvertible(a,b:PType):bool=result=falseifisImplicitlyConvertible(a,b):returntrueiftypeEquals(a,b):returntrueifa==distinctandtypeEquals(a.baseType,b):returntrueifb==distinctandtypeEquals(b.baseType,a):returntrueifisIntegralType(a)andisIntegralType(b):returntrueifisSubtype(a,b)orisSubtype(b,a):returntrue
The convertible relation can be relaxed by a user-defined typeconverter.
convertertoInt(x:char):int=result=ord(x)varx:intchr:char='a'# implicit conversion magic happens herex=chrechox# => 97# one can use the explicit form toox=chr.toIntechox# => 97
The type conversionT(a) is an L-value ifa is an L-value andtypeEqualsOrDistinct(T,typeof(a)) holds.
An expressionb can be assigned to an expressiona iffa is anl-value andisImplicitlyConvertible(b.typ,a.typ) holds.
In a callp(args) wherep may refer to more than one candidate, it is said to be a symbol choice. Overload resolution will attempt to find the best candidate, thus transforming the symbol choice into a resolved symbol. The routinep that matches best is selected following a series of trials explained below. In order: Category matching, Hierarchical Order Comparison, and finally, Complexity Analysis.
If multiple candidates match equally well after all trials have been tested, the ambiguity is reported during semantic analysis.
Every arg inargs needs to match and there are multiple different categories of matches. Letf be the formal parameter's type anda the type of the argument.
Each operand may fall into one of the categories above; the operand's highest priority category. The list above is in order or priority. If a candidate has more priority matches than all other candidates, it is selected as the resolved symbol.
For example, if a candidate with one exact match is compared to a candidate with multiple generic matches and zero exact matches, the candidate with an exact match will win.
Below is a pseudocode interpretation of category matching,count(p,m) counts the number of matches of the matching categorym for the routinep.
A routinep matches better than a routineq if the following algorithm returns true:
foreachmatchingcategorymin["exact match","literal match","generic match","subtype match","integral match","conversion match"]:ifcount(p,m)>count(q,m):returntrueelifcount(p,m)==count(q,m):discard"continue with next category m"else:returnfalsereturn"ambiguous"
The hierarchical order of a type is analogous to its relative specificity. Consider the type defined:
typeA[T]=object
Matching formals for this type includeT,object,A,A[...] andA[C] whereC is a concrete type,A[...] is a generic typeclass composition andT is an unconstrained generic type variable. This list is in order of specificity with respect toA as each subsequent category narrows the set of types that are members of their match set.
In this trial, the formal parameters of candidates are compared in order (1st parameter, 2nd parameter, etc.) to search for a candidate that has an unrivaled specificity. If such a formal parameter is found, the candidate it belongs to is chosen as the resolved symbol.
A slight clarification: While category matching digests all the formal parameters of a candidate at once (order doesn't matter), specificity comparison and complexity analysis operate on each formal parameter at a time. The following is the final trial to disambiguate a symbol choice when a pair of formal parameters have the same hierarchical order.
The complexity of a type is essentially its number of modifiers and depth of shape. The definition with thehighest complexity wins. Consider the following types:
typeA[T]=objectB[T,H]=object
Note: The below examples are not exhaustive.
We shall say that:
proctakesInt(x:int)=echo"int"proctakesInt[T](x:T)=echo"T"proctakesInt(x:int16)=echo"int16"takesInt(4)# "int"varx:int32takesInt(x)# "T"vary:int16takesInt(y)# "int16"varz:range[0..4]=0takesInt(z)# "T"
If the argumenta matches both the parameter typef ofp andg ofq via a subtyping relation, the inheritance depth is taken into account:
typeA=objectofRootObjB=objectofAC=objectofBprocp(obj:A)=echo"A"procp(obj:B)=echo"B"varc=C()# not ambiguous, calls 'B', not 'A' since B is a subtype of A# but not vice versa:p(c)procpp(obj:A,obj2:B)=echo"A B"procpp(obj:B,obj2:A)=echo"B A"# but this is ambiguous:pp(c,c)
Likewise, for generic matches, the most specialized generic type (that still matches) is preferred:
procgen[T](x:refrefT)=echo"ref ref T"procgen[T](x:refT)=echo"ref T"procgen[T](x:T)=echo"T"varri:refintgen(ri)# "ref T"
When overload resolution is considering candidates, the type variable's definition is not overlooked as it is used to define the formal parameter's type via variable substitution.
For example:
typeAprocp[T:A](param:T)procp[T:object](param:T)
These signatures are not ambiguous for a concrete type ofA even though the formal parameters match ("T" == "T"). InsteadT is treated as a variable in that (T ?=T) depending on the bound type ofT at the time of overload resolution.
If the formal parameterf is of typevarT in addition to the ordinary type checking, the argument is checked to be anl-value.varT matches better than justT then.
procsayHi(x:int):string=# matches a non-var intresult=$xprocsayHi(x:varint):string=# matches a var intresult=$(x+10)procsayHello(x:int)=varm=x# a mutable version of xechosayHi(x)# matches the non-var version of sayHiechosayHi(m)# matches the var version of sayHisayHello(3)# 3# 13
Note: Anunresolved expression is an expression for which no symbol lookups and no type checking have been performed.
Since templates and macros that are not declared asimmediate participate in overloading resolution, it's essential to have a way to pass unresolved expressions to a template or macro. This is what the meta-typeuntyped accomplishes:
templaterem(x:untyped)=discardremunresolvedExpression(undeclaredIdentifier)
A parameter of typeuntyped always matches any argument (as long as there is any argument passed to it).
But one has to watch out because other overloads might trigger the argument's resolution:
templaterem(x:untyped)=discardprocrem[T](x:T)=discard# undeclared identifier: 'unresolvedExpression'remunresolvedExpression(undeclaredIdentifier)
untyped andvarargs[untyped] are the only metatype that are lazy in this sense, the other metatypestyped andtypedesc are not lazy.
SeeVarargs.
A callediterator yielding typeT can be passed to a template or macro via a parameter typed asuntyped (for unresolved expressions) or the type classiterable oriterable[T] (after type checking and overload resolution).
iteratoriota(n:int):int=foriin0..<n:yielditemplatetoSeq2[T](a:iterable[T]):seq[T]=varret:seq[T]asserta.typeofisTforaiina:ret.addairetassertiota(3).toSeq2==@[0,1,2]asserttoSeq2(5..7)==@[5,6,7]assertnotcompiles(toSeq2(@[1,2]))# seq[int] is not an iterableasserttoSeq2(items(@[1,2]))==@[1,2]# but items(@[1,2]) is
For routine calls "overload resolution" is performed. There is a weaker form of overload resolution calledoverload disambiguation that is performed when an overloaded symbol is used in a context where there is additional type information available. Letp be an overloaded symbol. These contexts are:
As usual, ambiguous matches produce a compile-time error.
Routines with the same type signature can be called individually if a parameter has different names between them.
procfoo(x:int)=echo"Using x: ",xprocfoo(y:int)=echo"Using y: ",yfoo(x=2)# Using x: 2foo(y=2)# Using y: 2
Not supplying the parameter name in such cases results in an ambiguity error.
Concepts are a mechanism for users to define custom type classes that match other types based on a given set of bindings.
typeComparable=concept# Atomic conceptproccmp(a,b:Self):intIndexable[I,T]=concept# Container conceptproc`[]`(x:Self;at:I):Tproc`[]=`(x:varSelf;at:I;newVal:T)proclen(x:Self):IIndex=conceptprocinc(x:varSelf)proc`<`(a,b:Self):boolprocsort*[I:Index;T:Comparable](x:varIndexable[I,T])
In the above example,Comparable andIndexable are types that will match any type that can can bind each definition declared in the concept body. The specialSelf type defined in the concept body refers to the type being matched, also called the "implementation" of the concept. Implementations that match the concept are generic matches, and the concept typeclasses themselves work in a similar way to generic type variables in that they are never concrete types themselves (even if they have concrete type parameters such asIndexable[int,int]) and expressions liketypeof(x) in the body ofprocsort from the above example will return the type of the implementation, not the concept typeclass. Concepts are useful for providing information to the compiler in generic contexts, most notably for generic type checking, and as a tool forOverload resolution. Generic type checking is forthcoming, so this will only explain overload resolution for now.
In the example above, "atomic" and "container" concepts are mentioned. These kinds of concept are determined by the generic type variables of the concept. Atomic concepts` definitions contain only concrete types, and theSelf type is inferred to be concrete. Container types are the same, under the condition that their generic variables are bound to concrete types and substituted appropriately. The programmer is free to define a concept that breaks these concreteness rules, thus making a "gray" concept:
typeProcessor=conceptprocprocess[T](s:Self;data:T)
The above concept does not have generic variables, and its definition containsT which is not concrete. This kind of concept may disrupt the compiler's ability to type check generic contexts, but it is useful for overload resolution. The difference betweenIndexable[I,T] andProcessor is that a given implementation is effectively described as an instantiation ofIndexable (as inIndexable[int,int]) whereas aProcessor concept describes an implementation designed to handle multiple different types of dataT.
When an operand's type is being matched to a concept, the operand's type is set as the "potential implementation". For each definition in the concept body, overload resolution is performed by substitutingSelf for the potential implementation to try and find a match for each definition. If this succeeds, the concept matches. Implementations do not need to exactly match the definitions in the concept. For example:
typeC1=conceptprocp(s:Self;x:int)Implementation=objectprocp(x:Implementation;y:SomeInteger)procspring(x:C1)spring(Implementation())This will bind becausep(Implementation(),0) will bind. Conversely, container types will bind to less specific definitions if the generic constraints and bindings allow it, as per usual generic matching.
Things start to get more complicated when overload resolution starts "Hierarchical Order Comparison" I.E. specificity comparison as perOverload resolution. In this state the compiler may be comparing all kinds of types and typeclasses with concepts as defined in theproc definitions of each overload. This leads to confusing and impractical behavior in most situations, so the rules are simplified. They are:
1. if a concept is being compared withT or any type that accepts all other types (auto) the concept is more specific
This type of matching is simple. When comparing conceptsC1 andC2, if all valid implementations ofC1 are also valid implementations ofC2 but not vice versa thenC1 is a subset ofC2. This means thatC1 will match toC2 and therefore the disambiguation process will preferC2 as it is more specific. If neither of them are subsets of one another, then the disambiguation proceeds to complexity analysis and the concept with the most definitions wins, if any. No definite winner is an ambiguity error at compile time.
Nim uses the common statement/expression paradigm: Statements do not produce a value in contrast to expressions. However, some expressions are statements.
Statements are separated intosimple statements andcomplex statements. Simple statements are statements that cannot contain other statements like assignments, calls, or thereturn statement; complex statements can contain other statements. To avoid thedangling else problem, complex statements always have to be indented. The details can be found in the grammar.
Statements can also occur in an expression context that looks like(stmt1;stmt2;...;ex). This is called a statement list expression or(;). The type of(stmt1;stmt2;...;ex) is the type ofex. All the other statements must be of typevoid. (One can usediscard to produce avoid type.)(;) does not introduce a new scope.
Example:
procp(x,y:int):int=result=x+ydiscardp(3,4)# discard the return value of `p`
Thediscard statement evaluates its expression for side-effects and throws the expression's resulting value away, and should only be used when ignoring this value is known not to cause problems.
Ignoring the return value of a procedure without using a discard statement is a static error.
The return value can be ignored implicitly if the called proc/iterator has been declared with thediscardable pragma:
procp(x,y:int):int{.discardable.}=result=x+yp(3,4)# now valid
however the discardable pragma does not work on templates as templates substitute the AST in place. For example:
{.pushdiscardable.}templateexample():string="https://nim-lang.org"{.pop.}example()
This template will resolve into "https://nim-lang.org" which is a string literal and since {.discardable.} doesn't apply to literals, the compiler will error.
An emptydiscard statement is often used as a null statement:
procclassify(s:string)=cases[0]ofSymChars,'_':echo"an identifier"of'0'..'9':echo"a number"else:discard
In a list of statements, every expression except the last one needs to have the typevoid. In addition to this rule an assignment to the builtinresult symbol also triggers a mandatoryvoid context for the subsequent expressions:
procinvalid*():string=result="foo""invalid"# Error: value of type 'string' has to be discarded
procvalid*():string=letx=317"valid"
Var statements declare new local and global variables and initialize them. A comma-separated list of variables can be used to specify variables of the same type:
vara:int=0x,y,z:int
If an initializer is given, the type can be omitted: the variable is then of the same type as the initializing expression. Variables are always initialized with a default value if there is no initializing expression. The default value depends on the type and is always a zero in binary.
Type | default value |
---|---|
any integer type | 0 |
any float | 0.0 |
char | '\0' |
bool | false |
ref or pointer type | nil |
procedural type | nil |
sequence | @[] |
string | "" |
tuple[x:A,y:B,...] | (zeroDefault(A), zeroDefault(B), ...) (analogous for objects) |
array[0...,T] | [zeroDefault(T),...] |
range[T] | default(T); this may be out of the valid range |
T = enum | cast[T](0); this may be an invalid value |
The implicit initialization can be avoided for optimization reasons with thenoinit pragma:
vara{.noinit.}:array[0..1023,char]
If a proc is annotated with thenoinit pragma, this refers to its implicitresult variable:
procreturnUndefinedValue:int{.noinit.}=discard
The implicit initialization can also be prevented by therequiresInit type pragma. The compiler requires an explicit initialization for the object and all of its fields. However, it does acontrol flow analysis to prove the variable has been initialized and does not rely on syntactic properties:
typeMyObject{.requiresInit.}=objectprocp()=# the following is valid:varx:MyObjectifsomeCondition():x=a()else:x=a()# use x
requiresInit pragma can also be applied todistinct types.
Given the following distinct type definitions:
typeFoo=objectx:stringDistinctFoo{.requiresInit,borrow:`.`.}=distinctFooDistinctString{.requiresInit.}=distinctstring
The following code blocks will fail to compile:
varfoo:DistinctFoofoo.x="test"doAssertfoo.x=="test"
vars:DistinctStrings="test"doAssertstring(s)=="test"
But these will compile successfully:
letfoo=DistinctFoo(Foo(x:"test"))doAssertfoo.x=="test"
lets=DistinctString("test")doAssertstring(s)=="test"
Alet statement declares new local and globalsingle assignment variables and binds a value to them. The syntax is the same as that of thevar statement, except that the keywordvar is replaced by the keywordlet. Let variables are not l-values and can thus not be passed tovar parameters nor can their address be taken. They cannot be assigned new values.
For let variables, the same pragmas are available as for ordinary variables.
Aslet statements are immutable after creation they need to define a value when they are declared. The only exception to this is if the{.importc.} pragma (or any of the otherimportX pragmas) is applied, in this case the value is expected to come from native code, typically a C/C++const.
The identifier_ has a special meaning in declarations. Any definition with the name_ will not be added to scope, meaning the definition is evaluated, but cannot be used. As a result the name_ can be indefinitely redefined.
let_=123echo_# errorlet_=456# compiles
In avar,let orconst statement tuple unpacking can be performed. The special identifier_ can be used to ignore some parts of the tuple:
procreturnsTuple():(int,int,int)=(4,2,3)let(x,_,z)=returnsTuple()
This is treated as syntax sugar for roughly the following:
lettmpTuple=returnsTuple()x=tmpTuple[0]z=tmpTuple[2]
Forvar orlet statements, if the value expression is a tuple literal, each expression is directly expanded into an assignment without the use of a temporary variable.
let(x,y,z)=(1,2,3)# becomesletx=1y=2z=3
Tuple unpacking can also be nested:
procreturnsNestedTuple():(int,(int,int),int,int)=(4,(5,7),2,3)let(x,(_,y),_,z)=returnsNestedTuple()
A const section declares constants whose values are constant expressions:
importstd/[strutils]constroundPi=3.1415constEval=contains("abc",'b')# computed at compile time!
Once declared, a constant's symbol can be used as a constant expression.
The value part of a constant declaration opens a new scope for each constant, so no symbols declared in the constant value are accessible outside of it.
constfoo=(vara=1;a)constbar=a# errorletbaz=a# error
SeeConstants and Constant Expressions for details.
A static statement/expression explicitly requires compile-time execution. Even some code that has side effects is permitted in a static block:
static:echo"echo at compile time"
static can also be used like a routine.
procgetNum(a:int):int=a# Below calls "echo getNum(123)" at compile time.static:echogetNum(123)# Below call evaluates the "getNum(123)" at compile time, but its# result gets used at run time.echostatic(getNum(123))
There are limitations on what Nim code can be executed at compile time; seeRestrictions on Compile-Time Execution for details. It's a static error if the compiler cannot execute the block at compile time.
Example:
varname=readLine(stdin)ifname=="Andreas":echo"What a nice name!"elifname=="":echo"Don't you have a name?"else:echo"Boring name..."
Theif statement is a simple way to make a branch in the control flow: The expression after the keywordif is evaluated, if it is true the corresponding statements after the: are executed. Otherwise, the expression after theelif is evaluated (if there is anelif branch), if it is true the corresponding statements after the: are executed. This goes on until the lastelif. If all conditions fail, theelse part is executed. If there is noelse part, execution continues with the next statement.
Inif statements, new scopes begin immediately after theif/elif/else keywords and ends after the correspondingthen block. For visualization purposes the scopes have been enclosed in{||} in the following example:
if{|(letm=input=~re"(\w+)=\w+";m.isMatch):echo"key ",m[0]," value ",m[1]|}elif{|(letm=input=~re"";m.isMatch):echo"new m in this scope"|}else:{|echo"m not declared here"|}
Example:
letline=readline(stdin)caselineof"delete-everything","restart-computer":echo"permission denied"of"go-for-a-walk":echo"please yourself"elifline.len==0:echo"empty"# optional, must come after `of` brancheselse:echo"unknown command"# ditto# indentation of the branches is also allowed; and so is an optional colon# after the selecting expression:casereadline(stdin):of"delete-everything","restart-computer":echo"permission denied"of"go-for-a-walk":echo"please yourself"else:echo"unknown command"
Thecase statement is similar to theif statement, but it represents a multi-branch selection. The expression after the keywordcase is evaluated and if its value is in aslicelist the corresponding statements (after theof keyword) are executed. If the value is not in any givenslicelist, trailingelif andelse parts are executed using same semantics as forif statement, andelif is handled just likeelse:if. If there are noelse orelif parts and not all possible values thatexpr can hold occur in aslicelist, a static error occurs. This holds only for expressions of ordinal types. "All possible values" ofexpr are determined byexpr's type. To suppress the static error anelse:discard should be used.
Only ordinal types, floats, strings and cstrings are allowed as values in case statements.
For non-ordinal types, it is not possible to list every possible value and so these always require anelse part. An exception to this rule is for thestring type, which currently doesn't require a trailingelse orelif branch; it's unspecified whether this will keep working in future versions.
Because case statements are checked for exhaustiveness during semantic analysis, the value in everyof branch must be a constant expression. This restriction also allows the compiler to generate more performant code.
As a special semantic extension, an expression in anof branch of a case statement may evaluate to a set or array constructor; the set or array is then expanded into a list of its elements:
constSymChars:set[char]={'a'..'z', 'A'..'Z', '\x80'..'\xFF'}procclassify(s:string)=cases[0]ofSymChars,'_':echo"an identifier"of'0'..'9':echo"a number"else:echo"other"# is equivalent to:procclassify(s:string)=cases[0]of'a'..'z','A'..'Z','\x80'..'\xFF','_':echo"an identifier"of'0'..'9':echo"a number"else:echo"other"
Thecase statement doesn't produce an l-value, so the following example won't work:
typeFoo=refobjectx:seq[string]procget_x(x:Foo):varseq[string]=# doesn't workcasetrueoftrue:x.xelse:x.xvarfoo=Foo(x:@[])foo.get_x().add("asd")
This can be fixed by explicitly usingresult orreturn:
procget_x(x:Foo):varseq[string]=casetrueoftrue:result=x.xelse:result=x.x
Example:
whensizeof(int)==2:echo"running on a 16 bit system!"elifsizeof(int)==4:echo"running on a 32 bit system!"elifsizeof(int)==8:echo"running on a 64 bit system!"else:echo"cannot happen!"
Thewhen statement is almost identical to theif statement with some exceptions:
Thewhen statement enables conditional compilation techniques. As a special syntactic extension, thewhen construct is also available withinobject definitions.
nimvm is a special symbol that may be used as the expression of awhennimvm statement to differentiate the execution path between compile-time and the executable.
Example:
procsomeProcThatMayRunInCompileTime():bool=whennimvm:# This branch is taken at compile time.result=trueelse:# This branch is taken in the executable.result=falseconstctValue=someProcThatMayRunInCompileTime()letrtValue=someProcThatMayRunInCompileTime()assert(ctValue==true)assert(rtValue==false)
Awhennimvm statement must meet the following requirements:
Example:
return40+2
Thereturn statement ends the execution of the current procedure. It is only allowed in procedures. If there is anexpr, this is syntactic sugar for:
result=exprreturnresult
return without an expression is a short notation forreturnresult if the proc has a return type. Theresult variable is always the return value of the procedure. It is automatically declared by the compiler. As all variables,result is initialized to (binary) zero:
procreturnZero():int=# implicitly returns 0
Example:
yield(1,2,3)
Theyield statement is used instead of thereturn statement in iterators. It is only valid in iterators. Execution is returned to the body of the for loop that called the iterator. Yield does not end the iteration process, but the execution is passed back to the iterator if the next iteration starts. See the section about iterators (Iterators and the for statement) for further information.
Example:
varfound=falseblockmyblock:foriin0..3:forjin0..3:ifa[j][i]==7:found=truebreakmyblock# leave the block, in this case both for-loopsechofound
The block statement is a means to group statements to a (named)block. Inside the block, thebreak statement is allowed to leave the block immediately. Abreak statement can contain a name of a surrounding block to specify which block is to be left.
Example:
break
Thebreak statement is used to leave a block immediately. Ifsymbol is given, it is the name of the enclosing block that is to be left. If it is absent, the innermost block is left.
Example:
echo"Please tell me your password:"varpw=readLine(stdin)whilepw!="12345":echo"Wrong password! Next try:"pw=readLine(stdin)
Thewhile statement is executed until theexpr evaluates to false. Endless loops are no error.while statements open animplicitblock so that they can be left with abreak statement.
Acontinue statement leads to the immediate next iteration of the surrounding loop construct. It is only allowed within a loop. A continue statement is syntactic sugar for a nested block:
whileexpr1:stmt1continuestmt2
Is equivalent to:
whileexpr1:blockmyBlockName:stmt1breakmyBlockNamestmt2
The direct embedding of assembler code into Nim code is supported by the unsafeasm statement. Identifiers in the assembler code that refer to Nim identifiers shall be enclosed in a special character which can be specified in the statement's pragmas. The default special character is'`':
{.pushstackTrace:off.}procaddInt(a,b:int):int=# a in eax, and b in edxasm""" mov eax, `a` add eax, `b` jno theEnd call `raiseOverflow` theEnd: """{.pop.}
If the GNU assembler is used, quotes and newlines are inserted automatically:
procaddInt(a,b:int):int=asm""" addl %%ecx, %%eax jno 1 call `raiseOverflow` 1: :"=a"(`result`) :"a"(`a`), "c"(`b`) """
Instead of:
procaddInt(a,b:int):int=asm""" "addl %%ecx, %%eax\n" "jno 1\n" "call `raiseOverflow`\n" "1: \n" :"=a"(`result`) :"a"(`a`), "c"(`b`) """
Theusing statement provides syntactic convenience in modules where the same parameter names and types are used over and over. Instead of:
procfoo(c:Context;n:Node)=...procbar(c:Context;n:Node,counter:int)=...procbaz(c:Context;n:Node)=...
One can tell the compiler about the convention that a parameter of namec should default to typeContext,n should default toNode etc.:
usingc:Contextn:Nodecounter:intprocfoo(c,n)=...procbar(c,n,counter)=...procbaz(c,n)=...procmixedMode(c,n;x,y:int)=# 'c' is inferred to be of the type 'Context'# 'n' is inferred to be of the type 'Node'# But 'x' and 'y' are of type 'int'.
Theusing section uses the same indentation based grouping syntax as avar orlet section.
Note thatusing is not applied fortemplate since the untyped template parameters default to the typesystem.untyped.
Mixing parameters that should use theusing declaration with parameters that are explicitly typed is possible and requires a semicolon between them.
Anif expression is almost like an if statement, but it is an expression. This feature is similar toternary operators in other languages. Example:
vary=ifx>8:9else:10
Anif expression always results in a value, so theelse part is required.elif parts are also allowed.
Just like anif expression, but corresponding to thewhen statement.
Thecase expression is again very similar to the case statement:
varfavoriteFood=caseanimalof"dog":"bones"of"cat":"mice"elifanimal.endsWith"whale":"plankton"else:echo"I'm not sure what to serve, but everybody loves ice cream""ice cream"
As seen in the above example, the case expression can also introduce side effects. When multiple statements are given for a branch, Nim will use the last expression as the result value.
Ablock expression is almost like a block statement, but it is an expression that uses the last expression under the block as the value. It is similar to the statement list expression, but the statement list expression does not open a new block scope.
leta=block:varfib=@[0,1]foriin0..10:fib.addfib[^1]+fib[^2]fib
A table constructor is syntactic sugar for an array constructor:
{"key1":"value1","key2","key3":"value2"}# is the same as:[("key1","value1"),("key2","value2"),("key3","value2")]
The empty table can be written{:} (in contrast to the empty set which is{}) which is thus another way to write the empty array constructor[]. This slightly unusual way of supporting tables has lots of advantages:
Syntactically atype conversion is like a procedure call, but a type name replaces the procedure name. A type conversion is always safe in the sense that a failure to convert a type to another results in an exception (if it cannot be determined statically).
Ordinary procs are often preferred over type conversions in Nim: For instance,$ is thetoString operator by convention andtoFloat andtoInt can be used to convert from floating-point to integer or vice versa.
Type conversion can also be used to disambiguate overloaded routines:
procp(x:int)=echo"int"procp(x:string)=echo"string"letprocVar=(proc(x:string))(p)procVar("a")
Since operations on unsigned numbers wrap around and are unchecked so are type conversions to unsigned integers and between unsigned integers. The rationale for this is mostly better interoperability with the C Programming language when algorithms are ported from C to Nim.
Note: Historically the operations were unchecked and the conversions were sometimes checked but starting with the revision 1.0.4 of this document and the language implementation the conversions too are nowalways unchecked.
Type casts are a crude mechanism to interpret the bit pattern of an expression as if it would be of another type. Type casts are only needed for low-level programming and are inherently unsafe.
cast[int](x)
The target type of a cast must be a concrete type, for instance, a target type that is a type class (which is non-concrete) would be invalid:
typeFoo=intorfloatvarx=cast[Foo](1)# Error: cannot cast to a non concrete type: 'Foo'
Type casts should not be confused withtype conversions, as mentioned in the prior section. Unlike type conversions, a type cast cannot change the underlying bit pattern of the data being cast (aside from that the size of the target type may differ from the source type). Casting resemblestype punning in other languages or C++'sreinterpret_cast andbit_cast features.
If the size of the target type is larger than the size of the source type, the remaining memory is zeroed.
Theaddr operator returns the address of an l-value. If the type of the location isT, theaddr operator result is of the typeptrT. An address is always an untraced reference. Taking the address of an object that resides on the stack isunsafe, as the pointer may live longer than the object on the stack and can thus reference a non-existing object. One can get the address of variables. For easier interoperability with other compiled languages such as C, retrieving the address of alet variable, a parameter, or afor loop variable can be accomplished too:
lett1="Hello"vart2=t1t3:pointer=addr(t2)echorepr(addr(t2))# --> ref 0x7fff6b71b670 --> 0x10bb81050"Hello"echocast[ptrstring](t3)[]# --> Hello# The following line also worksechorepr(addr(t1))
TheunsafeAddr operator is a deprecated alias for theaddr operator:
letmyArray=[1,2,3]foreignProcThatTakesAnAddr(unsafeAddrmyArray)
What most programming languages callmethods orfunctions are calledprocedures in Nim. A procedure declaration consists of an identifier, zero or more formal parameters, a return value type and a block of code. Formal parameters are declared as a list of identifiers separated by either comma or semicolon. A parameter is given a type by:typename. The type applies to all parameters immediately before it, until either the beginning of the parameter list, a semicolon separator, or an already typed parameter, is reached. The semicolon can be used to make separation of types and subsequent identifiers more distinct.
# Using only commasprocfoo(a,b:int,c,d:bool):int# Using semicolon for visual distinctionprocfoo(a,b:int;c,d:bool):int# Will fail: a is untyped since ';' stops type propagation.procfoo(a;b:int;c,d:bool):int
A parameter may be declared with a default value which is used if the caller does not provide a value for the argument. The value will be reevaluated every time the function is called.
# b is optional with 47 as its default value.procfoo(a:int,b:int=47):int
Parameters can be declared mutable and so allow the proc to modify those arguments, by using the type modifiervar.
# "returning" a value to the caller through the 2nd argument# Notice that the function uses no actual return value at all (ie void)procfoo(inp:int,outp:varint)=outp=inp+47
If the proc declaration doesn't have a body, it is aforward declaration. If the proc returns a value, the procedure body can access an implicitly declared variable namedresult that represents the return value. Procs can be overloaded. The overloading resolution algorithm determines which proc is the best match for the arguments. Example:
proctoLower(c:char):char=# toLower for charactersifcin{'A'..'Z'}:result=chr(ord(c)+(ord('a') - ord('A')))else:result=cproctoLower(s:string):string=# toLower for stringsresult=newString(len(s))foriin0..len(s)-1:result[i]=toLower(s[i])# calls toLower for characters; no recursion!
Calling a procedure can be done in many ways:
proccallme(x,y:int,s:string="",c:char,b:bool=false)=...# call with positional arguments # parameter bindings:callme(0,1,"abc",'\t',true)# (x=0, y=1, s="abc", c='\t', b=true)# call with named and positional arguments:callme(y=1,x=0,"abd",'\t')# (x=0, y=1, s="abd", c='\t', b=false)# call with named arguments (order is not relevant):callme(c='\t',y=1,x=0)# (x=0, y=1, s="", c='\t', b=false)# call as a command statement: no () needed:callme0,1,"abc",'\t'# (x=0, y=1, s="abc", c='\t', b=false)
A procedure may call itself recursively.
Operators are procedures with a special operator symbol as identifier:
proc`$`(x:int):string=# converts an integer to a string; this is a prefix operator.result=intToStr(x)
Operators with one parameter are prefix operators, operators with two parameters are infix operators. (However, the parser distinguishes these from the operator's position within an expression.) There is no way to declare postfix operators: all postfix operators are built-in and handled by the grammar explicitly.
Any operator can be called like an ordinary proc with the `opr` notation. (Thus an operator can have more than two parameters):
proc`*+`(a,b,c:int):int=# Multiply and addresult=a*b+cassert`*+`(3,4,6)==`+`(`*`(a,b),c)
If a declared symbol is marked with anasterisk it is exported from the current module:
procexportedEcho*(s:string)=echosproc`*`*(a:string;b:int):string=result=newStringOfCap(a.len*b)foriin1..b:result.addavarexportedVar*:intconstexportedConst*=78typeExportedType*=objectexportedField*:int
For object-oriented programming, the syntaxobj.methodName(args) can be used instead ofmethodName(obj,args). The parentheses can be omitted if there are no remaining arguments:obj.len (instead oflen(obj)).
This method call syntax is not restricted to objects, it can be used to supply any type of first argument for procedures:
echo"abc".len# is the same as echo len "abc"echo"abc".toUpper()echo{'a', 'b', 'c'}.cardstdout.writeLine("Hallo")# the same as writeLine(stdout, "Hallo")
Another way to look at the method call syntax is that it provides the missing postfix notation.
The method call syntax conflicts with explicit generic instantiations:p[T](x) cannot be written asx.p[T] becausex.p[T] is always parsed as(x.p)[T].
See also:Limitations of the method call syntax.
The[:] notation has been designed to mitigate this issue:x.p[:T] is rewritten by the parser top[T](x),x.p[:T](y) is rewritten top[T](x,y). Note that[:] has no AST representation, the rewrite is performed directly in the parsing step.
Nim has no need forget-properties: Ordinary get-procedures that are called with themethod call syntax achieve the same. But setting a value is different; for this, a special setter syntax is needed:
# Module asockettypeSocket*=refobjectofRootObjhost:int# cannot be accessed from the outside of the moduleproc`host=`*(s:varSocket,value:int){.inline.}=## setter of hostAddr.## This accesses the 'host' field and is not a recursive call to## `host=` because the builtin dot access is preferred if it is## available:s.host=valueprochost*(s:Socket):int{.inline.}=## getter of hostAddr## This accesses the 'host' field and is not a recursive call to## `host` because the builtin dot access is preferred if it is## available:s.host
# module Bimportasocketvars:Socketnewss.host=34# same as `host=`(s, 34)
A proc defined asf= (with the trailing=) is called asetter. A setter can be called explicitly via the common backticks notation:
proc`f=`(x:MyObject;value:string)=discard`f=`(myObject,"value")
f= can be called implicitly in the patternx.f=value if and only if the type ofx does not have a field namedf or iff is not visible in the current module. These rules ensure that object fields and accessors can have the same name. Within the modulex.f is then always interpreted as field access and outside the module it is interpreted as an accessor proc call.
Routines can be invoked without the() if the call is syntactically a statement. This command invocation syntax also works for expressions, but then only a single argument may follow. This restriction meansechof1,f2 is parsed asecho(f(1),f(2)) and not asecho(f(1,f(2))). The method call syntax may be used to provide one more argument in this case:
procoptarg(x:int,y:int=0):int=x+yprocsinglearg(x:int):int=20*xechooptarg1," ",singlearg2# prints "1 40"letfail=optarg1,optarg8# Wrong. Too many arguments for a command callletx=optarg(1,optarg8)# traditional procedure call with 2 argumentslety=1.optargoptarg8# same thing as above, w/o the parenthesisassertx==y
The command invocation syntax also can't have complex expressions as arguments. For example:anonymous procedures,if,case ortry. Function calls with no arguments still need () to distinguish between a call and the function itself as a first-class value.
Procedures can appear at the top level in a module as well as inside other scopes, in which case they are called nested procs. A nested proc can access local variables from its enclosing scope and if it does so it becomes a closure. Any captured variables are stored in a hidden additional argument to the closure (its environment) and they are accessed by reference by both the closure and its enclosing scope (i.e. any modifications made to them are visible in both places). The closure environment may be allocated on the heap or on the stack if the compiler determines that this would be safe.
Since closures capture local variables by reference it is often not wanted behavior inside loop bodies. SeeclosureScope andcapture for details on how to change this behavior.
Unnamed procedures can be used as lambda expressions to pass into other procedures:
varcities=@["Frankfurt","Tokyo","New York","Kyiv"]cities.sort(proc(x,y:string):int=cmp(x.len,y.len))
Procs as expressions can appear both as nested procs and inside top-level executable code. Thesugar module contains the=> macro which enables a more succinct syntax for anonymous procedures resembling lambdas as they are in languages like JavaScript, C#, etc.
As a special convenience notation that keeps most elements of a regular proc expression, thedo keyword can be used to pass anonymous procedures to routines:
varcities=@["Frankfurt","Tokyo","New York","Kyiv"]sort(cities)do(x,y:string)->int:cmp(x.len,y.len)# Less parentheses using the method plus command syntax:cities=cities.mapdo(x:string)->string:"City of "&x
do is written after the parentheses enclosing the regular proc parameters. The proc expression represented by thedo block is appended to the routine call as the last argument. In calls using the command syntax, thedo block will bind to the immediately preceding expression rather than the command call.
do with a parameter list or pragma list corresponds to an anonymousproc, howeverdo without parameters or pragmas is treated as a normal statement list. This allows macros to receive both indented statement lists as an argument in inline calls, as well as a direct mirror of Nim's routine syntax.
# Passing a statement list to an inline macro:macroResults.addquotedo:ifnot`ex`:echo`info`,": Check failed: ",`expString`# Processing a routine definition in a macro:rpc(router,"add")do(a,b:int)->int:result=a+b
Thefunc keyword introduces a shortcut for anoSideEffect proc.
funcbinarySearch[T](a:openArray[T];elem:T):int
Is short for:
procbinarySearch[T](a:openArray[T];elem:T):int{.noSideEffect.}
A routine is a symbol of kind:proc,func,method,iterator,macro,template,converter.
A type bound operator is aproc orfunc whose name starts with= but isn't an operator (i.e. containing only symbols, such as==). These are unrelated to setters (seeProperties), which instead end in=. A type bound operator declared for a type applies to the type regardless of whether the operator is in scope (including if it is private).
# foo.nim:varwitness*=0typeFoo[T]=objectprocinitFoo*(T:typedesc):Foo[T]=discardproc`=destroy`[T](x:varFoo[T])=witness.inc# type bound operator# main.nim:importfooblock:vara=initFoo(int)doAssertwitness==0doAssertwitness==1block:vara=initFoo(int)doAssertwitness==1`=destroy`(a)# can be called explicitly, even without being in scopedoAssertwitness==2# will still be called upon exiting scopedoAssertwitness==3
Type bound operators are:=destroy,=copy,=sink,=trace,=deepcopy,=wasMoved,=dup.
These operations can beoverridden instead ofoverloaded. This means that the implementation is automatically lifted to structured types. For instance, if the typeT has an overridden assignment operator=, this operator is also used for assignments of the typeseq[T].
Since these operations are bound to a type, they have to be bound to a nominal type for reasons of simplicity of implementation; this means an overriddendeepCopy forrefT is really bound toT and not torefT. This also means that one cannot overridedeepCopy for bothptrT andrefT at the same time, instead a distinct or object helper type has to be used for one pointer type.
For more details on some of those procs, seeLifetime-tracking hooks.
The following built-in procs cannot be overloaded for reasons of implementation simplicity (they require specialized semantic checking):
declared, defined, definedInScope, compiles, sizeof,is, shallowCopy, getAst, astToStr, spawn, procCall
Thus, they act more like keywords than like ordinary identifiers; unlike a keyword however, a redefinition mayshadow the definition in thesystem module. From this list the following should not be written in dot notationx.f sincex cannot be type-checked before it gets passed tof:
declared, defined, definedInScope, compiles, getAst, astToStr
The type of a parameter may be prefixed with thevar keyword:
procdivmod(a,b:int;res,remainder:varint)=res=adivbremainder=amodbvarx,y:intdivmod(8,5,x,y)# modifies x and yassertx==1asserty==3
In the example,res andremainder arevarparameters. Var parameters can be modified by the procedure and the changes are visible to the caller. The argument passed to a var parameter has to be an l-value. Var parameters are implemented as hidden pointers. The above example is equivalent to:
procdivmod(a,b:int;res,remainder:ptrint)=res[]=adivbremainder[]=amodbvarx,y:intdivmod(8,5,addr(x),addr(y))assertx==1asserty==3
In the examples, var parameters or pointers are used to provide two return values. This can be done in a cleaner way by returning a tuple:
procdivmod(a,b:int):tuple[res,remainder:int]=(adivb,amodb)vart=divmod(8,5)assertt.res==1assertt.remainder==3
One can usetuple unpacking to access the tuple's fields:
var(x,y)=divmod(8,5)# tuple unpackingassertx==1asserty==3
Note:var parameters are never necessary for efficient parameter passing. Since non-var parameters cannot be modified the compiler is always free to pass arguments by reference if it considers it can speed up execution.
A proc, converter, or iterator may return avar type which means that the returned value is an l-value and can be modified by the caller:
varg=0procwriteAccessToG():varint=result=gwriteAccessToG()=6assertg==6
It is a static error if the implicitly introduced pointer could be used to access a location beyond its lifetime:
procwriteAccessToG():varint=varg=0result=g# Error!
For iterators, a component of a tuple return type can have avar type too:
iteratormpairs(a:varseq[string]):tuple[key:int,val:varstring]=foriin0..a.high:yield(i,a[i])
In the standard library every name of a routine that returns avar type starts with the prefixm per convention.
Memory safety for returning byvarT is ensured by a simple borrowing rule: Ifresult does not refer to a location pointing to the heap (that is inresult=X theX involves aptr orref access) then it has to be derived from the routine's first parameter:
procforward[T](x:varT):varT=result=x# ok, derived from the first parameter.procp(param:varint):varint=varx:int# we know 'forward' provides a view into the location derived from# its first argument 'x'.result=forward(x)# Error: location is derived from `x`# which is not p's first parameter and lives# on the stack.
In other words, the lifetime of whatresult points to is attached to the lifetime of the first parameter and that is enough knowledge to verify memory safety at the call site.
Later versions of Nim can be more precise about the borrowing rule with a syntax like:
procfoo(other:Y;container:varX):varTfromcontainer
HerevarTfromcontainer explicitly exposes that the location is derived from the second parameter (called 'container' in this case). The syntaxvarTfromp specifies a typevarTy[T,2] which is incompatible withvarTy[T,1].
Note: This section describes the current implementation. This part of the language specification will be changed. Seehttps://github.com/nim-lang/RFCs/issues/230 for more information.
The return value is represented inside the body of a routine as the specialresult variable. This allows for a mechanism much like C++'s "named return value optimization" (NRVO). NRVO means that the stores toresult insidep directly affect the destinationdest inlet/vardest=p(args) (definition ofdest) and also indest=p(args) (assignment todest). This is achieved by rewritingdest=p(args) top'(args, dest) wherep' is a variation ofp that returnsvoid and receives a hidden mutable parameter representingresult.
Informally:
procp():BigT=...varx=p()x=p()# is roughly turned into:procp(result:varBigT)=...varx;p(x)p(x)
LetT's bep's return type. NRVO applies forT ifsizeof(T)>=N (whereN is implementation dependent), in other words, it applies for "big" structures.
Ifp can raise an exception, NRVO applies regardless. This can produce observable differences in behavior:
typeBigT=array[16,int]procp(raiseAt:int):BigT=foriin0..high(result):ifi==raiseAt:raisenewException(ValueError,"interception")result[i]=iprocmain=varx:BigTtry:x=p(8)exceptValueError:doAssertx==[0,1,2,3,4,5,6,7,0,0,0,0,0,0,0,0]main()
The compiler can produce a warning in these cases, however this behavior is turned off by default. It can be enabled for a section of code via thewarning[ObservableStores] andpush/pop pragmas. Take the above code as an example:
{.pushwarning[ObservableStores]:on.}main(){.pop.}
The[] subscript operator for arrays/openarrays/sequences can be overloaded for any type (with some exceptions) by defining a routine with the name[].
typeFoo=objectdata:seq[int]proc`[]`(foo:Foo,i:int):int=result=foo.data[i]letfoo=Foo(data:@[1,2,3])echofoo[1]# 2
Assignment to subscripts can also be overloaded by naming a routine[]=, which has precedence over assigning to the result of[].
typeFoo=objectdata:seq[int]proc`[]`(foo:Foo,i:int):int=result=foo.data[i]proc`[]=`(foo:varFoo,i:int,val:int)=foo.data[i]=valvarfoo=Foo(data:@[1,2,3])echofoo[1]# 2foo[1]=5echofoo.data# @[1, 5, 3]echofoo[1]# 5
Overloads of the subscript operator cannot be applied to routine or type symbols themselves, as this conflicts with the syntax for instantiating generic parameters, i.e.foo[int](1,2,3) orFoo[int].
Procedures always use static dispatch. Methods use dynamic dispatch. For dynamic dispatch to work on an object it should be a reference type.
typeExpression=refobjectofRootObj## abstract base class for an expressionLiteral=refobjectofExpressionx:intPlusExpr=refobjectofExpressiona,b:Expressionmethodeval(e:Expression):int{.base.}=# override this base methodraisenewException(CatchableError,"Method without implementation override")methodeval(e:Literal):int=returne.xmethodeval(e:PlusExpr):int=# watch out: relies on dynamic bindingresult=eval(e.a)+eval(e.b)procnewLit(x:int):Literal=new(result)result.x=xprocnewPlus(a,b:Expression):PlusExpr=new(result)result.a=aresult.b=bechoeval(newPlus(newPlus(newLit(1),newLit(2)),newLit(4)))
In the example the constructorsnewLit andnewPlus are procs because they should use static binding, buteval is a method because it requires dynamic binding.
As can be seen in the example, base methods have to be annotated with thebase pragma. Thebase pragma also acts as a reminder for the programmer that a base methodm is used as the foundation to determine all the effects that a call tom might cause.
Note: Compile-time execution is not (yet) supported for methods.
Note: Starting from Nim 0.20, generic methods are deprecated.
Note: Starting from Nim 0.20, to use multi-methods one must explicitly pass--multimethods:on when compiling.
In a multi-method, all parameters that have an object type are used for the dispatching:
typeThing=refobjectofRootObjUnit=refobjectofThingx:intmethodcollide(a,b:Thing){.base,inline.}=quit"to override!"methodcollide(a:Thing,b:Unit){.inline.}=echo"1"methodcollide(a:Unit,b:Thing){.inline.}=echo"2"vara,b:Unitnewanewbcollide(a,b)# output: 2
Dynamic method resolution can be inhibited via the builtinsystem.procCall. This is somewhat comparable to thesuper keyword that traditional OOP languages offer.
typeThing=refobjectofRootObjUnit=refobjectofThingx:intmethodm(a:Thing){.base.}=echo"base"methodm(a:Unit)=# Call the base method:procCallm(Thing(a))echo"1"
Thefor statement is an abstract mechanism to iterate over the elements of a container. It relies on aniterator to do so. Likewhile statements,for statements open animplicit block so that they can be left with abreak statement.
Thefor loop declares iteration variables - their scope reaches until the end of the loop body. The iteration variables' types are inferred by the return type of the iterator.
An iterator is similar to a procedure, except that it can be called in the context of afor loop. Iterators provide a way to specify the iteration over an abstract type. Theyield statement in the called iterator plays a key role in the execution of afor loop. Whenever ayield statement is reached, the data is bound to thefor loop variables and control continues in the body of thefor loop. The iterator's local variables and execution state are automatically saved between calls. Example:
# this definition exists in the system moduleiteratoritems*(a:string):char{.inline.}=vari=0whilei<len(a):yielda[i]inc(i)forchinitems("hello world"):# `ch` is an iteration variableechoch
The compiler generates code as if the programmer had written this:
vari=0whilei<len(a):varch=a[i]echochinc(i)
If the iterator yields a tuple, there can be as many iteration variables as there are components in the tuple. The i'th iteration variable's type is the type of the i'th component. In other words, implicit tuple unpacking in a for loop context is supported.
If the for loop expressione does not denote an iterator and the for loop has exactly 1 variable, the for loop expression is rewritten toitems(e); i.e. anitems iterator is implicitly invoked:
forxin[1,2,3]:echox
If the for loop has exactly 2 variables, apairs iterator is implicitly invoked.
Symbol lookup of the identifiersitems/pairs is performed after the rewriting step, so that all overloads ofitems/pairs are taken into account.
There are 2 kinds of iterators in Nim:inline andclosure iterators. Aninline iterator is an iterator that's always inlined by the compiler leading to zero overhead for the abstraction, but may result in a heavy increase in code size.
Caution: the body of a for loop over an inline iterator is inlined into eachyield statement appearing in the iterator code, so ideally the code should be refactored to contain a single yield when possible to avoid code bloat.
Inline iterators are second class citizens; They can be passed as parameters only to other inlining code facilities like templates, macros, and other inline iterators.
In contrast to that, aclosure iterator can be passed around more freely:
iteratorcount0():int{.closure.}=yield0iteratorcount2():int{.closure.}=varx=1yieldxincxyieldxprocinvoke(iter:iterator():int{.closure.})=forxiniter():echoxinvoke(count0)invoke(count2)
Closure iterators and inline iterators have some restrictions:
Iterators that are neither marked{.closure.} nor{.inline.} explicitly default to being inline, but this may change in future versions of the implementation.
Theiterator type is always of the calling conventionclosure implicitly; the following example shows how to use iterators to implement acollaborative tasking system:
# simple tasking:typeTask=iterator(ticker:int)iteratora1(ticker:int){.closure.}=echo"a1: A"yieldecho"a1: B"yieldecho"a1: C"yieldecho"a1: D"iteratora2(ticker:int){.closure.}=echo"a2: A"yieldecho"a2: B"yieldecho"a2: C"procrunTasks(t:varargs[Task])=varticker=0whiletrue:letx=t[tickermodt.len]iffinished(x):breakx(ticker)inctickerrunTasks(a1,a2)
The builtinsystem.finished can be used to determine if an iterator has finished its operation; no exception is raised on an attempt to invoke an iterator that has already finished its work.
Note thatsystem.finished is error-prone to use because it only returnstrue one iteration after the iterator has finished:
iteratormycount(a,b:int):int{.closure.}=varx=awhilex<=b:yieldxincxvarc=mycount# instantiate the iteratorwhilenotfinished(c):echoc(1,3)# Produces1230
Instead, this code has to be used:
varc=mycount# instantiate the iteratorwhiletrue:letvalue=c(1,3)iffinished(c):break# and discard 'value'!echovalue
It helps to think that the iterator actually returns a pair(value,done) andfinished is used to access the hiddendone field.
Closure iterators areresumable functions and so one has to provide the arguments to every call. To get around this limitation one can capture parameters of an outer factory proc:
procmycount(a,b:int):iterator():int=result=iterator():int=varx=awhilex<=b:yieldxincxletfoo=mycount(1,4)forfinfoo():echof
The call can be made more like an inline iterator with a for loop macro:
importstd/macrosmacrotoItr(x:ForLoopStmt):untyped=letexpr=x[0]letcall=x[1][1]# Get foo out of toItr(foo)letbody=x[2]result=quotedo:block:letitr=`call`for`expr`initr():`body`forfintoItr(mycount(1,4)):# using early `proc mycount`echof
Because of full backend function call apparatus involvement, closure iterator invocation is typically higher cost than inline iterators. Adornment by a macro wrapper at the call site like this is a possibly useful reminder.
The factoryproc, as an ordinary procedure, can be recursive. The above macro allows such recursion to look much like a recursive iterator would. For example:
procrecCountDown(n:int):iterator():int=result=iterator():int=ifn>0:yieldnforeintoItr(recCountDown(n-1)):yieldeforiintoItr(recCountDown(6)):# Emits: 6 5 4 3 2 1echoi
See alsoiterable for passing iterators to templates and macros.
A converter is like an ordinary proc except that it enhances the "implicitly convertible" type relation (seeConvertible relation):
# bad style ahead: Nim is not C.convertertoBool(x:int):bool=x!=0if4:echo"compiles"
A converter can also be explicitly invoked for improved readability. Note that implicit converter chaining is not supported: If there is a converter from type A to type B and from type B to type C, the implicit conversion from A to C is not provided.
Example:
type# example demonstrating mutually recursive typesNode=refobject# an object managed by the garbage collector (ref)le,ri:Node# left and right subtreessym:refSym# leaves contain a reference to a SymSym=object# a symbolname:string# the symbol's nameline:int# the line the symbol was declared incode:Node# the symbol's abstract syntax tree
A type section begins with thetype keyword. It contains multiple type definitions. A type definition binds a type to a name. Type definitions can be recursive or even mutually recursive. Mutually recursive types are only possible within a singletype section. Nominal types likeobjects orenums can only be defined in atype section.
Example:
# read the first two lines of a text file that should contain numbers# and tries to add themvarf:Fileifopen(f,"numbers.txt"):try:vara=readLine(f)varb=readLine(f)echo"sum: "&$(parseInt(a)+parseInt(b))exceptOverflowDefect:echo"overflow!"exceptValueError,IOError:echo"catch multiple exceptions!"exceptCatchableError:echo"Catchable exception!"finally:close(f)
The statements after thetry are executed in sequential order unless an exceptione is raised. If the exception type ofe matches any listed in anexcept clause, the corresponding statements are executed. The statements following theexcept clauses are calledexception handlers.
If there is afinally clause, it is always executed after the exception handlers.
The exception isconsumed in an exception handler. However, an exception handler may raise another exception. If the exception is not handled, it is propagated through the call stack. This means that often the rest of the procedure - that is not within afinally clause - is not executed (if an exception occurs).
Try can also be used as an expression; the type of thetry branch then needs to fit the types ofexcept branches, but the type of thefinally branch always has to bevoid:
fromstd/strutilsimportparseIntletx=try:parseInt("133a")exceptValueError:-1finally:echo"hi"
To prevent confusing code there is a parsing limitation; if thetry follows a( it has to be written as a one liner:
fromstd/strutilsimportparseIntletx=(try:parseInt("133a")exceptValueError:-1)
Within anexcept clause it is possible to access the current exception using the following syntax:
try:# ...exceptIOErrorase:# Now use "e"echo"I/O error: "&e.msg
Alternatively, it is possible to usegetCurrentException to retrieve the exception that has been raised:
try:# ...exceptIOError:lete=getCurrentException()# Now use "e"
Note thatgetCurrentException always returns arefException type. If a variable of the proper type is needed (in the example above,IOError), one must convert it explicitly:
try:# ...exceptIOError:lete=(refIOError)(getCurrentException())# "e" is now of the proper type
However, this is seldom needed. The most common case is to extract an error message frome, and for such situations, it is enough to usegetCurrentExceptionMsg:
try:# ...exceptCatchableError:echogetCurrentExceptionMsg()
It is possible to create custom exceptions. A custom exception is a custom type:
typeLoadError*=objectofException
Ending the custom exception's name withError is recommended.
Custom exceptions can be raised just like any other exception, e.g.:
raisenewException(LoadError,"Failed to load data")
Instead of atryfinally statement adefer statement can be used, which avoids lexical nesting and offers more flexibility in terms of scoping as shown below.
Any statements following thedefer will be considered to be in an implicit try block in the current block:
procmain=varf=open("numbers.txt",fmWrite)defer:close(f)f.write"abc"f.write"def"
Is rewritten to:
procmain=varf=open("numbers.txt")try:f.write"abc"f.write"def"finally:close(f)
Whendefer is at the outermost scope of a template/macro, its scope extends to the block where the template/macro is called from:
templatesafeOpenDefer(f,path)=varf=open(path,fmWrite)defer:close(f)templatesafeOpenFinally(f,path,body)=varf=open(path,fmWrite)try:body# without `defer`, `body` must be specified as parameterfinally:close(f)block:safeOpenDefer(f,"/tmp/z01.txt")f.write"abc"block:safeOpenFinally(f,"/tmp/z01.txt"):f.write"abc"# adds a lexical scopeblock:varf=open("/tmp/z01.txt",fmWrite)try:f.write"abc"# adds a lexical scopefinally:close(f)
Top-leveldefer statements are not supported since it's unclear what such a statement should refer to.
Example:
raisenewException(IOError,"IO failed")
Apart from built-in operations like array indexing, memory allocation, etc. theraise statement is the only way to raise an exception.
If no exception name is given, the current exception isre-raised. TheReraiseDefect exception is raised if there is no exception to re-raise. It follows that theraise statementalways raises an exception.
The exception tree is defined in thesystem module. Every exception inherits fromsystem.Exception. Exceptions that indicate programming bugs inherit fromsystem.Defect (which is a subtype ofException) and are strictly speaking not catchable as they can also be mapped to an operation that terminates the whole process. If panics are turned into exceptions, these exceptions inherit fromDefect.
Exceptions that indicate any other runtime error that can be caught inherit fromsystem.CatchableError (which is a subtype ofException).
Exception|-- CatchableError| |-- IOError| | `-- EOFError| |-- OSError| |-- ResourceExhaustedError| `-- ValueError| `-- KeyError`-- Defect |-- AccessViolationDefect |-- ArithmeticDefect | |-- DivByZeroDefect | `-- OverflowDefect |-- AssertionDefect |-- DeadThreadDefect |-- FieldDefect |-- FloatingPointDefect | |-- FloatDivByZeroDefect | |-- FloatInvalidOpDefect | |-- FloatOverflowDefect | |-- FloatUnderflowDefect | `-- InexactDefect |-- IndexDefect |-- NilAccessDefect |-- ObjectAssignmentDefect |-- ObjectConversionDefect |-- OutOfMemoryDefect |-- RangeDefect |-- ReraiseDefect `-- StackOverflowDefect
It is possible to raise/catch imported C++ exceptions. Types imported usingimportcpp can be raised or caught. Exceptions are raised by value and caught by reference. Example:
typeCStdException{.importcpp:"std::exception",header:"<exception>",inheritable.}=object## does not inherit from `RootObj`, so we use `inheritable` insteadCRuntimeError{.requiresInit,importcpp:"std::runtime_error",header:"<stdexcept>".}=objectofCStdException## `CRuntimeError` has no default constructor => `requiresInit`procwhat(s:CStdException):cstring{.importcpp:"((char *)#.what())".}procinitRuntimeError(a:cstring):CRuntimeError{.importcpp:"std::runtime_error(@)",constructor.}procinitStdException():CStdException{.importcpp:"std::exception()",constructor.}procfn()=leta=initRuntimeError("foo")doAssert$a.what=="foo"varb=""try:raiseinitRuntimeError("foo2")exceptCStdExceptionase:doAsserteisCStdExceptionb=$e.what()doAssertb=="foo2"try:raiseinitStdException()exceptCStdException:discardtry:raiseinitRuntimeError("foo3")exceptCRuntimeErrorase:b=$e.what()exceptCStdException:doAssertfalsedoAssertb=="foo3"fn()
Note:getCurrentException() andgetCurrentExceptionMsg() are not available for imported exceptions from C++. One needs to use theexceptImportedExceptionasx: syntax and rely on functionality of thex object to get exception details.
Note: The rules for effect tracking changed with the release of version 1.6 of the Nim compiler.
Nim supports exception tracking. Theraises pragma can be used to explicitly define which exceptions a proc/iterator/method/converter is allowed to raise. The compiler verifies this:
procp(what:bool){.raises:[IOError,OSError].}=ifwhat:raisenewException(IOError,"IO")else:raisenewException(OSError,"OS")
An emptyraises list (raises:[]) means that no exception may be raised:
procp():bool{.raises:[].}=try:unsafeCall()result=trueexceptCatchableError:result=false
Araises list can also be attached to a proc type. This affects type compatibility:
typeCallback=proc(s:string){.raises:[IOError].}varc:Callbackprocp(x:string)=raisenewException(OSError,"OS")c=p# type error
For a routinep, the compiler uses inference rules to determine the set of possibly raised exceptions; the algorithm operates onp's call graph:
Exceptions inheriting fromsystem.Defect are not tracked with the.raises:[] exception tracking mechanism. This is more consistent with the built-in operations. The following code is valid:
procmydiv(a,b):int{.raises:[].}=adivb# can raise an DivByZeroDefect
And so is:
procmydiv(a,b):int{.raises:[].}=ifb==0:raisenewException(DivByZeroDefect,"division by zero")else:result=adivb
The reason for this is thatDivByZeroDefect inherits fromDefect and with--panics:on Defects become unrecoverable errors. (Since version 1.4 of the language.)
Rules 1-2 of the exception tracking inference rules (see the previous section) ensure the following works:
procweDontRaiseButMaybeTheCallback(callback:proc()){.raises:[],effectsOf:callback.}=callback()procdoRaise(){.raises:[IOError].}=raisenewException(IOError,"IO")procuse(){.raises:[].}=# doesn't compile! Can raise IOError!weDontRaiseButMaybeTheCallback(doRaise)
As can be seen from the example, a parameter of typeproc(...) can be annotated as.effectsOf. Such a parameter allows for effect polymorphism: The procweDontRaiseButMaybeTheCallback raises the exceptions thatcallback raises.
So in many cases a callback does not cause the compiler to be overly conservative in its effect analysis:
{.pushwarningAsError[Effect]:on.}importstd/algorithmtypeMyInt=distinctintvartoSort=@[MyInt1,MyInt2,MyInt3]proccmpN(a,b:MyInt):int=cmp(a.int,b.int)procharmless{.raises:[].}=toSort.sortcmpNproccmpE(a,b:MyInt):int{.raises:[Exception].}=cmp(a.int,b.int)procharmful{.raises:[].}=# does not compile, `sort` can now raise ExceptiontoSort.sortcmpE
Exception tracking is part of Nim'seffect system. Raising an exception is aneffect. Other effects can also be defined. A user defined effect is a means totag a routine and to perform checks against this tag:
typeIO=object## input/output effectprocreadLine():string{.tags:[IO].}=discardprocno_effects_please(){.tags:[].}=# the compiler prevents this:letx=readLine()
A tag has to be a type name. Atags list - like araises list - can also be attached to a proc type. This affects type compatibility.
The inference for tag tracking is analogous to the inference for exception tracking.
There is also a way which can be used to forbid certain effects:
typeIO=object## input/output effectprocreadLine():string{.tags:[IO].}=discardprocechoLine():void=discardprocno_IO_please(){.forbids:[IO].}=# this is OK because it didn't define any tag:echoLine()# the compiler prevents this:lety=readLine()
Theforbids pragma defines a list of illegal effects - if any statement invokes any of those effects, the compilation will fail. Procedure types with any disallowed effect are the subtypes of equal procedure types without such lists:
typeMyEffect=objecttypeProcType1=proc(i:int):void{.forbids:[MyEffect].}typeProcType2=proc(i:int):voidproccaller1(p:ProcType1):void=p(1)proccaller2(p:ProcType2):void=p(1)proceffectful(i:int):void{.tags:[MyEffect].}=echo$iproceffectless(i:int):void{.forbids:[MyEffect].}=echo$iproctoBeCalled1(i:int):void=effectful(i)proctoBeCalled2(i:int):void=effectless(i)## this will fail because toBeCalled1 uses MyEffect which was forbidden by ProcType1:caller1(toBeCalled1)## this is OK because both toBeCalled2 and ProcType1 have the same requirements:caller1(toBeCalled2)## these are OK because ProcType2 doesn't have any effect requirement:caller2(toBeCalled1)caller2(toBeCalled2)
ProcType2 is a subtype ofProcType1. Unlike with thetags pragma, the parent context - the function which calls other functions with forbidden effects - doesn't inherit the forbidden list of effects.
ThenoSideEffect pragma is used to mark a proc/iterator that can have only side effects through parameters. This means that the proc/iterator only changes locations that are reachable from its parameters and the return value only depends on the parameters. If none of its parameters have the typevar,ref,ptr,cstring, orproc, then no locations are modified.
In other words, a routine has no side effects if it does not access a threadlocal or global variable and it does not call any routine that has a side effect.
It is a static error to mark a proc/iterator to have no side effect if the compiler cannot verify this.
As a special semantic rule, the built-indebugEcho pretends to be free of side effects so that it can be used for debugging routines marked asnoSideEffect.
func is syntactic sugar for a proc with no side effects:
func`+`(x,y:int):int
To override the compiler's side effect analysis a{.noSideEffect.}cast pragma block can be used:
funcf()={.cast(noSideEffect).}:echo"test"
Side effects are usually inferred. The inference for side effects is analogous to the inference for exception tracking.
When the compiler cannot infer side effects, as is the case for imported functions, one can annotate them with thesideEffect pragma.
We call a procpGC safe when it doesn't access any global variable that contains GC'ed memory (string,seq,ref or a closure) either directly or indirectly through a call to a GC unsafe proc.
The GC safety property is usually inferred. The inference for GC safety is analogous to the inference for exception tracking.
Thegcsafe annotation can be used to mark a proc to be gcsafe, otherwise this property is inferred by the compiler. Note thatnoSideEffect impliesgcsafe.
Routines that are imported from C are always assumed to begcsafe.
To override the compiler's gcsafety analysis a{.cast(gcsafe).} pragma block can be used:
varsomeGlobal:string="some string here"perThread{.threadvar.}:stringprocsetPerThread()={.cast(gcsafe).}:deepCopy(perThread,someGlobal)
See also:
Theeffects pragma has been designed to assist the programmer with the effects analysis. It is a statement that makes the compiler output all inferred effects up to theeffects's position:
procp(what:bool)=ifwhat:raisenewException(IOError,"IO"){.effects.}else:raisenewException(OSError,"OS")
The compiler produces a hint message thatIOError can be raised.OSError is not listed as it cannot be raised in the branch theeffects pragma appears in.
Generics are Nim's means to parametrize procs, iterators or types withtype parameters. Depending on the context, the brackets are used either to introduce type parameters or to instantiate a generic proc, iterator, or type.
The following example shows how a generic binary tree can be modeled:
typeBinaryTree*[T]=refobject# BinaryTree is a generic type with# generic parameter `T`le,ri:BinaryTree[T]# left and right subtrees; may be nildata:T# the data stored in a nodeprocnewNode*[T](data:T):BinaryTree[T]=# constructor for a noderesult=BinaryTree[T](le:nil,ri:nil,data:data)procadd*[T](root:varBinaryTree[T],n:BinaryTree[T])=# insert a node into the treeifroot==nil:root=nelse:varit=rootwhileit!=nil:# compare the data items; uses the generic `cmp` proc# that works for any type that has a `==` and `<` operatorvarc=cmp(it.data,n.data)ifc<0:ifit.le==nil:it.le=nreturnit=it.leelse:ifit.ri==nil:it.ri=nreturnit=it.riprocadd*[T](root:varBinaryTree[T],data:T)=# convenience proc:add(root,newNode(data))iteratorpreorder*[T](root:BinaryTree[T]):T=# Preorder traversal of a binary tree.# This uses an explicit stack (which is more efficient than# a recursive iterator factory).varstack:seq[BinaryTree[T]]=@[root]whilestack.len>0:varn=stack.pop()whilen!=nil:yieldn.dataadd(stack,n.ri)# push right subtree onto the stackn=n.le# and follow the left pointervarroot:BinaryTree[string]# instantiate a BinaryTree with `string`add(root,newNode("hello"))# instantiates `newNode` and `add`add(root,"world")# instantiates the second `add` procforstrinpreorder(root):stdout.writeLine(str)
TheT is called ageneric type parameter or atype variable.
Let's consider the anatomy of a genericproc to agree on defined terminology.
p[T:t](arg1:f):y
The use of the word "formal" here is to denote the symbols as they are defined by the programmer, not as they may be at compile time contextually. Since generics may be instantiated and types bound, we have more than one entity to think about when generics are involved.
The usage of a generic will resolve the formally defined expression into an instance of that expression bound to only concrete types. This process is called "instantiation".
Brackets at the site of a generic's formal definition specify the "constraints" as in:
typeFoo[T]=objectprocp[H;T:Foo[H]](param:T):H
A constraint definition may have more than one symbol defined by separating each definition by a;. Notice howT is composed ofH and the return type ofp is defined asH. When this generic proc is instantiatedH will be bound to a concrete type, thus makingT concrete and the return type ofp will be bound to the same concrete type used to defineH.
Brackets at the site of usage can be used to supply concrete types to instantiate the generic in the same order that the symbols are defined in the constraint. Alternatively, type bindings may be inferred by the compiler in some situations, allowing for cleaner code.
Theis operator is evaluated during semantic analysis to check for type equivalence. It is therefore very useful for type specialization within generic code:
typeTable[Key,Value]=objectkeys:seq[Key]values:seq[Value]whennot(Keyisstring):# empty value for strings used for optimizationdeletedKeys:seq[bool]
A type class is a special pseudo-type that can be used to match against types in the context of overload resolution or theis operator. Nim supports the following built-in type classes:
type class | matches |
---|---|
object | any object type |
tuple | any tuple type |
enum | any enumeration |
proc | any proc type |
iterator | any iterator type |
ref | anyref type |
ptr | anyptr type |
var | anyvar type |
distinct | any distinct type |
array | any array type |
set | any set type |
seq | any seq type |
auto | any type |
Furthermore, every generic type automatically creates a type class of the same name that will match any instantiation of the generic type.
Type classes can be combined using the standard boolean operators to form more complex type classes:
# create a type class that will match all tuple and object typestypeRecordType=(tupleorobject)procprintFields[T:RecordType](rec:T)=forkey,valueinfieldPairs(rec):echokey," = ",value
Type constraints on generic parameters can be grouped with, and propagation stops with;, similarly to parameters for macros and templates:
procfn1[T;U,V:SomeFloat]()=discard# T is unconstrainedtemplatefn2(t;u,v:SomeFloat)=discard# t is unconstrained
Whilst the syntax of type classes appears to resemble that of ADTs/algebraic data types in ML-like languages, it should be understood that type classes are static constraints to be enforced at type instantiations. Type classes are not really types in themselves but are instead a system of providing generic "checks" that ultimatelyresolve to some singular type. Type classes do not allow for runtime type dynamism, unlike object variants or methods.
As an example, the following would not compile:
typeTypeClass=int|stringvarfoo:TypeClass=2# foo's type is resolved to an int herefoo="this will fail"# error here, because foo is an int
Nim allows for type classes and regular types to be specified astype constraints of the generic type parameter:
proconlyIntOrString[T:int|string](x,y:T)=discardonlyIntOrString(450,616)# validonlyIntOrString(5.0,0.0)# type mismatchonlyIntOrString("xy",50)# invalid as 'T' cannot be both at the same time
proc anditerator type classes also accept a calling convention pragma to restrict the calling convention of the matchingproc oriterator type.
proconlyClosure[T:proc{.closure.}](x:T)=discardonlyClosure(proc()=echo"hello")# validprocfoo(){.nimcall.}=discardonlyClosure(foo)# type mismatch
A type class can be used directly as the parameter's type.
# create a type class that will match all tuple and object typestypeRecordType=(tupleorobject)procprintFields(rec:RecordType)=forkey,valueinfieldPairs(rec):echokey," = ",value
Procedures utilizing type classes in such a manner are considered to beimplicitly generic. They will be instantiated once for each unique combination of parameter types used within the program.
By default, during overload resolution, each named type class will bind to exactly one concrete type. We call such type classesbind once types. Here is an example taken directly from the system module to illustrate this:
proc`==`*(x,y:tuple):bool=## requires `x` and `y` to be of the same tuple type## generic `==` operator for tuples that is lifted from the components## of `x` and `y`.result=truefora,binfields(x,y):ifa!=b:result=false
Alternatively, thedistinct type modifier can be applied to the type class to allow each parameter matching the type class to bind to a different type. Such type classes are calledbind many types.
Procs written with the implicitly generic style will often need to refer to the type parameters of the matched generic type. They can be easily accessed using the dot syntax:
typeMatrix[T,Rows,Columns]=object...proc`[]`(m:Matrix,row,col:int):Matrix.T=m.data[col*high(Matrix.Columns)+row]
Here are more examples that illustrate implicit generics:
procp(t:Table;k:Table.Key):Table.Value# is roughly the same as:procp[Key,Value](t:Table[Key,Value];k:Key):Value
procp(a:Table,b:Table)# is roughly the same as:procp[Key,Value](a,b:Table[Key,Value])
procp(a:Table,b:distinctTable)# is roughly the same as:procp[Key,Value,KeyB,ValueB](a:Table[Key,Value],b:Table[KeyB,ValueB])
typedesc used as a parameter type also introduces an implicit generic.typedesc has its own set of rules:
procp(a:typedesc)# is roughly the same as:procp[T](a:typedesc[T])
typedesc is a "bind many" type class:
procp(a,b:typedesc)# is roughly the same as:procp[T,T2](a:typedesc[T],b:typedesc[T2])
A parameter of typetypedesc is itself usable as a type. If it is used as a type, it's the underlying type. In other words, one level of "typedesc"-ness is stripped off:
procp(a:typedesc;b:a)=discard# is roughly the same as:procp[T](a:typedesc[T];b:T)=discard# hence this is a valid call:p(int,4)# as parameter 'a' requires a type, but 'b' requires a value.
The typesvarT andtypedesc[T] cannot be inferred in a generic instantiation. The following is not allowed:
procg[T](f:proc(x:T);x:T)=f(x)procc(y:int)=echoyprocv(y:varint)=y+=100vari:int# allowed: infers 'T' to be of type 'int'g(c,42)# not valid: 'T' is not inferred to be of type 'var int'g(v,i)# also not allowed: explicit instantiation via 'var int'g[varint](v,i)
The symbol binding rules in generics are slightly subtle: There are "open" and "closed" symbols. A "closed" symbol cannot be re-bound in the instantiation context, an "open" symbol can. Per default, overloaded symbols are open and every other symbol is closed.
Open symbols are looked up in two different contexts: Both the context at definition and the context at instantiation are considered:
typeIndex=distinctintproc`==`(a,b:Index):bool{.borrow.}vara=(0,0.Index)varb=(0,0.Index)echoa==b# works!
In the example, thegeneric `==` for tuples (as defined in the system module) uses the== operators of the tuple's components. However, the== for theIndex type is definedafter the== for tuples; yet the example compiles as the instantiation takes the currently defined symbols into account too.
A symbol can be forced to be open by amixin declaration:
proccreate*[T]():refT=# there is no overloaded 'init' here, so we need to state that it's an# open symbol explicitly:mixininitnewresultinitresult
mixin statements only make sense in templates and generics.
Thebind statement is the counterpart to themixin statement. It can be used to explicitly declare identifiers that should be bound early (i.e. the identifiers should be looked up in the scope of the template/generic definition):
# Module AvarlastId=0templategenId*:untyped=bindlastIdinc(lastId)lastId
# Module BimportAechogenId()
But abind is rarely useful because symbol binding from the definition scope is the default.
bind statements only make sense in templates and generics.
The following example outlines a problem that can arise when generic instantiations cross multiple different modules:
# module AprocgenericA*[T](x:T)=mixininitinit(x)
importC# module BprocgenericB*[T](x:T)=# Without the `bind init` statement C's init proc is# not available when `genericB` is instantiated:bindinitgenericA(x)
# module CtypeO=objectprocinit*(x:varO)=discard
# module mainimportB,CgenericBO()
In module B has aninit proc from module C in its scope that is not taken into account whengenericB is instantiated which leads to the instantiation ofgenericA. The solution is toforward these symbols by abind statement insidegenericB.
A template is a simple form of a macro: It is a simple substitution mechanism that operates on Nim's abstract syntax trees. It is processed in the semantic pass of the compiler.
The syntax toinvoke a template is the same as calling a procedure.
Example:
template`!=`(a,b:untyped):untyped=# this definition exists in the system modulenot(a==b)assert(5!=6)# the compiler rewrites that to: assert(not (5 == 6))
The!=,>,>=,in,notin,isnot operators are in fact templates:
a>b is transformed intob<a.
ainb is transformed intocontains(b,a).
notin andisnot have the obvious meanings.
The "types" of templates can be the symbolsuntyped,typed ortypedesc. These are "meta types", they can only be used in certain contexts. Regular types can be used too; this implies thattyped expressions are expected.
Anuntyped parameter means that symbol lookups and type resolution is not performed before the expression is passed to the template. This means thatundeclared identifiers, for example, can be passed to the template:
templatedeclareInt(x:untyped)=varx:intdeclareInt(x)# validx=3
templatedeclareInt(x:typed)=varx:intdeclareInt(x)# invalid, because x has not been declared and so it has no type
A template where every parameter isuntyped is called animmediate template. For historical reasons, templates can be explicitly annotated with animmediate pragma and then these templates do not take part in overloading resolution and the parameters' types areignored by the compiler. Explicit immediate templates are now deprecated.
Note: For historical reasons,stmt was an alias fortyped andexpr was an alias foruntyped, but they are removed.
One can pass a block of statements as the last argument to a template following the special: syntax:
templatewithFile(f,fn,mode,actions:untyped):untyped=varf:Fileifopen(f,fn,mode):try:actionsfinally:close(f)else:quit("cannot open: "&fn)withFile(txt,"ttempl3.txt",fmWrite):# special colontxt.writeLine("line 1")txt.writeLine("line 2")
In the example, the twowriteLine statements are bound to theactions parameter.
Usually, to pass a block of code to a template, the parameter that accepts the block needs to be of typeuntyped. Because symbol lookups are then delayed until template instantiation time:
templatet(body:typed)=procp=echo"hey"block:bodyt:p()# fails with 'undeclared identifier: p'
The above code fails with the error message thatp is not declared. The reason for this is that thep() body is type-checked before getting passed to thebody parameter and type checking in Nim implies symbol lookups. The same code works withuntyped as the passed body is not required to be type-checked:
templatet(body:untyped)=procp=echo"hey"block:bodyt:p()# compiles
In addition to theuntyped meta-type that prevents type checking, there is alsovarargs[untyped] so that not even the number of parameters is fixed:
templatehideIdentifiers(x:varargs[untyped])=discardhideIdentifiers(undeclared1,undeclared2)
However, since a template cannot iterate over varargs, this feature is generally much more useful for macros.
A template is ahygienic macro and so opens a new scope. Most symbols are bound from the definition scope of the template:
# Module AvarlastId=0templategenId*:untyped=inc(lastId)lastId
# Module BimportAechogenId()# Works as 'lastId' has been bound in 'genId's defining scope
As in generics, symbol binding can be influenced viamixin orbind statements.
In templates, identifiers can be constructed with the backticks notation:
templatetypedef(name:untyped,typ:typedesc)=type`Tname`*{.inject.}=typ`Pname`*{.inject.}=ref`Tname`typedef(myint,int)varx:PMyInt
In the example,name is instantiated withmyint, so `T name` becomesTmyint.
A parameterp in a template is even substituted in the expressionx.p. Thus, template arguments can be used as field names and a global symbol can be shadowed by the same argument name even when fully qualified:
# module 'm'typeLev=enumlevA,levBvarabclev=levBtemplatetstLev(abclev:Lev)=echoabclev," ",m.abclevtstLev(levA)# produces: 'levA levA'
But the global symbol can properly be captured by abind statement:
# module 'm'typeLev=enumlevA,levBvarabclev=levBtemplatetstLev(abclev:Lev)=bindm.abclevechoabclev," ",m.abclevtstLev(levA)# produces: 'levA levB'
Per default, templates arehygienic: Local identifiers declared in a template cannot be accessed in the instantiation context:
templatenewException*(exceptn:typedesc,message:string):untyped=vare:refexceptn# e is implicitly gensym'ed herenew(e)e.msg=messagee# so this works:lete="message"raisenewException(IoError,e)
Whether a symbol that is declared in a template is exposed to the instantiation scope is controlled by theinject andgensym pragmas:gensym'ed symbols are not exposed butinject'ed symbols are.
The default for symbols of entitytype,var,let andconst isgensym. Forproc,iterator,converter,template,macro, the default isinject, but if agensym symbol with the same name is defined in the same syntax-level scope, it will begensym by default. This can be overridden by marking the routine asinject.
If the name of the entity is passed as a template parameter, it is aninject'ed symbol:
templatewithFile(f,fn,mode:untyped,actions:untyped):untyped=block:varf:File# since 'f' is a template parameter, it's injected implicitly...withFile(txt,"ttempl3.txt",fmWrite):txt.writeLine("line 1")txt.writeLine("line 2")
Theinject andgensym pragmas are second class annotations; they have no semantics outside a template definition and cannot be abstracted over:
{.pragmamyInject:inject.}templatet()=varx{.myInject.}:int# does NOT work
To get rid of hygiene in templates, one can use thedirty pragma for a template.inject andgensym have no effect indirty templates.
gensym'ed symbols cannot be used asfield in thex.field syntax. Nor can they be used in theObjectConstruction(field:value) andnamedParameterCall(field=value) syntactic constructs.
The reason for this is that code like
typeT=objectf:inttemplatetmp(x:T)=letf=34echox.f,T(f:4)
should work as expected.
However, this means that the method call syntax is not available forgensym'ed symbols:
templatetmp(x)=typeT{.gensym.}=intechox.T# invalid: instead use: 'echo T(x)'.tmp(12)
The expressionx inx.f needs to be semantically checked (that means symbol lookup and type checking) before it can be decided that it needs to be rewritten tof(x). Therefore, the dot syntax has some limitations when it is used to invoke templates/macros:
templatedeclareVar(name:untyped)=constname{.inject.}=45# Doesn't compile:unknownIdentifier.declareVar
It is also not possible to use fully qualified identifiers with module symbol in method call syntax. The order in which the dot operator binds to symbols prohibits this.
importstd/sequtilsvarmyItems=@[1,3,3,7]letN1=count(myItems,3)# OKletN2=sequtils.count(myItems,3)# fully qualified, OKletN3=myItems.count(3)# OKletN4=myItems.sequtils.count(3)# illegal, `myItems.sequtils` can't be resolved
This means that when for some reason a procedure needs a disambiguation through the module name, the call needs to be written in function call syntax.
A macro is a special function that is executed at compile time. Normally, the input for a macro is an abstract syntax tree (AST) of the code that is passed to it. The macro can then do transformations on it and return the transformed AST. This can be used to add custom language features and implementdomain-specific languages.
Macro invocation is a case where semantic analysis doesnot entirely proceed top to bottom and left to right. Instead, semantic analysis happens at least twice:
While macros enable advanced compile-time code transformations, they cannot change Nim's syntax.
Style note: For code readability, it is best to use the least powerful programming construct that remains expressive. So the "check list" is:
The following example implements a powerfuldebug command that accepts a variable number of arguments:
# to work with Nim syntax trees, we need an API that is defined in the# `macros` module:importstd/macrosmacrodebug(args:varargs[untyped]):untyped=# `args` is a collection of `NimNode` values that each contain the# AST for an argument of the macro. A macro always has to# return a `NimNode`. A node of kind `nnkStmtList` is suitable for# this use case.result=nnkStmtList.newTree()# iterate over any argument that is passed to this macro:forninargs:# add a call to the statement list that writes the expression;# `toStrLit` converts an AST to its string representation:result.addnewCall("write",newIdentNode("stdout"),newLit(n.repr))# add a call to the statement list that writes ": "result.addnewCall("write",newIdentNode("stdout"),newLit(": "))# add a call to the statement list that writes the expressions value:result.addnewCall("writeLine",newIdentNode("stdout"),n)vara:array[0..10,int]x="some string"a[0]=42a[1]=45debug(a[0],a[1],x)
The macro call expands to:
write(stdout,"a[0]")write(stdout,": ")writeLine(stdout,a[0])write(stdout,"a[1]")write(stdout,": ")writeLine(stdout,a[1])write(stdout,"x")write(stdout,": ")writeLine(stdout,x)
Arguments that are passed to avarargs parameter are wrapped in an array constructor expression. This is whydebug iterates over all ofargs's children.
The abovedebug macro relies on the fact thatwrite,writeLine andstdout are declared in the system module and are thus visible in the instantiating context. There is a way to use bound identifiers (akasymbols) instead of using unbound identifiers. ThebindSym builtin can be used for that:
importstd/macrosmacrodebug(n:varargs[typed]):untyped=result=newNimNode(nnkStmtList,n)forxinn:# we can bind symbols in scope via 'bindSym':add(result,newCall(bindSym"write",bindSym"stdout",toStrLit(x)))add(result,newCall(bindSym"write",bindSym"stdout",newStrLitNode(": ")))add(result,newCall(bindSym"writeLine",bindSym"stdout",x))vara:array[0..10,int]x="some string"a[0]=42a[1]=45debug(a[0],a[1],x)
The macro call expands to:
write(stdout,"a[0]")write(stdout,": ")writeLine(stdout,a[0])write(stdout,"a[1]")write(stdout,": ")writeLine(stdout,a[1])write(stdout,"x")write(stdout,": ")writeLine(stdout,x)
In this version ofdebug, the symbolswrite,writeLine andstdout are already bound and are not looked up again. As the example shows,bindSym does work with overloaded symbols implicitly.
Note that the symbol names passed tobindSym have to be constant. The experimental featuredynamicBindSym (experimental manual) allows this value to be computed dynamically.
Macros can receiveof,elif,else,except,finally anddo blocks (including their different forms such asdo with routine parameters) as arguments if called in statement form.
macroperformWithUndo(task,undo:untyped)=...performWithUndodo:# multiple-line block of code# to perform the taskdo:# code to undo itletnum=12# a single colon may be used if there is no initial blockmatch(nummod3,nummod5):of(0,0):echo"FizzBuzz"of(0,_):echo"Fizz"of(_,0):echo"Buzz"else:echonum
A macro that takes as its only input parameter an expression of the special typesystem.ForLoopStmt can rewrite the entirety of afor loop:
importstd/macrosmacroexample(loop:ForLoopStmt)=result=newTree(nnkForStmt)# Create a new For loop.result.addloop[^3]# This is "item".result.addloop[^2][^1]# This is "[1, 2, 3]".result.addnewCall(bindSym"echo",loop[0])foriteminexample([1,2,3]):discard
Expands to:
foriteminitems([1,2,3]):echoitem
Another example:
importstd/macrosmacroenumerate(x:ForLoopStmt):untyped=expectKindx,nnkForStmt# check if the starting count is specified:varcountStart=ifx[^2].len==2:newLit(0)else:x[^2][1]result=newStmtList()# we strip off the first for loop variable and use it as an integer counter:result.addnewVarStmt(x[0],countStart)varbody=x[^1]ifbody.kind!=nnkStmtList:body=newTree(nnkStmtList,body)body.addnewCall(bindSym"inc",x[0])varnewFor=newTree(nnkForStmt)foriin1..x.len-3:newFor.addx[i]# transform enumerate(X) to 'X'newFor.addx[^2][^1]newFor.addbodyresult.addnewFor# now wrap the whole macro in a block to create a new scoperesult=quotedo:block:`result`fora,binenumerate(items([1,2,3])):echoa," ",b# without wrapping the macro in a block, we'd need to choose different# names for `a` and `b` here to avoid redefinition errorsfora,binenumerate(10,[1,2,3,5]):echoa," ",b
Macros named ``case `` can provide implementations ofcase statements for certain types. The following is an example of such an implementation for tuples, leveraging the existing equality operator for tuples (as provided insystem.==):
importstd/macrosmacro`case`(n:tuple):untyped=result=newTree(nnkIfStmt)letselector=n[0]foriin1..<n.len:letit=n[i]caseit.kindofnnkElse,nnkElifBranch,nnkElifExpr,nnkElseExpr:result.additofnnkOfBranch:forjin0..it.len-2:letcond=newCall("==",selector,it[j])result.addnewTree(nnkElifBranch,cond,it[^1])else:error"custom 'case' for tuple cannot handle this node",itcase("foo",78)of("foo",78):echo"yes"of("bar",88):echo"no"else:discard
case macros are subject to overload resolution. The type of thecase statement's selector expression is matched against the type of the first argument of thecase macro. Then the completecase statement is passed in place of the argument and the macro is evaluated.
In other words, the macro needs to transform the fullcase statement but only the statement's selector expression is used to determine which macro to call.
As their name suggests, static parameters must be constant expressions:
procprecompiledRegex(pattern:staticstring):RegEx=varres{.global.}=re(pattern)returnresprecompiledRegex("/d+")# Replaces the call with a precompiled# regex, stored in a global variableprecompiledRegex(paramStr(1))# Error, command-line options# are not constant expressions
For the purposes of code generation, all static parameters are treated as generic parameters - the proc will be compiled separately for each unique supplied value (or combination of values).
Static parameters can also appear in the signatures of generic types:
typeMatrix[M,N:staticint;T:Number]=array[0..(M*N-1),T]# Note how `Number` is just a type constraint here, while# `static int` requires us to supply an int valueAffineTransform2D[T]=Matrix[3,3,T]AffineTransform3D[T]=Matrix[4,4,T]varm1:AffineTransform3D[float]# OKvarm2:AffineTransform2D[string]# Error, `string` is not a `Number`
Please note thatstaticT is just a syntactic convenience for the underlying generic typestatic[T]. The type parameter can be omitted to obtain the type class of all constant expressions. A more specific type class can be created by instantiatingstatic with another type class.
One can force an expression to be evaluated at compile time as a constant expression by coercing it to a correspondingstatic type:
importstd/mathechostatic(fac(5))," ",static[bool](16.isPowerOfTwo)
The compiler will report any failure to evaluate the expression or a possible type mismatch error.
In many contexts, Nim treats the names of types as regular values. These values exist only during the compilation phase, but since all values must have a type,typedesc is considered their special type.
typedesc acts as a generic type. For instance, the type of the symbolint istypedesc[int]. Just like with regular generic types, when the generic parameter is omitted,typedesc denotes the type class of all types. As a syntactic convenience, one can also usetypedesc as a modifier.
Procs featuringtypedesc parameters are considered implicitly generic. They will be instantiated for each unique combination of supplied types, and within the body of the proc, the name of each parameter will refer to the bound concrete type:
procnew(T:typedesc):refT=echo"allocating ",T.namenew(result)varn=Node.newvartree=new(BinaryTree[int])
When multiple type parameters are present, they will bind freely to different types. To force a bind-once behavior, one can use an explicit generic parameter:
procacceptOnlyTypePairs[T,U](A,B:typedesc[T];C,D:typedesc[U])
Once bound, type parameters can appear in the rest of the proc signature:
templatedeclareVariableWithType(T:typedesc,value:T)=varx:T=valuedeclareVariableWithTypeint,42
Overload resolution can be further influenced by constraining the set of types that will match the type parameter. This works in practice by attaching attributes to types via templates. The constraint can be a concrete type or a type class.
templatemaxval(T:typedesc[int]):int=high(int)templatemaxval(T:typedesc[float]):float=Infvari=int.maxvalvarf=float.maxvalwhenfalse:vars=string.maxval# error, maxval is not implemented for stringtemplateisNumber(t:typedesc[object]):string="Don't think so."templateisNumber(t:typedesc[SomeInteger]):string="Yes!"templateisNumber(t:typedesc[SomeFloat]):string="Maybe, could be NaN."echo"is int a number? ",isNumber(int)echo"is float a number? ",isNumber(float)echo"is RootObj a number? ",isNumber(RootObj)
Passingtypedesc is almost identical, just with the difference that the macro is not instantiated generically. The type expression is simply passed as aNimNode to the macro, like everything else.
importstd/macrosmacroforwardType(arg:typedesc):typedesc=# `arg` is of type `NimNode`lettmp:NimNode=argresult=tmpvartmp:forwardType(int)
Note:typeof(x) can for historical reasons also be written astype(x) buttype(x) is discouraged.
One can obtain the type of a given expression by constructing atypeof value from it (in many other languages this is known as thetypeof operator):
varx=0vary:typeof(x)# y has type int
Iftypeof is used to determine the result type of a proc/iterator/converter callc(X) (whereX stands for a possibly empty list of arguments), the interpretation, wherec is an iterator, is preferred over the other interpretations, but this behavior can be changed by passingtypeOfProc as the second argument totypeof:
iteratorsplit(s:string):string=discardprocsplit(s:string):seq[string]=discard# since an iterator is the preferred interpretation, this has the type `string`:asserttypeof("a b c".split)isstringasserttypeof("a b c".split,typeOfProc)isseq[string]
Nim supports splitting a program into pieces by a module concept. Each module needs to be in its own file and has its ownnamespace. Modules enableinformation hiding andseparate compilation. A module may gain access to the symbols of another module by theimport statement.Recursive module dependencies are allowed, but are slightly subtle. Only top-level symbols that are marked with an asterisk (*) are exported. A valid module name can only be a valid Nim identifier (and thus its filename isidentifier.nim).
The algorithm for compiling modules is:
This is best illustrated by an example:
# Module AtypeT1*=int# Module A exports the type `T1`importB# the compiler starts parsing Bprocmain()=vari=p(3)# works because B has been parsed completely heremain()
# Module BimportA# A is not parsed here! Only the already known symbols# of A are imported.procp*(x:A.T1):A.T1=# this works because the compiler has already# added T1 to A's interface symbol tableresult=x+1
After theimport keyword, a list of module names can follow or a single module name followed by anexcept list to prevent some symbols from being imported:
importstd/strutilsexcept`%`,toUpperAscii# doesn't work then:echo"$1"%"abc".toUpperAscii
It is not checked that theexcept list is really exported from the module. This feature allows us to compile against different versions of the module, even when one version does not export some of these identifiers.
Theimport statement is only allowed at the top level.
String literals can be used for import/include statements. The compiler performspath substitution when used.
Theinclude statement does something fundamentally different than importing a module: it merely includes the contents of a file. Theinclude statement is useful to split up a large module into several files:
includefileA,fileB,fileC
Theinclude statement can be used outside the top level, as such:
# Module Aecho"Hello World!"
# Module Bprocmain()=includeAmain()# => Hello World!
A module alias can be introduced via theas keyword, after which the original module name is inaccessible:
importstd/strutilsassu,std/sequtilsasquechosu.format("$1","lalelu")
The notationspath/to/module or"path/to/module" can be used to refer to a module in subdirectories:
importlib/pure/os,"lib/pure/times"
Note that the module name is stillstrutils and notlib/pure/strutils, thus onecannot do:
importlib/pure/strutilsecholib/pure/strutils.toUpperAscii("abc")
Likewise, the following does not make sense as the name isstrutils already:
importlib/pure/strutilsasstrutils
The syntaximportdir/[moduleA,moduleB] can be used to import multiple modules from the same directory.
Path names are syntactically either Nim identifiers or string literals. If the path name is not a valid Nim identifier it needs to be a string literal:
import"gfx/3d/somemodule"# in quotes because '3d' is not a valid Nim identifier
A directory can also be a so-called "pseudo directory". They can be used to avoid ambiguity when there are multiple modules with the same path.
There are two pseudo directories:
It is recommended and preferred but not currently enforced that all stdlib module imports include the std/ "pseudo directory" as part of the import name.
After thefrom keyword, a module name followed by animport to list the symbols one likes to use without explicit full qualification:
fromstd/strutilsimport`%`echo"$1"%"abc"# always possible: full qualification:echostrutils.replace("abc","a","z")
It's also possible to usefrommoduleimportnil if one wants to import the module but wants to enforce fully qualified access to every symbol inmodule.
Anexport statement can be used for symbol forwarding so that client modules don't need to import a module's dependencies:
# module BtypeMyObject*=object
# module AimportBexportB.MyObjectproc`$`*(x:MyObject):string="my object"
# module CimportA# B.MyObject has been imported implicitly here:varx:MyObjectecho$x
When the exported symbol is another module, all of its definitions will be forwarded. One can use anexcept list to exclude some of the symbols.
Notice that when exporting, one needs to specify only the module name:
importfoo/bar/bazexportbaz
Identifiers are valid from the point of their declaration until the end of the block in which the declaration occurred. The range where the identifier is known is the scope of the identifier. The exact scope of an identifier depends on the way it was declared.
Thescope of a variable declared in the declaration part of a block is valid from the point of declaration until the end of the block. If a block contains a second block, in which the identifier is redeclared, then inside this block, the second declaration will be valid. Upon leaving the inner block, the first declaration is valid again. An identifier cannot be redefined in the same block, except if valid for procedure or iterator overloading purposes.
The field identifiers inside a tuple or object definition are valid in the following places:
All identifiers of a module are valid from the point of declaration until the end of the module. Identifiers from indirectly dependent modules arenot available. Thesystem module is automatically imported in every module.
If a module imports the same identifier from two different modules, the identifier is considered ambiguous, which can be resolved in the following ways:
Using the identifier in a context where the compiler can infer the type of the identifier resolves ambiguity in the case that one definition matches the type stronger than the others.
# Module Avarx*:stringprocfoo*(a:string)=echo"A: ",a
# Module Bvarx*:intprocfoo*(b:int)=echo"B: ",b
# Module CimportA,Bfoo("abc")# A: abcfoo(123)# B: 123letinferred:proc(x:string)=foofoo("def")# A: defwrite(stdout,x)# error: x is ambiguouswrite(stdout,A.x)# no error: qualifier usedprocbar(a:int):int=a+1assertbar(x)==x+1# no error: only A.x of type int matchesvarx=4write(stdout,x)# not ambiguous: uses the module C's x
Modules can share their name, however, when trying to qualify an identifier with the module name the compiler will fail with ambiguous identifier error. One can qualify the identifier by aliasing the module.
# Module A/Cprocfb*=echo"fizz"
# Module B/Cprocfb*=echo"buzz"
importA/CimportB/CC.fb()# Error: ambiguous identifier: 'C'
importA/CasfizzimportB/Cfizz.fb()# Works
A collection of modules in a file tree with anidentifier.nimble file in the root of the tree is called a Nimble package. A valid package name can only be a valid Nim identifier and thus its filename isidentifier.nimble whereidentifier is the desired package name. A module without a.nimble file is assigned the package identifier:unknown.
The distinction between packages allows diagnostic compiler messages to be scoped to the current project's package vs foreign packages.
The Nim compiler emits different kinds of messages:hint,warning, anderror messages. Anerror message is emitted if the compiler encounters any static error.
Pragmas are Nim's method to give the compiler additional information / commands without introducing a massive number of new keywords. Pragmas are processed on the fly during semantic checking. Pragmas are enclosed in the special{. and.} curly brackets. Pragmas are also often used as a first implementation to play with a language feature before a nicer syntax to access the feature becomes available.
The deprecated pragma is used to mark a symbol as deprecated:
procp(){.deprecated.}varx{.deprecated.}:char
This pragma can also take in an optional warning string to relay to developers.
procthing(x:bool){.deprecated:"use thong instead".}
ThecompileTime pragma is used to mark a proc or variable to be used only during compile-time execution. No code will be generated for it. Compile-time procs are useful as helpers for macros. Since version 0.12.0 of the language, a proc that usessystem.NimNode within its parameter types is implicitly declaredcompileTime:
procastHelper(n:NimNode):NimNode=result=n
Is the same as:
procastHelper(n:NimNode):NimNode{.compileTime.}=result=n
compileTime variables are available at runtime too. This simplifies certain idioms where variables are filled at compile-time (for example, lookup tables) but accessed at runtime:
importstd/macrosvarnameToProc{.compileTime.}:seq[(string,proc():string{.nimcall.})]macroregisterProc(p:untyped):untyped=result=newTree(nnkStmtList,p)letprocName=p[0]letprocNameAsStr=$p[0]result.addquotedo:nameToProc.add((`procNameAsStr`,`procName`))procfoo:string{.registerProc.}="foo"procbar:string{.registerProc.}="bar"procbaz:string{.registerProc.}="baz"doAssertnameToProc[2][1]()=="baz"
Thenoreturn pragma is used to mark a proc that never returns.
Theacyclic pragma can be used for object types to mark them as acyclic even though they seem to be cyclic. This is anoptimization for the garbage collector to not consider objects of this type as part of a cycle:
typeNode=refNodeObjNodeObj{.acyclic.}=objectleft,right:Nodedata:string
Or if we directly use a ref object:
typeNode{.acyclic.}=refobjectleft,right:Nodedata:string
In the example, a tree structure is declared with theNode type. Note that the type definition is recursive and the GC has to assume that objects of this type may form a cyclic graph. Theacyclic pragma passes the information that this cannot happen to the GC. If the programmer uses theacyclic pragma for data types that are in reality cyclic, this may result in memory leaks, but memory safety is preserved.
Thefinal pragma can be used for an object type to specify that it cannot be inherited from. Note that inheritance is only available for objects that inherit from an existing object (via theobjectofSuperType syntax) or that have been marked asinheritable.
Theshallow pragma affects the semantics of a type: The compiler is allowed to make a shallow copy. This can cause serious semantic issues and break memory safety! However, it can speed up assignments considerably, because the semantics of Nim require deep copying of sequences and strings. This can be expensive, especially if sequences are used to build a tree structure:
typeNodeKind=enumnkLeaf,nkInnerNode{.shallow.}=objectcasekind:NodeKindofnkLeaf:strVal:stringofnkInner:children:seq[Node]
An object type can be marked with thepure pragma so that its type field which is used for runtime type identification is omitted. This used to be necessary for binary compatibility with other compiled languages.
An enum type can be marked aspure. Then access of its fields always requires full qualification.
A proc can be marked with theasmNoStackFrame pragma to tell the compiler it should not generate a stack frame for the proc. There are also no exit statements likereturnresult; generated and the generated C function is declared as__declspec(naked) or__attribute__((naked)) (depending on the used C compiler).
Note: This pragma should only be used by procs which consist solely of assembler statements.
Theerror pragma is used to make the compiler output an error message with the given content. The compilation does not necessarily abort after an error though.
Theerror pragma can also be used to annotate a symbol (like an iterator or proc). Theusage of the symbol then triggers a static error. This is especially useful to rule out that some operation is valid due to overloading and type conversions:
## check that underlying int values are compared and not the pointers:proc`==`(x,y:ptrint):bool{.error.}
Thefatal pragma is used to make the compiler output an error message with the given content. In contrast to theerror pragma, the compilation is guaranteed to be aborted by this pragma. Example:
whennotdefined(objc):{.fatal:"Compile this program with the objc command!".}
Thewarning pragma is used to make the compiler output a warning message with the given content. Compilation continues after the warning.
Thehint pragma is used to make the compiler output a hint message with the given content. Compilation continues after the hint.
Theline pragma can be used to affect line information of the annotated statement, as seen in stack backtraces:
templatemyassert*(cond:untyped,msg="")=ifnotcond:# change run-time line information of the 'raise' statement:{.line:instantiationInfo().}:raisenewException(AssertionDefect,msg)
If theline pragma is used with a parameter, the parameter needs to be atuple[filename:string,line:int]. If it is used without a parameter,system.instantiationInfo() is used.
ThelinearScanEnd pragma can be used to tell the compiler how to compile a Nimcase statement. Syntactically it has to be used as a statement:
casemyIntof0:echo"most common case"of1:{.linearScanEnd.}echo"second most common case"of2:echo"unlikely: use branch table"else:echo"unlikely too: use branch table for ",myInt
In the example, the case branches0 and1 are much more common than the other cases. Therefore, the generated assembler code should test for these values first so that the CPU's branch predictor has a good chance to succeed (avoiding an expensive CPU pipeline stall). The other cases might be put into a jump table for O(1) overhead but at the cost of a (very likely) pipeline stall.
ThelinearScanEnd pragma should be put into the last branch that should be tested against via linear scanning. If put into the last branch of the wholecase statement, the wholecase statement uses linear scanning.
ThecomputedGoto pragma can be used to tell the compiler how to compile a Nimcase in awhiletrue statement. Syntactically it has to be used as a statement inside the loop:
typeMyEnum=enumenumA,enumB,enumC,enumD,enumEprocvm()=varinstructions:array[0..100,MyEnum]instructions[2]=enumCinstructions[3]=enumDinstructions[4]=enumAinstructions[5]=enumDinstructions[6]=enumCinstructions[7]=enumAinstructions[8]=enumBinstructions[12]=enumEvarpc=0whiletrue:{.computedGoto.}letinstr=instructions[pc]caseinstrofenumA:echo"yeah A"ofenumC,enumD:echo"yeah CD"ofenumB:echo"yeah B"ofenumE:breakinc(pc)vm()
As the example shows,computedGoto is mostly useful for interpreters. If the underlying backend (C compiler) does not support the computed goto extension the pragma is simply ignored.
The immediate pragma is obsolete. SeeTyped vs untyped parameters.
Redefinition of template symbols with the same signature is allowed. This can be made explicit with theredefine pragma:
templatefoo:int=1echofoo()# 1templatefoo:int{.redefine.}=2echofoo()# 2# warning: implicit redefinition of templatetemplatefoo:int=3
This is mostly intended for macro generated code.
The listed pragmas here can be used to override the code generation options for a proc/method/converter.
The implementation currently provides the following possible options (various others may be added later).
pragma | allowed values | description |
---|---|---|
checks | on|off | Turns the code generation for all runtime checks on or off. |
boundChecks | on|off | Turns the code generation for array bound checks on or off. |
overflowChecks | on|off | Turns the code generation for over- or underflow checks on or off. |
nilChecks | on|off | Turns the code generation for nil pointer checks on or off. |
assertions | on|off | Turns the code generation for assertions on or off. |
warnings | on|off | Turns the warning messages of the compiler on or off. |
hints | on|off | Turns the hint messages of the compiler on or off. |
optimization | none|speed|size | Optimize the code for speed or size, or disable optimization. |
patterns | on|off | Turns the term rewriting templates/macros on or off. |
callconv | cdecl|... | Specifies the default calling convention for all procedures (and procedure types) that follow. |
Example:
{.checks:off,optimization:speed.}# compile without runtime checks and optimize for speed
Thepush/pop pragmas are very similar to the option directive, but are used to override the settings temporarily. Example:
{.pushchecks:off.}# compile this section without runtime checks as it is# speed critical# ... some code ...{.pop.}# restore old settings
push/pop can switch on/off some standard library pragmas, example:
{.pushinline.}procthisIsInlined():int=42funcwillBeInlined():float=42.0{.pop.}procnotInlined():int=9{.pushdiscardable,boundChecks:off,compileTime,noSideEffect,experimental.}templateexample():string="https://nim-lang.org"{.pop.}{.pushdeprecated,used,stackTrace:off.}procsample():bool=true{.pop.}
For third party pragmas, it depends on its implementation but uses the same syntax.
Theregister pragma is for variables only. It declares the variable asregister, giving the compiler a hint that the variable should be placed in a hardware register for faster access. C compilers usually ignore this though and for good reasons: Often they do a better job without it anyway.
However, in highly specific cases (a dispatch loop of a bytecode interpreter for example) it may provide benefits.
Theglobal pragma can be applied to a variable within a proc to instruct the compiler to store it in a global location and initialize it once at program startup.
procisHexNumber(s:string):bool=varpattern{.global.}=re"[0-9a-fA-F]+"result=s.match(pattern)
When used within a generic proc, a separate unique global variable will be created for each instantiation of the proc. The order of initialization of the created global variables within a module is not defined, but all of them will be initialized after any top-level variables in their originating module and before any variable in a module that imports it.
Nim generates some warnings and hints that may annoy the user. A mechanism for disabling certain messages is provided: Each hint and warning message is associated with a symbol. This is the message's identifier, which can be used to enable or disable the message by putting it in brackets following the pragma:
{.hint[XDeclaredButNotUsed]:off.}# Turn off the hint about declared but not used symbols.
This is often better than disabling all warnings at once.
Nim produces a warning for symbols that are not exported and not used either. Theused pragma can be attached to a symbol to suppress this warning. This is particularly useful when the symbol was generated by a macro:
templateimplementArithOps(T)=procechoAdd(a,b:T){.used.}=echoa+bprocechoSub(a,b:T){.used.}=echoa-b# no warning produced for the unused 'echoSub'implementArithOps(int)echoAdd3,5
used can also be used as a top-level statement to mark a module as "used". This prevents the "Unused import" warning:
# module: debughelper.nimwhendefined(nimHasUsed):# 'import debughelper' is so useful for debugging# that Nim shouldn't produce a warning for that import,# even if currently unused:{.used.}
Theexperimental pragma enables experimental language features. Depending on the concrete feature, this means that the feature is either considered too unstable for an otherwise stable release or that the future of the feature is uncertain (it may be removed at any time). See theexperimental manual for more details.
Example:
importstd/threadpool{.experimental:"parallel".}procthreadedEcho(s:string,i:int)=echo(s," ",$i)procuseParallel()=parallel:foriin0..4:spawnthreadedEcho("echo in parallel",i)useParallel()
As a top-level statement, the experimental pragma enables a feature for the rest of the module it's enabled in. This is problematic for macro and generic instantiations that cross a module scope. Currently, these usages have to be put into a.push/pop environment:
# client.nimprocuseParallel*[T](unused:T)=# use a generic T here to show the problem.{.pushexperimental:"parallel".}parallel:foriin0..4:echo"echo in parallel"{.pop.}
importclientuseParallel(1)
This section describes additional pragmas that the current Nim implementation supports but which should not be seen as part of the language specification.
Thebitsize pragma is for object field members. It declares the field as a bitfield in C/C++.
typemybitfield=objectflag{.bitsize:1.}:cuint
generates:
structmybitfield{unsignedintflag:1;};
Nim automatically determines the size of an enum. But when wrapping a C enum type, it needs to be of a specific size. Thesizepragma allows specifying the size of the enum type.
typeEventType*{.size:sizeof(uint32).}=enumQuitEvent,AppTerminating,AppLowMemorydoAssertsizeof(EventType)==sizeof(uint32)
When used for enum types, thesizepragma accepts only the values 1, 2, 4 or 8.
Thesizepragma can also specify the size of animportc incomplete object type so that one can get the size of it at compile time even if it was declared without fields.
typeAtomicFlag*{.importc:"atomic_flag",header:"<stdatomic.h>",size:1.}=objectstatic:# if AtomicFlag didn't have the size pragma, this code would result in a compile time error.echosizeof(AtomicFlag)
Thealign pragma is for variables and object field members. It modifies the alignment requirement of the entity being declared. The argument must be a constant power of 2. Valid non-zero alignments that are weaker than other align pragmas on the same declaration are ignored. Alignments that are weaker than the alignment requirement of the type are ignored.
typesseType=objectsseData{.align(16).}:array[4,float32]# every object will be aligned to 128-byte boundaryData=objectx:charcacheline{.align(128).}:array[128,char]# over-aligned array of char,procmain()=echo"sizeof(Data) = ",sizeof(Data)," (1 byte + 127 bytes padding + 128-byte array)"# output: sizeof(Data) = 256 (1 byte + 127 bytes padding + 128-byte array)echo"alignment of sseType is ",alignof(sseType)# output: alignment of sseType is 16vard{.align(2048).}:Data# this instance of data is aligned even strictermain()
This pragma has no effect on the JS backend.
Since version 1.4 of the Nim compiler, there is a.noalias annotation for variables and parameters. It is mapped directly to C/C++'srestrict keyword and means that the underlying pointer is pointing to a unique location in memory, no other aliases to this location exist. It isunchecked that this alias restriction is followed. If the restriction is violated, the backend optimizer is free to miscompile the code. This is anunsafe language feature.
Ideally in later versions of the language, the restriction will be enforced at compile time. (This is also why the namenoalias was chosen instead of a more verbose name likeunsafeAssumeNoAlias.)
Thevolatile pragma is for variables only. It declares the variable asvolatile, whatever that means in C/C++ (its semantics are not well-defined in C/C++).
Note: This pragma will not exist for the LLVM backend.
Thenodecl pragma can be applied to almost any symbol (variable, proc, type, etc.) and is sometimes useful for interoperability with C: It tells Nim that it should not generate a declaration for the symbol in the C code. For example:
varEACCES{.importc,nodecl.}:cint# pretend EACCES was a variable, as# Nim does not know its value
However, theheader pragma is often the better alternative.
Note: This will not work for the LLVM backend.
Theheader pragma is very similar to thenodecl pragma: It can be applied to almost any symbol and specifies that it should not be declared and instead, the generated code should contain an#include:
typePFile{.importc:"FILE*",header:"<stdio.h>".}=distinctpointer# import C's FILE* type; Nim will treat it as a new pointer type
Theheader pragma always expects a string constant. The string constant contains the header file: As usual for C, a system header file is enclosed in angle brackets:<>. If no angle brackets are given, Nim encloses the header file in"" in the generated C code.
Note: This will not work for the LLVM backend.
TheincompleteStruct pragma tells the compiler to not use the underlying Cstruct in asizeof expression:
typeDIR*{.importc:"DIR",header:"<dirent.h>",pure,incompleteStruct.}=object
Thecompile pragma can be used to compile and link a C/C++ source file with the project:
This pragma can take three forms. The first is a simple file input:
{.compile:"myfile.cpp".}
The second form is a tuple where the second arg is the output name strutils formatter:
{.compile:("file.c","$1.o").}
Note: Nim computes a SHA1 checksum and only recompiles the file if it has changed. One can use the-f command-line option to force the recompilation of the file.
Since 1.4 thecompile pragma is also available with this syntax:
{.compile("myfile.cpp","--custom flags here").}
As can be seen in the example, this new variant allows for custom flags that are passed to the C compiler when the file is recompiled.
Thelink pragma can be used to link an additional file with the project:
{.link:"myfile.o".}
Thepassc pragma can be used to pass additional parameters to the C compiler like one would use the command-line switch--passc:
{.passc:"-Wall -Werror".}
Note that one can usegorge from thesystem module to embed parameters from an external command that will be executed during semantic analysis:
{.passc:gorge("pkg-config --cflags sdl").}
ThelocalPassC pragma can be used to pass additional parameters to the C compiler, but only for the C/C++ file that is produced from the Nim module the pragma resides in:
# Module A.nim# Produces: A.nim.cpp{.localPassC:"-Wall -Werror".}# Passed when compiling A.nim.cpp
Thepassl pragma can be used to pass additional parameters to the linker like one would be using the command-line switch--passl:
{.passl:"-lSDLmain -lSDL".}
Note that one can usegorge from thesystem module to embed parameters from an external command that will be executed during semantic analysis:
{.passl:gorge("pkg-config --libs sdl").}
Theemit pragma can be used to directly affect the output of the compiler's code generator. The code is then unportable to other code generators/backends. Its usage is highly discouraged! However, it can be extremely useful for interfacing withC++ orObjective C code.
Example:
{.emit:"""static int cvariable = 420;""".}{.pushstackTrace:off.}procembedsC()=varnimVar=89# access Nim symbols within an emit section outside of string literals:{.emit:["""fprintf(stdout, "%d\n", cvariable + (int)""",nimVar,");"].}{.pop.}embedsC()
nimbase.h definesNIM_EXTERNC C macro that can be used forextern"C" code to work with bothnimc andnimcpp, e.g.:
procfoobar(){.importc:"$1".}{.emit:"""#include <stdio.h>NIM_EXTERNCvoid fun(){}""".}
For a top-level emit statement, the section where in the generated C/C++ file the code should be emitted can be influenced via the prefixes/*TYPESECTION*/ or/*VARSECTION*/ or/*INCLUDESECTION*/:
{.emit:"""/*TYPESECTION*/struct Vector3 {public: Vector3(): x(5) {} Vector3(float x_): x(x_) {} float x;};""".}typeVector3{.importcpp:"Vector3",nodecl}=objectx:cfloatprocconstructVector3(a:cfloat):Vector3{.importcpp:"Vector3(@)",nodecl}
Note:c2nim can parse a large subset of C++ and knows about theimportcpp pragma pattern language. It is not necessary to know all the details described here.
Similar to theimportc pragma for C, theimportcpp pragma can be used to importC++ methods or C++ symbols in general. The generated code then uses the C++ method calling syntax:obj->method(arg). In combination with theheader andemit pragmas this allowssloppy interfacing with libraries written in C++:
# Horrible example of how to interface with a C++ engine ... ;-){.link:"/usr/lib/libIrrlicht.so".}{.emit:"""using namespace irr;using namespace core;using namespace scene;using namespace video;using namespace io;using namespace gui;""".}constirr="<irrlicht/irrlicht.h>"typeIrrlichtDeviceObj{.header:irr,importcpp:"IrrlichtDevice".}=objectIrrlichtDevice=ptrIrrlichtDeviceObjproccreateDevice():IrrlichtDevice{.header:irr,importcpp:"createDevice(@)".}procrun(device:IrrlichtDevice):bool{.header:irr,importcpp:"#.run(@)".}
The compiler needs to be told to generate C++ (commandcpp) for this to work. The conditional symbolcpp is defined when the compiler emits C++ code.
Thesloppy interfacing example uses.emit to produceusingnamespace declarations. It is usually much better to instead refer to the imported name via thenamespace::identifier notation:
typeIrrlichtDeviceObj{.header:irr,importcpp:"irr::IrrlichtDevice".}=object
Whenimportcpp is applied to an enum type the numerical enum values are annotated with the C++ enum type, like in this example:((TheCppEnum)(3)). (This turned out to be the simplest way to implement it.)
Note that theimportcpp variant for procs uses a somewhat cryptic pattern language for maximum flexibility:
For example:
proccppMethod(this:CppObj,a,b,c:cint){.importcpp:"#.CppMethod(@)".}varx:ptrCppObjcppMethod(x[],1,2,3)
Produces:
x->CppMethod(1,2,3)
As a special rule to keep backward compatibility with older versions of theimportcpp pragma, if there is no special pattern character (any of# ' @) at all, C++'s dot or arrow notation is assumed, so the above example can also be written as:
proccppMethod(this:CppObj,a,b,c:cint){.importcpp:"CppMethod".}
Note that the pattern language naturally also covers C++'s operator overloading capabilities:
procvectorAddition(a,b:Vec3):Vec3{.importcpp:"# + #".}procdictLookup(a:Dict,k:Key):Value{.importcpp:"#[#]".}
For example:
typeInput{.importcpp:"System::Input".}=objectprocgetSubsystem*[T]():ptrT{.importcpp:"SystemManager::getSubsystem<'*0>()",nodecl.}letx:ptrInput=getSubsystem[Input]()
Produces:
x=SystemManager::getSubsystem<System::Input>()
For example C++'snew operator can be "imported" like this:
proccnew*[T](x:T):ptrT{.importcpp:"(new '*0#@)",nodecl.}# constructor of 'Foo':procconstructFoo(a,b:cint):Foo{.importcpp:"Foo(@)".}letx=cnewconstructFoo(3,4)
Produces:
x=newFoo(3,4)
However, depending on the use casenewFoo can also be wrapped like this instead:
procnewFoo(a,b:cint):ptrFoo{.importcpp:"new Foo(@)".}letx=newFoo(3,4)
Sometimes a C++ class has a private copy constructor and so code likeClassc=Class(1,2); must not be generated but insteadClassc(1,2);. For this purpose the Nim proc that wraps a C++ constructor needs to be annotated with theconstructor pragma. This pragma also helps to generate faster C++ code since construction then doesn't invoke the copy constructor:
# a better constructor of 'Foo':procconstructFoo(a,b:cint):Foo{.importcpp:"Foo(@)",constructor.}
Since Nim generates C++ directly, any destructor is called implicitly by the C++ compiler at the scope exits. This means that often one can get away with not wrapping the destructor at all! However, when it needs to be invoked explicitly, it needs to be wrapped. The pattern language provides everything that is required:
procdestroyFoo(this:varFoo){.importcpp:"#.~Foo()".}
Genericimportcpp'ed objects are mapped to C++ templates. This means that one can import C++'s templates rather easily without the need for a pattern language for object types:
typeStdMap[K,V]{.importcpp:"std::map",header:"<map>".}=objectproc`[]=`[K,V](this:varStdMap[K,V];key:K;val:V){.importcpp:"#[#] = #",header:"<map>".}varx:StdMap[cint,cdouble]x[6]=91.4
Produces:
std::map<int,double>x;x[6]=91.4;
If more precise control is needed, the apostrophe' can be used in the supplied pattern to denote the concrete type parameters of the generic type. See the usage of the apostrophe operator in proc patterns for more details.
typeVectorIterator[T]{.importcpp:"std::vector<'0>::iterator".}=objectvarx:VectorIterator[cint]
Produces:
std::vector<int>::iteratorx;
Similar to theimportcpp pragma for C++, theimportjs pragma can be used to import Javascript methods or symbols in general. The generated code then uses the Javascript method calling syntax:obj.method(arg).
Similar to theimportc pragma for C, theimportobjc pragma can be used to importObjective C methods. The generated code then uses the Objective C method calling syntax:[obj method param1: arg]. In addition with theheader andemit pragmas this allowssloppy interfacing with libraries written in Objective C:
# horrible example of how to interface with GNUStep ...{.passl:"-lobjc".}{.emit:"""#include <objc/Object.h>@interface Greeter:Object{}- (void)greet:(long)x y:(long)dummy;@end#include <stdio.h>@implementation Greeter- (void)greet:(long)x y:(long)dummy{ printf("Hello, World!\n");}@end#include <stdlib.h>""".}typeId{.importc:"id",header:"<objc/Object.h>",final.}=distinctintprocnewGreeter:Id{.importobjc:"Greeter new",nodecl.}procgreet(self:Id,x,y:int){.importobjc:"greet",nodecl.}procfree(self:Id){.importobjc:"free",nodecl.}varg=newGreeter()g.greet(12,34)g.free()
The compiler needs to be told to generate Objective C (commandobjc) for this to work. The conditional symbolobjc is defined when the compiler emits Objective C code.
ThecodegenDecl pragma can be used to directly influence Nim's code generator. It receives a format string that determines how the variable, proc or object type is declared in the generated code.
For variables, $1 in the format string represents the type of the variable, $2 is the name of the variable, and each appearance of $# represents $1/$2 respectively according to its position.
The following Nim code:
vara{.codegenDecl:"$# progmem $#".}:int
will generate this C code:
intprogmema
For procedures, $1 is the return type of the procedure, $2 is the name of the procedure, $3 is the parameter list, and each appearance of $# represents $1/$2/$3 respectively according to its position.
The following nim code:
procmyinterrupt(){.codegenDecl:"__interrupt $# $#$#".}=echo"realistic interrupt handler"
will generate this code:
__interruptvoidmyinterrupt()
For object types, the $1 represents the name of the object type, $2 is the list of fields and $3 is the base type.
conststrTemplate=""" struct $1 { $2 };"""typeFoo{.codegenDecl:strTemplate.}=objecta,b:int
will generate this code:
structFoo{NIa;NIb;};
ThecppNonPod pragma should be used for non-PODimportcpp types so that they work properly (in particular regarding constructor and destructor) forthreadvar variables. This requires--tlsEmulation:off.
typeFoo{.cppNonPod,importcpp,header:"funs.h".}=objectx:cintprocmain()=vara{.threadvar.}:Foo
The pragmas listed here can be used to optionally accept values from the-d/--define option at compile time.
The implementation currently provides the following possible options (various others may be added later).
pragma | description |
---|---|
intdefine | Reads in a build-time define as an integer |
strdefine | Reads in a build-time define as a string |
booldefine | Reads in a build-time define as a bool |
constFooBar{.intdefine.}:int=5echoFooBar
nimc-d:FooBar=42foobar.nim
In the above example, providing the-d flag causes the symbolFooBar to be overwritten at compile-time, printing out 42. If the-d:FooBar=42 were to be omitted, the default value of 5 would be used. To see if a value was provided,defined(FooBar) can be used.
The syntax-d:flag is actually just a shortcut for-d:flag=true.
These pragmas also accept an optional string argument for qualified define names.
constFooBar{.intdefine:"package.FooBar".}:int=5echoFooBar
nimc-d:package.FooBar=42foobar.nim
This helps disambiguate define names in different packages.
See also thegeneric `define` pragma for a version of these pragmas that detects the type of the define based on the constant value.
Thepragma pragma can be used to declare user-defined pragmas. This is useful because Nim's templates and macros do not affect pragmas. User-defined pragmas are in a different module-wide scope than all other symbols. They cannot be imported from a module.
Example:
whenappType=="lib":{.pragma:rtl,exportc,dynlib,cdecl.}else:{.pragma:rtl,importc,dynlib:"client.dll",cdecl.}procp*(a,b:int):int{.rtl.}=result=a+b
In the example, a new pragma namedrtl is introduced that either imports a symbol from a dynamic library or exports the symbol for dynamic library generation.
It is possible to define custom typed pragmas. Custom pragmas do not affect code generation directly, but their presence can be detected by macros. Custom pragmas are defined using templates annotated with pragmapragma:
templatedbTable(name:string,table_space:string=""){.pragma.}templatedbKey(name:string="",primary_key:bool=false){.pragma.}templatedbForeignKey(t:typedesc){.pragma.}templatedbIgnore{.pragma.}
Consider this stylized example of a possible Object Relation Mapping (ORM) implementation:
consttblspace{.strdefine.}="dev"# switch for dev, test and prod environmentstypeUser{.dbTable("users",tblspace).}=objectid{.dbKey(primary_key=true).}:intname{.dbKey"full_name".}:stringis_cached{.dbIgnore.}:boolage:intUserProfile{.dbTable("profiles",tblspace).}=objectid{.dbKey(primary_key=true).}:intuser_id{.dbForeignKey:User.}:intread_access:boolwrite_access:booladmin_access:bool
In this example, custom pragmas are used to describe how Nim objects are mapped to the schema of the relational database. Custom pragmas can have zero or more arguments. In order to pass multiple arguments use one of template call syntaxes. All arguments are typed and follow standard overload resolution rules for templates. Therefore, it is possible to have default values for arguments, pass by name, varargs, etc.
Custom pragmas can be used in all locations where ordinary pragmas can be specified. It is possible to annotate procs, templates, type and variable definitions, statements, etc.
The macros module includes helpers which can be used to simplify custom pragma accesshasCustomPragma,getCustomPragmaVal. Please consult themacros module documentation for details. These macros are not magic, everything they do can also be achieved by walking the AST of the object representation.
More examples with custom pragmas:
Better serialization/deserialization control:
typeMyObj=objecta{.dontSerialize.}:intb{.defaultDeserialize:5.}:intc{.serializationKey:"_c".}:string
Adopting type for gui inspector in a game engine:
typeMyComponent=objectposition{.editable,animatable.}:Vector3alpha{.editRange:[0.0..1.0],animatable.}:float32
Macros and templates can sometimes be called with the pragma syntax. Cases where this is possible include when attached to routine (procs, iterators, etc.) declarations or routine type expressions. The compiler will perform the following simple syntactic transformations:
templatecommand(name:string,def:untyped)=discardprocp(){.command("print").}=discard
This is translated to:
command("print"):procp()=discard
typeAsyncEventHandler=proc(x:Event){.async.}
This is translated to:
typeAsyncEventHandler=async(proc(x:Event))
When multiple macro pragmas are applied to the same definition, the first one from left to right will be evaluated. This macro can then choose to keep the remaining macro pragmas in its output, and those will be evaluated in the same way.
There are a few more applications of macro pragmas, such as in type, variable and constant declarations, but this behavior is considered to be experimental and is documented in theexperimental manual instead.
Nim'sFFI (foreign function interface) is extensive and only the parts that scale to other future backends (like the LLVM/JavaScript backends) are documented here.
Theimportc pragma provides a means to import a proc or a variable from C. The optional argument is a string containing the C identifier. If the argument is missing, the C name is the Nim identifierexactly as spelled:
procprintf(formatstr:cstring){.header:"<stdio.h>",importc:"printf",varargs.}
Whenimportc is applied to alet statement it can omit its value which will then be expected to come from C. This can be used to import a Cconst:
{.emit:"const int cconst = 42;".}letcconst{.importc,nodecl.}:cintassertcconst==42
Note that this pragma has been abused in the past to also work in the JS backend for JS objects and functions. Other backends do provide the same feature under the same name. Also, when the target language is not set to C, other pragmas are available:
The string literal passed toimportc can be a format string:
procp(s:cstring){.importc:"prefix$1".}
In the example, the external name ofp is set toprefixp. Only$1 is available and a literal dollar sign must be written as$$.
Theexportc pragma provides a means to export a type, a variable, or a procedure to C. Enums and constants can't be exported. The optional argument is a string containing the C identifier. If the argument is missing, the C name is the Nim identifierexactly as spelled:
proccallme(formatstr:cstring){.exportc:"callMe",varargs.}
Note that this pragma is somewhat of a misnomer: Other backends do provide the same feature under the same name.
The string literal passed toexportc can be a format string:
procp(s:string){.exportc:"prefix$1".}=echos
In the example, the external name ofp is set toprefixp. Only$1 is available and a literal dollar sign must be written as$$.
If the symbol should also be exported to a dynamic library, thedynlib pragma should be used in addition to theexportc pragma. SeeDynlib pragma for export.
Theexportcpp pragma works like theexportc pragma but it requires thecpp backend. When compiled with thecpp backend, theexportc pragma addsexport"C" to the declaration in the generated code so that it can be called from both C and C++ code.exportcpp pragma doesn't addexport"C".
Likeexportc orimportc, theextern pragma affects name mangling. The string literal passed toextern can be a format string:
procp(s:string){.extern:"prefix$1".}=echos
In the example, the external name ofp is set toprefixp. Only$1 is available and a literal dollar sign must be written as$$.
Thebycopy pragma can be applied to an object or tuple type or a proc param. It instructs the compiler to pass the type by value to procs:
typeVector{.bycopy.}=objectx,y,z:float
The Nim compiler automatically determines whether a parameter is passed by value or by reference based on the parameter type's size. If a parameter must be passed by value or by reference, (such as when interfacing with a C library) use the bycopy or byref pragmas. Notice params marked asbyref takes precedence over types marked asbycopy.
Thebyref pragma can be applied to an object or tuple type or a proc param. When applied to a type it instructs the compiler to pass the type by reference (hidden pointer) to procs. When applied to a param it will take precedence, even if the the type was marked asbycopy. When animportc type has abyref pragma or parameters are marked asbyref in animportc proc, these params translate to pointers. When animportcpp type has abyref pragma, these params translate to C++ references&.
{.emit:"""/*TYPESECTION*/typedef struct { int x;} CStruct;""".}{.emit:"""#ifdef __cplusplusextern "C"#endifint takesCStruct(CStruct* x) { return x->x;}""".}typeCStruct{.importc,byref.}=objectx:cintproctakesCStruct(x:CStruct):cint{.importc.}
or
typeCStruct{.importc.}=objectx:cintproctakesCStruct(x{.byref.}:CStruct):cint{.importc.}
{.emit:"""/*TYPESECTION*/struct CppStruct { int x; int takesCppStruct(CppStruct& y) { return x + y.x; }};""".}typeCppStruct{.importcpp,byref.}=objectx:cintproctakesCppStruct(x,y:CppStruct):cint{.importcpp.}
Thevarargs pragma can be applied to procedures only (and procedure types). It tells Nim that the proc can take a variable number of parameters after the last specified parameter. Nim string values will be converted to C strings automatically:
procprintf(formatstr:cstring){.header:"<stdio.h>",varargs.}printf("hallo %s","world")# "world" will be passed as C string
Theunion pragma can be applied to anyobject type. It means all of an object's fields are overlaid in memory. This produces aunion instead of astruct in the generated C/C++ code. The object declaration then must not use inheritance or any GC'ed memory but this is currently not checked.
Future directions: GC'ed memory should be allowed in unions and the GC should scan unions conservatively.
Thepacked pragma can be applied to anyobject type. It ensures that the fields of an object are packed back-to-back in memory. It is useful to store packets or messages from/to network or hardware drivers, and for interoperability with C. Combining packed pragma with inheritance is not defined, and it should not be used with GC'ed memory (ref's).
Future directions: Using GC'ed memory in packed pragma will result in a static error. Usage with inheritance should be defined and documented.
With thedynlib pragma, a procedure or a variable can be imported from a dynamic library (.dll files for Windows,lib*.so files for UNIX). The non-optional argument has to be the name of the dynamic library:
procgtk_image_new():PGtkWidget{.cdecl,dynlib:"libgtk-x11-2.0.so",importc.}
In general, importing a dynamic library does not require any special linker options or linking with import libraries. This also implies that nodevel packages need to be installed.
Thedynlib import mechanism supports a versioning scheme:
procTcl_Eval(interp:pTcl_Interp,script:cstring):int{.cdecl,importc,dynlib:"libtcl(|8.5|8.4|8.3).so.(1|0)".}
At runtime, the dynamic library is searched for (in this order):
libtcl.so.1libtcl.so.0libtcl8.5.so.1libtcl8.5.so.0libtcl8.4.so.1libtcl8.4.so.0libtcl8.3.so.1libtcl8.3.so.0
Thedynlib pragma supports not only constant strings as an argument but also string expressions in general:
importstd/osprocgetDllName:string=result="mylib.dll"iffileExists(result):returnresult="mylib2.dll"iffileExists(result):returnquit("could not load dynamic library")procmyImport(s:cstring){.cdecl,importc,dynlib:getDllName().}
Note: Patterns likelibtcl(|8.5|8.4).so are only supported in constant strings, because they are precompiled.
Note: Passing variables to thedynlib pragma will fail at runtime because of order of initialization problems.
Note: Adynlib import can be overridden with the--dynlibOverride:name command-line option. TheCompiler User Guide contains further information.
With thedynlib pragma, a procedure can also be exported to a dynamic library. The pragma then has no argument and has to be used in conjunction with theexportc pragma:
procexportme():int{.cdecl,exportc,dynlib.}
This is only useful if the program is compiled as a dynamic library via the--app:lib command-line option.
The--threads:on command-line switch is enabled by default. Thetypedthreads module module then contains several threading primitives. Seespawn for further details.
The only ways to create a thread is viaspawn orcreateThread.
A proc that is executed as a new thread of execution should be marked by thethread pragma for reasons of readability. The compiler checks for violations of theno heap sharing restriction: This restriction implies that it is invalid to construct a data structure that consists of memory allocated from different (thread-local) heaps.
A thread proc can be passed tocreateThread orspawn.
A variable can be marked with thethreadvar pragma, which makes it athread-local variable; Additionally, this implies all the effects of theglobal pragma.
varcheckpoints*{.threadvar.}:seq[string]
Due to implementation restrictions, thread-local variables cannot be initialized within thevar section. (Every thread-local variable needs to be replicated at thread creation.)
The interaction between threads and exceptions is simple: Ahandled exception in one thread cannot affect any other thread. However, anunhandled exception in one thread terminates the wholeprocess.
Nim provides common low level concurrency mechanisms like locks, atomic intrinsics or condition variables.
Nim significantly improves on the safety of these features via additional pragmas:
Object fields and global variables can be annotated via aguard pragma:
importstd/locksvarglock:Lockvargdata{.guard:glock.}:int
The compiler then ensures that every access ofgdata is within alocks section:
procinvalid=# invalid: unguarded access:echogdataprocvalid=# valid access:{.locks:[glock].}:echogdata
Top level accesses togdata are always allowed so that it can be initialized conveniently. It isassumed (but not enforced) that every top level statement is executed before any concurrent action happens.
Thelocks section deliberately looks ugly because it has no runtime semantics and should not be used directly! It should only be used in templates that also implement some form of locking at runtime:
templatelock(a:Lock;body:untyped)=pthread_mutex_lock(a){.locks:[a].}:try:bodyfinally:pthread_mutex_unlock(a)
The guard does not need to be of any particular type. It is flexible enough to model low level lockfree mechanisms:
vardummyLock{.compileTime.}:intvaratomicCounter{.guard:dummyLock.}:inttemplateatomicRead(x):untyped={.locks:[dummyLock].}:memoryReadBarrier()xechoatomicRead(atomicCounter)
Thelocks pragma takes a list of lock expressionslocks:[a,b,...] in order to supportmulti lock statements.
Theguard annotation can also be used to protect fields within an object. The guard then needs to be another field within the same object or a global variable.
Since objects can reside on the heap or on the stack, this greatly enhances the expressiveness of the language:
importstd/lockstypeProtectedCounter=objectv{.guard:L.}:intL:LockprocincCounters(counters:varopenArray[ProtectedCounter])=foriin0..counters.high:lockcounters[i].L:inccounters[i].v
The access to fieldx.v is allowed since its guardx.L is active. After template expansion, this amounts to:
procincCounters(counters:varopenArray[ProtectedCounter])=foriin0..counters.high:pthread_mutex_lock(counters[i].L){.locks:[counters[i].L].}:try:inccounters[i].vfinally:pthread_mutex_unlock(counters[i].L)
There is an analysis that checks thatcounters[i].L is the lock that corresponds to the protected locationcounters[i].v. This analysis is calledpath analysis because it deals with paths to locations likeobj.field[i].fieldB[j].
The path analysis iscurrently unsound, but that doesn't make it useless. Two paths are considered equivalent if they are syntactically the same.
This means the following compiles (for now) even though it really should not:
{.locks:[a[i].L].}:inciaccessa[i].v