Movatterモバイル変換

Standard ML

From Wikipedia, the free encyclopedia

General-purpose functional programming language

Standard ML
Majorimplementations
Paradigm	Multi-paradigm:functional,imperative,modular^[1]
Family	ML
First appeared	1983; 42 years ago (1983)^[2]

Stable release	Standard ML '97^[2] / 1997; 28 years ago (1997)

Typing discipline	Inferred,static,strong
Filename extensions	.sml
Website	smlfamily.github.io
SML/NJ,MLton,Poly/ML
Dialects
Alice,Concurrent ML,Dependent ML
Influenced by
ML,Hope,Pascal
Influenced
Elm,F#,F*,Haskell,OCaml,Python,^[3]Rust,^[4]Scala

Standard ML (SML) is ageneral-purpose,high-level,modular,functional programming language with compile-timetype checking andtype inference. It is popular for writingcompilers, forprogramming language research, and for developingtheorem provers.

Standard ML is a modern dialect ofML, the language used in theLogic for Computable Functions (LCF) theorem-proving project. It is distinctive among widely used languages in that it has aformal specification, given astyping rules andoperational semantics inThe Definition of Standard ML.^[5]

Language

[edit]

This section has multiple issues. Please helpimprove it or discuss these issues on thetalk page.(Learn how and when to remove these messages)

This sectioncontainsinstructions, advice, or how-to content. Please helprewrite the content so that it is more encyclopedic ormove it toWikiversity,Wikibooks, orWikivoyage.(November 2021)

This sectiondoes notcite anysources. Please helpimprove this section byadding citations to reliable sources. Unsourced material may be challenged andremoved.(November 2021) (Learn how and when to remove this message)

(Learn how and when to remove this message)

Standard ML is a functionalprogramming language with some impure features. Programs written in Standard ML consist ofexpressions in contrast to statements or commands, although some expressions of typeunit are only evaluated for theirside-effects.

Functions

[edit]

Like all functional languages, a key feature of Standard ML is thefunction, which is used for abstraction. The factorial function can be expressed as follows:

funfactorialn=ifn=0then1elsen*factorial(n-1)

Type inference

[edit]

An SML compiler must infer the static typevalfactorial:int->int without user-supplied type annotations. It has to deduce thatn is only used with integer expressions, and must therefore itself be an integer, and that all terminal expressions are integer expressions.

Declarative definitions

[edit]

The same function can be expressed withclausal function definitions where theif-then-else conditional is replaced with templates of the factorial function evaluated for specific values:

funfactorial0=1|factorialn=n*factorial(n-1)

Imperative definitions

[edit]

or iteratively:

funfactorialn=letvali=refnandacc=ref1inwhile!i>0do(acc:=!acc*!i;i:=!i-1);!accend

Lambda functions

[edit]

or as a lambda function:

valrecfactorial=fn0=>1|n=>n*factorial(n-1)

Here, the keywordval introduces a binding of an identifier to a value,fn introduces ananonymous function, andrec allows the definition to be self-referential.

Local definitions

[edit]

The encapsulation of an invariant-preserving tail-recursive tight loop with one or more accumulator parameters within an invariant-free outer function, as seen here, is a common idiom in Standard ML.

Using a local function, it can be rewritten in a more efficient tail-recursive style:

localfunloop(0,acc)=acc|loop(m,acc)=loop(m-1,m*acc)infunfactorialn=loop(n,1)end

Type synonyms

[edit]

A type synonym is defined with the keywordtype. Here is a type synonym for points on aplane, and functions computing the distances between two points, and the area of a triangle with the given corners as perHeron's formula. (These definitions will be used in subsequent examples).

typeloc=real*realfunsquare(x:real)=x*xfundist(x,y)(x',y')=Math.sqrt(square(x'-x)+square(y'-y))funheron(a,b,c)=letvalx=distabvaly=distbcvalz=distacvals=(x+y+z)/2.0inMath.sqrt(s*(s-x)*(s-y)*(s-z))end

Algebraic datatypes

[edit]

Standard ML provides strong support foralgebraic datatypes (ADT). Adata type can be thought of as adisjoint union of tuples (or a "sum of products"). They are easy to define and easy to use, largely because ofpattern matching, and most Standard ML implementations'pattern-exhaustiveness checking and pattern redundancy checking.

Inobject-oriented programming languages, a disjoint union can be expressed asclass hierarchies. However, in contrast toclass hierarchies, ADTs areclosed. Thus, the extensibility of ADTs is orthogonal to the extensibility of class hierarchies. Class hierarchies can be extended with new subclasses which implement the same interface, while the functions of ADTs can be extended for the fixed set of constructors. Seeexpression problem.

A datatype is defined with the keyworddatatype, as in:

datatypeshape=Circleofloc*real(* center and radius *)|Squareofloc*real(* upper-left corner and side length; axis-aligned *)|Triangleofloc*loc*loc(* corners *)

Note that a type synonym cannot be recursive; datatypes are necessary to define recursive constructors. (This is not at issue in this example.)

Pattern matching

[edit]

Patterns are matched in the order in which they are defined.C programmers can usetagged unions, dispatching on tag values, to do what ML does with datatypes and pattern matching. Nevertheless, while a C program decorated with appropriate checks will, in a sense, be as robust as the corresponding ML program, those checks will of necessity be dynamic; ML'sstatic checks provide strong guarantees about the correctness of the program at compile time.

Function arguments can be defined as patterns as follows:

funarea(Circle(_,r))=Math.pi*squarer|area(Square(_,s))=squares|area(Trianglep)=heronp(* see above *)

The so-called "clausal form" of function definition, where arguments are defined as patterns, is merelysyntactic sugar for a case expression:

funareashape=caseshapeofCircle(_,r)=>Math.pi*squarer|Square(_,s)=>squares|Trianglep=>heronp

Exhaustiveness checking

[edit]

Pattern-exhaustiveness checking will make sure that each constructor of the datatype is matched by at least one pattern.

The following pattern is not exhaustive:

funcenter(Circle(c,_))=c|center(Square((x,y),s))=(x+s/2.0,y+s/2.0)

There is no pattern for theTriangle case in thecenter function. The compiler will issue a warning that the case expression is not exhaustive, and if aTriangle is passed to this function at runtime,exceptionMatch will be raised.

Redundancy checking

[edit]

The pattern in the second clause of the following (meaningless) function is redundant:

funf(Circle((x,y),r))=x+y|f(Circle_)=1.0|f_=0.0

Any value that would match the pattern in the second clause would also match the pattern in the first clause, so the second clause is unreachable. Therefore, this definition as a whole exhibits redundancy, and causes a compile-time warning.

The following function definition is exhaustive and not redundant:

valhasCorners=fn(Circle_)=>false|_=>true

If control gets past the first pattern (Circle), we know the shape must be either aSquare or aTriangle. In either of those cases, we know the shape has corners, so we can returntrue without discerning the actual shape.

Higher-order functions

[edit]

Functions can consume functions as arguments:

funmapf(x,y)=(fx,fy)

Functions can produce functions as return values:

funconstantk=(fn_=>k)

Functions can also both consume and produce functions:

funcompose(f,g)=(fnx=>f(gx))

The functionList.map from the basislibrary is one of the most commonly used higher-order functions in Standard ML:

funmap_[]=[]|mapf(x::xs)=fx::mapfxs

A more efficient implementation with tail-recursiveList.foldl:

funmapf=List.revoList.foldl(fn(x,acc)=>fx::acc)[]

Exceptions

[edit]

Exceptions are raised with the keywordraise and handled with the pattern matchinghandle construct. The exception system can implementnon-local exit; this optimization technique is suitable for functions like the following.

localexceptionZero;valp=fn(0,_)=>raiseZero|(a,b)=>a*binfunprodxs=List.foldlp1xshandleZero=>0end

WhenexceptionZero is raised, control leaves the functionList.foldl altogether. Consider the alternative: the value 0 would be returned, it would be multiplied by the next integer in the list, the resulting value (inevitably 0) would be returned, and so on. The raising of the exception allows control to skip over the entire chain of frames and avoid the associated computation. Note the use of the underscore (_) as a wildcard pattern.

The same optimization can be obtained with atail call.

localfunpa(0::_)=0|pa(x::xs)=p(a*x)xs|pa[]=ainvalprod=p1end

Module system

[edit]

Standard ML's advanced module system allows programs to be decomposed into hierarchically organizedstructures of logically related type and value definitions. Modules provide not onlynamespace control but also abstraction, in the sense that they allow the definition ofabstract data types. Three main syntactic constructs comprise the module system: signatures, structures and functors.

Signatures

[edit]

Asignature is aninterface, usually thought of as a type for a structure; it specifies the names of all entities provided by the structure, thearity of each type component, the type of each value component, and the signature of each substructure. The definitions of type components are optional; type components whose definitions are hidden areabstract types.

For example, the signature for aqueue may be:

signatureQUEUE=sigtype'aqueueexceptionQueueError;valempty:'aqueuevalisEmpty:'aqueue->boolvalsingleton:'a->'aqueuevalfromList:'alist->'aqueuevalinsert:'a*'aqueue->'aqueuevalpeek:'aqueue->'avalremove:'aqueue->'a*'aqueueend

This signature describes a module that provides a polymorphic type'aqueue,exceptionQueueError, and values that define basic operations on queues.

Structures

[edit]

Astructure is a module; it consists of a collection of types, exceptions, values and structures (calledsubstructures) packaged together into a logical unit.

A queue structure can be implemented as follows:

structureTwoListQueue:>QUEUE=structtype'aqueue='alist*'alistexceptionQueueError;valempty=([],[])funisEmpty([],[])=true|isEmpty_=falsefunsingletona=([],[a])funfromLista=([],a)funinsert(a,([],[]))=singletona|insert(a,(ins,outs))=(a::ins,outs)funpeek(_,[])=raiseQueueError|peek(ins,outs)=List.hdoutsfunremove(_,[])=raiseQueueError|remove(ins,[a])=(a,([],List.revins))|remove(ins,a::outs)=(a,(ins,outs))end

This definition declares thatstructureTwoListQueue implementssignatureQUEUE. Furthermore, theopaque ascription denoted by:> states that any types which are not defined in the signature (i.e.type'aqueue) should be abstract, meaning that the definition of a queue as a pair of lists is not visible outside the module. The structure implements all of the definitions in the signature.

The types and values in a structure can be accessed with "dot notation":

valq:stringTwoListQueue.queue=TwoListQueue.emptyvalq'=TwoListQueue.insert(Real.toStringMath.pi,q)

Functors

[edit]

Afunctor is a function from structures to structures; that is, a functor accepts one or more arguments, which are usually structures of a given signature, and produces a structure as its result. Functors are used to implementgeneric data structures and algorithms.

One popular algorithm^[6] forbreadth-first search of trees makes use of queues. Here is a version of that algorithm parameterized over an abstract queue structure:

(* after Okasaki, ICFP, 2000 *)functorBFS(Q:QUEUE)=structdatatype'atree=E|Tof'a*'atree*'atreelocalfunbfsQq=ifQ.isEmptyqthen[]elsesearch(Q.removeq)andsearch(E,q)=bfsQq|search(T(x,l,r),q)=x::bfsQ(insert(insertql)r)andinsertqa=Q.insert(a,q)infunbfst=bfsQ(Q.singletont)endendstructureQueueBFS=BFS(TwoListQueue)

WithinfunctorBFS, the representation of the queue is not visible. More concretely, there is no way to select the first list in the two-list queue, if that is indeed the representation being used. Thisdata abstraction mechanism makes the breadth-first search truly agnostic to the queue's implementation. This is in general desirable; in this case, the queue structure can safely maintain any logical invariants on which its correctness depends behind the bulletproof wall of abstraction.

Code examples

[edit]

Wikibooks has a book on the topic of:Standard ML Programming

This sectiondoes notcite anysources. Please helpimprove this section byadding citations to reliable sources. Unsourced material may be challenged andremoved.(June 2013) (Learn how and when to remove this message)

Snippets of SML code are most easily studied by entering them into aninteractive top-level.

Hello, world!

[edit]

The following is a"Hello, World!" program:

hello.sml
print"Hello, world!\n";
sh
$mltonhello.sml$./helloHello, world!

Algorithms

[edit]

Insertion sort

[edit]

Insertion sort forintlist (ascending) can be expressed concisely as follows:

funinsert(x,[])=[x]|insert(x,h::t)=sortx(h,t)andsortx(h,t)=ifx<hthen[x,h]@telseh::insert(x,t)valinsertionsort=List.foldlinsert[]

Mergesort

[edit]

Main article:Merge sort

Here, the classic mergesort algorithm is implemented in three functions: split, merge and mergesort. Also note the absence of types, with the exception of the syntaxop:: and[] which signify lists. This code will sort lists of any type, so long as a consistent ordering functioncmp is defined. UsingHindley–Milner type inference, the types of all variables can be inferred, even complicated types such as that of the functioncmp.

Split

funsplit is implemented with astateful closure which alternates betweentrue andfalse, ignoring the input:

funalternator{}=letvalstate=reftrueinfna=>!statebeforestate:=not(!state)end(* Split a list into near-halves which will either be the same length, * or the first will have one more element than the other. * Runs in O(n) time, where n = |xs|. *)funsplitxs=List.partition(alternator{})xs

Merge

Merge uses a local function loop for efficiency. The innerloop is defined in terms of cases: when both lists are non-empty (x::xs) and when one list is empty ([]).

This function merges two sorted lists into one sorted list. Note how the accumulatoracc is built backwards, then reversed before being returned. This is a common technique, since'alist is represented as alinked list; this technique requires more clock time, but theasymptotics are not worse.

(* Merge two ordered lists using the order cmp. * Pre: each list must already be ordered per cmp. * Runs in O(n) time, where n = |xs| + |ys|. *)funmergecmp(xs,[])=xs|mergecmp(xs,y::ys)=letfunloop(a,acc)(xs,[])=List.revAppend(a::acc,xs)|loop(a,acc)(xs,y::ys)=ifcmp(a,y)thenloop(y,a::acc)(ys,xs)elseloop(a,y::acc)(xs,ys)inloop(y,[])(ys,xs)end

Mergesort

The main function:

funapf(x,y)=(fx,fy)(* Sort a list in according to the given ordering operation cmp. * Runs in O(n log n) time, where n = |xs|. *)funmergesortcmp[]=[]|mergesortcmp[x]=[x]|mergesortcmpxs=(mergecmpoap(mergesortcmp)osplit)xs

Quicksort

[edit]

Main article:Quicksort

Quicksort can be expressed as follows.funpart is aclosure that consumes an order operatorop<<.

infix<<funquicksort(op<<)=letfunpartp=List.partition(fnx=>x<<p)funsort[]=[]|sort(p::xs)=joinp(partpxs)andjoinp(l,r)=sortl@p::sortrinsortend

Expression interpreter

[edit]

Note the relative ease with which a small expression language can be defined and processed:

exceptionTyErr;datatypety=IntTy|BoolTyfununify(IntTy,IntTy)=IntTy|unify(BoolTy,BoolTy)=BoolTy|unify(_,_)=raiseTyErrdatatypeexp=True|False|Intofint|Notofexp|Addofexp*exp|Ifofexp*exp*expfuninferTrue=BoolTy|inferFalse=BoolTy|infer(Int_)=IntTy|infer(Note)=(asserteBoolTy;BoolTy)|infer(Add(a,b))=(assertaIntTy;assertbIntTy;IntTy)|infer(If(e,t,f))=(asserteBoolTy;unify(infert,inferf))andassertet=unify(infere,t)funevalTrue=True|evalFalse=False|eval(Intn)=Intn|eval(Note)=ifevale=TruethenFalseelseTrue|eval(Add(a,b))=(case(evala,evalb)of(Intx,Inty)=>Int(x+y))|eval(If(e,t,f))=eval(ifevale=Truethentelsef)funrune=(infere;SOME(evale))handleTyErr=>NONE

Example usage on well-typed and ill-typed expressions:

valSOME(Int3)=run(Add(Int1,Int2))(* well-typed *)valNONE=run(If(Not(Int1),True,False))(* ill-typed *)

Arbitrary-precision integers

[edit]

TheIntInf module provides arbitrary-precision integer arithmetic. Moreover, integer literals may be used as arbitrary-precision integers without the programmer having to do anything.

The following program implements an arbitrary-precision factorial function:

fact.sml
funfactn:IntInf.int=ifn=0then1elsen*fact(n-1);funprintLinestr=TextIO.output(TextIO.stdOut,str^"\n");val()=printLine(IntInf.toString(fact120));
bash
$mltonfact.sml$./fact6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000

fact.sml

funfactn:IntInf.int=ifn=0then1elsen*fact(n-1);funprintLinestr=TextIO.output(TextIO.stdOut,str^"\n");val()=printLine(IntInf.toString(fact120));

bash

$mltonfact.sml$./fact6689502913449127057588118054090372586752746333138029810295671352301633557244962989366874165271984981308157637893214090552534408589408121859898481114389650005964960521256960000000000000000000000000000

Partial application

[edit]

Curried functions have many applications, such as eliminating redundant code. For example, a module may require functions of typea->b, but it is more convenient to write functions of typea*c->b where there is a fixed relationship between the objects of typea andc. A function of typec->(a*c->b)->a->b can factor out this commonality. This is an example of theadapter pattern.^{[citation needed]}

In this example,fund computes the numerical derivative of a given functionf at pointx:

-funddeltafx=(f(x+delta)-f(x-delta))/(2.0*delta)vald=fn:real->(real->real)->real->real

The type offund indicates that it maps a "float" onto a function with the type(real->real)->real->real. This allows us to partially apply arguments, known ascurrying. In this case, functiond can be specialised by partially applying it with the argumentdelta. A good choice fordelta when using this algorithm is the cube root of themachine epsilon.^{[citation needed]}

-vald'=d1E~8;vald'=fn:(real->real)->real->real

The inferred type indicates thatd' expects a function with the typereal->real as its first argument. We can compute an approximation to the derivative of $f(x)=x^{3}-x-1$ at $x=3$ . The correct answer is $f'(3)=27-1=26$ .

-d'(fnx=>x*x*x-x-1.0)3.0;valit=25.9999996644:real

Libraries

[edit]

Standard

[edit]

The Basis Library^[7] has been standardized and ships with most implementations. It provides modules for trees, arrays, and other data structures, andinput/output and system interfaces.

Third party

[edit]

Fornumerical computing, a Matrix module exists (but is currently broken),https://www.cs.cmu.edu/afs/cs/project/pscico/pscico/src/matrix/README.html.

For graphics, cairo-sml is an open source interface to theCairo graphics library. For machine learning, a library for graphical models exists.

Implementations

[edit]

Implementations of Standard ML include the following:

Standard

HaMLet: a Standard ML interpreter that aims to be an accurate and accessible reference implementation of the standard
MLton (mlton.org): awhole-program optimizing compiler which strictly conforms to the Definition and produces very fast code compared to other ML implementations, includingbackends forLLVM and C
Moscow ML: a light-weight implementation, based on theCaml Light runtime engine which implements the full Standard ML language, including modules and much of the basis library
Poly/ML: a full implementation of Standard ML that produces fast code and supports multicore hardware (via Portable Operating System Interface (POSIX) threads); its runtime system performs parallelgarbage collection and online sharing of immutable substructures.
Standard ML of New Jersey (smlnj.org): a full compiler, with associated libraries, tools, an interactive shell, and documentation with support forConcurrent ML
SML.NET: a Standard ML compiler for theCommon Language Runtime with extensions for linking with other.NET framework code
ML Kit Archived 2016-01-07 at theWayback Machine: an implementation based very closely on the Definition, integrating a garbage collector (which can be disabled) andregion-based memory management with automatic inference of regions, aiming to support real-time applications

Derivative

Alice: an interpreter for Standard ML by Saarland University with support for parallel programming usingfutures,lazy evaluation,distributed computing viaremote procedure calls andconstraint programming
SML#: an extension of SML providing record polymorphism and C language interoperability. It is a conventional native compiler and its name isnot an allusion to running on the .NET framework
SOSML: an implementation written inTypeScript, supporting most of the SML language and select parts of the basis library

Research

CakeML is a REPL version of ML with formally verified runtime and translation to assembler.
Isabelle (Isabelle/ML Archived 2020-08-30 at theWayback Machine) integrates parallel Poly/ML into an interactive theorem prover, with a sophisticated IDE (based onjEdit) for official Standard ML (SML'97), the Isabelle/ML dialect, and the proof language. Starting with Isabelle2016, there is also a source-level debugger for ML.
Poplog implements a version of Standard ML, along withCommon Lisp andProlog, allowing mixed language programming; all are implemented inPOP-11, which iscompiled incrementally.
TILT is a full certifying compiler for Standard ML which uses typedintermediate languages tooptimize code and ensure correctness, and can compile totyped assembly language.

All of these implementations areopen-source and freely available. Most are implemented themselves in Standard ML. There are no longer any commercial implementations;Harlequin, now defunct, once produced a commercial IDE and compiler called MLWorks which passed on toXanalys and was later open-sourced after it was acquired by Ravenbrook Limited on April 26, 2013.

Major projects using SML

[edit]

TheIT University of Copenhagen's entireenterprise architecture is implemented in around 100,000 lines of SML, including staff records, payroll, course administration and feedback, student project management, and web-based self-service interfaces.^[8]

Theproof assistants HOL4,Isabelle,LEGO, andTwelf are written in Standard ML. It is also used bycompiler writers andintegrated circuit designers such asARM.^[9]

References

[edit]

^^a ^b"Programming in Standard ML: Hierarchies and Parameterization". Retrieved2020-02-22.
^^a ^b ^c"SML '97".www.smlnj.org.
^^a ^b"itertools — Functions creating iterators for efficient looping — Python 3.7.1rc1 documentation".docs.python.org.
^"Influences - The Rust Reference".The Rust Reference. Retrieved2023-12-31.
^^a ^bMilner, Robin; Tofte, Mads; Harper, Robert; MacQueen, David (1997).The Definition of Standard ML (Revised). MIT Press.ISBN 0-262-63181-4.
^^a ^bOkasaki, Chris (2000). "Breadth-First Numbering: Lessons from a Small Exercise in Algorithm Design".International Conference on Functional Programming 2000. ACM.
^"Standard ML Basis Library".smlfamily.github.io. Retrieved2022-01-10.
^^a ^bTofte, Mads (2009)."Standard ML language".Scholarpedia.4 (2): 7515.Bibcode:2009SchpJ...4.7515T.doi:10.4249/scholarpedia.7515.
^^a ^bAlglave, Jade; Fox, Anthony C. J.; Ishtiaq, Samin; Myreen, Magnus O.; Sarkar, Susmit; Sewell, Peter; Nardelli, Francesco Zappa (2009).The Semantics of Power and ARM Multiprocessor Machine Code(PDF). DAMP 2009. pp. 13–24.Archived(PDF) from the original on 2017-08-14.

External links

[edit]

About Standard ML

Revised definition
Standard ML Family GitHub Project Archived 2020-02-20 at theWayback Machine
What is SML?
What is SML '97?

About successor ML

successor ML (sML): evolution of ML using Standard ML as a starting point
HaMLet on GitHub: reference implementation for successor ML

Practical

Academic

ML programming

Software

Implementations,
dialects

Caml	OCaml° Eff F*° F#° JoCaml° Reason°
Standard ML	Alice° Concurrent ML Extended ML MLton° Standard ML of New Jersey° (SML/NJ)
Dependent ML	ATS°

Programming tools

Alt-Ergo°

Astrée

Camlp4°

FFTW°

Frama-C°

Haxe°

Marionnet°

MTASC°

Poplog°

Semgrep°

SLAM project

Theorem provers,
proof assistants

GeneWeb°

Community

Designers	Lennart Augustsson (Lazy ML) Damien Doligez (OCaml) Gérard Huet (Caml) Xavier Leroy (Caml, OCaml) Robin Milner (ML) Don Sannella (Extended ML) Don Syme (F#)

Italics= discontinued

° =Open-source software

Book

Category:Family:ML

Category:Family:OCaml

Category:Software:OCaml

v t e Programming languages
Comparison Timeline History
Ada ALGOL Simula APL Assembly BASIC Visual Basic classic .NET C C++ C# COBOL Erlang Forth Fortran Go Haskell Java JavaScript TypeScript Julia Kotlin Lisp Lua MATLAB ML Caml OCaml Pascal Object Pascal Perl Raku PHP Prolog Python R Ruby Rust SQL Scratch Shell Smalltalk Swift more...
Lists:Alphabetical Categorical Generational Non-English-based Category