Infunctional programming, ageneralized algebraic data type (GADT, alsofirst-class phantom type,[1]guarded recursive datatype,[2] orequality-qualified type[3]) is a generalization of aparametricalgebraic data type (ADT).
In a GADT, the product constructors (calleddata constructors inHaskell) can provide an explicit instantiation of the ADT as the type instantiation of their return value. This allows defining functions with a more advanced type behaviour. For a data constructor of Haskell 2010, the return value has the type instantiation implied by the instantiation of the ADT parameters at the constructor's application.
-- A parametric ADT that is not a GADTdataLista=Nil|Consa(Lista)integers::ListIntintegers=Cons12(Cons107Nil)strings::ListStringstrings=Cons"boat"(Cons"dock"Nil)-- A GADTdataExprawhereEBool::Bool->ExprBoolEInt::Int->ExprIntEEqual::ExprInt->ExprInt->ExprBooleval::Expra->aevale=caseeofEBoola->aEInta->aEEqualab->(evala)==(evalb)expr1::ExprBoolexpr1=EEqual(EInt2)(EInt3)ret=evalexpr1-- False
They are currently implemented in theGlasgow Haskell Compiler (GHC) as a non-standard extension, used by, among others,Pugs andDarcs.OCaml supports GADT natively since version 4.00.[4]
The GHC implementation provides support for existentially quantified type parameters and for local constraints.
An early version of generalized algebraic data types were described byAugustsson & Petersson (1994) and based onpattern matching inALF.
Generalized algebraic data types were introduced independently byCheney & Hinze (2003) and prior byXi, Chen & Chen (2003) as extensions to thealgebraic data types ofML andHaskell.[5] Both are essentially equivalent to each other. They are similar to theinductive families of data types (orinductive datatypes) found inRocq'sCalculus of Inductive Constructions and otherdependently typed languages, modulo the dependent types and except that the latter have an additionalpositivity restriction which is not enforced in GADTs.[6]
Sulzmann, Wazny & Stuckey (2006) introducedextended algebraic data types which combine GADTs together with theexistential data types andtype class constraints.
Type inference in the absence of any programmer suppliedtype annotation, isundecidable[7] and functions defined over GADTs do not admitprincipal types in general.[8]Type reconstruction requires several design trade-offs and is an area of active research (Peyton Jones, Washburn & Weirich 2004;Peyton Jones et al. 2006).
In spring 2021, Scala 3.0 was released.[9] This major update ofScala introduced the possibility to write GADTs[10] with the same syntax as algebraic data types, which is not the case in otherprogramming languages according toMartin Odersky.[11]
Applications of GADTs includegeneric programming, modelling programming languages (higher-order abstract syntax), maintaininginvariants indata structures, expressing constraints inembedded domain-specific languages, and modelling objects.[12]
An important application of GADTs is to embedhigher-order abstract syntax in atype safe fashion. Here is an embedding of thesimply typed lambda calculus with an arbitrary collection ofbase types,product types (tuples) and afixed point combinator:
dataLam::*->*whereLift::a->Lama-- ^ lifted valuePair::Lama->Lamb->Lam(a,b)-- ^ productLam::(Lama->Lamb)->Lam(a->b)-- ^ lambda abstractionApp::Lam(a->b)->Lama->Lamb-- ^ function applicationFix::Lam(a->a)->Lama-- ^ fixed point
And a type safe evaluation function:
eval::Lamt->teval(Liftv)=veval(Pairlr)=(evall,evalr)eval(Lamf)=\x->eval(f(Liftx))eval(Appfx)=(evalf)(evalx)eval(Fixf)=(evalf)(eval(Fixf))
The factorial function can now be written as:
fact=Fix(Lam(\f->Lam(\y->Lift(ifevaly==0then1elseevaly*(evalf)(evaly-1)))))eval(fact)(10)
Problems would have occurred using regular algebraic data types. Dropping the type parameter would have made the lifted base types existentially quantified, making it impossible to write the evaluator. With a type parameter, it is still restricted to one base type. Further, ill-formed expressions such asApp (Lam (\x -> Lam (\y -> App x y))) (Lift True) would have been possible to construct, while they are type incorrect using the GADT. A well-formed analogue isApp (Lam (\x -> Lam (\y -> App x y))) (Lift (\z -> True)). This is because the type ofx isLam (a -> b), inferred from the type of theLam data constructor.