This chapter gives an overview of the object-oriented features ofOCaml.
Note that the relationship between object, class and type in OCaml isdifferent than in mainstream object-oriented languages such as Java andC++, so you shouldn’t assume that similar keywords mean the same thing.Object-oriented features are used much less frequently in OCaml thanin those languages. OCaml has alternatives that are often more appropriate,such as modules and functors. Indeed, many OCaml programs do not use objectsat all.
The classpoint below defines one instance variablex and two methodsget_x andmove. The initial value of the instance variable is0.The variablex is declared mutable, so the methodmove can changeits value.
We now create a new pointp, instance of thepoint class.
Note that the type ofp ispoint. This is an abbreviationautomatically defined by the class definition above. It stands for theobject type<get_x : int; move : int -> unit>, listing the methodsof classpoint along with their types.
We now invoke some methods ofp:
The evaluation of the body of a class only takes place at objectcreation time. Therefore, in the following example, the instancevariablex is initialized to different values for two differentobjects.
The classpoint can also be abstracted over the initial values ofthex coordinate.
Like in function definitions, the definition above can beabbreviated as:
An instance of the classpoint is now a function that expects aninitial parameter to create a point object:
The parameterx_init is, of course, visible in the whole body of thedefinition, including methods. For instance, the methodget_offsetin the class below returns the position of the object relative to itsinitial position.
Expressions can be evaluated and bound before defining the object bodyof the class. This is useful to enforce invariants. For instance,points can be automatically adjusted to the nearest point on a grid,as follows:
(One could also raise an exception if thex_init coordinate is noton the grid.) In fact, the same effect could be obtained here bycalling the definition of classpoint with the value of theorigin.
An alternate solution would have been to define the adjustment ina special allocation function:
However, the former pattern is generally more appropriate, sincethe code for adjustment is part of the definition of the class and will beinherited.
This ability provides class constructors as can be found in otherlanguages. Several constructors can be defined this way to build objects ofthe same class but with different initialization patterns; analternative is to use initializers, as described below insection 3.4.
There is another, more direct way to create an object: create itwithout going through a class.
The syntax is exactly the same as for class expressions, but theresult is a single object rather than a class. All the constructsdescribed in the rest of this section also apply to immediate objects.
Unlike classes, which cannot be defined inside an expression,immediate objects can appear anywhere, using variables from theirenvironment.
Immediate objects have two weaknesses compared to classes: their typesare not abbreviated, and you cannot inherit from them. But these twoweaknesses can be advantages in some situations, as we will seein sections 3.3 and 3.10.
A method or an initializer can invoke methods on self (that is,the current object). For that, self must be explicitly bound, here tothe variables (s could be any identifier, even though we willoften choose the nameself.)
Dynamically, the variables is bound at the invocation of a method. Inparticular, when the classprintable_point is inherited, the variables will be correctly bound to the object of the subclass.
A common problem with self is that, as its type may be extended insubclasses, you cannot fix it in advance. Here is a simple example.
You can ignore the first two lines of the error message. What mattersis the last one: putting self into an external reference would make itimpossible to extend it through inheritance.We will see in section 3.12 a workaround to thisproblem.Note however that, since immediate objects are not extensible, theproblem does not occur with them.
Let-bindings within class definitions are evaluated before the objectis constructed. It is also possible to evaluate an expressionimmediately after the object has been built. Such code is written asan anonymous hidden method called an initializer. Therefore, it canaccess self and the instance variables.
Initializers cannot be overridden. On the contrary, all initializers areevaluated sequentially.Initializers are particularly useful to enforce invariants.Another example can be seen in section 8.1.
It is possible to declare a method without actually defining it, usingthe keywordvirtual. This method will be provided later insubclasses. A class containing virtual methods must be flaggedvirtual, and cannot be instantiated (that is, no object of this classcan be created). It still defines type abbreviations (treating virtual methodsas other methods.)
Instance variables can also be declared as virtual, with the same effectas with methods.
Private methods are methods that do not appear in object interfaces.They can only be invoked from other methods of the same object.
Note that this is not the same thing as private and protected methodsin Java or C++, which can be called from other objects of the sameclass. This is a direct consequence of the independence between typesand classes in OCaml: two unrelated classes may produceobjects of the same type, and there is no way at the type level toensure that an object comes from a specific class. However a possibleencoding of friend methods is given in section 3.17.
Private methods are inherited (they are by default visible in subclasses),unless they are hidden by signature matching, as described below.
Private methods can be made public in a subclass.
The annotationvirtual here is only used to mention a method withoutproviding its definition. Since we didn’t add theprivateannotation, this makes the method public, keeping the originaldefinition.
An alternative definition is
The constraint on self’s type is requiring a publicmove method, andthis is sufficient to overrideprivate.
One could think that a private method should remain private in a subclass.However, since the method is visible in a subclass, it is always possibleto pick its code and define a method of the same name that runs thatcode, so yet another (heavier) solution would be:
Of course, private methods can also be virtual. Then, the keywords mustappear in this order:method private virtual.
Class interfaces are inferred from class definitions. They may alsobe defined directly and used to restrict the type of a class. Like classdeclarations, they also define a new type abbreviation.
In addition to program documentation, class interfaces can be used toconstrain the type of a class. Both concrete instance variables and concreteprivate methods can be hidden by a class type constraint. Publicmethods and virtual members, however, cannot.
Or, equivalently:
The interface of a class can also be specified in a modulesignature, and used to restrict the inferred signature of a module.
We illustrate inheritance by defining a class of colored points thatinherits from the class of points. This class has all instancevariables and all methods of classpoint, plus a new instancevariablec and a new methodcolor.
A point and a colored point have incompatible types, since a point hasno methodcolor. However, the functionget_succ_x below is a genericfunction applying methodget_x to any objectp that has thismethod (and possibly some others, which are represented by an ellipsisin the type). Thus, it applies to both points and colored points.
Methods need not be declared previously, as shown by the example:
Multiple inheritance is allowed. Only the last definition of a methodis kept: the redefinition in a subclass of a method that was visible inthe parent class overrides the definition in the parent class.Previous definitions of a method can be reused by binding the relatedancestor. Below,super is bound to the ancestorprintable_point.The namesuper is a pseudo value identifier that can only be used toinvoke a super-class method, as insuper#print.
A private method that has been hidden in the parent class is no longervisible, and is thus not overridden. Since initializers are treated asprivate methods, all initializers along the class hierarchy are evaluated,in the order they are introduced.
Note that for clarity’s sake, the methodprint is explicitly marked asoverriding another definition by annotating themethod keyword withan exclamation mark!. If the methodprint were not overriding theprint method ofprintable_point, the compiler would raise an error:
This explicit overriding annotation also worksforval andinherit:
Reference cells can be implemented as objects.The naive definition fails to typecheck:
The reason is that at least one of the methods has a polymorphic type(here, the type of the value stored in the reference cell), thuseither the class should be parametric, or the method type should beconstrained to a monomorphic type. A monomorphic instance of the class couldbe defined by:
Note that since immediate objects do not define a class type, they haveno such restriction.
On the other hand, a class for polymorphic references must explicitlylist the type parameters in its declaration. Class type parameters arelisted between[ and]. The type parameters must also bebound somewhere in the class body by a type constraint.
The type parameter in the declaration may actually be constrained in thebody of the class definition. In the class type, the actual value ofthe type parameter is displayed in theconstraint clause.
Let us consider a more complex example: define a circle, whose centermay be any kind of point. We put an additional typeconstraint in methodmove, since no free variables must remainunaccounted for by the class type parameters.
An alternate definition ofcircle, using aconstraint clause inthe class definition, is shown below. The type#point used below intheconstraint clause is an abbreviation produced by the definitionof classpoint. This abbreviation unifies with the type of anyobject belonging to a subclass of classpoint. It actually expands to< get_x : int; move : int -> unit; .. >. This leads to the followingalternate definition ofcircle, which has slightly strongerconstraints on its argument, as we now expectcenter to have amethodget_x.
The classcolored_circle is a specialized version of classcircle that requires the type of the center to unify with#colored_point, and adds a methodcolor. Note that when specializing aparameterized class, the instance of type parameter must always beexplicitly given. It is again written between[ and].
While parameterized classes may be polymorphic in their contents, theyare not enough to allow polymorphism of method use.
A classical example is defining an iterator.
At first look, we seem to have a polymorphic iterator, however thisdoes not work in practice.
Our iterator works, as shows its first use for summation. However,since objects themselves are not polymorphic (only their constructorsare), using thefold method fixes its type for this individual object.Our next attempt to use it as a string iterator fails.
The problem here is that quantification was wrongly located: it isnot the class we want to be polymorphic, but thefold method.This can be achieved by giving an explicitly polymorphic type in themethod definition.
As you can see in the class type shown by the compiler, whilepolymorphic method types must be fully explicit in class definitions(appearing immediately after the method name), quantified typevariables can be left implicit in class descriptions. Why require typesto be explicit? The problem is that(int -> int -> int) -> int -> int would also be a valid type forfold, and it happens to beincompatible with the polymorphic type we gave (automaticinstantiation only works for toplevel types variables, not for innerquantifiers, where it becomes an undecidable problem.) So the compilercannot choose between those two types, and must be helped.
However, the type can be completely omitted in the class definition ifit is already known, through inheritance or type constraints on self.Here is an example of method overriding.
The following idiom separates description and definition.
Note here the(self : int #iterator) idiom, which ensures that thisobject implements the interfaceiterator.
Polymorphic methods are called in exactly the same way as normalmethods, but you should be aware of some limitations of typeinference. Namely, a polymorphic method can only be called if itstype is known at the call site. Otherwise, the method will be assumedto be monomorphic, and given an incompatible type.
The workaround is easy: you should put a type constraint on theparameter.
Of course the constraint may also be an explicit method type.Only occurrences of quantified variables are required.
Another use of polymorphic methods is to allow some form of implicitsubtyping in method arguments. We have already seen insection 3.8 how some functions may be polymorphic in theclass of their argument. This can be extended to methods.
Note here the special syntax(#point0 as 'a) we have to use toquantify the extensible part of#point0. As for the variable binder,it can be omitted in class specifications. If you want polymorphisminside object field it must be quantified independently.
In methodm1,o must be an object with at least a methodn1,itself polymorphic. In methodm2, the argument ofn2 andx musthave the same type, which is quantified at the same level as'a.
Subtyping is never implicit. There are, however, two ways to performsubtyping. The most general construction is fully explicit: both thedomain and the codomain of the type coercion must be given.
We have seen that points and colored points have incompatible types.For instance, they cannot be mixed in the same list. However, acolored point can be coerced to a point, hiding itscolor method:
An object of typet can be seen as an object of typet'only ift is a subtype oft'. For instance, a point cannot beseen as a colored point.
Indeed, narrowing coercions without runtime checks would be unsafe.Runtime type checks might raise exceptions, and they would requirethe presence of type information at runtime, which is not the case inthe OCaml system.For these reasons, there is no such operation available in the language.
Be aware that subtyping and inheritance are not related. Inheritance is asyntactic relation between classes while subtyping is a semantic relationbetween types. For instance, the class of colored points could have beendefined directly, without inheriting from the class of points; the type ofcolored points would remain unchanged and thus still be a subtype ofpoints.
The domain of a coercion can often be omitted. For instance, one candefine:
In this case, the functioncolored_point_to_point is an instance of thefunctionto_point. This is not always true, however. The fullyexplicit coercion is more precise and is sometimes unavoidable.Consider, for example, the following class:
The object typec0 is an abbreviation for<m : 'a; n : int> as 'a.Consider now the type declaration:
The object typec1 is an abbreviation for the type<m : 'a> as 'a.The coercion from an object of typec0 to an object of typec1 iscorrect:
However, the domain of the coercion cannot always be omitted.In that case, the solution is to use the explicit form.Sometimes, a change in the class-type definition can also solve the problem
While class typesc1 andc2 are different, both object typesc1 andc2 expand to the same object type (same method names and types).Yet, when the domain of a coercion is left implicit and its co-domainis an abbreviation of a known class type, then the class type, ratherthan the object type, is used to derive the coercion function. Thisallows leaving the domain implicit in most cases when coercing from asubclass to its superclass.The type of a coercion can always be seen as below:
Note the difference between these two coercions: in the case ofto_c2,the type#c2 = < m : 'a; .. > as 'a is polymorphically recursive (accordingto the explicit recursion in the class type ofc2); hence thesuccess of applying this coercion to an object of classc0.On the other hand, in the first case,c1 was only expanded andunrolled twice to obtain< m : < m : c1; .. >; .. > (remember#c1 = < m : c1; .. >), without introducing recursion.You may also note that the type ofto_c2 is#c2 -> c2 whilethe type ofto_c1 is more general than#c1 -> c1. This is not always true,since there are class types for which some instances of#c are not subtypesofc, as explained in section 3.16. Yet, forparameterless classes the coercion(_ :> c) is always more general than(_ : #c :> c).
A common problem may occur when one tries to define a coercion to aclassc while defining classc. The problem is due to the typeabbreviation not being completely defined yet, and so its subtypes are notclearly known. Then, a coercion(_ :> c) or(_ : #c :> c) is taken to bethe identity function, as in
As a consequence, if the coercion is applied toself, as in thefollowing example, the type ofself is unified with the closed typec (a closed object type is an object type without ellipsis). Thiswould constrain the type of self be closed and is thus rejected.Indeed, the type of self cannot be closed: this would prevent anyfurther extension of the class. Therefore, a type error is generatedwhen the unification of this type with another type would result in aclosed object type.
However, the most common instance of this problem, coercing self toits current class, is detected as a special case by the type checker,and properly typed.
This allows the following idiom, keeping a list of all objectsbelonging to a class or its subclasses:
This idiom can in turn be used to retrieve an object whose type hasbeen weakened:
The type< m : int > we see here is just the expansion ofc, dueto the use of a reference; we have succeeded in getting back an objectof typec.
The previous coercion problem can often be avoided by firstdefining the abbreviation, using a class type:
It is also possible to use a virtual class. Inheriting from this classsimultaneously forces all methods ofc to have the sametype as the methods ofc'.
One could think of defining the type abbreviation directly:
However, the abbreviation#c' cannot be defined directly in a similar way.It can only be defined by a class or a class-type definition.This is because a#-abbreviation carries an implicit anonymousvariable.. that cannot be explicitly named.The closer you get to it is:
with an extra type variable capturing the open object type.
It is possible to write a version of classpoint without assignmentson the instance variables. The override construct{< ... >} returns a copy of“self” (that is, the current object), possibly changing the value ofsome instance variables.
As with records, the form{< x >} is an elided version of{< x = x >} which avoids the repetition of the instance variable name.Note that the type abbreviationfunctional_point is recursive, which canbe seen in the class type offunctional_point: the type of self is'aand'a appears inside the type of the methodmove.
The above definition offunctional_point is not equivalentto the following:
While objects of either class will behave the same, objects of theirsubclasses will be different. In a subclass ofbad_functional_point,the methodmove willkeep returning an object of the parent class. On the contrary, in asubclass offunctional_point, the methodmove will return anobject of the subclass.
Functional update is often used in conjunction with binary methodsas illustrated in section 8.2.1.
Objects can also be cloned, whether they are functional or imperative.The library functionOo.copy makes a shallow copy of an object. That is,it returns a new object that has the same methods and instancevariables as its argument. Theinstance variables are copied but their contents are shared.Assigning a new value to an instance variable of the copy (using a methodcall) will not affect instance variables of the original, and conversely.A deeper assignment (for example if the instance variable is a reference cell)will of course affect both the original and the copy.
The type ofOo.copy is the following:
The keywordas in that type binds the type variable'a tothe object type< .. >. Therefore,Oo.copy takes an object withany methods (represented by the ellipsis), and returns an object ofthe same type. The type ofOo.copy is different from type< .. > -> < .. > as each ellipsis represents a different set of methods.Ellipsis actually behaves as a type variable.
In fact,Oo.copy p will behave asp#copy assuming that a publicmethodcopy with body{< >} has been defined in the class ofp.
Objects can be compared using the generic comparison functions= and<>.Two objects are equal if and only if they are physically equal. Inparticular, an object and its copy are not equal.
Other generic comparisons such as (<,<=, ...) can also be used onobjects. Therelation< defines an unspecified but strict ordering on objects. Theordering relationship between two objects is fixed permanently once thetwo objects have been created, and it is not affected by mutation of fields.
Cloning and override have a non empty intersection.They are interchangeable when used within an object and withoutoverriding any field:
Only the override can be used to actually override fields, andonly theOo.copy primitive can be used externally.
Cloning can also be used to provide facilities for saving andrestoring the state of objects.
The above definition will only backup one level.The backup facility can be added to any class by using multiple inheritance.
We can define a variant of backup that retains all copies. (We alsoadd a methodclear to manually erase all copies.)
Recursive classes can be used to define objects whose types aremutually recursive.
Although their types are mutually recursive, the classeswidget andwindow are themselves independent.
A binary method is a method which takes an argument of the same typeas self. The classcomparable below is a template for classes with abinary methodleq of type'a -> bool where the type variable'ais bound to the type of self. Therefore,#comparable expands to< leq : 'a -> bool; .. > as 'a. We see here that the binderas alsoallows writing recursive types.
We then define a subclassmoney ofcomparable. The classmoneysimply wraps floats as comparable objects.1 We will extendmoney below with more operations. We have to use a type constraint onthe class parameterx because the primitive<= is a polymorphicfunction in OCaml. Theinherit clause ensures that the type ofobjects of this class is an instance of#comparable.
Note that the typemoney is not a subtype of typecomparable, as the self type appears in contravariant positionin the type of methodleq.Indeed, an objectm of classmoney has a methodleqthat expects an argument of typemoney since it accessesitsvalue method. Consideringm of typecomparable would allow acall to methodleq onm with an argument that does not have a methodvalue, which would be an error.
Similarly, the typemoney2 below is not a subtype of typemoney.
It is however possible to define functions that manipulate objects oftype eithermoney ormoney2: the functionminwill return the minimum of any two objects whose type unifies with#comparable. The type ofmin is not the same as#comparable -> #comparable -> #comparable, as the abbreviation#comparable hides atype variable (an ellipsis). Each occurrence of this abbreviationgenerates a new variable.
This function can be applied to objects of typemoneyormoney2.
More examples of binary methods can be found insections 8.2.1 and 8.2.3.
Note the use of override for methodtimes.Writingnew money2 (k *. repr) instead of{< repr = k *. repr >}would not behave well with inheritance: in a subclassmoney3 ofmoney2thetimes method would return an object of classmoney2 but not of classmoney3 as would be expected.
The classmoney could naturally carry another binary method. Here is adirect definition:
The above classmoney reveals a problem that often occurs with binarymethods. In order to interact with other objects of the same class, therepresentation ofmoney objects must be revealed, using a method such asvalue. If we remove all binary methods (hereplus andleq),the representation can easily be hidden inside objects by removing the methodvalue as well. However, this is not possible as soon as some binarymethod requires access to the representation of objects of the sameclass (other than self).
Here, the representation of the object is known only to a particular object.To make it available to other objects of the same class, we are forced tomake it available to the whole world. However we can easily restrict thevisibility of the representation using the module system.
Another example of friend functions may be found in section 8.2.3.These examples occur when a group of objects (hereobjects of the same class) and functions should see each others internalrepresentation, while their representation should be hidden from theoutside. The solution is always to define all friends in the same module,give access to the representation and use a signature constraint to make therepresentation abstract outside the module.