thma/WhyHaskellMattersPublic

NotificationsYou must be signed in to change notification settings
Fork14
Star479

In this article I try to explain why Haskell keeps being such an important language by presenting some of its most important and distinguishing features and detailing them with working code examples. The presentation aims to be self-contained and does not require any previous knowledge of the language.

License

Apache-2.0 license

479 stars 14 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
.github/workflows		.github/workflows
img		img
src		src
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Setup.hs		Setup.hs
WhyHaskellMatters.iml		WhyHaskellMatters.iml
functor-proof.md		functor-proof.md
package.yaml		package.yaml
stack.yaml		stack.yaml
zettelkasten.md		zettelkasten.md

Repository files navigation

Why Haskell Matters

Haskell doesn't solve different problems than other languages.But it solves them differently.
-- unknown

Abstract

In this article I try to explain why Haskell keeps being such an important language by presenting someof its most important and distinguishing features and detailing them with working code examples.

The presentation aims to be self-contained and does not require any previous knowledge of the language.

The target audience are Haskell freshmen and developers with a background in non-functional languages who are eagerto learn about concepts of functional programming and Haskell in particular.

Introduction

Exactly thirty years ago, on April 1st 1990, a small group of researchers in the field of non-strict functionalprogramming published the original Haskell language report.

Haskell never became one of the most popular languages in the software industry or part of the mainstream,but it has been and still is quite influential in the software development community.

In this article I try to explain why Haskell keeps being such an important language by presenting someof its most distinguishing features and detailing them with working code examples.

The presentation aims to be self-contained and does not require any previous knowledge of the language.I will also try to keep the learning curve moderate and to limit the scope of the presentation;nevertheless this article is by no means a complete introduction to the language.

(If you are looking for thorough tutorials have a look atHaskell Wikibook orLearn You a Haskell

Before diving directly into the technical details I'd like to first have a closer look on the reception ofHaskell in the software developers community:

A strange development over time

In a talk in 2017 onthe Haskell journeysince its beginnings in the 1980ies Simon Peyton Jones speaks about therather unusual life story of Haskell.

First he talks about the typical life cycle of research languages. They are often created bya single researcher (who is also the single user), and most of them will be abandonedafter just a few years.

A more successful research language might gain some interest in a larger communitybut will still not escape the ivory tower and typically will be given up within ten years.

On the other hand we have all those popular programming languages that are quickly adopted bylarge numbers of developers and thus reach "the threshold of immortality".That is the base of existing code will grow so large that the language willbe in use for decades.

A little jokingly he then depicts the sad fate of languages designed bycommittees by flat line through zero: They simply never take off.

Finally, he presents a chart showing the Haskell timeline:

The development shown in this chart seems rather unexpected:Haskell started as a research language and was even designed by a committee;so in all probability it should have been abandoned long before the millennium!

Instead, it gained some momentum in its early years followed by a rather quiet phase duringthe decade of OO hype (Java being released in 1995).And then again we see a continuous growth of interest since about 2005.I'm writing this in early 2020, and we still see this trend!

Being used versus being discussed

Then Simon Peyton Jones points out another interesting characteristic of the reception of Haskellin recent years:In statistics that rank programming languages by actual usage Haskell is typically not under the 30 most active languages.But in statistics that instead rank languages by the volume of discussions on the internetHaskell typically scores much better (often in the top ten).

So why does Haskell keep being a hot topic in the software development community?

A veryshort answer might be:Haskell has a number of features that are clearly different from those of most other programming languages.Many of these features have proven to be powerful tools to solve basic problems of software development elegantly.

Therefore, over time other programming languages have adopted parts of these concepts (e.g. pattern matching or type classes).In discussions about such concepts the Haskell heritage is mentionedand differences between the original Haskell concepts and those of other languages are discussed.Sometimes people feel encouraged to have a closer look at the source of these concepts to get a deeper understanding oftheir original intentions. That's why we see a growing number of developers working inPython, Typescript, Scala, Rust, C++, C# or Java starting to dive into Haskell.

A further essential point is that Haskell is still an experimental laboratory for research in areas such ascompiler construction, programming language design, theorem-provers, type systems etc.So inevitably Haskell will be a topic in the discussion about these approaches.

In the following sections we will try to find thelonger answer bystudying some of the most distinguishing features of Haskell.

Functions are First-class

In computer science, a programming language is said to have first-class functions if it treats functions asfirst-class citizens. This means the language supportspassing functions as arguments to other functions,returning them as the values from other functions, andassigning them to variables or storing them in datastructures.[1] Some programming language theorists requiresupport for anonymous functions (function literals)as well.[2] In languages with first-class functions, the names of functions do not have any special status;they are treated like ordinary variables with a function type.
quoted fromWikipedia

We'll go through this one by one:

Functions can be assigned to variables exactly as any other values

Let's have a look how this looks like in Haskell. First we define some simple values:

-- define constant `aNumber` with a value of 42.aNumber::IntegeraNumber=42-- define constant `aString` with a value of "hello world"aString::StringaString="Hello World"

In the first line we see a type signature that defines the constantaNumber to be of typeInteger.In the second line we define the value ofaNumber to be42.In the same way we define the constantaString to be of typeString.

Haskell is a statically typed language: all type checks happen at compile time.Static typing has the advantage that type errors don't happen at runtime.This is especially useful if a function signature is changed and this changeaffects many dependent parts of a project: the compiler will detect the breaking changesat all affected places.

The Haskell Compiler also providestype inference, which allows the compiler to deduce the concrete data typeof an expression from the context.Thus, it is usually not required to provide type declarations.Nevertheless, using explicit type signatures is considered good style as they are an important element of acomprehensive documentation.

Next we define a functionsquare that takes an integer argument and returns the square value of the argument:

square::Integer->Integersquare x= x* x

Definition of a function works exactly in the same way as the definition of any other value.The only thing special is that we declare the type to be afunction type by using the-> notation.So:: Integer -> Integer represents a function fromInteger toInteger.In the second line we define functionsquare to computex * x for anyInteger argumentx.

Ok, seems not too difficult, so let's define another functiondouble that doubles its input value:

double::Integer->Integerdouble n=2* n

Support for anonymous functions

Anonymous functions, also known as lambda expressions, can be defined in Haskell like this:

\x-> x* x

This expression denotes an anonymous function that takes a single argument x and returns the square of that argument.The backslash is read as λ (the greek letter lambda).

You can use such as expressions everywhere where you would use any other function. For example you could apply theanonymous function\x -> x * x to a number just like the named functionsquare:

-- use named function:result= square5-- use anonymous function:result'= (\x-> x* x)5

We will see more useful applications of anonymous functions in the following section.

Functions can be returned as values from other functions

Function composition

Do you rememberfunction composition from your high-school math classes?Function composition is an operation that takes two functionsf andg and produces a functionh such thath(x) = g(f(x))The resulting composite function is denotedh = g ∘ f where(g ∘ f )(x) = g(f(x)).Intuitively, composing functions is a chaining process in which the output of functionf is used as input of functiong.

So looking from a programmers perspective the∘ operator is a function thattakes two functions as arguments and returns a new composite function.

In Haskell this operator is represented as the dot operator.:

(.):: (b->c)-> (a->b)->a->c(.) f g x= f (g x)

The brackets around the dot are required as we want to use a non-alphabetical symbol as an identifier.In Haskell such identifiers can be used as infix operators (as we will see below).Otherwise(.) is defined as any other function.Please also note how close the syntax is to the original mathematical definition.

Using this operator we can easily create a composite function that first doublesa number and then computes the square of that doubled number:

squareAfterDouble::Integer->IntegersquareAfterDouble= square. double

Currying and Partial Application

In this section we look at another interesting example of functions producingother functions as return values.We start by defining a functionadd that takes twoInteger arguments and computes their sum:

-- function adding two numbersadd::Integer->Integer->Integeradd x y= x+ y

This look quite straightforward. But still there is one interesting detail to note:the type signature ofadd is not something like

add:: (Integer,Integer)->Integer

Instead it is:

add::Integer->Integer->Integer

What does this signature actually mean?It can be read as "A function taking an Integer argument and returning a function of typeInteger -> Integer".Sounds weird? But that's exactly what Haskell does internally.So if we calladd 2 3 firstadd is applied to2 which return a new function of typeInteger -> Integer which is then applied to3.

This technique is calledCurrying

Currying is widely used in Haskell as it allows another cool thing:partial application.

In the next code snippet we define a functionadd5 by partially applying the functionadd to only one argument:

-- partial application: applying add to 5 returns a function of type Integer -> Integeradd5::Integer->Integeradd5= add5

The trick is as follows:add 5 returns a function of typeInteger -> Integer which will add5 to any Integer argument.

Partial application thus allows us to write functions that return functions as result values.This technique is frequently used toprovide functions with configuration data.

Functions can be passed as arguments to other functions

I could keep this section short by telling you that we have already seen an example for this:the function composition operator(.).Itaccepts two functions as arguments and returns a new one as in:

squareAfterDouble::Integer->IntegersquareAfterDouble= square. double

But I have another instructive example at hand.

Let's imagine we have to implement a function that doubles any odd Integer:

ifOddDouble::Integer->IntegerifOddDouble n=ifodd nthen double nelse n

The Haskell code is straightforward: new ingredients are theif ... then ... else ... and theoddodd which is a predicate from the Haskell standard librarythat returnsTrue if an integral number is odd.

Now let's assume that we also need another function that computes the square for any odd number:

ifOddSquare::Integer->IntegerifOddSquare n=ifodd nthen square nelse n

As vigilant developers we immediately detect a violation of theDon't repeat yourself principle asboth functions only vary in the usage of a different growth functionsdouble versussquare.

So we are looking for a way to refactor this code by a solution that keeps the originalstructure but allows to vary the used growth function.

What we need is a function that takes a growth function (of type(Integer -> Integer))as first argument, anInteger as second argumentand returns anInteger. The specified growth function will be applied in thethen clause:

ifOdd:: (Integer->Integer)->Integer->IntegerifOdd growthFunction n=ifodd nthen growthFunction nelse n

With this approach we can refactorifOddDouble andifOddSquare as follows:

ifOddDouble::Integer->IntegerifOddDouble n= ifOdd double nifOddSquare::Integer->IntegerifOddSquare n= ifOdd square n

Now imagine that we have to implement new functionifEvenDouble andifEvenSquare, thatwill work only on even numbers. Instead of repeating ourselves we come up with a functionifPredGrow that takes a predicate function of type(Integer -> Bool) as first argument,a growth function of type(Integer -> Integer) as second argument and an Integer as third argument,returning anInteger.

The predicate function will be used to determine whether the growth function has to be applied:

ifPredGrow:: (Integer->Bool)-> (Integer->Integer)->Integer->IntegerifPredGrow predicate growthFunction n=if predicate nthen growthFunction nelse n

Using thishigher order functionthat even takes two functions as arguments we can write the two new functions andfurther refactor the existing ones without breaking the DRY principle:

ifEvenDouble::Integer->IntegerifEvenDouble n= ifPredGroweven double nifEvenSquare::Integer->IntegerifEvenSquare n= ifPredGroweven square nifOddDouble''::Integer->IntegerifOddDouble'' n= ifPredGrowodd double nifOddSquare''::Integer->IntegerifOddSquare'' n= ifPredGrowodd square n

Pattern matching

With the things that we have learnt so far, we can now start to implement some more interesting functions.So what about implementing the recursivefactorial function?

The factorial function can be defined as follows:

For all n ∈ ℕ₀:
0! = 1n! = n * (n-1)!

With our current knowledge of Haskell we can implement this as follows:

factorial::Natural->Naturalfactorial n=if n==0then1else n* factorial (n-1)

We are using the Haskell data typeNatural to denote the set of non-negative integers ℕ₀.Using the literalfactorial within the definition of the functionfactorial works as expected and denotes arecursive function call.

As these kind of recursive definition of functions are typical for functional programming, the language designers haveadded a useful feature calledpattern matching that allows to define functions by a set of equations:

fac::Natural->Naturalfac0=1fac n= n* fac (n-1)

This style comes much closer to the mathematical definition and is typically more readable, as it helps to avoidnestedif ... then ... else ... constructs.

Pattern matching can not only be used for numeric values but for any other data types.We'll see some more examples shortly.

Algebraic Data Types

Haskell supports user-defined data types by making use of a well thought out concept.Let's start with a simple example:

dataStatus=Green |Yellow |Red

This declares a data typeStatus which has exactly three different instances. For each instance adata constructor is defined that allows to create a new instance of the data type.

Each of those data constructors is a function (in this simple case a constant) that returns aStatus instance.

The typeStatus is a so calledsum type as it is represents the set defined by the sum of all threeinstancesGreen,Yellow,Red. In Java this corresponds to Enumerations.

Let's assume we have to create a converter that maps ourStatus values toSeverity valuesrepresenting severity levels in some other system.This converter can be written using the pattern matching syntax that we already have seen above:

-- another sum type representing severity:dataSeverity=Low |Middle |Highderiving (Eq,Show)severity::Status->SeverityseverityGreen=LowseverityYellow=MiddleseverityRed=High

The compiler will tell us when we did not cover all instances of theStatus type(by making use of the-fwarn-incomplete-patterns pragma).

Now we look at data types that combine multiple different elements, like pairs n-tuples, etc.Let's start with aPairStatusSeverity type that combines two different elements:

dataPairStatusSeverity=PStatusSeverity

This can be understood as: data typePairStatusSeverity can be constructed from adata constructorP that takes a value of typeStatus and a value of typeSeverity and returns aPair instance.

So for exampleP Green High returns aPairStatusSeverity instance(the data constructorP has the signatureP :: Status -> Severity -> PairStatusSeverity).

The typePairStatusSeverity can be interpreted as the set of all possible ordered pairs of Status and Severity values,that is thecartesian product ofStatus andSeverity.

That's why such a data type is calledproduct type.

Haskell allows you to create arbitrary data types by combiningsum types andproduct types. The completerange of data types that can be constructed in this way is calledalgebraic data types or ADT in short.

Using algebraic data types has several advantages:

Pattern matching can be used to analyze any concrete instance to select different behaviour based on input data.as in the example that mapsStatus toSeverity there is no need to useif..then..else.. constructs.
The compiler can detect incomplete patterns matching or other flaws.
The compiler can derive many complex functionality automatically for ADTs as they are constructed insuch a regular way.

We will cover the interesting combination of ADTs and pattern matching in the following sections.

Polymorphic Data Types

Forming pairs or more generally n-tuples is a very common task in programming.Therefore it would be inconvenient and repetitive if we were forced to create new Pair or Tuple typesfor each concrete usage. consider the following example:

dataPairStatusSeverity=PStatusSeveritydataPairStatusString=P'StatusStringdataPairSeverityStatus=P''SeverityStatus

Luckily data type declarations allow to use type variables to avoid this kind of cluttered code.So we can define a generic data typePair that allows us to freely combine different kinds of arguments:

-- a simple polymorphic typedataPairab=Pab

This can be understood as: data typePair uses two elements of (potentially) different typesa andb; thedata constructorP takes a value of typea and a value of typeb and returns aPair a b instance(the data constructorP has the signatureP :: a -> b -> Pair a b).The typePair can now be used to create many different concrete data types it is thuscalled apolymorphic data type.As the Polymorphism is defined by type variables, i.e. parameters to the type declarations, this mechanism iscalledparametric polymorphism.

As pairs and n-tuples are so frequently used, the Haskell language designers have added some syntactic sugar towork effortlessly with them.

So you can simply write tuples like this:

tuple:: (Status,Severity,String)tuple= (Green,Low,"All green")

Lists

Another very useful polymorphic type is theList.

A list can either be the empty list (denoted by the data constructor[])or some element of a data typea followed by a list with elements of typea, denoted by[a].

This intuition is reflected in the following data type definition:

data [a]= [] | a : [a]

The cons operator(:) (which is an infix operator like(.) from the previous section) is declared as adata constructor to construct a list from a single element of typea and a list of type[a].

So a list containing only a single element1 is constructed by:

1:[]

A list containing the three numbers 1, 2, 3 is constructed like this:

1:2:3:[]

Luckily the Haskell language designers have been so kind to offer some syntactic sugar for this.So the first list can simply be written as[1] and the second as[1,2,3].

Polymorphic type expressions describefamilies of types.For example,(forall a)[a] is the family of types consisting of,for every typea, the type of lists ofa.Lists of integers (e.g.[1,2,3]), lists of characters (['a','b','c']),even lists of lists of integers, etc., are all members of this family.

Function that work on lists can use pattern matching to select behaviour for the[] and thea:[a] case.

Take for instance the definition of the functionlength that computes the length of a list:

length:: [a]->Integerlength[]=0length (x:xs)=1+length xs

We can read these equations as: The length of the empty list is 0,and the length of a list whose first element is x and remainder is xsis 1 plus the length of xs.

In our next example we want to work with a of some random integers:

someNumbers:: [Integer]someNumbers= [49,64,97,54,19,90,934,22,215,6,68,325,720,8082,1,33,31]

Now we want to select all even or all odd numbers from this list.We are looking for a functionfilter that takes twoarguments: first a predicate function that will be used to check each elementand second the actual list of elements. The function will return a list with all matching elements.And of course our solution should work not only for Integers but for any other types as well.Here is the type signature of such a filter function:

filter:: (a->Bool)-> [a]-> [a]

In the implementation we will use pattern matching to provide different behaviour for the[] and the(x:xs) case:

filter:: (a->Bool)-> [a]-> [a]filterpred[]=[]filterpred (x:xs)|pred x= x:filterpred xs|otherwise=filterpred xs

The[] case is obvious. To understand the(x:xs) case we have to know that in addition to simple matching of the type constructorswe can also usepattern guards to perform additional testing on the input data.In this case we computepred x if it evaluates toTrue,x is a match and will be cons'ed with the result offilter pred xs.If it does not evaluate toTrue,we will not addx to the result list and thus simply call filter recursively on the remainder of the list.

Now we can usefilter to select elements from our sample list:

someEvenNumbers:: [Integer]someEvenNumbers=filtereven someNumbers-- predicates may also be lambda-expresssionssomeOddNumbers:: [Integer]someOddNumbers=filter (\n-> n`rem`2/=0) someNumbers

Of course we don't have to invent functions likefilter on our own but can rely on theextensive set ofpredefined functions working on listsin the Haskell base library.

Arithmetic sequences

There is a nice feature that often comes in handy when dealing with lists of numbers. It's calledarithmetic sequences andallows you to define lists of numbers with a concise syntax:

upToHundred:: [Integer]upToHundred= [1..100]

As expected this assignsupToHundred with a list of integers from 1 to 100.

It's also possible to define a step width that determines the increment between the subsequent numbers.If we want only the odd numbers we can construct them like this:

oddsUpToHundred:: [Integer]oddsUpToHundred= [1,3..100]

Arithmetic sequences can also be used in more dynamic cases. For example we can define thefactorial function like this:

$$n! = 1 * 2 * 3 ... (n-2) * (n-1) * n, for integers > 0$$

In Haskell we can use an arithmetic sequence to define this function:

fac' n= prod [1..n]

Immutability

In object-oriented and functional programming, an immutable object is an objectwhose state cannot be modified after it is created. This is in contrast to a mutable object(changeable object), which can be modified after it is created.
Quoted fromWikipedia

This is going to be a very short section. In Haskell all data is immutable. Period.

Let's look at some interactions with the Haskell GHCi REPL (whenever you see theλ> prompt in this articleit is from a GHCi session):

λ> a= [1,2,3]λ> a[1,2,3]λ>reverse a[3,2,1]λ> a[1,2,3]

In Haskell there is no way to change the value ofa after its initial creation. There are nodestructiveoperations available unlike some other functional languages such as Lisp, Scheme or ML.

The huge benefit of this is that refactoring becomes much simpler than in languages where every function or methodmight mutate data. Thus it will also be easier to reason about a given piece of code.

Of course this also makes programming of concurrent operations much easier. With ashared nothing approach,Haskell programs are automatically thread-safe.

Declarative programming

In this section I want to explain how programming withhigher order functions can be used tofactor out many basic control structures and algorithms from the user code.

This will result in a moredeclarative programming style where the developer can simplydeclarewhat she wants to achieve but is not required to write downhow it is to be achieved.

Code that applies this style will be much denser, and it will be more concerned with the actual elementsof the problem domain than with the technical implementation details.

Mapping

We'll demonstrate this with some examples working on lists.First we get the task to write a function that doubles all elements of a[Integer] list.We want to reuse thedouble function we have already defined above.

With all that we have learnt so far, writing a functiondoubleAll isn't that hard:

-- compute the double value for all list elementsdoubleAll:: [Integer]-> [Integer]doubleAll[]=[]doubleAll (n:rest)= double n: doubleAll rest

Next we are asked to implement a similar functionsquareAll that will usesquare to compute the square of all elements in a list.The naive way would be to implement it in theWET (We Enjoy Typing) approach:

-- compute squares for all list elementssquareAll:: [Integer]-> [Integer]squareAll[]=[]squareAll (n:rest)= square n: squareAll rest

Of course this is very ugly:both function use the same pattern matching and apply the same recursive iteration strategy.They only differ in the function applied to each element.

As role model developers we don't want to repeat ourselves. We are thus looking for something thatcaptures the essence of mapping a given function over a list of elements:

map:: (a->b)-> [a]-> [b]map f[]=[]map f (x:xs)= f x:map f xs

This function abstracts away the implementation details of iterating over a list and allows to provide a user definedmapping function as well.

Now we can usemap to simplydeclare our intention (the 'what') and don't have to detail the 'how':

doubleAll':: [Integer]-> [Integer]doubleAll'=map doublesquareAll':: [Integer]-> [Integer]squareAll'=map square

Folding

Now let's have a look at some related problem.Our first task is to add up all elements of a[Integer] list.First the naive approach which uses the already familiar mix of pattern matching plus recursion:

sumUp:: [Integer]->IntegersumUp[]=0sumUp (n:rest)= n+ sumUp rest

By looking at the code for a function that computes the product of all elements of a[Integer] list we can again see thatwe are repeating ourselves:

prod:: [Integer]->Integerprod[]=1prod (n:rest)= n* prod rest

So what is the essence of both algorithms?At the core of both algorithms we have a recursive function which

takes a binary operator ((+)or(*) in our case),
an initial value that is used as a starting point for the accumulation(typically the identity element (or neutral element) of the binary operator),
the list of elements that should be reduced to a single return value
performs the accumulation by recursively applying the binary operator to all elements of the list until the[] is reached,where the neutral element is returned.

This essence is contained in the higher order functionfoldr which again is part of the Haskell standard library:

foldr:: (a->b->b)->b-> [a]->bfoldr f acc[]=  accfoldr f acc (x:xs)=  f x (foldr f acc xs)

Now we can usefoldr to simplydeclare our intention (the 'what') and don't have to detail the 'how':

sumUp':: [Integer]->IntegersumUp'=foldr(+)0prod':: [Integer]->Integerprod'=foldr(*)1

With the functionsmap andfoldr (orreduce) we have now two very powerful tools at hand that can be used inmany situation where list data has to be processed.

Both functions can even be composed to form yet another very important programming concept:Map/Reduce.In Haskell this operation is provided by the functionfoldMap.

I won't go into details here as it would go beyond the scope of this article, but I'll invite you to read myintroduction to Map/Reduce in Haskell.

Non-strict Evaluation

Now we come to topic that was one of the main drivers for the Haskell designers: they wanted to getaway from the then standard model of strict evaluation.

Non-Strict Evaluation (aka. normal order reduction) has one very important property.

If a lambda expression has a normal form, then normal order reduction will terminate and find that normal form.
Church-Rosser Theorem II

This property does not hold true for other reduction strategies (like applicative order or call-by-value reduction).

This result from mathematical research on thelambda calculusis important as Haskell maintains the semantics of normal order reduction.

The real-world benefits of lazy evaluation include:

Avoid endless loops in certain edge cases
The ability to define control flow (structures) as abstractions instead of primitives.
The ability to define potentially infinite data structures. This allows for more straightforward implementation of some algorithms.

So let's have a closer look at those benefits:

Avoid endless loops

Consider the following example function:

ignoreY::Integer->Integer->IntegerignoreY x y= x

It takes two integer arguments and returns the first one unmodified. The second argument issimply ignored.

In most programming languages both arguments will beevaluated before the function body is executed:they use applicative order reduction aka. eager evaluation or call-by-value semantics.

In Haskell on the other hand it is safe to call the function with a non-terminating expression in the second argument.First we create a non-terminating expressionviciousCircle. Any attempt to evaluate it will result in an endless loop:

-- it's possible to define non-terminating expressions likeviciousCircle::aviciousCircle= viciousCircle

But if we useviciousCircle as second argument to the functionignoreY it will simply be ignored and the first argumentis returned:

-- trying it in GHCi:λ> ignoreY42 viciousCircle42

Define potentially infinite data structures

In thesection on lists we have already metarithmetic sequences like[1..10].

Arithmetic sequences can also be used to define infinite lists of numbers.Here are a few examples:

-- all natural numbersnaturalNumbers= [1..]-- all even numbersevens= [2,4..]-- all odd numbersodds= [1,3..]

Defining those infinite lists is rather easy. But what can we do with them? Are they useful for any purpose? In theviciousCircle example above we have learnt thatdefining that expression is fine but any attempt to evaluate it will result in an infinite loop.

If we try to printnaturalNumbers we will also end up in an infinite loop of integers printed to the screen.

But if we are bit less greedy than asking for all natural numbers everything will be OK.

λ>take10 naturalNumbers[1,2,3,4,5,6,7,8,9,10]λ>take10 evens[2,4,6,8,10,12,14,16,18,20]λ>take10 odds[1,3,5,7,9,11,13,15,17,19]

We can also peek at a specific position in such an infinite list, using the(!!) operator:

λ> odds!!500010001λ> evens!!1000020002

List comprehension

Do you rememberset comprehension notation from your math classes?

As simple example would be the definition of the set of even numbers:

Evens = {i | i = 2n ∧ n ∊ ℕ}

Which can be read as: Evens is defined as the set of alli wherei = 2*n andn is an element of the set of natural numbers.

The Haskelllist comprehension allows us to define - potentially infinite - lists with a similar syntax:

evens'= [2*n| n<- [1..]]

Again we can avoid infinite loops by evaluating only a finite subset ofevens':

λ>take10 evens'[2,4,6,8,10,12,14,16,18,20]

List comprehension can be very useful for defining numerical sets and series in a (mostly) declarative way that comesclose to the original mathematical definitions.

Take for example the setPT of all pythagorean triples

PT = { (a,b,c) | a,b,c ∊ ℕ ∧ a² + b² = c² }

The Haskell definition looks like this:

pt:: [(Natural,Natural,Natural)]pt= [(a,b,c)| c<- [1..],                b<- [1..c],                a<- [1..b],                a^2+ b^2== c^2]

Define control flow structures as abstractions

In most languages it is not possible to define new conditional operations, e.g. your ownmyIf statement.A conditional operation will evaluate some of its arguments only if certain conditions are met.This is very hard - if not impossible - to implement in language with call-by-value semantics which evaluates all function arguments beforeactually evaluating the function body.

As Haskell implements call-by-need semantics, it is possible to define new conditional operations.In fact this is quite helpful when writingdomain specific languages.

Here comes a very simple version ofmyIf:

myIf::Bool->b->b->bmyIf p x y=if pthen xelse y λ> myIf (4>2)"true" viciousCircle"true"

A somewhat more useful control-structure is thecond (for conditional) function that stems from LISP and Scheme languages.It allows you to define a more table-like decision structure, somewhat resembling aswitch statement from C-style languages:

cond:: [(Bool,a)]->acond[]=error"make sure that at least one condition is true"cond ((True,  v):rest)= vcond ((False, _):rest)= cond rest

With this function we can implement a signum functionsign as follows:

sign:: (Orda,Numa)=>a->asign x= cond [(x>0     ,1 )              ,(x<0     ,-1)              ,(otherwise ,0 )]λ> sign51λ> sign00λ> sign (-4)-1

Type Classes

Now we come to one of the most distinguishing features of Haskell:type classes.

In the sectionPolymorphic Data Types we have seen that type variables (or parameters) allowtype declarations to be polymorphic like in:

data [a]= [] | a : [a]

This approach is calledparametric polymorphism and is used in several programming languages.

Type classes on the other hand addressad hoc polymorphism of data types. This approach is also known asoverloading.

To get a first intuition let's start with a simple example.

We would like to be able to use characters (represented by the data typeChar) as if they were numbers.E.g. we would like to be able to things like:

λ>'A'+25'Z'-- please note that in Haskell a string is List of characters: type String = [Char]λ>map (+5)"hello world""mjqqt%|twqi"λ>map (\c-> c-5)"mjqqt%|twqi""hello world"

To enable this we will have tooverload the infix operators(+) and(-) to work not only on numbers but also on characters.Now, let's have a look at the type signature of the(+) operator:

λ>:type(+)(+)::Numa=>a->a->a

So(+) is not just declared to be of type(+) :: a -> a -> a but it contains aconstraint on the type variablea,namelyNum a =>.The whole type signature of(+) can be read as: for all typesa that are members of the type classNum the operator(+) has the typea -> a -> a.

Next we obtain more information on the type classNum:

λ>:infoNumclassNumawhere(+)::a->a->a(-)::a->a->a(*)::a->a->anegate::a->aabs::a->asignum::a->afromInteger::Integer->a  {-#MINIMAL (+), (*), abs, signum, fromInteger, (negate | (-)) #-}-- Defined in `GHC.Num'instanceNumWord-- Defined in `GHC.Num'instanceNumInteger-- Defined in `GHC.Num'instanceNumInt-- Defined in `GHC.Num'instanceNumFloat-- Defined in `GHC.Float'instanceNumDouble-- Defined in `GHC.Float'

This information details what functions a typea has to implement to be used as an instance of theNum type class.The line{-# MINIMAL (+), (*), abs, signum, fromInteger, (negate | (-)) #-} tells us what a minimal complete implementationhas to provide.It also tells us that the typesWord,Integer,Int,Float andDouble are instances of theNum type class.

This is all we need to know to make the typeChar an instance of theNum type class, so without further ado wedive into the implementation (please note thatfromEnum converts aChar into anInt andtoEnum convertsanInt into anChar):

instanceNumCharwhere  a+ b=toEnum (fromEnum a+fromEnum b)  a- b=toEnum (fromEnum a-fromEnum b)  a* b=toEnum (fromEnum a*fromEnum b)abs c= csignum=toEnum.signum.fromEnumfromInteger=toEnum.fromIntegernegate c= c

This piece of code makes the typeChar an instance of theNum type class. We can then use(+) and `(-) as demonstratedabove.

Originally the idea for type classes came up to provide overloading of arithmetic operatorsin order to use the same operators across all numeric types.

But the type classes concept proved to be useful in a variety of other cases as well.This has lead to a rich sets of type classes provided by the Haskell base library anda wealth of programming techniques that make use of this powerful concept.

Here comes a graphic overview of some of the most important type classes in the Haskell base library:

I won't go over all of these but I'll cover some of the most important ones.

Let's start with Eq:

classEqawhere(==),(/=)::a->a->Bool-- Minimal complete definition:--      (==) or (/=)   x/= y=not (x== y)   x== y=not (x/= y)

This definition states two things:

if a typea is to be made an instance of the classEq it must support thefunctions(==) and(/=) both of them having typea -> a -> Bool.
Eq providesdefault definitions for(==) and(/=) in terms of each other.As a consequence, there is no need for a type inEq to provide both definitions -given one of them, the other will work automatically.

Now we can turn some of the data types that we defined in the section onAlgebraic Data Types into instances of theEq type class.

Here the type declarations as a recap:

dataStatus=Green |Yellow |ReddataSeverity=Low |Middle |HighdataPairStatusSeverity=PSSStatusSeverity

First, we create Eq instances for the simple typesStatus andSeverity by defining the(==)operator for each of them:

instanceEqStatuswhereGreen==Green=TrueYellow==Yellow=TrueRed==Red=True  _== _=FalseinstanceEqSeveritywhereLow==Low=TrueMiddle==Middle=TrueHigh==High=True  _== _=False

Next, we create anEq instance forPairStatusSeverity by defining the(==) operator:

instanceEqPairStatusSeveritywhere   (PSS sta1 sev1)== (PSS sta2 sev2)= (sta1== sta2)&& (sev1== sev2)

With these definitions it is now possible to use the(==) and(/=) on our three types.

As you will have noticed, the code for implementingEq is quite boring. Even a machine could do it!

That's why the language designers have provided aderiving mechanism to let the compiler automatically implementtype class instances if it's automatically derivable as in theEq case.

With this syntax it much easier to let a type implement theEq type class:

dataStatus=Green |Yellow |Redderiving (Eq)dataSeverity=Low |Middle |Highderiving (Eq)dataPairStatusSeverity=PSSStatusSeverityderiving (Eq)

This automatic deriving of type class instances works for many cases and reduces a lof of repetitive code.

For example, its possible to automatically derive instances of theOrd type class, which providesordering functionality:

class (Eqa)=>Ordawherecompare::a->a->Ordering(<),(<=),(>),(>=)::a->a->Boolmax,min::a->a->a...

If you are usingderiving for theStatus andSeverity types, the Compiler will implement theordering according to the ordering of the constructors in the type declaration.That isGreen < Yellow < Red andLow < Middle < High:

dataStatus=Green |Yellow |Redderiving (Eq,Ord)dataSeverity=Low |Middle |Highderiving (Eq,Ord)

Read and Show

Two other quite useful type classes areRead andShow that also support automatic deriving.

Show provides a functionshow with the following type signature:

show::Showa=>a->String

This means that any type implementingShow can be converted (ormarshalled) into aString representation.Creation of aShow instance can be achieved by adding aderiving (Show) clause to the type declaration.

dataPairStatusSeverity=PSSStatusSeverityderiving (Show)λ>show (PSSGreenLow)"PSS Green Low"

TheRead type class is used to do the opposite:unmarshalling data from a String with the functionread:

read::Reada=>String->a

This signature says that for any typea implementing theRead type class the functionread canreconstruct an instance ofa from its String representation:

dataPairStatusSeverity=PSSStatusSeverityderiving (Show,Read)dataStatus=Green |Yellow |Redderiving (Show,Read)dataSeverity=Low |Middle |Highderiving (Show,Read)λ> marshalled=show (PSSGreenLow)λ>read marshalled::PairStatusSeverityPSSGreenLow

Please note that it is required to specify the expected target type with the:: PairStatusSeverity clause.Haskell uses static compile time typing. At compile time there is no way to determine which typean expressionread "some string content" will return. Thus the expected type must be specified at compile time.Either by an implicit declaration given by some function type signature, or as in the example above,by an explicit declaration.

Togethershow andread provide a convenient way to serialize (marshal) and deserialize (unmarshal) Haskelldata structures.This mechanism does not provide any optimized binary representation, but it is still good enough formany practical purposes, the format is more compact than JSON, and it does not require a parser library.

Functor and Foldable

The most interesting type classes are those derived from abstract algebra or category theory.Studying them is a very rewarding process that I highly recommend. However, it is definitelybeyond the scope of this article. Thus, I'm only pointing to two resources covering this part of the Haskelltype class hierarchy.The first one is the legendaryTypeclassopedia by Brent Yorgey.The second one isLambda the ultimate Pattern Factory by myself.This text relates the algebraic type classes to software design patterns, and therefore we will only cover some of these type classes.

In the section ondeclarative programming we came across two very useful concepts:

mapping a function over all elements in a list (map :: (a -> b) -> [a] -> [b])
reducing a list with a binary operation and the neutral (identity) element of that operation(foldr :: (a -> b -> b) -> b -> [a] -> b)

These concepts are not only useful for lists, but also for many other data structures. So it doesn't come as asurprise that there are type classes that abstract these concepts.

Functor

TheFunctor type class generalizes the functionality of applying a function to a value in a context without altering the context,(e.g. mapping a function over a list[a] which returns a new list[b] of the same length):

classFunctorfwherefmap:: (a->b)->fa->fb

Let's take a closer look at this idea by playing with a simple binary tree:

dataTreea=Leafa |Node (Treea) (Treea)deriving (Show)-- a simple instance binary tree:statusTree::TreeStatusstatusTree=Node (LeafGreen) (Node (LeafRed) (LeafYellow))-- a function mapping Status to SeveritytoSeverity::Status->SeveritytoSeverityGreen=LowtoSeverityYellow=MiddletoSeverityRed=High

We want to use the functiontoSeverity :: Status -> Severity to convert allStatus elements of thestatusTreeintoSeverity instances.

Therefore, we letTree instantiate theFunctor class:

instanceFunctorTreewherefmap f (Leaf a)=Leaf (f a)fmap f (Node a b)=Node (fmap f a) (fmap f b)

We can now usefmap onTree data structures:

λ>fmap toSeverity statusTreeNode (LeafLow) (Node (LeafHigh) (LeafMiddle))λ>:type itit::TreeSeverity

As already described above, fmap maintains the tree structure unchanged but converts the type of eachLeaf element,which effectively changes the type of the tree toTree Severity.

As derivation ofFunctor instances is a boring task, it is again possible to use thederiving clause tolet data types instantiateFunctor:

{-#LANGUAGE DeriveFunctor #-}-- this pragma allows automatic deriving of Functor instancesdataTreea=Leafa |Node (Treea) (Treea)deriving (Show,Functor)

Foldable

As already mentioned,Foldable provides the ability to performfolding operations on any data type instantiating theFoldable type class:

classFoldabletwherefold::Monoidm=>tm->mfoldMap::Monoidm=> (a->m)->ta->mfoldr:: (a->b->b)->b->ta->bfoldr':: (a->b->b)->b->ta->bfoldl:: (b->a->b)->b->ta->bfoldl':: (b->a->b)->b->ta->bfoldr1:: (a->a->a)->ta->afoldl1:: (a->a->a)->ta->atoList::ta-> [a]null::ta->Boollength::ta->Intelem::Eqa=>a->ta->Boolmaximum::Orda=>ta->aminimum::Orda=>ta->asum::Numa=>ta->aproduct::Numa=>ta->a

besides the abstraction of thefoldr function,Foldable provides several other useful operations when dealing withcontainer-like structures.

Because of the regular structure algebraic data types it is again possible to automatically deriveFoldable instancesby using thederiving clause:

{-#LANGUAGE DeriveFunctor, DeriveFoldable #-}-- allows automatic deriving of Functor and FoldabledataTreea=Leafa |Node (Treea) (Treea)deriving (Eq,Show,Read,Functor,Foldable)

Of course, we can also implement thefoldr function on our own:

instanceFoldableTreewherefoldr f acc (Leaf a)= f a accfoldr f acc (Node a b)=foldr f (foldr f acc b) a

We can now usefoldr and other class methods ofFoldable:

statusTree::TreeStatusstatusTree=Node (LeafGreen) (Node (LeafRed) (LeafYellow))maxStatus=foldrmaxGreen statusTreemaxStatus'=maximum statusTree-- using length from Foldable type classtreeSize=length statusTree-- in GHCi:λ>:tmaxmax::Orda=>a->a->aλ>foldrmaxGreen statusTreeRed-- using maximum from Foldable type class:λ>maximum statusTreeRedλ> treeSize3-- using toList from Foldable type class:λ> toList statusTree[Green,Red,Yellow]

The Maybe Monad

Now we will take the data typeMaybe as an example to dive deeper into the more complex parts of theHaskell type class system.

TheMaybe type is quite simple, it can be either a null value, calledNothing or a value of typeaconstructed byJust a:

dataMaybea=Nothing |Justaderiving (Eq,Ord)

The Maybe type is helpful in situations where certain operationmay return a valid result.Take for instance the functionlookup from the Haskell base library. It looks up a key in a list ofkey-value pairs. If it finds the key, the associated valueval is returned - but wrapped in a Maybe:Just val.If it doesn't find the key,Nothing is returned:

lookup:: (Eqa)=>a-> [(a,b)]->Maybeblookup _key[]=Nothinglookup  key ((k,val):rest)| key== k=Just val|otherwise=lookup key rest

TheMaybe type is a simple way to avoid NullPointer errors or similar issues with undefined results.Thus, many languages have adopted it under different names. In Java for instance, it is calledOptional.

Total functions

In Haskell, it is considered good practise to usetotal functions - that is functions that have definedreturn values for all possible input values - where ever possible to avoid runtime errors.

Typical examples forpartial (i.e. non-total) functions are division and square roots.We can useMaybe to make them total:

safeDiv:: (Eqa,Fractionala)=>a->a->MaybeasafeDiv _0=NothingsafeDiv x y=Just (x/ y)safeRoot:: (Orda,Floatinga)=>a->MaybeasafeRoot x| x<0=Nothing|otherwise=Just (sqrt x)

In fact, there are alternative base libraries that don't provide any partial functions.

Composition of Maybe operations

Now let's consider a situation where we want to combine several of those functions.Say for example we first want to lookup the divisor from a key-value table, then perform adivision with it and finally compute the square root of the quotient:

findDivRoot::Double->String-> [(String,Double)]->MaybeDoublefindDivRoot x keymap=caselookup keymapofNothing->NothingJust y->case safeDiv x yofNothing->NothingJust d->case safeRoot dofNothing->NothingJust r->Just r-- and then in GHCi:λ> findDivRoot27"val" [("val",3)]Just3.0λ> findDivRoot27"val" [("val",0)]Nothingλ> findDivRoot27"val" [("val",-3)]Nothing

The resulting control flow is depicted in the following diagram, which was inspired by theRailroad Oriented Programming presentation:

In each single step we have to check forNothing, in that case we directly short circuit to an overallNothing result value.In theJust case we proceed to the next processing step.

This kind of handling is repetitive and buries the actual intention under a lot of boilerplate.As Haskell uses layout (i.e. indentation) instead of curly brackets to separate blocks the code willend up in what is called thedreaded staircase: it marches to the right of the screen.

So we are looking for a way to improve the code by abstracting away the chaining of functions that returnMaybe values and providing a way toshort circuit theNothing cases.

We need an operatorandThen that takes theMaybe result of a first functionapplication as first argument, and a function as second argument that will be used in theJust x case and againreturns aMaybe result.In case that the input isNothing the operator will directly returnNothing without any further processing.In case that the input isJust x the operator will apply the argument functionfun tox and return its result:

andThen::Maybea-> (a->Maybeb)->MaybebandThenNothing _fun=NothingandThen (Just x) fun= fun x

We can then rewritefindDivRoot as follows:

findDivRoot'''' x keymap=lookup keymap`andThen`\y->  safeDiv x y`andThen`\d->  safeRoot d

(Side note: In Java theOptional type has a corresponding method:Optional.flatmap)

This kind of chaining of functions in the context of a specific data type is quite common. So, it doesn't surprise us thatthere exists an even more abstractandThen operator that works for arbitrary parameterized data types:

(>>=)::Monadm=>ma-> (a->mb)->mb

When we compare thisbind operator with the type signature of theandThen operator:

andThen::Maybea-> (a->Maybeb)->Maybeb

We can see that both operators bear the same structure.The only difference is that instead of the concrete typeMaybe the signature of(>>=)uses a type variablem with aMonad type class constraint. We can read this type signature as:

For any typem of the type classMonad the operator(>>=) is defined asm a -> (a -> m b) -> m bBased on(>>=) we can rewrite thefindDivRoot function as follows:

findDivRoot' x keymap=lookup keymap>>=\y->  safeDiv x y>>=\d->  safeRoot d

Monads are a central element of the Haskell type class ecosystem. In fact the monadic composition based on(>>=) is sofrequently used that there exists some specific syntactic sugar for it. It's called the do-Notation.Using do-NotationfindDivRoot looks like this:

findDivRoot''' x keymap=do  y<-lookup keymap  d<- safeDiv x y  safeRoot d

This looks quite like a sequence of statements (including variable assignments) in an imperative language.Due to this similarity Monads have been aptly calledprogrammable semicolons.But as we have seen: below the syntactic sugar it's a purely functional composition!

Purity

A function is called pure if it corresponds to a function in the mathematical sense: it associates each possible inputvalue with an output value, and does nothing else. In particular,

it has no side effects, that is to say, invoking it produces no observable effect other than the result it returns;it cannot also e.g. write to disk, or print to a screen.
it does not depend on anything other than its parameters, so when invoked in a different context or at a differenttime with the same arguments, it will produce the same result.

Purity makes it easy to reason about code, as it is so close to mathematical calculus.The properties of a Haskell program can thus often be determined with equational reasoning.(As an example I have provided anexample for equational reasoning in Haskell).

Purity also improves testability: It is much easier to set up tests without worrying about mocks or stubs to factor outaccess to backend layers.

All the functions that we have seen so far are allpure code that is free from side effects.

So how can we achieve side effects like writing to a database or serving HTTP requests in Haskell?

The Haskell language designers came up with a solution that distinguishes Haskell from most other languages:Side effects are always explicitly declared in the function type signature.In the next section we will learn how exactly this works.

Explicit side effects with the IO Monad

Monadic I/O is a clever trick for encapsulating sequential, imperative computation, so that it can “do no evil”to the part that really does have precise semantics and good compositional properties.
Conal Elliott

The most prominent Haskell Monad is theIO monad. It is used to compose operations that perform I/O.We'll study this with a simple example.

In an imperative language, reading a String from the console simply returns a String value (e.g.BufferedReader.readline() in Java:public String readLine() throws IOException).

In Haskell the functiongetLine does not return aString value but anIO String:

getLine::IOString

This could be interpreted as:getLine returns a String in an IO context.In Haskell, it is not possible to extract the String value from its IO context (In Java on the other hand you could alwayscatch away theIOException).

So how can we use the result ofgetLine in a function that takes aString value as input parameter?

We need the monadic bind operation(>>=) to do this in the same as we already saw in theMaybe monad:

-- convert a string to upper casestrToUpper::String->StringstrToUpper=map toUpperup::IO() up=getLine>>=\str->print (strToUpper str)-- and then in GHCi:λ>:tprintprint::Showa=>a->IO()λ> uphello world"HELLO WORLD"

or with do-Notation:

up'::IO() up'=do  str<-getLineprint (strToUpper str)

Making side effects explicit in function type signatures is one of the most outstanding achievements of Haskell.This feature will lead to a very rigid distinction between code that is free of side effects (akapure code) and codethat has side effects (akaimpure code).

Keeping domain logicpure - particularly when working only withtotal functions - will dramatically improvereliability and testability as tests can be run without setting up mocks or stubbed backends.

It's not possible to introduce side effects without making them explicit in type signatures.There is nothing like theinvisible JavaRuntimeExceptions.So you can rely on the compiler to detect any violations of a rule like "No impure code in domain logic".

I've written a simple Restaurant Booking REST Service API that explains how Haskell helps you to keep domain logic pure byorganizing your code according to theports and adapters pattern.

The section on type classes (and on Monads in particular) have been quite lengthy. Yet, they have hardly shown more thanthe tip of the iceberg. If you want to dive deeper into type classes, I recommendThe Typeclassopedia.

Conclusion

We have covered quite a bit of terrain in the course of this article.

It may seem that Haskell has invented an intimidating mass of programming concepts.But in fact, Haskell inherits much from earlier functional programming languages.

Features like first class functions, comprehensive list APIs or declarative programminghad already been introduced with Lisp and Scheme.

Several others, like pattern matching, non-strict evaluation, immutability, purity, static and strong typing,type inference, algebraic data types and polymorphic data typeshave been invented in languages like Hope, Miranda and ML.

Only a few features like type classes and explicit side effects / monadic I/O were first introduced in Haskell.

So if you already know some functional language concepts, Haskell shouldn't seem too alien to you.For developers with a background in OO languages, the conceptual gap will be much larger.

I hope that this article helped to bridge that gap a bit and to better explainwhyfunctional programming - and Haskell in particular - matters.

Using functional programming languages - or applying some of their techniques - will helpto create designs that are closer to the problem domain (as intented by domain driven design),more readable (due to their declarative character), allow equational reasoning, will provide more rigidseparation of business logic and side effects,are more flexible for future changes or extensions, provide better testability (supporting BDD, TDD and property based testing),will need much less debugging, are better to maintain and, last but not least, will be more fun to write.

About

Languages

Haskell100.0%

Movatterモバイル変換

License

thma/WhyHaskellMatters

Folders and files

Latest commit

History

Repository files navigation

Why Haskell Matters

Abstract

Table of contents

Introduction

A strange development over time

Being used versus being discussed

So why does Haskell keep being a hot topic in the software development community?

Functions are First-class

Functions can be assigned to variables exactly as any other values

Support for anonymous functions

Functions can be returned as values from other functions

Function composition

Currying and Partial Application

Functions can be passed as arguments to other functions

Pattern matching

Algebraic Data Types

Polymorphic Data Types

Lists

Arithmetic sequences

Immutability

Declarative programming

Mapping

Folding

Non-strict Evaluation

Avoid endless loops

Define potentially infinite data structures

List comprehension

Define control flow structures as abstractions

Type Classes

Read and Show

Functor and Foldable

Functor

Foldable

The Maybe Monad

Total functions

Composition of Maybe operations

Purity

Explicit side effects with the IO Monad

Conclusion

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Contributors5

Uh oh!

Languages

Packages