Return to Question

Tweetedtwitter.com/StackCodeReview/status/1440964165133275136

occurredSep 23, 2021 at 9:00

added 233 characters in body

editedSep 8, 2021 at 12:49

I recently wrote code to parse lambda expressions written in a lisp like format, using a small hand-rolled parsing library.

I do not implementThe code does a single pass without any sort of tokenizer since. This might be a little odd, but because the domain I think that such would make the code more complex, and we have very rigid white-space rules.

Isdo notation a good choice given my goals?
Was it a good choice toforgo tokenizinguse a single pass method? Might it be more presentable to parse into a tree and then verify certain properties of the tree?
Are there any subtle Haskell-isms that I have that could easily be removed to make the more approachable to a general audience?

of course keeping my initial goals in mind.

I recently wrote code to parse lambda expressions in a lisp like format, using a small hand-rolled parsing library.

I do not implement any sort of tokenizer since I think that such would make the code more complex, and we have very rigid white-space rules.

Isdo notation a good choice?
Was it a good choice toforgo tokenizing?
Are there any subtle Haskell-isms that I have that could easily be removed to make the more approachable to a general audience?

I recently wrote code to parse lambda expressions written in a lisp like format, using a small hand-rolled parsing library.

The code does a single pass without any tokenizer. This might be a little odd, but because the domain I think that such would make the code more complex, and we have very rigid white-space rules.

Isdo notation a good choice given my goals?
Was it a good choice touse a single pass method? Might it be more presentable to parse into a tree and then verify certain properties of the tree?
Are there any subtle Haskell-isms that I have that could easily be removed to make the more approachable to a general audience?

of course keeping my initial goals in mind.

Source Link

askedAug 20, 2021 at 18:16

Sriotchilism O'Zaic

askedAug 20, 2021 at 18:16

Sriotchilism O'Zaic

Small combinator parsing library in Haskell

I recently wrote code to parse lambda expressions in a lisp like format, using a small hand-rolled parsing library.

The goal

This code was written as a part ofthis code-golf challenge not as code-golf, but as a demo for the concept. My goals for this were

It should be simple enough that by looking over it one should be able to understand what is being parsed.
It should be self-contained so that if someone wants to dig into the nitty-gritty and understand how it works they can.

I chose Haskell for this because:

In my experience imperative parsers are more difficult to understand than combinator parsers.
In my experience combinator parsing has a smaller footprint since it can assemble complex parsers from simple components.
In my experience it is a joy to write and read parsers in Haskell.

The code then consists of a minimalist "library" of simple helper functions, and the definition of the actual parser. If it is important, you should be able to see exactly how it is presented via the link above.

I'd like to get an idea of how well I achieved these goals and what steps could be taken to better present the idea through code.

Expressions

The goal is to verify that a string is a properly formatted lambda expression, consisting of

abs (abstraction) to introduce a new variable.
var (variable) to use a variable.
app (application) to apply one lambda to another.

formatted as a lisp list.

For example

(abs x (app (var x) (var x)))

Is a valid expression representing\$\lambda x. x x\$.

Additionally we want to verify the expression doesn't use undeclared variables and doesn't shadow any variables.

(abs y (var x))

is not valid becausex is not declared. And

(abs y (abs y (abs x (var x))))

is not valid becausey is shadowed.

A precise grammar for the valid expressions is given

\$S\left(C\right) := \left\{\begin{array}[rr] \\color{green}{\texttt{(var }}x\color{green}{\texttt{)}} & \textrm{ where } x\in C \\\color{green}{\texttt{(abs }} x\texttt{ }S(C\cup\{x\})\color{green}{\texttt{)}} & \textrm{ where } x\notin C\\\color{green}{\texttt{(app }} S(C)\texttt{ }S(C)\color{green}{\texttt{)}} &\end{array}\right.\$

Where valid expressions are strings spanned by this grammar starting from\$S(\{\})\$.

There are two extra things to note:

The whitespace is intentionally inflexible in this specification.
Our Parse is not producing a parse tree, only confirming whether an expression meets the specifications.

Library

import Control.Applicative  ( liftA2  , Alternative (..)  )import Control.Monad  ( guard  )newtype Parser a =  Parser (String -> [(String, a)])apply :: Parser a -> String -> [(String, a)]apply (Parser p) s =  p s instance Functor Parser where  fmap func (Parser p) =    Parser $ fmap (map (fmap func)) pinstance Applicative Parser where  pure x =    Parser ( \ string -> [(string, x)] )  liftA2 f (Parser p1) (Parser p2) =    Parser      ( \ string -> do        (rem1, res1) <- p1 string        (rem2, res2) <- p2 rem1        return (rem2, f res1 res2)      )instance Monad Parser where  Parser p >>= f =    Parser      ( \ string -> do        (rem1, res1) <- p string        apply (f res1) rem1      )instance Alternative Parser where  empty =    Parser ( \ _ -> [] )    Parser p1 <|> Parser p2 =    Parser $ ( \ string -> p1 string ++ p2 string )charBy :: (Char -> Bool) -> Parser CharcharBy predicate =  Parser go  where    go (a : b) =      [ (b, a) | predicate a ]    go [] =      []char :: Char -> Parser Charchar x =  charBy (x==)string :: String -> Parser Stringstring =  traverse charchoice :: [Parser a] -> Parser achoice =  foldr (<|>) empty

Parser

lambda :: [String] -> Parser ()lambda variables =  do    char '('    choice      [ do        string "abs "        name <- some $ charBy (`elem` ['a'..'z'])        guard $ notElem name variables        char ' '        lambda (name : variables)      , do        string "var "        choice (map string variables)        return ()      , do        string "app "        lambda variables        char ' '        lambda variables      ]    char ')'    return ()

Description

This uses monadic parsing to get the job done. I build a parser type implement the necessary type classes and a couple of helper functions. The actual parser follows the same structure as the given grammar.

I do not implement any sort of tokenizer since I think that such would make the code more complex, and we have very rigid white-space rules.

Review

I welcome all feedback on how to make this code more presentable. However the specific questions I would like to prompt are:

Isdo notation a good choice?
Was it a good choice to forgo tokenizing?
Are there any subtle Haskell-isms that I have that could easily be removed to make the more approachable to a general audience?

lang-hs

Movatterモバイル変換

Stack Exchange Network