Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Parser-combinators library.

License

NotificationsYou must be signed in to change notification settings

besok/parsit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

This library provides a very simple and lightweight parser (recursive descendant ll(1)) to combine and express agrammar.

The library usesLogos as a lexical analyzer and tokenizer.

The premise

This library major incentives were:

  • lightweight : very small and does not require a deep dive
  • transparency : literally 3 structs with a handful of methods
  • speed : good speed (with a gratitude toLogos)

The steps to implement

Create a set of tokens using Logos

Add logos to dependency

    logos = "*"
use logos::Logos;#[derive(Logos,Debug,PartialEq)]enumToken{// Tokens can be literal strings, of any length.#[token("fast")]Fast,#[token(".")]Period,// Or regular expressions.#[regex("[a-zA-Z]+")]Text,// Logos requires one token variant to handle errors,// it can be named anything you wish.#[error]// We can also use this variant to define whitespace,// or any other matches we wish to skip.#[regex(r"[ \t\n\f]+", logos::skip)]Error,}

Create a parser that will be able to parse the given set of tokens

The library providesParsit<'a,T> instance that encompasses a set of tokens and auxiliary methods

structParser<'a>{inner:Parsit<'a,Token<'a>>,}

Implement a parsing functions usingParsit instance and auxiliary methods from theStep

The helpers:

  • the macros token! that alleviates comparing and matching single tokens
  • methodsthen,then_zip and others fromStep
  • methodsone_or_more,zero_or_more fromParsit

Transform the result intoResult<Structure, ParserError<'a>>

fntext(&self,pos:usize) ->Result<Vec<Sentence<'a>>,ParseError<'a>>{self.inner.zero_or_more(pos, |p|self.sentence(p)).into()}

Complete example

usecrate::parser::Parsit;usecrate::token;usecrate::step::Step;usecrate::parser::EmptyToken;usecrate::error::ParseError;use logos::Logos;#[derive(Logos,Debug,Copy,Clone,PartialEq)]pubenumToken<'a>{#[regex(r"[a-zA-Z-]+")]Word(&'astr),#[token(",")]Comma,#[token(".")]Dot,#[token("!")]Bang,#[token("?")]Question,#[regex(r"[ \t\r\n\u000C\f]+", logos::skip)]Whitespace,#[error]Error,}#[derive(Debug,Copy,Clone,PartialEq)]enumItem<'a>{Word(&'astr),Comma,}#[derive(Debug,Clone,PartialEq)]enumSentence<'a>{Sentence(Vec<Item<'a>>),Question(Vec<Item<'a>>),Exclamation(Vec<Item<'a>>),}structParser<'a>{inner:Parsit<'a,Token<'a>>,}impl<'a>Parser<'a>{fnnew(text:&'astr) ->Parser<'a>{let delegate:Parsit<Token> =Parsit::new(text).unwrap();Parser{inner: delegate}}fnsentence(&self,pos:usize) ->Step<'a,Sentence<'a>>{let items = |p|self.inner.one_or_more(p, |p|self.word(p));let sentence = |p|items(p).then_zip(|p|token!(self.inner.token(p) =>Token::Dot)).take_left().map(Sentence::Sentence);let exclamation = |p|items(p).then_zip(|p|token!(self.inner.token(p) =>Token::Bang)).take_left().map(Sentence::Exclamation);let question = |p|items(p).then_zip(|p|token!(self.inner.token(p) =>Token::Question)).take_left().map(Sentence::Question);sentence(pos).or_from(pos).or(exclamation).or(question).into()}fnword(&self,pos:usize) ->Step<'a,Item<'a>>{token!(self.inner.token(pos) =>Token::Word(v) =>Item::Word(v),Token::Comma =>Item::Comma)}fntext(&self,pos:usize) ->Result<Vec<Sentence<'a>>,ParseError<'a>>{self.inner.zero_or_more(pos, |p|self.sentence(p)).into()}}#[test]fntest(){let parser =Parser::new(r#"            I have a strange addiction,            It often sets off sparks!            I really cannot seem to stop,            Using exclamation marks!            Anyone heard of the interrobang?            The poem is for kids.        "#);let result = parser.text(0).unwrap();println!("{:?}", result);}

The base auxiliary methods

On parser

  • token - gives a possibility to pull out a curren token
  • one_or_more - gives a one or more semantic
  • zero_or_more - gives a zero or more semantic
  • validate_eof - ensure the parser reaches end of the input

Macros

  • token! - parses the current token. In general, it is used the followingtoken!(p.token(pos) => T::Bang => "!")
  • wrap! - implements a simple pattern in grammar likeleft value right, for instance[1,2,3] or(a,b)
    • can handle the default value likewrap!(0 => left; value or default; right)
    • can handle the option value likewrap!(0 => left; value ?; right)
  • seq! - implements a simple pattern of sequence likeel sep el ..., for instance1,2,3
    • can have a, at the end signaling the separator can be at the ned of the seq like1,2,3 (,)?

On step

To alternate
  • or - gives an alternative in a horizon of one token
  • or_from - gives a backtracking option
To combine
  • then - gives a basic combination with a next rule omitting the current one
  • then_zip - combines a current result and a next one into a pair
  • then_skip - parses the next one but drops the result keeping only current one
  • then_or_none -combines a next one in an option with a current one or return a none otherwise
To collect
  • take_left - drops a right value from a pair
  • take_right - drops a left value from a pair
  • merge - merge a value into a list
  • to_map - transforms a list of pairs into a map
To transform
  • or_val - replaces a value with a default value if it is not presented
  • or_none - replaces a value with a none if it is not presented
To work with value
  • ok - transforms a value into an option
  • error - transforms an error into an option
  • map - transforms a value
  • combine - combines a value with another value from a given step
  • validate - validates a given value and transforms into an error if a validation failed
To print
  • print - print a step
  • print_with - print a step with a given prefix
  • print_as - print a step with a transformation of value
  • print_with_as - print a step with a transformation of value with a given prefix
  • parsit.env - Prints a position and env from the source text(with a radius of 3 tokens )

Testing

Lexer

To test a lexer there are methods fromcrate::parsit::test::lexer_test::* for service

use logos::Logos;usecrate::parsit::test::lexer_test::*;#[derive(Logos,Debug,PartialEq)]pubenumT<'a>{#[regex(r"[a-zA-Z-]+")]Word(&'astr),#[token(",")]Comma,#[token(".")]Dot,#[token("!")]Bang,#[token("?")]Question,#[regex(r"[ \t\r\n]+", logos::skip)]Whitespace,#[error]Error,}#[test]fntest(){expect::<T>("abc, bcs!",vec![T::Word("abc"),T::Comma,T::Word("bcs"),T::Bang]);expect_succeed::<T>("abc, bcs!");expect_failed::<T>("abc, bcs >> !");expect_failed_with::<T,_>("abc, bcs > !", |e| e.is_bad_token_on(">"));}

Parser

To test a parser there are methods fromcrate::parsit::test::parser_test::* for service

  • expect : expect to parse a given value
  • expect_or_env : expect to parse a given value otherwise it will print an env (parsit.env)
  • expect_pos : expect to parse and get a cursor on a given pos
  • expect_pos_or_env : expect to parse and get a cursor on a given pos otherwise it will print an env (parsit.env)
  • fail : should fail parsing
  • fail_on : should fail parsing on a given position
use logos::Logos;usecrate::parsit::test::parser_test::fail;usecrate::parsit::test::parser_test::parsit;usecrate::parsit::token;usecrate::parsit::parser::Parsit;usecrate::parsit::step::Step;#[derive(Logos,Debug,PartialEq)]pubenumT<'a>{#[regex(r"[a-zA-Z-]+")]Word(&'astr),#[token(",")]Comma,#[token(".")]Dot,#[token("!")]Bang,#[token("?")]Question,#[regex(r"[ \t\r\n]+", logos::skip)]Whitespace,#[error]Error,}#[test]fntest_expect(){let p =parsit("abc!");let bang = |pos:usize|token!(p.token(pos) =>T::Bang =>"!");let word = |pos:usize|token!(p.token(pos) =>T::Word(v) =>*v);let step =word(0).then_or_val_zip(bang,"").map(|(a, b)|format!("{}{}", a, b));expect(step,"abc!".to_string());}#[test]fntest_expect_or_env(){let p =parsit("abc!");let bang = |pos:usize|token!(p.token(pos) =>T::Bang =>"!");let word = |pos:usize|token!(p.token(pos) =>T::Word(v) =>*v);let step =word(0).then_or_val_zip(bang,"").map(|(a, b)|format!("{}{}", a, b));expect_or_env(p,step,"abc!".to_string());}#[test]fntest_pos(){let p =parsit("abc!");let bang = |pos:usize|token!(p.token(pos) =>T::Bang =>"!");let word = |pos:usize|token!(p.token(pos) =>T::Word(v) => v);let step =word(0).then_or_val_zip(bang,"");expect_pos(step,2);// the next position to parse}#[test]fntest_pos_or_env(){let p =parsit("abc!");let bang = |pos:usize|token!(p.token(pos) =>T::Bang =>"!");let word = |pos:usize|token!(p.token(pos) =>T::Word(v) => v);let step =word(0).then_or_val_zip(bang,"");expect_pos_or_env(p,step,2);// the next position to parse}#[test]fntest_fail(){let p =parsit("abc?!");let bang = |pos:usize|token!(p.token(pos) =>T::Bang =>"!");let word = |pos:usize|token!(p.token(pos) =>T::Word(v) => v);let step =word(0).then_zip(bang);fail(step);}#[test]fntest_fail_on(){let p =parsit("abc?!");let bang = |pos:usize|token!(p.token(pos) =>T::Bang =>"!");let word = |pos:usize|token!(p.token(pos) =>T::Word(v) => v);let step =word(0).then_zip(bang);fail_on(step,1);}

About

Parser-combinators library.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp