- Notifications
You must be signed in to change notification settings - Fork483
Description
Consider:
use regex_syntax::ast::parse::ParserBuilder;fnmain(){let parse = |pattern|{ParserBuilder::new().ignore_whitespace(true).build().parse_with_comments(pattern).unwrap()};let wc_1 =parse("a #c\n|b");let wc_2 =parse("a|#c\n b");assert_ne!(wc_1, wc_2);}
The comment#c is attached to different alternatives in the two regex, but the parse output of both are equivalent:
WithComments{ast:Alternation(Alternation{span:Span(Position(o:0, l:1, c:1),Position(o:7, l:2, c:3)),asts:[Literal(Literal{span:Span(Position(o:0, l:1, c:1),Position(o:1, l:1, c:2)),kind:Verbatim,c:'a'}),Literal(Literal{span:Span(Position(o:6, l:2, c:2),Position(o:7, l:2, c:3)),kind:Verbatim,c:'b'})]}),comments:[Comment{span:Span(Position(o:2, l:1, c:3),Position(o:5, l:2, c:1)),comment:"c"}]}
Without knowing the span of the| punctuation we cannot know if the comment should belong toa orb fromparse_with_comments() alone. We have to refer back to the original pattern. At which point perhaps it is easier to just write the parser ourselves 🤷
I think theAst type itself should include the Span of these marks when their position cannot be inferred, like the| ina|b|c or the, ina{3,100}.