Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork1.7k
ptx: implement -S/--sentence-regexp#9682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Uh oh!
There was an error while loading.Please reload this page.
Conversation
| } | ||
| if matches.contains_id(options::SENTENCE_REGEXP){ | ||
| returnErr(PtxError::NotImplemented("-S").into()); | ||
| ifletSome(regex) = matches.get_one::<String>(options::SENTENCE_REGEXP){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I am not entirely sure about this and I don't think this is something that can be resolved within the scope of a PR but I was doing some investigation into why we use the onig crate and it appears it because that the GNU regex has some fundamental differences with the Regex that is used in the expr utility.
If that regex is the same regex used here it would be good to leave a comment explaining that there is still the todo that the regex that is being used is not fully compatible with the GNU implementation.
GNU testsuite comparison: |
Description
This PR implements the
-S / --sentence-regexpflag forptx, bringing it closer to full GNU compatibility.Previously,
ptxonly supported splitting input by lines. This change allows users to define a custom regular expression to split the input into sentences, as specified in the GNU documentation.Tests
GNU Compatibility: This fixes the previously failing
tests/ptx/ptx.pltest caseS-infloop.Unit Tests: Added new Rust unit tests in
tests/by-util/test_ptx.rs.