KEMBAR78
Fix panics with whitespace in extended mode by being more strict by robinst · Pull Request #354 · rust-lang/regex · GitHub
Skip to content

Conversation

@robinst
Copy link
Contributor

@robinst robinst commented Apr 7, 2017

Instead of ignoring space in all the bump/peek methods (as proposed in
pull request #349), have an explicit ignore_space method that can be
used in places where space/comments should be allowed.

This makes parsing a bit stricter than before as well.

Instead of ignoring space in all the bump/peek methods (as proposed in
pull request rust-lang#349), have an explicit `ignore_space` method that can be
used in places where space/comments should be allowed.

This makes parsing a bit stricter than before as well.

#[test]
fn ignore_space_escape_space() {
assert_eq!(p(r"(?x)a\ # hi there"), Expr::Concat(vec![
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing to escape whitespace is in line with what PCRE2 does, see here: http://www.pcre.org/current/doc/html/pcre2pattern.html#SEC5

An escaping backslash can be used to include a white space or # character as part of the pattern.

And it seems less surprising than allowing \ d or the following to mean digits:

r"(?x)\ # comment
    d"

@BurntSushi
Copy link
Member

@robinst Thanks so much for working on this! This behavior is indeed a bit more in line with what I'd expect. I will say though that doing it this way leads to a lot more implementation complexity. :-( I'm not sure I can immediately think of a better way though.

Allowing to escape whitespace is in line with what PCRE2 does

Hmmm, yes, I'll need to think about this. I'm a bit distracted at the moment.

@robinst
Copy link
Contributor Author

robinst commented Apr 21, 2017

I will say though that doing it this way leads to a lot more implementation complexity. :-( I'm not sure I can immediately think of a better way though.

FWIW, fancy-regex's parser does it the same way as I discovered recently, see the calls to optional_whitespace: https://github.com/google/fancy-regex/blob/master/src/parse.rs#L103

@BurntSushi
Copy link
Member

@robinst Sorry for taking so long to get around this. For the most part, I've avoid this because I've been internally waffling over this, but that's silly to do. Some thoughts:

  1. I'm pretty unhappy with this solution, because it makes the parser a lot more complex. My guess is that new bugs will be introduced.
  2. This does however fix a real bug.
  3. I am in the process of rewriting regex-syntax. When I get to supporting x, I'll sit down and re-evaluate our approach here.

Thanks so much for finding and fixing this bug. It is much appreciated!

@BurntSushi
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented May 20, 2017

📌 Commit 6f32d2f has been approved by BurntSushi

@bors
Copy link
Contributor

bors commented May 20, 2017

⌛ Testing commit 6f32d2f with merge c15170a...

bors added a commit that referenced this pull request May 20, 2017
…e-strict, r=BurntSushi

Fix panics with whitespace in extended mode by being more strict

Instead of ignoring space in all the bump/peek methods (as proposed in
pull request #349), have an explicit `ignore_space` method that can be
used in places where space/comments should be allowed.

This makes parsing a bit stricter than before as well.
@bors
Copy link
Contributor

bors commented May 20, 2017

☀️ Test successful - status-appveyor, status-travis
Approved by: BurntSushi
Pushing c15170a to master...

@bors bors merged commit 6f32d2f into rust-lang:master May 20, 2017
@robinst robinst deleted the fix-panics-in-extended-mode-by-being-more-strict branch May 22, 2017 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants