-
Notifications
You must be signed in to change notification settings - Fork 484
Description
I'm working on a syntax highlighting engine in Rust that requires an Oniguruma-compatible regex engine. I'm trying to port it from the onig crate to fancy-regex, but there's some features it doesn't support yet (see trishume/syntect#34).
One of these features is the && operator and nesting in character sets, for example [a-w&&[^c-g]z]. I was thinking this would be added to fancy-regex but @robinst pointed out this comment which suggests that you plan for them to be in the regex crate.
It would be nice if the regex crate supported UTS#18 RL1.3 in full, but the && operator and nesting are all that Oniguruma-compatibility of fancy-regex requires.
I imagine this would take some changes to regex-syntax and then a pass to convert the fancy character sets down to basic character sets. I haven't thought enough about it to know if there are any unicode-related issues that might make this more complex, perhaps by making a tiny fancy character set compile to an enormous basic character set.
@BurntSushi do you have any insight on how difficult you think this would be to add for a contributor not familiar with the internals of regex?