Today, if you try to compile a regex with an empty alternation, e.g., a||b, then you'll get this error message:
alternations cannot currently contain empty sub-expressions
When I initially built the regex crate, I don't think I was clear on what an empty alternation meant, so I simply made them illegal. However, an empty alternation should have the same match semantics as an empty regex. That is, a||b should match a, b or the empty string.
When I rewrote the regex-syntax crate, I specifically made sure to support empty alternations, which I believe were forbidden in the previous version of regex-syntax. The intent was to propagate that through the regex compiler. However, when I did that, I discovered that it did not implement the correct match semantics. Fixing it did not seem easy, so I simply made the compiler return an error if it found an empty alternate:
|
if prev_entry == self.insts.len() { |
|
// TODO(burntsushi): It is kind of silly that we don't support |
|
// empty-subexpressions in alternates, but it is supremely |
|
// awkward to support them in the existing compiler |
|
// infrastructure. This entire compiler needs to be thrown out |
|
// anyway, so don't feel too bad. |
|
return Err(Error::Syntax( |
|
"alternations cannot currently contain \ |
|
empty sub-expressions".to_string())); |
|
} |
Part of my plans for the future are to rethink a lot of the regex internals, and the compiler itself is at the top of that list. So I plan to tackle this problem when I rework the compiler.
Today, if you try to compile a regex with an empty alternation, e.g.,
a||b, then you'll get this error message:When I initially built the regex crate, I don't think I was clear on what an empty alternation meant, so I simply made them illegal. However, an empty alternation should have the same match semantics as an empty regex. That is,
a||bshould matcha,bor the empty string.When I rewrote the regex-syntax crate, I specifically made sure to support empty alternations, which I believe were forbidden in the previous version of regex-syntax. The intent was to propagate that through the regex compiler. However, when I did that, I discovered that it did not implement the correct match semantics. Fixing it did not seem easy, so I simply made the compiler return an error if it found an empty alternate:
regex/src/compile.rs
Lines 491 to 500 in 488fe56
Part of my plans for the future are to rethink a lot of the regex internals, and the compiler itself is at the top of that list. So I plan to tackle this problem when I rework the compiler.