Add small validator utility for PEG grammars #23519
Conversation
|
This is really great, @pablogsal! Thanks for putting in the time, again! There are indeed many things we could do here and many different approaches we could follow. I'd certainly be okay with merging this as is and maybe start building on top of it to support things like rule expansion. |
|
Another thought is to link to some explanation of why the alternative will never be reached in the error message, since most people that will fall into these kind of traps won't be too familiar with the PEG formalism. |
I am currently working on a document for the devguide on how to work with PEG grammars |
|
@lysnikolaou I plan to go ahead with this so we can start building up afterwards if that is ok with you. |
|
Of course! Go for it. |
3bcc4ea
into
python:master
One common gotcha of PEG grammars is having two alternatives in a rule in which one that appears first is contained in one that appears last. In this scenario, the second one will never match as if there is input text that matches it, it will also match the first and that comes first.
To avoid this problem, this PR create an initial version of a small utility that detects these cases and raises to alert the user.
We don't need to detect all cases, only the ones that is easy enough to implement. The current form is just an initial prototype that does the matching based in the string representation. To make this better we need to improve the matching algorithm into a visitor to allow checking rules that contain options. For example:
One "straightforward enough" way to do this is to generate all possible string representations: with and without optional and doing substring matching, but at this point a visitor that visits both alternatives at the same time and advances accordingly is probably better.
We could also support some rule expansion so we also detect something like this: