Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin #94386

Merged
merged 2 commits into from Jul 5, 2022

Conversation

pablogsal
Copy link
Member

@pablogsal pablogsal commented Jun 28, 2022

@pablogsal
Copy link
Member Author

pablogsal commented Jun 28, 2022

I'm not adding a test (for now) because this doesn't reproduce nicely on the test suite.

@pablogsal pablogsal marked this pull request as draft Jun 28, 2022
… syntax errors from stdin

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
@pablogsal pablogsal marked this pull request as ready for review Jun 28, 2022
@zware
Copy link
Member

zware commented Jun 28, 2022

Confirmed that this resolves the crash as produced by my reduced reproducer in main and 3.11, and without the change to Parser/pegen_errors.c (which doesn't exist in 3.10) it also resolves both reproducers for 3.10.

@pablogsal
Copy link
Member Author

pablogsal commented Jun 28, 2022

Thanks for checking @zware ! ♥️

@@ -259,15 +259,15 @@ get_error_line_from_tokenizer_buffers(Parser *p, Py_ssize_t lineno)
const char* buf_end = p->tok->fp_interactive ? p->tok->interactive_src_end : p->tok->inp;

for (int i = 0; i < relative_lineno - 1; i++) {
char *new_line = strchr(cur_line, '\n') + 1;
Copy link
Contributor

@ambv ambv Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NULL + 1 is beautifully strongly typed, huh? 😎

Copy link
Member Author

@pablogsal pablogsal Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6lujn3.jpg

Parser/tokenizer.c Outdated Show resolved Hide resolved
@pablogsal pablogsal merged commit 36fcde6 into python:main Jul 5, 2022
14 checks passed
@pablogsal pablogsal deleted the gh-94360 branch Jul 5, 2022
@miss-islington
Copy link
Contributor

miss-islington commented Jul 5, 2022

Thanks @pablogsal for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11.
🐍🍒🤖

@bedevere-bot
Copy link

bedevere-bot commented Jul 5, 2022

GH-94573 is a backport of this pull request to the 3.11 branch.

@miss-islington
Copy link
Contributor

miss-islington commented Jul 5, 2022

Sorry, @pablogsal, I could not cleanly backport this to 3.10 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker 36fcde61ba48c4e918830691ecf4092e4e3b9b99 3.10

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Jul 5, 2022
… syntax errors from stdin (pythonGH-94386)

* pythongh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>

* nitty nit

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit 36fcde6)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
miss-islington added a commit that referenced this pull request Jul 5, 2022
…x errors from stdin (GH-94386)

* gh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>

* nitty nit

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
(cherry picked from commit 36fcde6)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
@bedevere-bot
Copy link

bedevere-bot commented Jul 5, 2022

GH-94574 is a backport of this pull request to the 3.10 branch.

pablogsal added a commit to pablogsal/cpython that referenced this pull request Jul 5, 2022
…es with syntax errors from stdin (pythonGH-94386)

* pythongh-94360: Fix a tokenizer crash when reading encoded files with syntax errors from stdin

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>

* nitty nit

Co-authored-by: Łukasz Langa <lukasz@langa.pl>.
(cherry picked from commit 36fcde6)

Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
ambv pushed a commit that referenced this pull request Jul 5, 2022
…h syntax errors from stdin (GH-94386) (GH-94574)

Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
Co-authored-by: Pablo Galindo Salgado <Pablogsal@gmail.com>
Co-authored-by: Łukasz Langa <lukasz@langa.pl>

(cherry picked from commit 36fcde6)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Tokenizer crash when redirecting input to stdin
5 participants