Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-17852: Fixed the documentation about closing files #23135

Open
wants to merge 7 commits into
base: master
from

Conversation

@Volker-Weissmann
Copy link
Contributor

@Volker-Weissmann Volker-Weissmann commented Nov 3, 2020

Most programming languages, including Python 2 call all the destructor of global variables if the program exists successfully.
Python 3 is not guaranteed to.
This can result in the arguments of file.write() not being written to the disk.
The guys in this issue also released this problem and discussed if it should be this way, but apparently all of them were completely brain-dead idiots because they did not check if the documentation of Python matches the actual behaviour. Instead, the documentation stated that Python will call f.close() (and I quote) "eventually", which is completely BS, because Python may never call f.close().
I can't fix the (imho stupid) behavior of Python 3 not calling destructors, but at least I can fix the documentation (this PR).

Especially for a programming language like Python, that is supposed to be used by people with limited CS knowledge, it is important that no features in the programming language yields surprising behavior and good documentation that is correct is also very important.

Excuse me for my language, but if one of the most popular programming languages doesn't manage to fix a simple but important documentation error within a year after someone reports it, it is a disgrace and failure of the open source model. If you support this kind of misleading documentation, you are 100 % responsible to every bug written by someone who thought that f.close() is called when the program exists. And they are 0 % responsible for the bug, because they trusted the documentation.

https://bugs.python.org/issue17852

@nascheme
Copy link
Member

@nascheme nascheme commented Nov 3, 2020

I understand you are upset but calling the developers names is probably not helping. I agree that either the documentation should note this issue or we should actually try to fix the bug. Having buffered file objects randomly lose data is not a friendly behavior.

Your documentation change is not quite correct, I think. The problem is not that __del__ isn't called, it is that the ordering of the __del__ calls can be unexpected. For this bug, it happens if your buffered file object is part of a garbage reference cycle. The ordering of __del__ calls is essentially random at that point. It can happen that the buffer object gets cleaned up after the file object has been cleaned (closed). In that case, the buffer cannot be written because the file is already closed.

AFAIK, there is no way to fix that in the Python GC logic. We would need to re-structure the _io classes to prevent it (merge the buffer object with the file descriptor object or use a weakref).

@Volker-Weissmann
Copy link
Contributor Author

@Volker-Weissmann Volker-Weissmann commented Nov 4, 2020

I understand you are upset but calling the developers names is probably not helping.

I know, I'm sorry. I'm just tired of this "You should have expected this surprising behavior that is not documented!" victim-blaming. Multiple people told me that the old documentation was correct.

I agree that either the documentation should note this issue or we should actually try to fix the bug. Having buffered file objects randomly lose data is not a friendly behavior.

Thank you for saying that.

Your documentation change is not quite correct, I think. The problem is not that __del__ isn't called, it is that the ordering of the __del__ calls can be unexpected. For this bug, it happens if your buffered file object is part of a garbage reference cycle. The ordering of __del__ calls is essentially random at that point. It can happen that the buffer object gets cleaned up after the file object has been cleaned (closed). In that case, the buffer cannot be written because the file is already closed.

AFAIK, there is no way to fix that in the Python GC logic. We would need to re-structure the _io classes to prevent it (merge the buffer object with the file descriptor object or use a weakref).

Fixing this bug would not prevent data loss, because the docs says that python can exit without calling __del__. The other question is whether

f = open("path")
f.write("data")
del(f)
exit0)

is guaranteed to write to the disk.

Doc/tutorial/inputoutput.rst Outdated Show resolved Hide resolved
Doc/tutorial/inputoutput.rst Outdated Show resolved Hide resolved
Doc/tutorial/inputoutput.rst Outdated Show resolved Hide resolved
Co-authored-by: Inada Naoki <songofacandy@gmail.com>

.. warning::
Calling ``f.write()`` without using the :keyword:`!with` keyword or calling
``f.close()``, ``f.flush()`` or ``del f`` **might** result in the arguments

This comment has been minimized.

@methane

methane Nov 8, 2020
Member

Suggested change
``f.close()``, ``f.flush()`` or ``del f`` **might** result in the arguments
``f.close()`` **might** result in the arguments

del f doesn't guarantee that __del__ is called. So it should be removed.
And I suggest to remove f.flush() too because:

  • Here, the tutorial teaching with and f.close(). f.flush() is not described yet.
  • Even if f.flush() is called, not closing file is bad idea.
  • No need to cover how to flush buffered data here. Otherwise, we need to list up all methods causing flush.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.