This repository was archived by the owner on Feb 25, 2025. It is now read-only.
Use UTF-16 for C++ text input model #17831
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The C++ text input model used by Windows and Linux currently uses UTF-32. The intention was to facilitate handling of arrow keys, backspace/delete, etc., however since part of what is synchronized with the engine is cursor+selection offsets, and those offsets are defined in terms of UTF-16 code units, this causes very bad interactions with the framework-side model.
This converts to using UTF-16, rather than UTF-32, so that the offsets align with the framework. It also adds surrogate pair handling to the operations that adjust indexes, to avoid breaking surrogate pairs. (Arbitrary grapheme cluster handling is out of scope for this PR; while definitely desirable in the long term, surrogate pair handling is much more critical since improper handling yields invalid UTF-16, which breaks the text field).
This partially fixes flutter/flutter#55014. A framework-side fix is also necessary (since currently both the engine and the framework attempt to handle arrow keys, which is another out-of-scope-for-this-PR issue), but even without the framework fix this dramatically improves the cursor behavior on Windows when there are surrogate pairs somewhere in the string since at least the two sides agree on what indexes mean.
Includes minor plumbing changes to the text input plumbing on Windows so that we're not pointlessly converting from UTF-16 to UTF-32 and then back to UTF-16.