New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-46218: Change long_pow() to sliding window algorithm #30319
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
For large exponents in long_pow(), use the sliding window algorithm instead. Also boost the window size from 5 bits to 6, which should yield a modest but significant speedup for long exponents. The precomputed table remains the same size, though, because the sliding window algorithm only stores results for odd exponents. long_pow() no longer requires that the number of bits in a CPython long digit be a multiple of 5. It no longer cares at all what the digit width is.
rhettinger
reviewed
Jan 1, 2022
sweeneyde
reviewed
Jan 1, 2022
Good catch! The code is clearer the new way too.
trailing zero logic entirely into ABSORB_PENDING. These native int bit manipulations are dirt cheap in comparison to the bigint squaring needed for each exponent bit. And boost the size of exponents tested to (probabilistically) stress a greater mix of exponent bit patterns.
a million straight 1 bits, the timing difference seems insignicant. So I think it better to cut the table size in half, to cut the precomputation overhead time in half for "saner" (smaller exponent) cases.
|
Note that I cut the window size back to 5 bits. Short explanation on the bpo report (nothing to do with the number of bits in a "digit"). |
Since the dynamic table of small powers needed is half the size now, the overhead of trying to use the k-ary method has been cut accordingly, which allows it to pay off at smaller exponent bit lengths.
sweeneyde
reviewed
Jan 2, 2022
arhadthedev
reviewed
Jan 2, 2022
Use shortcut initializer for the k-ary table. Co-authored-by: Oleg Iarygin <dralife@yandex.ru>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
For large exponents in long_pow(), use the sliding window algorithm instead.
Also boost the window size from 5 bits to 6, which should yield a modest but significant speedup for long exponents. The precomputed table remains the same size, though, because the sliding window algorithm only stores results for odd exponents.
long_pow() no longer requires that the number of bits in a CPython long digit be a multiple of 5. It no longer cares at all what the digit width is.
https://bugs.python.org/issue46218