-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-100239: specialize bitwise logical binary ops on ints #128927
Conversation
Python/specialize.c
Outdated
return (is_nonnegative_compactlong(lhs) && is_nonnegative_compactlong(rhs)); | ||
} | ||
|
||
#define NONNEGATIVE_LONGS_ACTION(NAME, OP) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why restrict to nonnegative longs here? If we do restrict, then we can replace the calls _PyLong_CompactValue with direct access to op->long_value.ob_digit[0]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because negative ints need more work at runtime and I don't think they're common with bitwise logical ops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra work is already done by _PyLong_CompactValue
or am missing something? The output of ls OP rhs
might not be a compact int, but there are no guards for the output type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra work is already done by
_PyLong_CompactValue
or am missing something?
No, I think you're right. Good point.
The output of
ls OP rhs
might not be a compact int, but there are no guards for the output type.
For bitwise logical operators we should expect the results to have the same size as the inputs.
Misc/NEWS.d/next/Core_and_Builtins/2025-01-16-22-54-12.gh-issue-100239.7_HpBU.rst
Outdated
Show resolved
Hide resolved
…e-100239.7_HpBU.rst Co-authored-by: Pieter Eendebak <[email protected]>
The stats are weird with this PR, lot of misses: https://github.com/faster-cpython/benchmarking-public/blob/main/results/bm-20250116-3.14.0a4%2B-c57fe46/bm-20250116-azure-x86_64-iritkatriel-binaryops-3.14.0a4%2B-c57fe46-pystats-vs-base.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think non-negative
should be dropped in the news entry now?
Misc/NEWS.d/next/Core_and_Builtins/2025-01-16-22-54-12.gh-issue-100239.7_HpBU.rst
Outdated
Show resolved
Hide resolved
And maybe in the title of this pull request, too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are seeing a high proportion of specialization failures for non-compact ints with the &
operator.
I don't know why that would be. My guess is that some of the benchmarks are using ints as bit vectors and using more than one digit.
@@ -2556,6 +2609,7 @@ binary_op_extended_specialization(PyObject *lhs, PyObject *rhs, int oparg, | |||
|
|||
LOOKUP_SPEC(compactlong_float_specs, oparg); | |||
LOOKUP_SPEC(float_compactlong_specs, oparg); | |||
LOOKUP_SPEC(compactlongs_specs, oparg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why three tables, rather than one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single table would need to have a list of (guard, action) pairs for each OP. So it's a table of tables. Same thing basically.
Just a wild guess: could those be from enums, which derive from int? Especially flag enums? |
I don't think so. We check for int with |
I suspect the misses might be Other python constructions that would causes misses (but are not in pyperformance afaics) are |
(Would be nice if the report was rendered so that it's easy to see which benchmark contributed to a stat.) |
Do we have benchmarking numbers for this? |
(mdboom edited URL to the public one) |
Performance looks neutral within the noise. Were we expecting a speedup on any particular benchmark, or is there a micro-benchmark that shows a speedup? |
Here are some microbenchmark numbers: Old:
New:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My interpretation of the results is:
- Shows a measurable speedup for the operators and classes it targets
- Doesn't show an overall slowdown
- This specialization's effectiveness is limited by the small range of compact ints. Something like More efficient implementation of integers faster-cpython/ideas#548 should mostly fix that.
@iritkatriel do you agree?
If so, let's merge this.
This adds specialisations for bitwise |, &, ^ on non-negative ints.
I'm not adding more in the same PR so we can more easily bisect in the future if we need to.