Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-100239: specialize bitwise logical binary ops on ints #128927

Merged
merged 13 commits into from
Jan 29, 2025

Conversation

iritkatriel
Copy link
Member

@iritkatriel iritkatriel commented Jan 16, 2025

This adds specialisations for bitwise |, &, ^ on non-negative ints.

I'm not adding more in the same PR so we can more easily bisect in the future if we need to.

return (is_nonnegative_compactlong(lhs) && is_nonnegative_compactlong(rhs));
}

#define NONNEGATIVE_LONGS_ACTION(NAME, OP) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why restrict to nonnegative longs here? If we do restrict, then we can replace the calls _PyLong_CompactValue with direct access to op->long_value.ob_digit[0]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because negative ints need more work at runtime and I don't think they're common with bitwise logical ops.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra work is already done by _PyLong_CompactValue or am missing something? The output of ls OP rhs might not be a compact int, but there are no guards for the output type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extra work is already done by _PyLong_CompactValue or am missing something?

No, I think you're right. Good point.

The output of ls OP rhs might not be a compact int, but there are no guards for the output type.

For bitwise logical operators we should expect the results to have the same size as the inputs.

Copy link
Contributor

@chris-eibl chris-eibl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think non-negative should be dropped in the news entry now?

Python/specialize.c Outdated Show resolved Hide resolved
@chris-eibl
Copy link
Contributor

I think non-negative should be dropped in the news entry now?

And maybe in the title of this pull request, too?

@iritkatriel iritkatriel changed the title gh-100239: specialize bitwise logical binary ops on non-negative int gh-100239: specialize bitwise logical binary ops on ints Jan 21, 2025
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we are seeing a high proportion of specialization failures for non-compact ints with the & operator.
I don't know why that would be. My guess is that some of the benchmarks are using ints as bit vectors and using more than one digit.

@@ -2556,6 +2609,7 @@ binary_op_extended_specialization(PyObject *lhs, PyObject *rhs, int oparg,

LOOKUP_SPEC(compactlong_float_specs, oparg);
LOOKUP_SPEC(float_compactlong_specs, oparg);
LOOKUP_SPEC(compactlongs_specs, oparg);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why three tables, rather than one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A single table would need to have a list of (guard, action) pairs for each OP. So it's a table of tables. Same thing basically.

@chris-eibl
Copy link
Contributor

It looks like we are seeing a high proportion of specialization failures for non-compact ints with the & operator. I don't know why that would be. My guess is that some of the benchmarks are using ints as bit vectors and using more than one digit.

Just a wild guess: could those be from enums, which derive from int? Especially flag enums?

@iritkatriel
Copy link
Member Author

Just a wild guess: could those be from enums, which derive from int? Especially flag enums?

I don't think so. We check for int with PyLong_CheckExact.

@eendebakpt
Copy link
Contributor

I suspect the misses might be bm_pyflate:

https://github.com/python/pyperformance/blob/1d9261a7da8fcaa642a36181db8e7c4a306a1303/pyperformance/data-files/benchmarks/bm_pyflate/run_benchmark.py#L153-L163

Other python constructions that would causes misses (but are not in pyperformance afaics) are uuid.uuid4 (and variants) and bitwise logical operations on the hash of python objects.

@iritkatriel
Copy link
Member Author

(Would be nice if the report was rendered so that it's easy to see which benchmark contributed to a stat.)

@markshannon
Copy link
Member

Do we have benchmarking numbers for this?

@iritkatriel
Copy link
Member Author

iritkatriel commented Jan 28, 2025

Do we have benchmarking numbers for this?

https://github.com/faster-cpython/benchmarking-public/tree/main/results/bm-20250121-3.14.0a4%2B-6476205

(mdboom edited URL to the public one)

@markshannon
Copy link
Member

Performance looks neutral within the noise.

Were we expecting a speedup on any particular benchmark, or is there a micro-benchmark that shows a speedup?

@iritkatriel
Copy link
Member Author

Performance looks neutral within the noise.

Were we expecting a speedup on any particular benchmark, or is there a micro-benchmark that shows a speedup?

Here are some microbenchmark numbers:

Old:

>>> for i in range(5):
...     timeit("for i in range(10000):\n\tb = a&i", number=10000, setup="a = 1")
...     
2.517115215305239
2.476386756170541
2.484560175333172
2.4904085099697113
2.4953759447671473

New:

>>> for i in range(5):
...     timeit("for i in range(10000):\n\tb = a&i", number=10000, setup="a = 1")
...     
2.2215038347058
2.1580611928366125
2.1611814140342176
2.154480631928891
2.1730910362675786

@markshannon markshannon self-requested a review January 29, 2025 09:24
Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My interpretation of the results is:

@iritkatriel do you agree?

If so, let's merge this.

@iritkatriel iritkatriel merged commit 4815131 into python:main Jan 29, 2025
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants