Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embrace the frozen set #3

Draft
wants to merge 94 commits into
base: main
Choose a base branch
from
Draft

Embrace the frozen set #3

wants to merge 94 commits into from

Conversation

nineteendo
Copy link
Owner

@nineteendo nineteendo commented Jun 4, 2024

Frozen set literals and comprehensions

We should embrace the frozen set, literally: {{...}}.

Motivation

Currently you need too many characters to construct a frozen set:

foo = frozenset({1, 2, 3})
bar = frozenset({a, b, c})
baz = frozenset({c * 2 for c in "abc"}) 

Additionally, this is very inefficient:

>>> from dis import dis
>>> dis('foo = frozenset({1, 2, 3})')
  0           RESUME                   0

  1           LOAD_NAME                0 (frozenset)
              PUSH_NULL
              BUILD_SET                0
              LOAD_CONST               0 (frozenset({1, 2, 3}))
              SET_UPDATE               1
              CALL                     1
              STORE_NAME               1 (foo)
              RETURN_CONST             1 (None)
>>> dis('bar = frozenset({a, b, c})')
  0           RESUME                   0

  1           LOAD_NAME                0 (frozenset)
              PUSH_NULL
              LOAD_NAME                1 (a)
              LOAD_NAME                2 (b)
              LOAD_NAME                3 (c)
              BUILD_SET                3
              CALL                     1
              STORE_NAME               4 (bar)
              RETURN_CONST             0 (None)
>>> dis('baz = frozenset({c * 2 for c in "abc"})')
   0           RESUME                   0

   1           LOAD_NAME                0 (frozenset)
               PUSH_NULL
               LOAD_CONST               0 ('abc')
               GET_ITER
               LOAD_FAST_AND_CLEAR      0 (c)
               SWAP                     2
       L1:     BUILD_SET                0
               SWAP                     2
       L2:     FOR_ITER                 7 (to L3)
               STORE_FAST_LOAD_FAST     0 (c, c)
               LOAD_CONST               1 (2)
               BINARY_OP                5 (*)
               SET_ADD                  2
               JUMP_BACKWARD            9 (to L2)
       L3:     END_FOR
               POP_TOP
       L4:     SWAP                     2
               STORE_FAST               0 (c)
               CALL                     1
               STORE_NAME               1 (baz)
               RETURN_CONST             2 (None)

  --   L5:     SWAP                     2
               POP_TOP

   1           SWAP                     2
               STORE_FAST               0 (c)
               RERAISE                  0
ExceptionTable:
  L1 to L4 -> L5 [4]

That's why it could be useful to have frozen set literals and comprehensions:

foo = {{1, 2, 3}}
bar = {{a, b, c}}
baz = {{c * 2 for c in "abc"}}

Then this is all we need to do:

>>> from dis import dis
>>> dis('foo = {{1, 2, 3}}')
  0           RESUME                   0

  1           LOAD_CONST               0 (frozenset({1, 2, 3}))
              STORE_NAME               0 (foo)
              RETURN_CONST             1 (None)
>>> dis('bar = {{a, b, c}}')
  0           RESUME                   0

  1           LOAD_NAME                0 (a)
              LOAD_NAME                1 (b)
              LOAD_NAME                2 (c)
              BUILD_FROZENSET          3
              STORE_NAME               3 (bar)
              RETURN_CONST             0 (None)
>>> dis('baz = {{c * 2 for c in "abc"}}')
   0           RESUME                   0

   1           LOAD_CONST               0 ('abc')
               GET_ITER
               LOAD_FAST_AND_CLEAR      0 (c)
               SWAP                     2
       L1:     BUILD_FROZENSET          0
               SWAP                     2
       L2:     FOR_ITER                 7 (to L3)
               STORE_FAST_LOAD_FAST     0 (c, c)
               LOAD_CONST               1 (2)
               BINARY_OP                5 (*)
               SET_ADD                  2
               JUMP_BACKWARD            9 (to L2)
       L3:     END_FOR
               POP_TOP
       L4:     SWAP                     2
               STORE_FAST               0 (c)
               STORE_NAME               0 (baz)
               RETURN_CONST             2 (None)

  --   L5:     SWAP                     2
               POP_TOP

   1           SWAP                     2
               STORE_FAST               0 (c)
               RERAISE                  0
ExceptionTable:
  L1 to L4 -> L5 [2]

Syntax

frozenset_display ::= "{{" (`starred_list` | `comprehension`) "}}"

Note

Technically the syntax is this:

frozenset_display ::= "{{" (`starred_list` | `comprehension`) "}" "}"

But that's an implementation detail that shouldn't be documented.

Examples

Example 1

assert {{1, 2, 3}}         == frozenset({1, 2, 3})
assert { {{1, 2, 3}} }     == {frozenset({1, 2, 3})}
assert {{{{1, 2, 3}}}}     == frozenset({frozenset({1, 2, 3})})
assert { {{{{1, 2, 3}}}} } == {frozenset({frozenset({1, 2, 3})})}
...

Note

These are still type errors:

foo = { {1, 2, 3} }     # TypeError: unhashable type: 'set'
bar = {{{1, 2, 3}}}     # TypeError: unhashable type: 'set'
baz = {{ {1, 2, 3} }}   # TypeError: unhashable type: 'set'
qux = { { {1, 2, 3} } } # TypeError: unhashable type: 'set'
...

And these are now syntax errors:

foo = {{}}
bar = {{"a": 1, "b": 2, "c": 3}}

Example 2

assert {{c * 2 for c in "abc"}} == frozenset({c * 2 for c in "abc"})

Note

This is still a type error:

foo = { {c * 2 for c in "abc"} } # TypeError: unhashable type: 'set'

And this is now a syntax error:

foo = {{c: c * 2 for c in "abc"}}

Backwards compatibility

These statements would behave differently with this proposal:

foo = {{1, 2, 3}}          # TypeError: unhashable type: 'set'
bar = { {{1, 2, 3}} }      # TypeError: unhashable type: 'set'
baz = {{{{1, 2, 3}}}}      # TypeError: unhashable type: 'set'
qux = { {{{{1, 2, 3}}}} }  # TypeError: unhashable type: 'set'
...
foo = { {c * 2 for c in "abc"} } # TypeError: unhashable type: 'set'

But I don't think they can be used for anything useful, -'' is a shorter way to raise a type error.

Pros

  • Symmetrical and only uses brackets
  • Doesn't raise syntax error in previous versions
  • Doesn't reserve any new syntax
  • Double braces can look intuitively as a hardened set
  • Some other languages already give special meaning to {{...}}
  • Some text editors already support embracing selected text

Cons

  • Potentially breaks backwards compatibility
  • Double punctuation isn't Pythonic, there's a precedent with triple quotes: '''''+'''''
  • Suffers from brace overflow
  • Requires a lot of changes to tokenisation and string representations

GitHub usage

Other suggestions for frozenset literals

expand

{1, 2, 3}.freeze()

Example:

assert {1, 2, 3}.freeze() == frozenset({1, 2, 3})

Pros:

  • Intuitive
  • Doesn't raise syntax error in previous versions

Cons:

  • Hard to maintain
  • Unclear that it wouldn't be copied at runtime
  • Doesn't improve representation

{1, 2, 3}

Example:

assert {1, 2, 3} == frozenset({1, 2, 3})
assert {1, 2, 3} != set({1, 2, 3})

Pros:

  • Symmetrical and only uses brackets

Cons:

  • Not backwards compatible
  • Inconsistent way to get mutable collections
  • No unambiguous notation for set literals

|1, 2, 3|

Example:

assert |1, 2, 3| == frozenset({1, 2, 3})

Cons:

  • Undirectional
  • Can't keep track of nesting

<1, 2, 3>

Example:

assert <1, 2, 3> == frozenset({1, 2, 3})

Pros:

  • Symmetrical and only uses brackets

Cons:

  • Can't keep track of nesting
  • Hard to read
  • Already used as operator

f{1, 2, 3}

Example:

assert f{1, 2, 3} == frozenset({1, 2, 3})

Pros:

  • Prefix can be easily added and removed

Cons:

  • Rules out possibility of foo{...}
  • Not obvious
  • Looks like slicing operator or function call
  • Endless arguing over s{} for empty set

{{1, 2, 3}}

Example:

assert {{1, 2, 3}}       == frozenset({1, 2, 3})
assert {{{1, 2, 3}}}     == {frozenset({1, 2, 3})}
assert {{{{1, 2, 3}}}}   == frozenset({frozenset({1, 2, 3})})
assert {{{{{1, 2, 3}}}}} == {frozenset({frozenset({1, 2, 3})})}
...

Pros:

  • Symmetrical and only uses brackets
  • Doesn't raise syntax error in previous versions
  • Doesn't reserve any new syntax
  • Double braces can look intuitively as a hardened set
  • Some other languages already give special meaning to {{...}}
  • Some text editors already support embracing selected text

Cons:

  • Potentially breaks backwards compatibility
  • Hard to read and parse
  • Double punctuation isn't Pythonic, there's a precedent with triple quotes: '''''+'''''
  • Suffers from brace overflow
  • Requires a lot of changes to tokenisation and string representations

|{1, 2, 3}|

Example:

assert |{1, 2, 3}| == frozenset({1, 2, 3})

Cons:

  • PEP 351 was rejected
  • Undirectional

Links

  1. https://peps.python.org/pep-0351
  2. https://mail.python.org/pipermail/python-3000/2008-January/thread.html#11798
  3. https://mail.python.org/archives/list/[email protected]/thread/MVIIUMQZYTTSGZSYJFGKPHTOF5Y4RI6I
  4. https://mail.python.org/archives/list/[email protected]/thread/AMWKPS54ZK6X2FI7NICDM6DG7LERIJFV
  5. https://mail.python.org/archives/list/[email protected]/thread/SOGSM2KVVNYLD2U2EUJHOPZW7BUNOOF2
  6. https://mail.python.org/archives/list/[email protected]/thread/M6TMP3HRNA7HHF2S6R4VCZCTRDZ4W6WX
  7. https://mail.python.org/archives/list/[email protected]/thread/GRMNMWUQXG67PXXNZ4W7W27AQTCB6UQQ
  8. https://discuss.python.org/t/make-using-immutable-datatypes-more-pleasant-by-adding-a-little-syntactic-sugar/23588
  9. https://discuss.python.org/t/alternative-call-syntax/53126
  10. https://discuss.python.org/t/frozen-set-literals/53489

📚 Documentation preview 📚: https://cpython-previews--3.org.readthedocs.build/

@nineteendo
Copy link
Owner Author

nineteendo commented Jun 5, 2024

@blhsing, I managed to implement frozen set literals. If the syntax needs to be modified that's very easy, all other code is already in place. I'll discuss this in 2025 on Discourse.

@blhsing
Copy link

blhsing commented Jun 5, 2024

@blhsing, I managed to implement frozen set literals. If the syntax needs to be modified that's very easy, all other code is already in place. I'll discuss this in 2025 on Discourse.

Great work. Looks very good to me so far. Will try to poke holes in it later.

@nineteendo nineteendo changed the title Frozen set literals and comprehensions Embrace the frozen set Jun 12, 2024
@nineteendo
Copy link
Owner Author

nineteendo commented Jun 12, 2024

I made it symmetrical: { {{1}}: 1 } I didn't like how it looked before. It also simplified the implementation.

@nineteendo
Copy link
Owner Author

Should we use frozenset literals in the representation of subclasses? Using them is faster than regular sets.

>>> class Set(set): pass
... 
>>> Set({1, 2, 3})
Set({1, 2, 3}) # use Set({{1, 2, 3}}) instead

It also simplifies the logic a bit.

@nineteendo
Copy link
Owner Author

nineteendo commented Jun 12, 2024

I've used the new syntax throughout the standard library (excluding tests). That made me find a library I forgot to update.

@nineteendo
Copy link
Owner Author

nineteendo commented Jun 12, 2024

By chainging the format of frozenset's repr we would break tons of existing unit tests and decrease the chances of the proposal getting accepted. We should best roll back all my changes then.

The pull request is already way too large to review, so I'll have to split it up anyway. But this way we can show what the Python codebase could look like in a couple years and I'll have less work to figure things out if we decide to implement it anyway. We might be able to do this with a __future__ import though.

I had no idea { {} } is allowed in a type hint, and could not find any documentation about it even after you pointed it out. Can you show me where this behavior is documented, and what it means to have { {} } as a type hint (I see that it is somehow evaluated as a string rather than a set)?

It's not, but I didn't like the fact that this would raise a syntax error when evaluated and that { {1} } would result in a frozen set instead of a type error.

Could you ask for some initial feedback on Discourse in the help category? (My account was suspended)

@blhsing
Copy link

blhsing commented Jun 13, 2024

By chainging the format of frozenset's repr we would break tons of existing unit tests and decrease the chances of the proposal getting accepted. We should best roll back all my changes then.

The pull request is already way too large to review, so I'll have to split it up anyway. But this way we can show what the Python codebase could look like in a couple years and I'll have less work to figure things out if we decide to implement it anyway. We might be able to do this with a __future__ import though.

Fair enough. Perhaps the old repr output can go through a deprecation process to give people time to update their codebase.

I had no idea { {} } is allowed in a type hint, and could not find any documentation about it even after you pointed it out. Can you show me where this behavior is documented, and what it means to have { {} } as a type hint (I see that it is somehow evaluated as a string rather than a set)?

It's not, but I didn't like the fact that this would raise a syntax error when evaluated and that { {1} } would result in a frozen set instead of a type error.

Could you ask for some initial feedback on Discourse in the help category? (My account was suspended)

Ahh I've just found where this behavior is documented. It's called postponed evaluation of annotations as documented in PEP 563, where with from __future__ import annotations all annotations become strings at runtime as long as they're syntactically valid.

Sorry to hear about your suspended account by the way. I was wondering why you have to wait until 2025 to present this idea but I suppose that's when your account can be active again.

@blhsing
Copy link

blhsing commented Jun 13, 2024

Should we use frozenset literals in the representation of subclasses? Using them is faster than regular sets.

>>> class Set(set): pass
... 
>>> Set({1, 2, 3})
Set({1, 2, 3}) # use Set({{1, 2, 3}}) instead

It also simplifies the logic a bit.

That makes total sense. Still slightly annoyed that frozenset literals have to cost 2 more characters but the benefit in simplified bytecode is clearly there.

@blhsing
Copy link

blhsing commented Jun 13, 2024

I've used the new syntax throughout the standard library (excluding tests). That made me find a library I forgot to update.

Great job finding usages in the stdlib, signifying the usefulness of the change.

@nineteendo
Copy link
Owner Author

Sorry to hear about your suspended account by the way. I was wondering why you have to wait until 2025 to present this idea but I suppose that's when your account can be active again.

My account is suspended until September, but because I made too many half baked suggestions I was asked to wait until 2025.

@nineteendo
Copy link
Owner Author

nineteendo commented Jul 14, 2024

@Gouvernathor, would it make sense to already think about frozendict literals? Because changing the repr in the next version would be a breaking change. Most of what I did in this proof of concept could be adjusted for frozen dict literals. And the PEP could be called: "Embrace the frozen dict" (literally).

It might also be better to add frozen dict literals first. Otherwise people will ask for {{}} to create an empty frozen set.

@Gouvernathor
Copy link

I'm not sure how I feel about this proposition in itself. I'd be intuitively in favor of a f{} syntax - but if the consensus is this way, I can get behind it.
Aside that, since {} is already taken by dicts, then it makes no sense to use {{}} for frozen sets, even if frozen dicts never happen it just would make the language inconsistent. And you can indeed argue that frozen dicts may be coming, in order to "reserve" the empty double-brace syntax.

@nineteendo
Copy link
Owner Author

I'm not sure how I feel about this proposition in itself. I'd be intuitively in favor of a f{} syntax - but if the consensus is this way, I can get behind it.

Opinions vary, some find it intuitive, others find it very UNintuitive. From what I have gathered there are simply less issues with {{...}} (see Other suggestions for frozenset literals). I'm planning to do a poll in January.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants