Skip to content

Make struct.Struct() really immutable #143715

@skirpichev

Description

@skirpichev

Bug report

Bug description:

The Struct constructor permits creation of half-initialized Struct's, e.g.:

>>> from _struct import Struct
>>> s = Struct.__new__(Struct)
>>> s.unpack_from(b'boo!')  # this might be a crash!
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    s.unpack_from(b'boo!')
    ~~~~~~~~~~~~~^^^^^^^^^
SystemError: Objects/tupleobject.c:40: bad argument to internal function
>>> s = Struct.__new__(Struct, 1, 2, 3)  # anything is accepted
>>> s.unpack_from(b'boo!')
Traceback (most recent call last):
  File "<python-input-6>", line 1, in <module>
    s.unpack_from(b'boo!')
    ~~~~~~~~~~~~~^^^^^^^^^
SystemError: Objects/tupleobject.c:40: bad argument to internal function

c.f.:

>>> int.__new__(int, 1, 2, 3)
Traceback (most recent call last):
  File "<python-input-7>", line 1, in <module>
    int.__new__(int, 1, 2,3 )
    ~~~~~~~~~~~^^^^^^^^^^^^^^
TypeError: int() takes at most 2 arguments (3 given)

The Struct.__new__() dunder handles only memory allocation, the rest goes to the Struct.__init__(). That doesn't make sense for immutable type (which Struct() pretend to be in fact) and introduce a number of issues, e.g.:

The proper way to fix all this, probably, is moving all initialization logic to the Struct.__new__() dunder:

It is more convenient to initialize the Struct instance in __new__ than in __init__, and it makes sense, since Struct instances are cached and therefore can be considered immutable like ints or tuples. But the possibility of creating subclasses and the existence of subclasses in the wild makes this a breaking change.

Originally posted by @serhiy-storchaka in #112358

From docs:

A good rule of thumb is that for immutable types, all initialization should take place in tp_new, while for mutable types, most initialization should be deferred to tp_init.

This was done in #94532, which then was reverted due to introduced breackage (#112358).

I propose:

  1. deprecate repeated calls of the Struct.__init__() on initialized Struct (will be a no-op eventually)
  2. move all initialization logic to Struct.__new__(), make self.__init__() a no-op if __new__() got one argument
  3. deprecate calls of Struct.__new__() without required argument.

The Struct.__init__() dunder method will be removed in the CPython 3.20. I suggest to close all opened referenced above issues as duplicates of this one.

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

Metadata

Metadata

Assignees

Labels

extension-modulesC modules in the Modules dirtype-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions