python/cpythonPublic

NotificationsYou must be signed in to change notification settings
Fork33.7k
Star70.4k

Code objects have incorrect hash/equality when code is instrumented for sys.monitoring #111984

New issue

Closed

Code objects have incorrect hash/equality when code is instrumented for sys.monitoring#111984

Labels

3.12only security fixestype-bugAn unexpected behavior, bug, or error

Description

nedbat

opened

on Nov 11, 2023

Bug report

Bug description:

A thread stress test in the coverage.py test suite was failing with seemingly impossible behavior: a code object is used as a key in a dict, and then cannot be found. I eventually tracked it down to a mismatch between how bytecodes are examined in code_hash and in code_richcompare.

I have instructions for demonstrating the problem if you like, though it's a thread race condition, so it's a bit involved:https://gist.github.com/nedbat/2fb43527d102b8bb6e7fb7f3400fa3a9

Hash computation

When computing the hash, opcodes are checked if they are instrumented, and if so, use the original opcodes:

staticPy_hash_tcode_hash(PyCodeObject*co){//...for (inti=0;i<Py_SIZE(co);i++) {intdeop=_Py_GetBaseOpcode(co,i);SCRAMBLE_IN(deop);SCRAMBLE_IN(_PyCode_CODE(co)[i].op.arg);i+=_PyOpcode_Caches[deop];    }

/* Get the underlying opcode, stripping instrumentation */int_Py_GetBaseOpcode(PyCodeObject*code,inti){intopcode=_PyCode_CODE(code)[i].op.code;if (opcode==INSTRUMENTED_LINE) {opcode=code->_co_monitoring->lines[i].original_opcode;    }if (opcode==INSTRUMENTED_INSTRUCTION) {opcode=code->_co_monitoring->per_instruction_opcodes[i];    }CHECK(opcode!=INSTRUMENTED_INSTRUCTION);CHECK(opcode!=INSTRUMENTED_LINE);intdeinstrumented=DE_INSTRUMENT[opcode];if (deinstrumented) {returndeinstrumented;    }return_PyOpcode_Deopt[opcode];}

#define _PyCode_CODE(CO) _Py_RVALUE((_Py_CODEUNIT *)(CO)->co_code_adaptive)

Equality

When checking equality, opcodes are not checked if they are instrumented:

static PyObject *code_richcompare(PyObject *self, PyObject *other, int op){    // ...    for (int i = 0; i < Py_SIZE(co); i++) {        _Py_CODEUNIT co_instr = _PyCode_CODE(co)[i];        _Py_CODEUNIT cp_instr = _PyCode_CODE(cp)[i];        co_instr.op.code = _PyOpcode_Deopt[co_instr.op.code];        cp_instr.op.code = _PyOpcode_Deopt[cp_instr.op.code];        eq = co_instr.cache == cp_instr.cache;        if (!eq) {            goto unequal;        }        i += _PyOpcode_Caches[co_instr.op.code];    }

It seems like the hash and the equality will be well matched if the code hasn't been instrumented. But if it has been, then they don't match.

CPython versions tested on:

3.12

Operating systems tested on:

No response

Metadata

Assignees

No one assigned

Labels

3.12only security fixestype-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Code objects have incorrect hash/equality when code is instrumented for sys.monitoring #111984

Description

Bug report

Bug description:

Hash computation

Equality

CPython versions tested on:

Operating systems tested on:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions