Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34k
gh-143196: fix heap-buffer-overflow in JSON encoder indent cache#143246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Changes fromall commits
File filter
Filter by extension
Conversations
Uh oh!
There was an error while loading.Please reload this page.
Jump to
Uh oh!
There was an error while loading.Please reload this page.
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -1319,14 +1319,47 @@ encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| /* Convert indent to str if it's not None or already a string */ | ||||||||||||||||||||||||||
| if (indent != Py_None && !PyUnicode_Check(indent)) { | ||||||||||||||||||||||||||
Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. Instead of this complex checks (argument 1 also is checked) I'd suggest that we convert Author There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. Performance considerationsFor context, my original proposal was to use a simpler change staticPyObject*encoder_new(PyTypeObject*type,PyObject*args,PyObject*kwds){ ...if (indent!=Py_None&& !PyUnicode_Check(indent)) {PyErr_Format(PyExc_TypeError,"make_encoder() argument 4 must be str or None, ""not %.200s",Py_TYPE(indent)->tp_name);returnNULL; } ...} Let me call that After comment/suggestion in the reported issue I implemented the unidecode check with the additional logical statements (that is the current branch), let me call that And finally we have start=time.perf_counter()json.dumps(data,indent=2)end=time.perf_counter() These are the results
My local is a bit outdated, and I'm pretty confident these numbers can reduced in a faster cpu, but I do believe that performance is not impacted, or at least my changes are statistically consistent with Using the argument clinicI'm rabbit-holing a bit into that approach, not entirely sure yet how it works. But my take on it: it would require some major changes to the module, that I don't feel comfortable messing with. If that is the desired path, I will be happy to document my findings in the reported issue, and leave it to more expert/capable hands to complete. "How many times do we call this function?"I am not entirely sure what do you mean with this question, sorry 😢 it's called once per encoding operation. "We" don't call it, the user does and, as I showed above, they are not going to experience any measurable changes in performance. My apologies again if I misunderstood your question Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. The "we" was a generic "we" (academical one). But you answered my question well. I wondered whether Conversion to AC may introduce an overhead in how the arguments are pre-processed but if you're not comfortable with this, we can first fix this (and backport it) and then switch to AC for | ||||||||||||||||||||||||||
| Py_ssize_t indent_level; | ||||||||||||||||||||||||||
| if (!PyIndex_Check(indent)) { | ||||||||||||||||||||||||||
| PyErr_Format(PyExc_TypeError, | ||||||||||||||||||||||||||
| "make_encoder() argument 4 must be str, int, or None, " | ||||||||||||||||||||||||||
| "not %.200s", Py_TYPE(indent)->tp_name); | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| indent_level = PyNumber_AsSsize_t(indent, PyExc_ValueError); | ||||||||||||||||||||||||||
| if (indent_level == -1 && PyErr_Occurred()) { | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| if (indent_level < 0) { | ||||||||||||||||||||||||||
| PyErr_SetString(PyExc_ValueError, | ||||||||||||||||||||||||||
| "make_encoder() argument 4 must be a non-negative int"); | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| /* Create a string of spaces: ' ' * indent_level */ | ||||||||||||||||||||||||||
| indent = PyUnicode_New(indent_level, ' '); | ||||||||||||||||||||||||||
| if (indent == NULL) { | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| if (indent_level > 0) { | ||||||||||||||||||||||||||
| memset(PyUnicode_1BYTE_DATA(indent), ' ', indent_level); | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
Comment on lines +1345 to +1347 Member There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others.Learn more. Suggested change
| ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| else { | ||||||||||||||||||||||||||
| Py_INCREF(indent); | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| s = (PyEncoderObject *)type->tp_alloc(type, 0); | ||||||||||||||||||||||||||
| if (s == NULL) { | ||||||||||||||||||||||||||
| Py_DECREF(indent); | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| s->markers = Py_NewRef(markers); | ||||||||||||||||||||||||||
| s->defaultfn = Py_NewRef(defaultfn); | ||||||||||||||||||||||||||
| s->encoder = Py_NewRef(encoder); | ||||||||||||||||||||||||||
| s->indent = indent; /* Already incref'd or newly created */ | ||||||||||||||||||||||||||
| s->key_separator = Py_NewRef(key_separator); | ||||||||||||||||||||||||||
| s->item_separator = Py_NewRef(item_separator); | ||||||||||||||||||||||||||
| s->sort_keys = sort_keys; | ||||||||||||||||||||||||||
| @@ -1453,6 +1486,12 @@ encoder_call(PyObject *op, PyObject *args, PyObject *kwds) | ||||||||||||||||||||||||||
| PyObject *indent_cache = NULL; | ||||||||||||||||||||||||||
| if (self->indent != Py_None) { | ||||||||||||||||||||||||||
| if (indent_level != 0) { | ||||||||||||||||||||||||||
| PyErr_SetString(PyExc_ValueError, | ||||||||||||||||||||||||||
| "_current_indent_level must be 0 when indent is set"); | ||||||||||||||||||||||||||
| PyUnicodeWriter_Discard(writer); | ||||||||||||||||||||||||||
| return NULL; | ||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||
| indent_cache = create_indent_cache(self, indent_level); | ||||||||||||||||||||||||||
| if (indent_cache == NULL) { | ||||||||||||||||||||||||||
| PyUnicodeWriter_Discard(writer); | ||||||||||||||||||||||||||
Uh oh!
There was an error while loading.Please reload this page.