Movatterモバイル変換


[0]ホーム

URL:


Comparison of cyclic objects (was RE: [Python-Dev] trashcan and PR#7)

Jeremy Hyltonjeremy@cnri.reston.va.us
Thu, 13 Apr 2000 15:08:12 -0400 (EDT)


>>>>> "TP" == Tim Peters <tim_one@email.msn.com> writes:  TP> [Jeremy Hylton]>  >> So the real problem is defining some reasonable semantics for  >> comparison of recursive objects.  TP> I think this is exactly a graph isomorphism problem, since  TP> Python always compares "by value" (so isomorphism is the natural  TP> generalization).I'm not familiar with any algorithms for the graph isomorphismproblem, but I took a stab at a simple comparison algorithm.  The ideais to detect comparisons that would cross back-edges in the objectgraphs.  Instead of starting a new comparison, assume they are thesame.  If, in fact, the objects are not the same, they must differ insome other way; some other part of the comparison will fail.  TP> This isn't hard (!= tedious, alas) to define or to implement  TP> naively, but a straightforward implementation would be very  TP> expensive at runtime compared to the status quo.  That's why  TP> "real languages" <wink> would rather suffer an infinite loop.  TP> It's expensive because there's no cheap way to know whether you  TP> have a loop in an object.My first attempt at implementing this is expensive.  I maintain adictionary that contains all the object pairs that are currently being compared.  Specifically, the dictionary is used to implement a set ofobject id pairs.  Every call to PyObject_Compare will add a newpair to the dictionary when it is called and remove it when it returns (except for a few trivial cases).A naive patch is included below.  It does seem to involve a bigperformance hit -- more than 10% slower on pystone.  It also uses alot of extra space.Note that the patch has all its initialization code inline inPyObject_Compare; moving that elsewhere will help a little.  It alsouse a bunch of function calls where macros would be more efficient.  TP> An anal compromise would be to run comparisons full speed  TP> without trying to detect loops, but if the recursion got "too  TP> deep" break out and start over with an expensive alternative  TP> that does check for loops.  The later requires machinery similar  TP> to copy.deepcopy's.It looks like the anal compromise might be necessary.  I'llre-implement the patch more carefully and see what the realeffect on performance is.JeremyIndex: object.c===================================================================RCS file: /projects/cvsroot/python/dist/src/Objects/object.c,vretrieving revision 2.67diff -r2.67 object.c239c239<                                    "__repr__ returned non-string (type %.200s)",--->                                    "__repr__ returned non-string (type %s)",276c276<                            "__str__ returned non-string (type %.200s)",--->                            "__str__ returned non-string (type %s)",300a301,328> static PyObject *cmp_state_key = NULL;>> static PyObject*> cmp_state_make_pair(v, w)>       PyObject *v, *w;> {>       PyObject *pair = PyTuple_New(2);>       if (pair == NULL)>               return NULL;>       if ((long)v <= (long)w) {>               PyTuple_SET_ITEM(pair, 0, PyInt_FromLong((long)v));>               PyTuple_SET_ITEM(pair, 1, PyInt_FromLong((long)w));>       } else {>               PyTuple_SET_ITEM(pair, 0, PyInt_FromLong((long)w));>               PyTuple_SET_ITEM(pair, 1, PyInt_FromLong((long)v));>       }>       return pair;> }>> void> cmp_state_clear_pair(dict, key)>       PyObject *dict, *key;> {>       PyDict_DelItem(dict, key);>       Py_DECREF(key);> }>>305a334,336>       PyObject *tstate_dict, *cmp_dict, *pair;>       int result;>311a343,376>       tstate_dict = PyThreadState_GetDict();>       if (tstate_dict == NULL) {>               PyErr_BadInternalCall();>               return -1;>       }> /*    fprintf(stderr, "PyObject_Compare(%X: %s, %X: %s)\n", (long)v,>               v->ob_type->tp_name, (long)w, w->ob_type->tp_name);> */>       /* XXX should initialize elsewhere */>       if (cmp_state_key == NULL) {>               cmp_state_key = PyString_InternFromString("compare_state");>               cmp_dict = PyDict_New();>               if (cmp_dict == NULL)>                       return -1;>               PyDict_SetItem(tstate_dict, cmp_state_key, cmp_dict);>       } else {>               cmp_dict = PyDict_GetItem(tstate_dict, cmp_state_key);>               if (cmp_dict == NULL)>                       return NULL;>               PyDict_SetItem(tstate_dict, cmp_state_key, cmp_dict);>       }>>       pair = cmp_state_make_pair(v, w);>       if (pair == NULL) {>               PyErr_BadInternalCall();>               return -1;>       }>       if (PyDict_GetItem(cmp_dict, pair)) {>               /* already comparing these objects.  assume they're>                  equal until shown otherwise>               */>               Py_DECREF(pair);>               return 0;>       }316a382,384>               if (PyDict_SetItem(cmp_dict, pair, pair) == -1) {>                       return -1;>               }317a386>               cmp_state_clear_pair(cmp_dict, pair);329a399,401>       if (PyDict_SetItem(cmp_dict, pair, pair) == -1) {>               return -1;>       }344a417>                               cmp_state_clear_pair(cmp_dict, pair);350,364c423,425<               else if (PyUnicode_Check(v) || PyUnicode_Check(w)) {<                       int result = PyUnicode_Compare(v, w);<                       if (result == -1 && PyErr_Occurred() && <                           PyErr_ExceptionMatches(PyExc_TypeError))<                               /* TypeErrors are ignored: if Unicode coercion<                               fails due to one of the arguments not<                               having the right type, we continue as<                               defined by the coercion protocol (see<                               above). Luckily, decoding errors are<                               reported as ValueErrors and are not masked<                               by this technique. */<                               PyErr_Clear();<                       else<                               return result;<               }--->               cmp_state_clear_pair(cmp_dict, pair);>               if (PyUnicode_Check(v) || PyUnicode_Check(w))>                       return PyUnicode_Compare(v, w);372c433,434<       if (vtp->tp_compare == NULL)--->       if (vtp->tp_compare == NULL) {>               cmp_state_clear_pair(cmp_dict, pair);374c436,439<       return (*vtp->tp_compare)(v, w);--->       }>       result = (*vtp->tp_compare)(v, w);>       cmp_state_clear_pair(cmp_dict, pair);>       return result;


[8]ページ先頭

©2009-2025 Movatter.jp