Movatterモバイル変換
[0]ホーム
ANN: mxNumber -- Experimental Number Types, Version 0.2.0
Brian Kelleykelley at bioreason.com
Mon Apr 30 11:07:20 EDT 2001
> That's fine for C, but makes no sense in a Python interface; i.e., wtf is> MAX_ULONG in Python terms? Python doesn't even have an unsigned integral> type.>> So that's where the silly arguments start. Just pick *something*. For> example, sys.maxint is closest in spirit to MAX_ULONG, but shares the defect> of the GMP definition that it's ambiguous whether it means "infinity" or "a> whole lot but nevertheless finite" in this context. -1 would make more sense> for Python, and is not ambiguous; GMP doesn't have that choice, though, since> it returns an unsigned result.>Hmmm. Looks like I missed most of the previous discussions, I'll have to huntdejanews.> > more good stuff at> >http://www.swox.com/gmp/manual/gmp_6.html#SEC30>> Right, they have lots of good stuff. The functions aren't all well-defined> in Python terms, though, and sometimes not even in C terms; e.g.,>> Function: unsigned long int> mpz_scan1 (mpz_t op, unsigned long int starting_bit)> Scan op, starting with bit starting_bit, towards more significant> bits, until the first set bit is found. Return the index of the> found bit.>> The docs there really don't define what "starting_bit" or "index" mean> (perhaps 0-based, with index i being bit 2**i? i.e., starting with 0 "from> the right"?). Then what do you think mpz_scan1(0, 0) returns? That is,> there are no 1 bits in 0 for scan1 to find. I can guess that they return> MAX_ULONG again in such cases, but they don't say so, and as above -1 is> probably a better result for Python to return.>I was confused by this as well, I had to expose the function and play with it tofigure out what they meant.>> > This is more what I meant:> >> > >>i = mx.Number.Integer("100101011101010")> > >>pickle.dump(i,0)> > "cmx.Number\n_I\np0\n(S'10101010101010'\np1\ntp2\nRp3\n."> >> > The string S'10101010101010' is a fairly wasteful encoding for a> > bit vector.>> Sure. Is it actually a problem for you in practice, or is just something> that offends because it's provably less than optimal? Note that text-mode> pickles are *meant* to be easily human-readable too, and there's no clearer> way to "encode" the decimal integer 100101011101010 than as the string> "10101010101010"It is a problem in practice. I am writing a caching system for bit vectors andresponse time is important. I have no problem with text mode pickles, it justseems slightly odd that the binary mode uses (essentially) the same encodingwhile marshal seems to have a much more efficient binary encoding.> -- Python does the same for its own native long (unbounded> int) pickles. A mild compromise would be to use a hex string instead (still> easily readable, encodes 4 bits per byte instead of ~3.3, and should be very> much faster for pickle<->internal conversions of very long ints).I was thinking along these same lines.Anyway, it seems like I can avoid the whole problem by renaming mx.Number.Integeras "BitVector" This is what I am using this structure for anyway. Then I canavoid all of these problems. So let me ask this question, would anyone mind acontributed type BitVector to mx.Number? Then I can add all of the fun stufflike Tanimoto, Euclidean and Jaccard distances...Thanks for listening.Brian Kelley
More information about the Python-listmailing list
[8]ページ先頭