Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Hamming weight

From Wikipedia, the free encyclopedia
Number of nonzero symbols in a string
"popcnt" redirects here. For the CPU instruction in Intel x86 assembler, seeSSE4 § POPCNT_and_LZCNT.

TheHamming weight of astring is the number of symbols that are different from the zero-symbol of thealphabet used. It is thus equivalent to theHamming distance from the all-zero string of the same length. For the most typical case, a given set ofbits, this is the number of bits set to 1, or thedigit sum of thebinary representation of a given number and the₁ norm of a bit vector. In this binary case, it is also called thepopulation count,[1]popcount,sideways sum,[2] orbit summation.[3]

Examples
StringHamming weight
111014
111010004
000000000
67801234056710
A plot of Hamming weight for numbers 0 to 256.[4]

History and usage

[edit]

The Hamming weight is named after the American mathematicianRichard Hamming, although he did not originate the notion.[5] The Hamming weight of binary numbers was already used in 1899 byJames W. L. Glaisher to give a formula forthe number of odd binomial coefficients in a single row ofPascal's triangle.[6]Irving S. Reed introduced a concept, equivalent to Hamming weight in the binary case, in 1954.[7]

Hamming weight is used in several disciplines includinginformation theory,coding theory, andcryptography. Examples of applications of the Hamming weight include:

  • In modularexponentiation by squaring, the number of modular multiplications required for an exponente is log2e + weight(e). This is the reason that the public key valuee used inRSA is typically chosen to be a number of low Hamming weight.[8]
  • The Hamming weight determines path lengths between nodes inChord distributed hash tables.[9]
  • IrisCode lookups in biometric databases are typically implemented by calculating theHamming distance to each stored record.[10]
  • Incomputer chess programs using abitboard representation, the Hamming weight of a bitboard gives the number of pieces of a given type remaining in the game, or the number of squares of the board controlled by one player's pieces, and is therefore an important contributing term to the value of a position.[11]
  • Hamming weight can be used to efficiently computefind first set using the identity ffs(x) = pop(x ^ (x - 1)). This is useful on platforms such asSPARC that have hardware Hamming weight instructions but no hardware find first set instruction.[12][1]
  • The Hamming weight operation can be interpreted as a conversion from theunary numeral system tobinary numbers.[13]

Efficient implementation

[edit]

The population count of abitstring is often needed in cryptography and other applications. TheHamming distance of two wordsA andB can be calculated as the Hamming weight ofAxorB.[1]

The problem of how to implement it efficiently has been widely studied. A single operation for the calculation, or parallel operations on bit vectors areavailable on some processors. For processors lacking those features, the best solutions known are based on adding counts in a tree pattern. For example, to count the number of 1 bits in the 16-bit binary number a = 0110 1100 1011 1010, these operations can be done:

ExpressionBinaryDecimalComment
a011011001011101027834The original number
b0 = (a >> 0) & 01 01 01 01 01 01 01 0101000100000100001, 0, 1, 0, 0, 1, 0, 0Every other bit from a
b1 = (a >> 1) & 01 01 01 01 01 01 01 0100010100010101010, 1, 1, 0, 1, 1, 1, 1The remaining bits from a
c = b0 + b101011000011001011, 1, 2, 0, 1, 2, 1, 1Count of 1s in each 2-bit slice of a
d0 = (c >> 0) & 0011 0011 0011 001100010000001000011, 0, 2, 1Every other count from c
d2 = (c >> 2) & 0011 0011 0011 001100010010000100011, 2, 1, 1The remaining counts from c
e = d0 + d200100010001100102, 2, 3, 2Count of 1s in each 4-bit slice of a
f0 = (e >> 0) & 00001111 0000111100000010000000102, 2Every other count from e
f4 = (e >> 4) & 00001111 0000111100000010000000112, 3The remaining counts from e
g = f0 + f400000100000001014, 5Count of 1s in each 8-bit slice of a
h0 = (g >> 0) & 000000001111111100000000000001015Every other count from g
h8 = (g >> 8) & 000000001111111100000000000001004The remaining counts from g
i = h0 + h800000000000010019Count of 1s in entire 16-bit word

Here, the operations are as inC programming language, soX >> Y means to shift X right by Y bits, X & Y means the bitwise AND of X and Y, and + is ordinary addition. The best algorithms known for this problem are based on the concept illustrated above and are given here:[1]

//types and constants used in the functions below//uint64_t is an unsigned 64-bit integer variable type (defined in C99 version of C language)constuint64_tm1=0x5555555555555555;//binary: 0101...constuint64_tm2=0x3333333333333333;//binary: 00110011..constuint64_tm4=0x0f0f0f0f0f0f0f0f;//binary:  4 zeros,  4 ones ...constuint64_tm8=0x00ff00ff00ff00ff;//binary:  8 zeros,  8 ones ...constuint64_tm16=0x0000ffff0000ffff;//binary: 16 zeros, 16 ones ...constuint64_tm32=0x00000000ffffffff;//binary: 32 zeros, 32 onesconstuint64_th01=0x0101010101010101;//the sum of 256 to the power of 0,1,2,3...//This is a naive implementation, shown for comparison,//and to help in understanding the better functions.//This algorithm uses 24 arithmetic operations (shift, add, and).intpopcount64a(uint64_tx){x=(x&m1)+((x>>1)&m1);//put count of each  2 bits into those  2 bitsx=(x&m2)+((x>>2)&m2);//put count of each  4 bits into those  4 bitsx=(x&m4)+((x>>4)&m4);//put count of each  8 bits into those  8 bitsx=(x&m8)+((x>>8)&m8);//put count of each 16 bits into those 16 bitsx=(x&m16)+((x>>16)&m16);//put count of each 32 bits into those 32 bitsx=(x&m32)+((x>>32)&m32);//put count of each 64 bits into those 64 bitsreturnx;}//This uses fewer arithmetic operations than any other known//implementation on machines with slow multiplication.//This algorithm uses 17 arithmetic operations.intpopcount64b(uint64_tx){x-=(x>>1)&m1;//put count of each 2 bits into those 2 bitsx=(x&m2)+((x>>2)&m2);//put count of each 4 bits into those 4 bitsx=(x+(x>>4))&m4;//put count of each 8 bits into those 8 bitsx+=x>>8;//put count of each 16 bits into their lowest 8 bitsx+=x>>16;//put count of each 32 bits into their lowest 8 bitsx+=x>>32;//put count of each 64 bits into their lowest 8 bitsreturnx&0x7f;}//This uses fewer arithmetic operations than any other known//implementation on machines with fast multiplication.//This algorithm uses 12 arithmetic operations, one of which is a multiply.intpopcount64c(uint64_tx){x-=(x>>1)&m1;//put count of each 2 bits into those 2 bitsx=(x&m2)+((x>>2)&m2);//put count of each 4 bits into those 4 bitsx=(x+(x>>4))&m4;//put count of each 8 bits into those 8 bitsreturn(x*h01)>>56;//returns left 8 bits of x + (x<<8) + (x<<16) + (x<<24) + ...}

The above implementations have the best worst-case behavior of any known algorithm. However, when a value is expected to have few nonzero bits, it may instead be more efficient to use algorithms that count these bits one at a time. As Wegner described in 1960,[14] thebitwise AND ofx withx − 1 differs fromx only in zeroing out the least significant nonzero bit: subtracting 1 changes the rightmost string of 0s to 1s, and changes the rightmost 1 to a 0. Ifx originally hadn bits that were 1, then after onlyn iterations of this operation,x will be reduced to zero. The following implementation is based on this principle.

//This is better when most bits in x are 0//This algorithm works the same for all data sizes.//This algorithm uses 3 arithmetic operations and 1 comparison/branch per "1" bit in x.intpopcount64d(uint64_tx){intcount;for(count=0;x;count++)x&=x-1;returncount;}

Of interest here is theclose relationship between Popcount, FFS and CLZ.

If greater memory usage is allowed, we can calculate the Hamming weight faster than the above methods. With unlimited memory, we could simply create a large lookup table of the Hamming weight of every 64 bit integer. If we can store a lookup table of the hamming function of every 16 bit integer, we can do the following to compute the Hamming weight of every 32 bit integer.

staticuint8_twordbits[65536]={/* bitcounts of integers 0 through 65535, inclusive */};//This algorithm uses 3 arithmetic operations and 2 memory reads.intpopcount32e(uint32_tx){returnwordbits[x&0xFFFF]+wordbits[x>>16];}
//Optionally, the wordbits[] table could be filled using this functionintpopcount32e_init(void){uint32_ti;uint16_tx;intcount;for(i=0;i<=0xFFFF;i++){x=i;for(count=0;x;count++)// borrowed from popcount64d() abovex&=x-1;wordbits[i]=count;}}

A recursive algorithm is given in Donovan & Kernighan[15]

/* The weight of i can differ from the weight of i / 2only in the least significant bit of i */intpopcount32e_init(void){inti;for(i=1;sizeofwordbits/sizeof*wordbits>i;++i)wordbits[i]=wordbits[i>>1]+(1&i);}

Muła et al.[16] have shown that a vectorized version of popcount64b can run faster than dedicated instructions (e.g., popcnt on x64 processors).

The Harley–Seal algorithm[17] is one of the fastest that also only needs integer operations.[18]

Minimum weight

[edit]

Inerror-correcting coding, the minimum Hamming weight, commonly referred to as theminimum weightwmin of a code is the weight of the lowest-weight non-zero code word. The weightw of a code word is the number of 1s in the word. For example, the word 11001010 has a weight of 4.

In alinear block code the minimum weight is also theminimum Hamming distance (dmin) and defines the error correction capability of the code. Ifwmin = n, thendmin = n and the code will correct up todmin/2 errors.[19]

Language support

[edit]

Some C compilers provide intrinsic functions that provide bit counting facilities. For example,GCC (since version 3.4 in April 2004) includes a builtin function__builtin_popcount that will use a processor instruction if available or an efficient library implementation otherwise.[20]LLVM-GCC has included this function since version 1.5 in June 2005.[21]

In theC++ Standard Library, the bit-array data structurebitset has acount() method that counts the number of bits that are set. InC++20, a new header<bit> was added, containing functionsstd::popcount andstd::has_single_bit, taking arguments of unsigned integer types.

In Java, the growable bit-array data structureBitSet has aBitSet.cardinality() method that counts the number of bits that are set. In addition, there areInteger.bitCount(int) andLong.bitCount(long) functions to count bits in primitive 32-bit and 64-bit integers, respectively. Also, theBigInteger arbitrary-precision integer class also has aBigInteger.bitCount() method that counts bits.

InPython, theint type has abit_count() method to count the number of bits set. This functionality was introduced in Python 3.10, released in October 2021.[22]

InCommon Lisp, the functionlogcount, given a non-negative integer, returns the number of 1 bits. (For negative integers it returns the number of 0 bits in 2's complement notation.) In either case the integer can be a BIGNUM.

Starting inGHC 7.4, theHaskell base package has apopCount function available on all types that are instances of theBits class (available from theData.Bits module).[23]

MySQL version ofSQL language providesBIT_COUNT() as a standard function.[24]

Fortran 2008 has the standard, intrinsic, elemental functionpopcnt returning the number of nonzero bits within an integer (or integer array).[25]

Some programmable scientific pocket calculators feature special commands to calculate the number of set bits, e.g.#B on theHP-16C.[3]

FreePascal implements popcnt since version 3.0.[26]

Processor support

[edit]

See also

[edit]

References

[edit]
  1. ^abcdefgWarren Jr., Henry S. (2013) [2002].Hacker's Delight (2 ed.).Addison Wesley -Pearson Education, Inc. pp. 81–96.ISBN 978-0-321-84268-8. 0-321-84268-5.
  2. ^Knuth, Donald Ervin (2009). "Bitwise tricks & techniques; Binary Decision Diagrams".The Art of Computer Programming. Vol. 4, Fascicle 1.Addison–Wesley Professional.ISBN 978-0-321-58050-4. (NB. Draft ofFascicle 1bArchived 2016-03-12 at theWayback Machine available for download.)
  3. ^abHewlett-Packard HP-16C Computer Scientist Owner's Handbook(PDF).Hewlett-Packard Company. April 1982. 00016-90001.Archived(PDF) from the original on 2017-03-28. Retrieved2017-03-28.
  4. ^R.Ugalde, Laurence."Population count the Fōrmulæ programming language".Fōrmulæ. Retrieved2024-06-02.
  5. ^Thompson, Thomas M. (1983).From Error-Correcting Codes through Sphere Packings to Simple Groups. The Carus Mathematical Monographs #21.The Mathematical Association of America. p. 33.
  6. ^Glaisher, James Whitbread Lee (1899)."On the residue of a binomial-theorem coefficient with respect to a prime modulus".The Quarterly Journal of Pure and Applied Mathematics.30:150–156. (NB. See in particular the final paragraph of p. 156.)
  7. ^Reed, Irving Stoy (1954). "A Class of Multiple-Error-Correcting Codes and the Decoding Scheme".IRE Professional Group on Information Theory. PGIT-4.Institute of Radio Engineers (IRE):38–49.
  8. ^Cohen, Gérard D.; Lobstein, Antoine; Naccache, David; Zémor, Gilles (1998). "How to improve an exponentiation black-box". In Nyberg, Kaisa (ed.).Advances in Cryptology – EUROCRYPT '98, International Conference on the Theory and Application of Cryptographic Techniques, Espoo, Finland, May 31 – June 4, 1998, Proceeding. Lecture Notes in Computer Science. Vol. 1403. Springer. pp. 211–220.doi:10.1007/BFb0054128.ISBN 978-3-540-64518-4.
  9. ^Stoica, I.; Morris, R.; Liben-Nowell, D.; Karger, D. R.; Kaashoek, M. F.; Dabek, F.; Balakrishnan, H. (February 2003). "Chord: a scalable peer-to-peer lookup protocol for internet applications".IEEE/ACM Transactions on Networking.11 (1):17–32.Bibcode:2003ITNet..11...17S.doi:10.1109/TNET.2002.808407.S2CID 221276912.Section 6.3: "In general, the number of fingers we need to follow will be the number of ones in the binary representation of the distance from node to query."
  10. ^Kong, A.W.K.; Zhang, D.; Kamel, M.S. (February 2010). "An analysis of IrisCode".IEEE Transactions on Image Processing.19 (2):522–532.Bibcode:2010ITIP...19..522K.doi:10.1109/tip.2009.2033427.PMID 20083454.
  11. ^Heinz, E.A. (September 1997). "How Darkthought plays chess".ICGA Journal.20 (3):166–176.doi:10.3233/icg-1997-20304. Updated and reprinted inScalable Search in Computer Chess (Vieweg+Teubner Verlag, 2000), pp. 185–198,doi:10.1007/978-3-322-90178-1_13
  12. ^abSPARC International, Inc. (1992). "A.41: Population Count. Programming Note".The SPARC architecture manual: version 9 (Version 9 ed.). Englewood Cliffs, New Jersey, USA:Prentice Hall. pp. 205.ISBN 0-13-825001-4.
  13. ^Blaxell, David (1978). Hogben, David; Fife, Dennis W. (eds.)."Record linkage by bit pattern matching".Computer Science and Statistics--Tenth Annual Symposium on the Interface. NBS Special Publication.503.U.S. Department of Commerce /National Bureau of Standards:146–156.
  14. ^Wegner, Peter (May 1960)."A technique for counting ones in a binary computer".Communications of the ACM.3 (5): 322.doi:10.1145/367236.367286.S2CID 31683715.
  15. ^Donovan, Alan; Kernighan, Brian (2016).The Go Programming Language. Addison-Weseley.ISBN 978-0-13-419044-0.
  16. ^Muła, Wojciech; Kurz, Nathan; Lemire, Daniel (January 2018). "Faster Population Counts Using AVX2 Instructions".Computer Journal.61 (1):111–120.arXiv:1611.07612.doi:10.1093/comjnl/bxx046.S2CID 540973.
  17. ^"Sse-popcount/Popcnt-harley-seal.CPP at master · WojciechMula/Sse-popcount".GitHub.
  18. ^Muła, Wojciech; Kurz, Nathan; Lemire, Daniel (2018). "Faster Population Counts Using AVX2 Instructions".The Computer Journal.61:111–120.arXiv:1611.07612.doi:10.1093/comjnl/bxx046.
  19. ^Stern & Mahmoud,Communications System Design,Prentice Hall, 2004, p 477ff.
  20. ^"GCC 3.4 Release Notes".GNU Project.
  21. ^"LLVM 1.5 Release Notes".LLVM Project.
  22. ^"What's New In Python 3.10".python.org.
  23. ^"GHC 7.4.1 release notes". GHC documentation.
  24. ^"Chapter 12.11. Bit Functions — MySQL 5.0 Reference Manual".
  25. ^Metcalf, Michael; Reid, John; Cohen, Malcolm (2011).Modern Fortran Explained.Oxford University Press. p. 380.ISBN 978-0-19-960142-4.
  26. ^"Free Pascal documentation popcnt". Retrieved2019-12-07.
  27. ^"JDK-6378821: bitCount() should use POPC on SPARC processors and AMD+10h".Java bug database. 2006-01-30.
  28. ^Blackfin Instruction Set Reference (Preliminary ed.).Analog Devices. 2001. pp. 8–24. Part Number 82-000410-14.
  29. ^Wolf, Claire (2019-03-22)."RISC-V "B" Bit Manipulation Extension for RISC-V, Draft v0.37"(PDF).Github.

Further reading

[edit]

External links

[edit]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Hamming_weight&oldid=1306107874"
Category:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp