Provide a simple, non-locale aware way to format a numberwith a thousands separator.
Adding thousands separators is one of the simplest ways tohumanize a program’s output, improving its professional appearanceand readability.
In the finance world, output with thousands separators is the norm.Finance users and non-professional programmers find the localeapproach to be frustrating, arcane and non-obvious.
The locale module presents two other challenges. First, it isa global setting and not suitable for multi-threaded apps thatneed to serve-up requests in multiple locales. Second, thename of a relevant locale (such as “de_DE”) can vary fromplatform to platform or may not be defined at all. The docsfor the locale module describe these andmany other challengesin detail.
It is not the goal to replace the locale module, to performinternationalization tasks, or accommodate every possibleconvention. Such tasks are better suited to robust tools likeBabel. Instead, the goal is to make a common, everydaytask easier for many users.
A comma will be added to the format() specifier mini-language:
[[fill]align][sign][#][0][width][,][.precision][type]
The ‘,’ option indicates that commas should be included in theoutput as a thousands separator. As with locales which do notuse a period as the decimal point, locales which use adifferent convention for digit separation will need to use thelocale module to obtain appropriate formatting.
The proposal works well with floats, ints, and decimals.It also allows easy substitution for other separators.For example:
format(n,"6,d").replace(",","_")
This technique is completely general but it is awkward in theone case where the commas and periods need to be swapped:
format(n,"6,f").replace(",","X").replace(".",",").replace("X",".")
Thewidth argument means the total length including the commasand decimal point:
format(1234,"08,d")-->'0001,234'format(1234.5,"08,.1f")-->'01,234.5'
The ‘,’ option is defined as shown above for types ‘d’, ‘e’,‘f’, ‘g’, ‘E’, ‘G’, ‘%’, ‘F’ and ‘’. To allow future extensions, it isundefined for other types: binary, octal, hex, character,etc.
This proposal has the virtue of being simpler than the alternativeproposal but is much less flexible and meets the needs of fewerusers right out of the box. It is expected that some othersolution will arise for specifying alternative separators.
Scanning the web, I’ve found that thousands separators areusually one of COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.
C-Sharp provides both styles (picture formatting and type specifiers).The type specifier approach is locale aware. The picture formatting onlyoffers a COMMA as a thousands separator:
String.Format("{0:n}",12400)==>"12,400"String.Format("{0:0,0}",12400)==>"12,400"
Common Lisp uses a COLON before the~D decimal type specifier toemit a COMMA as a thousands separator. The general form of~D is~mincol,padchar,commachar,commaintervalD. Thepadchar defaultsto SPACE. Thecommachar defaults to COMMA. Thecommaintervaldefaults to three.
(formatnil"~:D"229345007)=>"229,345,007"
Visual Basic and its brethren (likeMS Excel) use a completelydifferent style and have ultra-flexible custom formatspecifiers like:
"_($* #,##0_)".
COBOL uses picture clauses like:
PICTURE $***,**9.99CR
Java offers aDecimal.Format Class that uses picture patterns (onefor positive numbers and an optional one for negatives) such as:"#,##0.00;(#,##0.00)". It allows arbitrary groupings includinghundreds and ten-thousands and uneven groupings. The special patterncharacters are non-localized (using a DOT for a decimal separator anda COMMA for a grouping separator). The user can supply an alternateset of symbols using the formatter’sDecimalFormatSymbols object.
Make both the thousands separator and decimal separator userspecifiable but not locale aware. For simplicity, limit thechoices to a COMMA, DOT, SPACE, APOSTROPHE or UNDERSCORE.The SPACE can be either U+0020 or U+00A0.
Whenever a separator is followed by a precision, it is adecimal separator and an optional separator preceding it is athousands separator. When the precision is absent, a lonespecifier means a thousands separator:
[[fill]align][sign][#][0][width][tsep][dsep precision][type]
Examples:
format(1234,"8.1f")-->' 1234.0'format(1234,"8,1f")-->' 1234,0'format(1234,"8.,1f")-->' 1.234,0'format(1234,"8 ,f")-->' 1 234,0'format(1234,"8d")-->' 1234'format(1234,"8,d")-->' 1,234'format(1234,"8_d")-->' 1_234'
This proposal meets mosts needs, but it comes at the expenseof taking a bit more effort to parse. Not every possibleconvention is covered, but at least one of the options (spacesor underscores) should be readable, understandable, and usefulto folks from many diverse backgrounds.
As shown in the examples, thewidth argument means the totallength including the thousands separators and decimal separators.
No change is proposed for the locale module.
The thousands separator is defined as shown above for types‘d’, ‘e’, ‘f’, ‘g’, ‘%’, ‘E’, ‘G’ and ‘F’. To allow futureextensions, it is undefined for other types: binary, octal,hex, character, etc.
The drawback to this alternative proposal is the difficultyof mentally parsing whether a single separator is a thousandsseparator or decimal separator. Perhaps it is too arcaneto link the decimal separator with the precision specifier.
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0378.rst
Last modified:2025-02-01 08:59:27 GMT