Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

🏣📮〠 Japanese postal code data.

License

NotificationsYou must be signed in to change notification settings

polm/posuto

Repository files navigation

Current PyPI packages

Posuto is a wrapper for thepostal codedata distributed by JapanPost. It makes mapping Japanese postal codes to addresses easier than workingwith the raw CSV.

You can read more about the motivations for posuto inParsing the InfamousJapanese Postal CSV.

issueを英語で書く必要はありません。

Postbox character by Irasutoya

Features:

  • multi-line neighborhoods are joined
  • parenthetical notes are put in a separate field
  • change reasons are converted from flags to labels
  • kana records are unified for easy access
  • codes with multiple areas provide a list of alternates

Romaji provided by JP Post were previously included in this library, but they are extremely low quality and hard to sync, due to being updated separately. If you need romaji it is recommended you usecutlet instead.

To install:

pip install posuto

Example usage:

import posuto as 〒🗼 = 〒.get('〒105-0011')print(🗼)# "東京都港区芝公園"print(🗼.prefecture)# "東京都"print(🗼.kana)# "トウキョウトミナトクシバコウエン"print(🗼.note)# None

Note: Unfortunately 〒 and 🗼 are not valid identifiers in Python, so theabove is pseudocode. Seeexamples/sample.py for an executable version.

You can provide a postal code with basic formatting, and postal data will bereturned as a named tuple with a few convenience functions. Read on for detailsof how quirks in the original data are handled.

Details

The original CSV files are managed in source control here but are notdistributed as part of the pip package. Instead, the CSV is converted to JSON,which is then put into an sqlite db and included in the package distribution.That means most of the complexity in code in this package is actually in thebuild and not at runtime.

The postal code data has many irregularities and strange parts. This explainshow they're dealt with.

As another note, in normal usage posuto doesn't require any dependencies. Whenactually building the postal data from the raw CSVsmojimoji is used for characterconversion and iconv for encoding conversion.

Field names

The primary fields of an address and the translations preferred here for each are:

  • 都道府県: prefecture
  • 市区町村: city
  • 町域名: neighborhood
    # 🗼    tt = posuto.get('〒105-0011')    print(tt.prefecture, tt.city, tt.neighborhood)    # "東京都 港区 芝公園"

Notes

The postal data often includes notes in the neighborhood field. These arealways in parenthesis with one exception, "以下に掲載がない場合". All notes areput in thenotes field, and no attempt is made to extract their yomigana orromaji (which are often not available anyway).

minatoku = posuto.get('1050000')print(minatoku.note)# "以下に掲載がない場合"

Yomigana

Yomigana are converted to full-width kana.

Long Neighborhood Names

The postal data README explains that when the neighborhood field is over 38characters it will be continued onto multiple lines. This is not explicitlymarked in the data, and where line breaks are inserted in long neighborhoodsappears to be random (it's often neither after the 38th character nor at areasonable word boundary). The only indicator of long lines is an unclosedparenthesis on the first line. Such long lines are always in order in theoriginal file.

In posuto, the parenthetical information is considered a note and put inthenote field.

omiya = posuto.get('6020847')print(omiya)# "京都府京都市上京区大宮町"print(omiya.note)# "今出川通河原町西入、今出川通寺町東入、今出川通寺町東入下る、河原町通今出川下る、河原町通今出川下る西入、寺町通今出川下る東入、中筋通石薬師上る"

Multiple Regions in One Code

Sometimes a postal code covers multiple regions. Often the city is the same andjust the neighborhood varies, but sometimes part of the city field varies, oreven the whole city field. Codes like this are indicated by the"一つの郵便番号で二以上の町域を表す場合の表示" field in the original CSV data,which is calledmulti here.

For now, if more than one region uses multiple codes, the main entry is for thefirst region listed in the main CSV, and other regions are stored as a list inthealternates property. There may be a better way to do this.

Programming Notes

This section is for notes on the use of the library itself as opposed to notesabout the data structure.

Multi-threaded Environments

By default, posuto creates a DB connection and cursor on startup and reuses itfor all requests. In the typical single-threaded, read-only scenario this isnot a problem, but it causes warnings (and may cause problems) in amulti-threaded scenario. In that case you can manage db connections manuallyusing a context manager object.

from posuto import Posutowith Posuto() as pp:    tower = pp.get('〒105-0011')

Using the object this way the connection will be automatically closed when thewith block is exited.

License

The original postal data is provided by JP Post with an indication they willnot assert copyright. The code in this repository is released under the MIT orWTFPL license.


[8]ページ先頭

©2009-2025 Movatter.jp