Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A parser for Japanese number (Kanji, arabic) in the natural language.

License

NotificationsYou must be signed in to change notification settings

takumakanari/japanese-numbers-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CircleCI

A parser for Japanese number (Kanji, arabic) in the natural language.

The modulejapanese_numbers finds any numbers in the natural language, and converts to arabic numerals.The followings are example patterns what can be parsed.

  • 二千万百一円
  • 5百万
  • 一を聞いて十を知る
  • 五〇六号室

Installation

pip install japanese-numbers-python

Usage

Functionto_arabic andto_arabic_numbers are almost stable.

to_arabic returns An array of[japanese_numbers.result.ParsedResult].

importjapanese_numbersjapanese_numbers.to_arabic('銀河の向こう、六千三百二十一億千五百十一万二千百八十一光年彼方。')# => [<ParsedResult 632115112181 : "六千三百二十一億千五百十一万二千百八十一" index=7>]japanese_numbers.to_arabic('一を聞いて十を知る。')# => [<ParsedResult 1 : "一" index=0>, <ParsedResult 10 : "十" index=5>]

Then you can see a numeric value (and others) in the instance ofParsedResult like as follows:

result=japanese_numbers.to_arabic('一を聞いて十を知る。')result[0].number# => 1result[0].text# => '一'result[0].index# => 0 as position that number was foundresult[1].number# => 10result[1].text# => '十'result[1].index# => 5

to_arabic_numbers returns a tuple of numbers directly.

importjapanese_numbersjapanese_numbers.to_arabic_numbers('一を聞いて十を知る。')# => (1, 10)

Charsets

Bothto_arabic_numbers,to_arabic getencode option to specify encode of input.

It'sutf8 by default, if you put non-unicode string into functions, it will be converted to unicode by using its encode first.

japanese_numbers.to_arabic_numbers('一を聞いて十を知る。')# utf8 by defaultjapanese_numbers.to_arabic('一を聞いて十を知る。',encode='eucjp')# set another charset

TODO

  • support float/double types
  • support negative types

Patch

Welcome!

About

A parser for Japanese number (Kanji, arabic) in the natural language.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp