Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A tool that divides Japanese full names into family and given names.

License

NotificationsYou must be signed in to change notification settings

rskmoi/namedivider-python

Repository files navigation

NameDivider Logo

PyPI versionPython versionsPyPI downloadsCI

NameDivider is a tool that divides Japanese full names into family and given names.

🚀 Try Live Demo📖 Documentation (日本語)🐳 Docker API⚡ Rust Version


💡 Why NameDivider?

Japanese full names like "菅義偉" are typically stored as single strings with no clear boundary between family and given names. NameDivider solves this with exceptional accuracy.

Unlike cloud-based AI solutions, NameDivider processes all data locally — no external API calls, no data transmission, and full privacy control.

# Beforeperson_name="菅義偉"# How do you know where to divide?# AfterfromnamedividerimportBasicNameDividerdivider=BasicNameDivider()result=divider.divide_name("菅義偉")print(f"Family:{result.family}, Given:{result.given}")# Family: 菅, Given: 義偉

✨ Key Features

  • 🎯99.91% accuracy - Tested on real-world Japanese names
  • Multiple algorithms - Choose between speed (Basic) or accuracy (GBDT)
  • 🔐Privacy-first – Local-only processing, ideal for sensitive data
  • 🔧Production ready - CLI, Python library, and Docker support
  • 🎨Interactive demo - Try it live with Streamlit
  • 📊Confidence scoring - Know when to trust the results
  • 🛠️Customizable rules - Add domain-specific patterns

🚀 Quick Start

Installation

pip install namedivider-python

Basic Usage

fromnamedividerimportBasicNameDivider,GBDTNameDivider# Fast but good accuracy (99.3%)basic_divider=BasicNameDivider()result=basic_divider.divide_name("菅義偉")print(result)# 菅 義偉# Slower but best accuracy (99.9%)gbdt_divider=GBDTNameDivider()result=gbdt_divider.divide_name("菅義偉")print(result.to_dict())# {#   'algorithm': 'gbdt',#   'family': '菅',#   'given': '義偉',#   'score': 0.7300634880343344,#   'separator': ' '# }

🔧 Multiple Interfaces

🖥️ Command Line Interface

Perfect for batch processing and automation:

# Single name$ nmdiv name 菅義偉菅 義偉# Process file with progress bar$ nmdiv file customer_names.txt100%|██████████| 1000/1000 [00:02<00:00, 431.2it/s]# Check accuracy on labeled data$ nmdiv accuracy test_data.txtAccuracy: 99.1%

🐳 REST API (Docker)

For environments where Python cannot be used, we provide a containerized REST API:

# Run the API serverdocker run -d -p 8000:8000 rskmoi/namedivider-api# Send batch requestscurl -X POST localhost:8000/divide \  -H"Content-Type: application/json" \  -d'{"names": ["竈門炭治郎", "竈門禰豆子"]}'

Response:

{"divided_names": [    {"family":"竈門","given":"炭治郎","separator":"","score":0.3004587452426102,"algorithm":"kanji_feature"},    {"family":"竈門","given":"禰豆子","separator":"","score":0.30480429696983175,"algorithm":"kanji_feature"}  ]}

🎯 Interactive Web Demo

Try NameDivider instantly in your browser:Live Demo →

Run locally:

cd examples/demopip install -r requirements.txtstreamlit run example_streamlit.py

📊 Performance & Benchmarks

AlgorithmAccuracySpeed (names/sec)Use Case
BasicNameDivider / backend=python99.3%4152.8Stable & compatible
BasicNameDivider / backend=rust99.3%18597.7Max performance (if available)
GBDTNameDivider / backend=python99.9%1143.3Best accuracy, guaranteed
GBDTNameDivider / backend=rust99.9%6277.4Fast + accurate (if available)

Run your own benchmarks:

bash scripts/benchmark_sample.sh

🛠️ Advanced Features

Custom Rules

Handle domain-specific names with custom patterns:

fromnamedividerimportBasicNameDivider,BasicNameDividerConfigfromnamedividerimportSpecificFamilyNameRuleconfig=BasicNameDividerConfig(custom_rules=[SpecificFamilyNameRule(family_names=["竜胆"]),# Rare family names    ])divider=BasicNameDivider(config=config)result=divider.divide_name("竜胆尊")# DividedName(family='竜胆', given='尊', separator=' ', score=1.0, algorithm='rule_specific_family')

Speed Up

For high-volume processing, NameDivider offers several optimization options:

fromnamedividerimportBasicNameDivider,BasicNameDividerConfig# Load your nameswithopen("names.txt","r",encoding="utf-8")asf:names= [line.strip()forlineinf]# Option 1: Enable caching (faster repeated processing)config=BasicNameDividerConfig(cache_mask=True)divider=BasicNameDivider(config=config)results= [divider.divide_name(name)fornameinnames]# Option 2: (beta) Use Rust backend (up to 4x faster)# First install: pip install namedivider-coreconfig=BasicNameDividerConfig(backend="rust")divider=BasicNameDivider(config=config)results= [divider.divide_name(name)fornameinnames]

🏢 Typical Use Cases

  • Customer Data Processing - Clean and standardize name databases
  • Form Validation - Real-time name splitting in web applications
  • Analytics & Reports - Generate family name statistics
  • Data Migration - Convert legacy systems with combined name fields
  • Government & Municipal - Process citizen registration data
  • Security-sensitive Environments - Process nameswithout sending data to external APIs

📚 Examples & Tutorials

📄 License

Source code and gbdt_model_v1.txt

MIT License

bert_katakana_v0_3_0.pt

cc-by-sa-4.0

family_name_repository.pickle

English

(1) Purpose of use

family_name_repository.pickle is available for commercial/non-commercial use if you use this software to divide name, and to develop algorithms for dividing name.

Any other use of family_name_repository.pickle is prohibited.

(2) Liability

The author or copyright holder assumes no responsibility for the software.

Japanese / 日本語

(1) 利用目的

このソフトウェアを用いて姓名分割、および姓名分割アルゴリズムの開発をする場合、family_name_repository.pickleは商用/非商用問わず利用可能です。

それ以外の目的でのfamily_name_repository.pickleの利用を禁じます。

(2) 責任

作者または著作権者は、family_name_repository.pickleに関して一切の責任を負いません。

The family name data used in family_name_repository.pickle is provided by Myoji-Yurai.net(名字由来net).

🔗 Related Projects

📈 Project Stats

GitHub starsGitHub forksDocker Pulls

Trusted by developers worldwide


Made with ❤️ by@rskmoi • Contact@rskmoi

About

A tool that divides Japanese full names into family and given names.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp