Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

🐍 mecab-python. you can find original version here:http://taku910.github.io/mecab/

License

NotificationsYou must be signed in to change notification settings

SamuraiT/mecab-python3

Repository files navigation

Current PyPI packagesTest StatusPyPI - DownloadsSupported Platforms

mecab-python3

This is a Python wrapper for theMeCab morphological analyzer for Japanesetext. It currently works with Python 3.8 and greater.

Note: If using MacOS Big Sur, you'll need to upgrade pip to version 20.3 orhigher to use wheels due to a pip issue.

issueを英語で書く必要はありません。

Note that Windows wheels require aMicrosoft Visual C++Redistributable, so be sure to install that.

Basic usage

>>>importMeCab>>>wakati=MeCab.Tagger("-Owakati")>>>wakati.parse("pythonが大好きです").split()['python','が','大好き','です']>>>tagger=MeCab.Tagger()>>>print(tagger.parse("pythonが大好きです"))pythonpythonpythonpython名詞-普通名詞-一般助詞-格助詞大好きダイスキダイスキ大好き形状詞-一般ですデスデスです助動詞助動詞-デス終止形-一般EOS

The API formecab-python3 closely follows the API for MeCab itself,even when this makes it not very “Pythonic.” Please consult theofficial MeCabdocumentation for more information.

Installation

Binary wheels are available for MacOS X, Linux, and Windows (64bit) areinstalled by default when you usepip:

pip install mecab-python3

These wheels include a copy of the MeCab library, but not a dictionary. Inorder to use MeCab you'll need to install a dictionary.unidic-lite is a goodone to start with:

pip install unidic-lite

To build from source using pip,

pip install --no-binary :all: mecab-python3

Dictionaries

In order to use MeCab, you must install a dictionary. There are many different dictionaries available for MeCab. These UniDic packages, which include slight modifications for ease of use, are recommended:

  • unidic: The latest full UniDic.
  • unidic-lite: A slightly modified UniDic 2.1.2, chosen for its small size.

The dictionaries below are not recommended due to being unmaintained for many years, but they are available for use with legacy applications.

For more details on the differences between dictionaries seehere.

Common Issues

If you get aRuntimeError when you try to run MeCab, here are some things to check:

Windows Redistributable

You have to installthis to use this package on Windows.

Installing a Dictionary

Runpip install unidic-lite and confirm that works. If that fixes yourproblem, you either don't have a dictionary installed, or you need to specifyyour dictionary path like this:

tagger = MeCab.Tagger('-r /dev/null -d /usr/local/lib/mecab/dic/mydic')

Note: on Windows, usenul instead of/dev/null. Alternately, if you have amecabrc you can use the path after-r.

Specifying a mecabrc

If you get this error:

error message: [ifs] no such file or directory: /usr/local/etc/mecabrc

You need to specify amecabrc file. It's OK to specify an empty file, it justhas to exist. You can specify amecabrc with-r. This may be necessary onDebian or Ubuntu, where themecabrc is in/etc/mecabrc.

You can specify an emptymecabrc like this:

tagger = MeCab.Tagger('-r/dev/null -d/home/hoge/mydic')

Using Unsupported Output Modes like-Ochasen

Chasen output is not a built-in feature of MeCab, you must specify it in yourdicrc ormecabrc. Notably, Unidic does not include Chasen output format.Please seethe MeCab documentation.

Alternatives

  • fugashi is a Cython wrapper for MeCab with a Pythonic interface, by the current maintainer of this library
  • SudachiPy is a modern tokenizer with an actively maintained dictionary
  • pymecab-ko is a wrapper of the Korean MeCab forkmecab-ko based on mecab-python3
  • KoNLPy is a library for Korean NLP that includes a MeCab wrapper

Licensing

Like MeCab itself,mecab-python3 is copyrighted free software byTaku Kudotaku@chasen.org and Nippon Telegraph and Telephone Corporation,and is distributed under a 3-clause BSD license (see the fileBSD).Alternatively, it may be redistributed under the terms of theGNU General Public License, version 2 (see the fileGPL) or theGNU Lesser General Public License, version 2.1 (see the fileLGPL).


[8]ページ先頭

©2009-2025 Movatter.jp