Movatterモバイル変換


[0]ホーム

URL:


Skip to contents

cld3

R Wrapper for Google’s Compact Language Detector 3

Project Status: Active – The project has reached a stable, usable state and is being actively developed.CRAN RStudio mirror downloads

Google’s Compact Language Detector 3 is a neural network model for language identification and the successor of CLD2 (available from) CRAN. This version is still experimental and uses a novell algorithm with different properties and outcomes. For more information see:https://github.com/google/cld3#readme

Example

The functiondetect_language() is vectorised and guesses the the language of each string in text or returns NA if the language could not reliably be determined.

>library(cld3)>example(cld3)cld3># Vectorized best guesscld3>detect_language(c("To be or not to be?","Ce n'est pas grave.","猿も木から落ちる"))[1]"en""fr""ja"

The functiondetect_language_multi() is not vectorised and detects all languages inside the entire character vector as a whole.

cld3># Multiple languages in one textcld3>detect_language_mixed("This piece of text is in English. Този текст е на Български.",size =3)  language probability reliable proportion1       bg0.9173891TRUE0.58536582       en0.9999790TRUE0.41463413      und0.0000000FALSE0.0000000

Installation

Binary packages forOS-X orWindows can be installed directly from CRAN:

Installation from source on Linux or OSX requires Google’sProtocol Buffers library. OnDebian or Ubuntu installlibprotobuf-dev andprotobuf-compiler:

sudo apt-get install -y libprotobuf-dev protobuf-compiler

OnFedora we needprotobuf-devel:

sudo yum install protobuf-devel

OnCentOS / RHEL we install [protobuf-devel](https://src.fedoraproject.org/rpms/protobuf via EPEL:

sudo yum install epel-releasesudo yum install protobuf-devel

OnOS-X useprotobuf from Homebrew:

brew install protobuf

Links

License

  • Apache License 2.0

Citation

Developers

R-universe

  • cld3 status badge


[8]ページ先頭

©2009-2025 Movatter.jp