- Notifications
You must be signed in to change notification settings - Fork19
A lemmatizer implemented in Go
License
aaaton/golem
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project is a dictionary based lemmatizer written in go.
Since v4 all dictionaries need to be gotten individually.
go get github.com/aaaton/golem/v4
Alemmatizer is a tool that finds the base form of words.
Lang | Input | Output |
---|---|---|
English | aligning | align |
Swedish | sprungit | springa |
French | abattaient | abattre |
It's based on the dictionaries found onmichmech/lemmatization-lists, which are available under theOpen Database License. This project would not be feasible without them.
At the moment golem supports English, Swedish, French, Spanish, Italian & German, but adding another language should be no more trouble than getting the dictionary for that language. Some of which are already available on lexiconista. Please let me know if there is something you would like to see in here, or fork the project and create a pull request.
English
go get github.com/aaaton/golem/v4/dicts/en
Swedish
go get github.com/aaaton/golem/v4/dicts/sv
French
go get github.com/aaaton/golem/v4/dicts/fr
German
go get github.com/aaaton/golem/v4/dicts/de
Spanish
go get github.com/aaaton/golem/v4/dicts/es
Italian
go get github.com/aaaton/golem/v4/dicts/it
package mainimport ("github.com/aaaton/golem/v4""github.com/aaaton/golem/v4/dicts/en")funcmain() {// the language packages are available under golem/dicts// "en" is for englishlemmatizer,err:=golem.New(en.New())iferr!=nil {panic(err)}word:=lemmatizer.Lemma("Abducting")ifword!="abduct" {panic("The output is not what is expected!")}}
- axamon
- charlesgiroux
- glaslos
About
A lemmatizer implemented in Go