- Notifications
You must be signed in to change notification settings - Fork5
Towards a Japanese verb conjugator and deconjugator based on Taeko Kamiya's *The Handbook of Japanese Verbs* and *The Handbook of Japanese Adjectives and Adverbs* opuses.
License
fasiha/kamiya-codec
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Table of contents—
- kamiya-codec
- Install
- Usage for verbs
conjugate(verb: string, conj: Conjugation, typeII: boolean = false): string[]
conjugateAuxiliaries(verb: string, auxs: Auxiliary[], conj: Conjugation, typeII: boolean = false): string[]
type Conjugation
andconjugations
type Auxiliary
andauxiliaries
verbDeconjugate(conjugated: string, dictionaryForm: string, typeII = false, maxAuxDepth = Infinity)
- Usage for adjectives
- Development
- Changelog
A dependency-free browser/Node JavaScript/TypeScript library to conjugate and deconjugate Japanese
- verbs,
- auxiliary verbs, and
- adjectives
based on Taeko Kamiya'sThe Handbook of Japanese Verbs (Kodansha) andThe Handbook of Japanese Adjectives and Adverbs (Kodansha). The idea is, you have a verb—書く, say (to write)—and maybe an auxiliary like たい (wanting to do something), and finally a conjugation, likenegative. Then,
varcodec=require('kamiya-codec');codec.conjugateAuxiliaries('書く',['Tai'],'Negative')// [ '書きたくない' ]
gives us what we want: 書きたくない, or, “doesn’t want to write”.
Similarly, you can ask the library to attempt to reverse this conjugation:
codec.verbDeconjugate('書きたくない','書く')// [ { conjugation: 'Negative', auxiliaries: [ 'Tai' ], result: [ '書きたくない' ] } ]
This library will make most sense if you have the book(s) for reference. It currently implements the first part of each book.
Node.js developers:npm install --save kamiya-codec
will add this package to your current project.
Depending on what module system you use, you can eitherrequire
(CommonJS):
varcodec=require("kamiya-codec");console.log(codec.conjugateAuxiliaries("書く",["Tai"],"Negative"));
or you canimport
(ESM, i.e., EcmaScript Modules), which will work for TypeScript:
import*ascodecfrom"kamiya-codec";console.log(codec.conjugateAuxiliaries("書く",["Tai"],"Negative"));// or alternativelyimport{conjugateAuxiliaries}from"kamiya-codec";console.log(conjugateAuxiliaries("書く",["Tai"],"Negative"));
Similarly for the browser you have two choices: ESM (EcmaScript modules) or a globally-defined variable.
If you use ESM (widely supported by modern browsers), drop
somewhere your HTML can see and then
<scripttype="module">import*ascodecfrom"./kamiya.min.mjs";console.log(codec.conjugateAuxiliaries("書く",["Tai"],"Negative"));</script>
Alternatively, if you want just a plain JavaScript file defining a global variable, then put
(so with.js
instead of.mjs
) somewhere your HTML can see, then
<scriptsrc="kamiya.min.js"></script><script>console.log(kamiya.conjugateAuxiliaries("書く",["Tai"],"Negative"));</script>
The first<script>
will load this library under thekamiya
global varible name.
For everyone else who just want to poke around:
$ git clone https://github.com/fasiha/kamiya-codec.git$ cd kamiya-codec$ npm install$ npm run build
where, in the above, each line is one command, and the$
represents your terminal's prompt (not to be typed in: the first letters you type should be "git…"). This makes a copy of this repository on your computer (git …
), changes into the new directory (cd …
), installs a few JavaScript dependencies (npm install
;npm
is the Node.js package manager that was installed when you installed Node.js), and finally builds the TypeScript source code to Node.js-ready JavaScript (npm run…
).
Then you can start a new Node.js shell (runnode
in the terminal) or create a new JavaScript or TypeScript program to exercise this library:
varcodec=require('./index');codec.conjugateAuxiliaries('書く',['Tai'],'Negative')// [ '書きたくない' ]
Conjugates averb
in dictionary form with a given conjugation (see below for list of allowed values). Returns an array of strings (guaranteed to be at least one element long, which is the most common case).
This library doesn't yet have a perfect way to tell type I (五段) verbs from type II (一段) ones, so all functions includingconjugate
accept atypeII
boolean to let you specify that the incoming verb is or isn't type II. (I'm not very fond of opaque names like type I and type II but to maximally take advantage of Taeko Kamiya's book, we use her notation.)
Irregular verbs
- する
- 来る・くる
are handled specially and ignoretypeII
.
conjugateAuxiliaries(verb: string, auxs: Auxiliary[], conj: Conjugation, typeII: boolean = false): string[]
Given averb
as well as an array of auxiliary verbs (auxs
, see below for list of allowed values), plus the finalconj
ugation and the optionaltypeII
boolean (false if 五段 (default), true if 一段), apply each of the auxiliaries to the verb and conjugate the result.
Note that the following two calls are equivalent:
conjugate(verb,conj,typeII)// deepEqualsconjugateAuxiliaries(verb,[],conj,typeII)
As above, する and 来る・くる irregular verbs will be conjugated correctly and will ignoretypeII
.
Conjugations must be one of the following:
| "Negative"| "Conjunctive"| "Dictionary"| "Conditional"| "Imperative"| "Volitional"| "Te"| "Ta"| "Tara"| "Tari"| "Zu" // not in Kamiya| "Nu" // Not in Kamiya
conjugations
is an array containing all allowed values (for looping, etc.).
Note that
Zu
(the traditional variant ofNegative
) is not included in Kamiya's book but I have included it here. (I havenot yet added the ぬ-form of ずI have also addedNu
.)
Auxiliaries must be one of the following:
| "Potential"| "Masu"| "Nai"| "Tai"| "Tagaru"| "Hoshii"| "Rashii"| "SoudaHearsay"| "SoudaConjecture"| "SeruSaseru"| "ShortenedCausative"| "ReruRareru"| "CausativePassive"| "ShortenedCausativePassive"| "Ageru" // Kamiya section 7.15| "Sashiageru"| "Yaru"| "Morau" // 7.16| "Itadaku"| "Kureru" // 7.17| "Kudasaru"| "TeIru" // 7.5 - 7.6| "TeAru" // 7.7| "Miru" // 7.22| "Iku" // 7.23| "Kuru" // 7.24| "Oku" // 7.25| "Shimau" // 7.26| "TeOru" // Not in Kamiya
auxiliaries
is an array of all allowed values.
Given aconjugated
form of a verb, and itsdictionaryForm
(ending in る or one of the other うくぐ⋯) and that dictionary form'stypeII
boolean (false if 五段 (default), true if 一段), attempt to deconjugate: find the list of auxiliaries and the final conjugation that produce the first argument when put throughconjugate
orconjugateAuxliaries
(above).
maxAuxDepth
can meaningfully be 0 (don't check for auxiliaries), 1, 2, or 3, and for increasing values will look for more and more auxiliaries that might lead fromdictionaryForm
to theconjugated
form.
The returned object has this type:
interfaceDeconjugated{auxiliaries:Auxiliary[];conjugation:Conjugation;result:string[];}
As you might imagine, given the inputs required, I expect you to use this alongside a morphological parser like MeCab that can give you the lemma (dictionary form) and whether or not your conjugated phrase is type I or II, etc.
This is very brute-force and might fail for your input. Please open anissue with examples that don't deconjugate.
Given the dictionary form of an adjective (e.g., 楽しい or 簡単—note な adjectives shouldnot be given with な added on), a conjugation (see below), and whether this is an い-adjective or not, returns an array of strings with that conjugation.
Adjective conjugations must be one of the following:
| "Negative"| "Conditional"| "Tari"| "Present"| "Prenomial"| "Past"| "NegativePast"| "ConjunctiveTe"| "Adverbial"| "TaraConditional"| "Noun"| "StemSou" . // Section 4.5| "StemNegativeSou" // Section 4.5
adjConjugations
is an array of all valid values.
StemSou
andStemNegativeSou
are from §4.5 "Adj stem + sō da" ofHandbook of Japanese Adjectives and Adverbs, and mean "look" or "look like". They are separated into positive vs negative forms because they are quite irregular and both yield な-adjectives.
With
interfaceAdjDeconjugated{conjugation:AdjConjugation;result:string[];}
this function attempts to deconjugate a string given its dictionary form and its い vs な status. Brute force. Again, the expectation is you would use this with MeCab or similar.
Run tests withnpm test
. We usetape
and all exported functions have tests in thetests/
directory. Tests currently happen to all be in JavaScript.
Neverending TypeScript woes 😅
"じゃなく" is correctly identified as the copula だ +Nai
negative auxiliary +Conjunctive
connective final conjugation.
Oku
: "verb + ておく"'s abbreviation, "とく", can undergo連濁 to become "どく" (see e.g.,Bunpro)Iku
: similarly, てく (でく) is an abbreviation for "verb + ていく" (see, e.g.,Jisho.org)
AddNu
, the classical/literary negative likeZu
(seeBunpro andJLPTSensei)
Just cleaning up JavaScript/Node/TypeScript export behavior.
LetYaru
be the final auxiliary when deconjugating, e.g., handle させてやる (する +SeruSaseru
+Yaru
inDictionary
form).
DeconjugateReruRareru
+Nai
+ finallyConjuctive
, e.g., おさえる to おさえられなく.
Allow deconjugator to work with "かいていただけません", i.e., Itadaku + Potential + Masu (and finally Negative).
Allow deconjugator to work with "してもらいたい", i.e., Morau + Tai auxiliaries. We might need to think of a more long-term solution than the bandaid I used but for now, this is a quick tactical fix.
This is only a housekeeping update for nerds: using esbuild to export IIFE and ESM things for others to use. Hopefully this breaks nothing.
Ugh I had a typo all this time: instead ofReruRareru
I was missing anr
. I know this is a breaking change but I cannot even.
TheOku
auxiliary's ~ておく can be colloquially shortened to ~とく permattb on Japanese Stack Exchange.
Allow だ + Nai + Te = じゃなくて.
AddedStemSou
andStemNegativeSou
conjugations to adjectives.
AddedZu
conjugation (old form ofNegative
).
Added polite です endings toSoudaConjecture
, so we can do 読む + Potential + SoudaConjecture (polite) + Ta (past tense) = 読めそうでした. (I haven't added it toSoudaHearsay
, I haven't encountered that yet.)
Add a few contractions of theShimau
auxiliary, ~てしまう:
- ~ちゃう (chau)
- ~ちまう (chimau; or ~じまう and ~ぢまう (jimau/dimau) with rendaku when て becomes で)
SeeSLJ FAQ on this topic.
Add the てる contraction of ている
Kamiya only mentions one form of ない's て-form: on page 37, e.g., (買わ)なくて. But we also encounter (買わ)ないで, see, e.g., this fromJapanesePod101:
Nakute indicates “cause and effect”. Naide means “without”.
This version adds this second nai+te form.
This isn't in Kamiya's verbs book, but I addedTeOru
, 居る, a kenjougo (humble) synonym for いる.
Renames
TeAruNoun
→TeAru
TeIruNoun
→TeIru
and deconjugates these as well.
Adds the sparse support for copulas だ and です: pages 34-35 ofVerbs.
3.0 replacedconjugateAuxiliary
with the more robustconjugateAuxiliaries
which can take an array of auxiliaries. Check it out: start with
- 知る
- → causative form (
SeruSaseru
) - → "do something" (for me or someone,
Kureru
) - → polite (
Masu
) - → past tense (
Ta
) - ➜ 知らせてくれました 💪! (Example from page 164 ofHandbook of Japanese Verbs, section 7.17, example 2.)
conjugateAuxiliaries('知る', ['SeruSaseru', 'Kureru', 'Masu'], 'Ta') // [ '知らせてくれました' ]
Consolidated deconjugator also.
2.0 converted from enums to discriminated unions; added adjectives; added brute force deconjugators.
About
Towards a Japanese verb conjugator and deconjugator based on Taeko Kamiya's *The Handbook of Japanese Verbs* and *The Handbook of Japanese Adjectives and Adverbs* opuses.
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.