Commit1575a31

committed

updated javadoc, changelog and readme

1 parent19b7b73 commit1575a31Copy full SHA for 1575a31

File tree

4 files changed

+114

-60

lines changed

CHANGELOG.md
README.md
src/main/java/com/indoqa/fsa
- Acceptor.java
- morfologik
  - MorfologikAcceptor.java

4 files changed

+114

-60

lines changed

`‎CHANGELOG.md‎`

Lines changed: 2 additions & 2 deletions

Original file line number	Diff line number	Diff line change
`@@ -1,15 +1,15 @@`
`1`	`1`	`#v0.2.2 \| 2019-03-11`
`2`	`2`	`* introduced version number for file format`
`3`	`3`	`* changed replacement detection to use sorting instead of a map`
`4`		`-*mproved minifying of acceptors`
	`4`	`+*improved minifying of acceptors`
`5`	`5`	`* bug fixes`
`6`	`6`
`7`	`7`	`#v0.2.1 \| 2018-11-21`
`8`	`8`	`* optimized minifying and remapping`
`9`	`9`	`* added transducer methods`
`10`	`10`
`11`	`11`	`#v0.2.0 \| 2018-06-15`
`12`		`-* moved char reading/writing method intoseparator class`
	`12`	`+* moved char reading/writing method intoseparate class`
`13`	`13`	`* improved memory footprint and execution time for creating CharAcceptors`
`14`	`14`	`* fixed issue with serializing CharAcceptor`
`15`	`15`	`* performance optimization of case-insensitive char comparison`

`‎README.md‎`

Lines changed: 7 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -1 +1,7 @@`
`1`		`-#indoqa-fsa`
	`1`	`+#indoqa-fsa`
	`2`	`+`
	`3`	`+Provides an abstraction layer for acceptors and transducers from[Morfologik](https://github.com/morfologik/) as well as alternative implementations.`
	`4`	`+`
	`5`	`+The abstraction layer handles the conversion between Strings and bytes, offers support for case-insensitive operations and easier construction of acceptors and transducers.`
	`6`	`+`
	`7`	`+The alternative implementations work directly on characters, which results in better runtime behaviour and greatly reduced need for garbage collection.`

`‎src/main/java/com/indoqa/fsa/Acceptor.java‎`

Lines changed: 105 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -20,32 +20,137 @@`
`20`	`20`
`21`	`21`	`publicinterfaceAcceptor {`
`22`	`22`
	`23`	`+/**`
	`24`	`+ * Checks whether or not this {@link Acceptor} accepts the given <code>sequence</code>.`
	`25`	`+ *`
	`26`	`+ * @param sequence The {@link CharSequence} to check.`
	`27`	`+ * @return <code>true</code> if and only if this {@link Acceptor} accepts the given <code>sequence</code>`
	`28`	`+ */`
`23`	`29`	`booleanaccepts(CharSequencesequence);`
`24`	`30`
	`31`	`+/**`
	`32`	`+ * Performs the same opeation as {@link #accepts(CharSequence)} but on the part of <code>sequence</code> denoted by`
	`33`	`+ * <code>start</code> and <code>length</code>.`
	`34`	`+ *`
	`35`	`+ * @see #accepts(CharSequence)`
	`36`	`+ */`
`25`	`37`	`booleanaccepts(CharSequencesequence,intstart,intlength);`
`26`	`38`
	`39`	`+/**`
	`40`	`+ * Find all accepted inputs at the beginning of given <code>charSequence</code>.<br/>`
	`41`	`+ * <br/>`
	`42`	`+ * Given the sequence <code>aa bbb cccc ddddd</code><br/>`
	`43`	`+ * and the accepted inputs <code>a</code>, <code>aa</code>, <code>aaa</code>, <code>b</code>, <code>bb</code>,`
	`44`	`+ * <code>bbb</code><br/>`
	`45`	`+ * the matches will be <code>a</code> and <code>aa</code>`
	`46`	`+ *`
	`47`	`+ * @param charSequence The {@link CharSequence} to examine.`
	`48`	`+ * @return All accepted inputs at the beginning of the charSequence.`
	`49`	`+ */`
`27`	`50`	`String[]getAllMatches(CharSequencesequence);`
`28`	`51`
	`52`	`+/**`
	`53`	`+ * Performs the same opeation as {@link #getAllMatches(CharSequence, int, int)} but on the part of <code>sequence</code> denoted by`
	`54`	`+ * <code>start</code> and <code>length</code>.`
	`55`	`+ *`
	`56`	`+ * @see #getAllMatches(CharSequence)`
	`57`	`+ */`
`29`	`58`	`String[]getAllMatches(CharSequencesequence,intstart,intlength);`
`30`	`59`
	`60`	`+/**`
	`61`	`+ * Find all accepted inputs in the given <code>charSequence</code>.<br/>`
	`62`	`+ * <p>`
	`63`	`+ * The only difference to {@link #getAllTokens(CharSequence)} is that the accepted input may occur at any position within the`
	`64`	`+ * <code>charSequence</code> (specifically start and end inside a token).`
	`65`	`+ * </p>`
	`66`	`+ *`
	`67`	`+ * @param charSequence The {@link CharSequence} to examine.`
	`68`	`+ * @return all occurrences of accepted input`
	`69`	`+ */`
`31`	`70`	`List<Token>getAllOccurrences(CharSequencesequence);`
`32`	`71`
	`72`	`+/**`
	`73`	`+ * Performs the same operation as {@link #getAllOccurrences(CharSequence)} but on the part of <code>sequence</code> denoted by`
	`74`	`+ * <code>start</code> and <code>length</code>.`
	`75`	`+ *`
	`76`	`+ * @see #getAllOccurrences(CharSequence)`
	`77`	`+ */`
`33`	`78`	`List<Token>getAllOccurrences(CharSequencesequence,intstart,intlength);`
`34`	`79`
	`80`	`+/**`
	`81`	`+ * Find all accepted inputs that are tokens in the given <code>charSequence</code>.<br/>`
	`82`	`+ * <p>`
	`83`	`+ * A part of the given sequence is considered to be a <code>token</code>, when it starts and ends at a token boundary.<br/>`
	`84`	`+ * A token boundary is the change from a non-word character to a word character (or vice-versa), as well as the beginning and end`
	`85`	`+ * of the whole sequence.<br/>`
	`86`	`+ * Please note that a token may contain token boundaries.`
	`87`	`+ * </p>`
	`88`	`+ *`
	`89`	`+ * @param charSequence The {@link CharSequence} to examine.`
	`90`	`+ * @return All tokens of accepted inputs.`
	`91`	`+ */`
`35`	`92`	`List<Token>getAllTokens(CharSequencesequence);`
`36`	`93`
	`94`	`+/**`
	`95`	`+ * Performs the same operation as {@link #getAllTokens(CharSequence)} but on the part of <code>sequence</code> denoted by`
	`96`	`+ * <code>start</code> and <code>length</code>.`
	`97`	`+ *`
	`98`	`+ * @see #getAllTokens(CharSequence)`
	`99`	`+ */`
`37`	`100`	`List<Token>getAllTokens(CharSequencesequence,intstart,intlength);`
`38`	`101`
	`102`	`+/**`
	`103`	`+ * Find the longest accepted input at the beginning of given <code>charSequence</code>.<br/>`
	`104`	`+ * <br/>`
	`105`	`+ * Given the sequence <code>aa bbb cccc ddddd</code><br/>`
	`106`	`+ * and the accepted inputs <code>a</code>, <code>aa</code>, <code>aaa</code>, <code>b</code>, <code>bb</code>,`
	`107`	`+ * <code>bbb</code><br/>`
	`108`	`+ * the longest match will be <code>aa</code>`
	`109`	`+ *`
	`110`	`+ * @param charSequence The charSequence to examine.`
	`111`	`+ * @return The longest accepted input at the beginning of the charSequence.`
	`112`	`+ */`
`39`	`113`	`StringgetLongestMatch(CharSequencesequence);`
`40`	`114`
	`115`	`+/**`
	`116`	`+ * Performs the same operation as {@link #getLongestMatch(CharSequence)} but on the part of <code>sequence</code> denoted by`
	`117`	`+ * <code>start</code> and <code>length</code>.`
	`118`	`+ *`
	`119`	`+ * @see #getLongestMatch(CharSequence)`
	`120`	`+ */`
`41`	`121`	`StringgetLongestMatch(CharSequencesequence,intstart,intlength);`
`42`	`122`
	`123`	`+/**`
	`124`	`+ * Performs {@link #getAllOccurrences(CharSequence)} and then eliminates overlapping {@link Token Tokens} by only keeping the`
	`125`	`+ * longest.`
	`126`	`+ *`
	`127`	`+ * @param charSequence The {@link CharSequence} to examine.`
	`128`	`+ * @return The longest occurrences of accepted input.`
	`129`	`+ */`
`43`	`130`	`List<Token>getLongestOccurrences(CharSequencesequence);`
`44`	`131`
	`132`	`+/**`
	`133`	`+ * Performs the same operation as {@link #getLongestOccurrences(CharSequence)} but on the part of <code>sequence</code> denoted by`
	`134`	`+ * <code>start</code> and <code>length</code>.`
	`135`	`+ *`
	`136`	`+ * @see #getLongestOccurrences(CharSequence)`
	`137`	`+ */`
`45`	`138`	`List<Token>getLongestOccurrences(CharSequencesequence,intstart,intlength);`
`46`	`139`
	`140`	`+/**`
	`141`	`+ * Performs {@link #getAllTokens(CharSequence)} and then eliminates overlapping {@link Token Tokens} by only keeping the longest.`
	`142`	`+ *`
	`143`	`+ * @param charSequence The {@link CharSequence} to examine.`
	`144`	`+ * @return The longest tokens of accepted input.`
	`145`	`+ */`
`47`	`146`	`List<Token>getLongestTokens(CharSequencesequence);`
`48`	`147`
	`148`	`+/**`
	`149`	`+ * Performs the same operation as {@link #getLongestTokens(CharSequence)} but on the part of <code>sequence</code> denoted by`
	`150`	`+ * <code>start</code> and <code>length</code>.`
	`151`	`+ *`
	`152`	`+ * @see #getLongestTokens(CharSequence)`
	`153`	`+ */`
`49`	`154`	`List<Token>getLongestTokens(CharSequencesequence,intstart,intlength);`
`50`	`155`
`51`	`156`	`}`

`‎src/main/java/com/indoqa/fsa/morfologik/MorfologikAcceptor.java‎`

Lines changed: 0 additions & 57 deletions

Original file line number	Diff line number	Diff line change
`@@ -60,17 +60,6 @@ public boolean accepts(CharSequence sequence, int start, int length) {`
`60`	`60`	`returnthis.accepts(sequence.subSequence(start,start +length));`
`61`	`61`	`}`
`62`	`62`
`63`		`-/**`
`64`		`- * Find the all accepted inputs at the beginning of given <code>charSequence</code>.<br/>`
`65`		`- * <br/>`
`66`		`- * Given the sequence <code>aa bbb cccc ddddd</code><br/>`
`67`		`- * and the accepted inputs <code>a</code>, <code>aa</code>, <code>aaa</code>, <code>b</code>, <code>bb</code>,`
`68`		`- * <code>bbb</code><br/>`
`69`		`- * the matches will be <code>a</code> and <code>aa</code>`
`70`		`- *`
`71`		`- * @param charSequence The {@link CharSequence} to examine.`
`72`		`- * @return All accepted inputs at the beginning of the charSequence.`
`73`		`- */`
`74`	`63`	`@Override`
`75`	`64`	`publicString[]getAllMatches(CharSequencesequence) {`
`76`	`65`	`byte[]bytes =this.getBytes(sequence);`
`@@ -89,16 +78,6 @@ public String[] getAllMatches(CharSequence sequence, int start, int length) {`
`89`	`78`	`returnthis.getAllMatches(sequence.subSequence(start,start +length));`
`90`	`79`	`}`
`91`	`80`
`92`		`-/**`
`93`		`- * Find all accepted inputs in the given <code>charSequence</code>.<br/>`
`94`		`- * <p>`
`95`		`- * The only difference to {@link #getAllTokens(CharSequence)} is that the accepted input may occur at any position within the`
`96`		`- * <code>charSequence</code> (specifically start and end inside a token).`
`97`		`- * </p>`
`98`		`- *`
`99`		`- * @param charSequence The {@link CharSequence} to examine.`
`100`		`- * @return all occurrences of accepted input`
`101`		`- */`
`102`	`81`	`@Override`
`103`	`82`	`publicList<Token>getAllOccurrences(CharSequencecharSequence) {`
`104`	`83`	`List<Token>result =newArrayList<>();`
`@@ -124,18 +103,6 @@ public List<Token> getAllOccurrences(CharSequence sequence, int start, int lengt`
`124`	`103`	`returnthis.getAllOccurrences(sequence.subSequence(start,start +length));`
`125`	`104`	`}`
`126`	`105`
`127`		`-/**`
`128`		`- * Find all accepted inputs that are tokens in the given <code>charSequence</code>.<br/>`
`129`		`- * <p>`
`130`		`- * A part of the given sequence is considered to be a <code>token</code>, when it starts and ends at a token boundary.<br/>`
`131`		`- * A token boundary is the change from a non-word character to a word character (or vice-versa), as well as the beginning and end`
`132`		`- * of the whole sequence.<br/>`
`133`		`- * Please note that a token may contain token boundaries.`
`134`		`- * </p>`
`135`		`- *`
`136`		`- * @param charSequence The {@link CharSequence} to examine.`
`137`		`- * @return All tokens of accepted inputs.`
`138`		`- */`
`139`	`106`	`@Override`
`140`	`107`	`publicList<Token>getAllTokens(CharSequencecharSequence) {`
`141`	`108`	`List<Token>result =newArrayList<>();`
`@@ -168,17 +135,6 @@ public List<Token> getAllTokens(CharSequence sequence, int start, int length) {`
`168`	`135`	`returnthis.getAllTokens(sequence.subSequence(start,start +length));`
`169`	`136`	`}`
`170`	`137`
`171`		`-/**`
`172`		`- * Find the longest accepted input at the beginning of given <code>charSequence</code>.<br/>`
`173`		`- * <br/>`
`174`		`- * Given the sequence <code>aa bbb cccc ddddd</code><br/>`
`175`		`- * and the accepted inputs <code>a</code>, <code>aa</code>, <code>aaa</code>, <code>b</code>, <code>bb</code>,`
`176`		`- * <code>bbb</code><br/>`
`177`		`- * the longest match will be <code>aa</code>`
`178`		`- *`
`179`		`- * @param charSequence The charSequence to examine.`
`180`		`- * @return The longest accepted input at the beginning of the charSequence.`
`181`		`- */`
`182`	`138`	`@Override`
`183`	`139`	`publicStringgetLongestMatch(CharSequencecharSequence) {`
`184`	`140`	`byte[]bytes =this.getBytes(charSequence);`
`@@ -196,13 +152,6 @@ public String getLongestMatch(CharSequence sequence, int start, int length) {`
`196`	`152`	`returnthis.getLongestMatch(sequence.subSequence(start,start +length));`
`197`	`153`	`}`
`198`	`154`
`199`		`-/**`
`200`		`- * Performs {@link #getAllOccurrences(CharSequence)} and then eliminates overlapping {@link Token Tokens} by only keeping the`
`201`		`- * longest.`
`202`		`- *`
`203`		`- * @param charSequence The {@link CharSequence} to examine.`
`204`		`- * @return The longest occurrences of accepted input.`
`205`		`- */`
`206`	`155`	`@Override`
`207`	`156`	`publicList<Token>getLongestOccurrences(CharSequencecharSequence) {`
`208`	`157`	`returneliminateOverlapping(this.getAllOccurrences(charSequence));`
`@@ -213,12 +162,6 @@ public List<Token> getLongestOccurrences(CharSequence sequence, int start, int l`
`213`	`162`	`returnthis.getLongestOccurrences(sequence.subSequence(start,start +length));`
`214`	`163`	`}`
`215`	`164`
`216`		`-/**`
`217`		`- * Performs {@link #getAllTokens(CharSequence)} and then eliminates overlapping {@link Token Tokens} by only keeping the longest.`
`218`		`- *`
`219`		`- * @param charSequence The {@link CharSequence} to examine.`
`220`		`- * @return The longest tokens of accepted input.`
`221`		`- */`
`222`	`165`	`@Override`
`223`	`166`	`publicList<Token>getLongestTokens(CharSequencecharSequence) {`
`224`	`167`	`returneliminateOverlapping(this.getAllTokens(charSequence));`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit1575a31

File tree

4 files changed

4 files changed

`‎CHANGELOG.md‎`

`‎README.md‎`

`‎src/main/java/com/indoqa/fsa/Acceptor.java‎`

`‎src/main/java/com/indoqa/fsa/morfologik/MorfologikAcceptor.java‎`

0 commit comments