Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Grapheme Cluster and Word boundaries according to UAX#29 rules

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
NotificationsYou must be signed in to change notification settings

unicode-rs/unicode-segmentation

Repository files navigation

Iterators which split strings on Grapheme Cluster or Word boundaries, accordingto theUnicode Standard Annex #29 rules.

Build Status

Documentation

use unicode_segmentation::UnicodeSegmentation;fnmain(){let s ="a̐éö̲\r\n";let g = s.graphemes(true).collect::<Vec<&str>>();let b:&[_] =&["a̐","é","ö̲","\r\n"];assert_eq!(g, b);let s ="The quick (\"brown\") fox can't jump 32.3 feet, right?";let w = s.unicode_words().collect::<Vec<&str>>();let b:&[_] =&["The","quick","brown","fox","can't","jump","32.3","feet","right"];assert_eq!(w, b);let s ="The quick (\"brown\")  fox";let w = s.split_word_bounds().collect::<Vec<&str>>();let b:&[_] =&["The"," ","quick"," ","(","\"","brown","\"",")","  ","fox"];assert_eq!(w, b);}

no_std

unicode-segmentation does not depend on libstd, so it can be used in crateswith the#![no_std] attribute.

crates.io

You can use this package in your project by adding the followingto yourCargo.toml:

[dependencies]unicode-segmentation ="1.10.1"

Change Log

1.11.0

  • #124 Update data to Unicode 15.1
  • #128 Addsize_hint to iterators

1.10.1

  • #113 Use criterion.rs for word benchmarks
  • #112 Improve table search speed through lookups

1.10.0

  • #107 Upgrade to Unicode 15.0.0
  • #104 Supersedes and fixes#75

1.9.0

  • #101 Upgrade to Unicode 14.0.0

1.8.0

  • #100 *#100 - Increase#[inline] opportunities, resulting in 15-40% performance improvement.
  • #95 Implement debug for Graphemes
  • #94 Add Initial fuzzer for oss-fuzz integration
  • #93 Fix unused imports and deprecated pattern warnings
  • #91 Made local variable immutable by moving it into loop
  • #91 Add new iteratorUnicodeWordIndices andunicode_word_indices

1.7.1

  • Update docs on version number

1.7.0

  • #87 Upgrade to Unicode 13
  • #79 Implement a special-case lookup for ascii grapheme categories
  • #77 Optimization for grapheme iteration

1.6.0

  • #72 Upgrade to Unicode 12

1.5.0

  • #68 Upgrade to Unicode 11

1.4.0

  • #56 Upgrade to Unicode 10

1.3.0

  • #24 Add support for sentence boundaries
  • #44 Treatgc=No as a subset ofgc=N

1.2.1

  • #37:Fix panic inprovide_context.
  • #40:Fix crash inprev_boundary.

1.2.0

  • NewGraphemeCursor API allows random access and bidirectional iteration.
  • Fixed incorrect splitting of certain emoji modifier sequences.

1.1.0

  • Addas_str methods to the iterator types.

1.0.3

  • Code cleanup and additional tests.

1.0.1

  • Fix a bug affecting some grapheme clusters containing Prepend characters.

1.0.0

  • Upgrade to Unicode 9.0.0.

About

Grapheme Cluster and Word boundaries according to UAX#29 rules

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp