Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

ngram

Here are 161 public repositories matching this topic...

Four word embedding models implemented in Python. Supporting arbitrary context features

  • UpdatedAug 22, 2019
  • Python

A Lite Bert For Self-Supervised Learning Language Representations

  • UpdatedMay 13, 2020
  • Python

A TUI tool to help you type faster and learn new layouts. Includes a free cat.

  • UpdatedNov 22, 2024
  • Rust

Touch typing trainer using N-grams as data source, with options to customize the auto-generated lessons and specify the minimum typing performance needed. There are sound/color effects as well.

  • UpdatedAug 12, 2024
  • JavaScript

datagrand 2019 information extraction competition rank9

  • UpdatedDec 29, 2019
  • Python

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate a…

  • UpdatedDec 17, 2024
  • C++

Cluster and merge similar string values: an R implementation of Open Refine clustering algorithms

  • UpdatedMar 14, 2024
  • C++

Python implementation of an N-gram language model with Laplace smoothing and sentence generation.

  • UpdatedFeb 9, 2018
  • Python

A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..

  • UpdatedApr 25, 2022
  • Scala

Fast n-Gram Tokenization

  • UpdatedDec 10, 2023
  • C

大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning

  • UpdatedJul 25, 2024
  • Java

Mirror of SRILM

  • UpdatedAug 11, 2020
  • Roff

natural language processing

  • UpdatedJul 3, 2018
  • C++

Create n-grams of wordlists based on words, characters, or charsets to use in offline password attacks and data analysis

  • UpdatedJun 27, 2024
  • Python

利用传统方法(N-gram,HMM等)、神经网络方法(CNN,LSTM等)和预训练方法(Bert等)的中文分词任务实现【The word segmentation task is realized by using traditional methods (n-gram, HMM, etc.), neural network methods (CNN, LSTM, etc.) and pre training methods (Bert, etc.)】

  • UpdatedJun 15, 2022
  • Python

Calculating Ngram with PySpark for wikipedia text

  • UpdatedJun 3, 2024
  • Jupyter Notebook

Spider - web crawler and local wordlist processor to generate frequency sorted wordlist / ngrams

  • UpdatedOct 11, 2025
  • Go

multiprocess unsupervised chinese_detect_words ngram_combination

  • UpdatedJan 2, 2019
  • Python

Improve this page

Add a description, image, and links to thengram topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thengram topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp