text-data
Here are 65 public repositories matching this topic...
Language:All
Sort:Most stars
Large-scale pretraining for dialogue
- Updated
Oct 17, 2022 - Python
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project:http://casl-project.ai/
- Updated
Aug 26, 2021 - Python
Large-scale pretrained models for goal-directed dialog
- Updated
Dec 10, 2023 - Python
Integrating the Best of TF into PyTorch, for Machine Learning, Natural Language Processing, and Text Generation. This is part of the CASL project:http://casl-project.ai/
- Updated
Apr 14, 2022 - Python
Forte is a flexible and powerful ML workflow builder. This is part of the CASL project:http://casl-project.ai/
- Updated
Feb 5, 2024 - Python
Conversational Toolkit. An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation
- Updated
Aug 31, 2020 - Python
Cleans Reddit Text Data 📜 🧹
- Updated
Apr 14, 2020 - Python
Tools to uniformly read in text data including semi-structured transcripts
- Updated
Mar 7, 2023 - R
Tools for reshaping text data
- Updated
Apr 1, 2024 - R
A Python library that enables smooth keyword extraction from any text using the RAKE(Rapid Automatic Keyword Extraction) algorithm.
- Updated
May 3, 2024 - Python
Question Classification for the dataset CogComp QC Dataset - [http://cogcomp.org/Data/QA/QC/ ].
- Updated
Nov 10, 2020 - Python
Visualize large text collections with WebGL
- Updated
Sep 4, 2024 - JavaScript
Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).
- Updated
Mar 7, 2022 - Python
Old book pages (with groundtruth), formerly used for OCR studies. There are several versions of the set (concerning resolution and binarization). Noised and denoised sets (done by several methods) are eventually going to be uploaded.
- Updated
Aug 25, 2017 - HTML
Scrape EDGAR filings fromhttps://www.sec.gov/
- Updated
Mar 10, 2025 - Julia
How Will Your Tweet Be Received? Predicting theSentiment Polarity of Tweet Replies
- Updated
Aug 29, 2021 - Python
A dataset which contains 30k+ so called "self-help" tweets from 100+ authors.
- Updated
Oct 12, 2019 - Jupyter Notebook
This repository hosts a diverse NLP dataset comprising 1,000 stories spanning 100 genres for comprehensive language understanding tasks.
- Updated
Dec 9, 2023
A Python package implementing the Directed LDA model for targeted extraction of specific topics from text data
- Updated
Jan 12, 2025 - Python
Improve this page
Add a description, image, and links to thetext-data topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thetext-data topic, visit your repo's landing page and select "manage topics."