Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.

License

NotificationsYou must be signed in to change notification settings

ThilinaRajapakse/pytorch-transformers-classification

Repository files navigation

This repository is now deprecated. Please useSimple Transformers instead.

Update Notice

The underlyingPytorch-Transformers library by HuggingFace has been updated substantially since this repo was created. As such, this repo might not be compatible with the current version of the Hugging Face Transformers library. This repo will not be updated further.

I recommend usingSimple Transformers (based on the updated Hugging Face library) as it is regularly maintained, feature rich, as well as (much) easier to use.

Pytorch-Transformers-Classification

This repository is based on thePytorch-Transformers library by HuggingFace. It is intended as a starting point for anyone who wishes to use Transformer models in text classification tasks.

Please refer to thisMedium article for further information on how this project works.

Check out the new librarysimpletransformers for one line training and evaluating!

Table of contents

Setup

Simple Transformers - Ready to use library

If you want to go directly to training, evaluating, and predicting with Transformer models, take a look at theSimple Transformers library. It's the easiest way to use Transformers for text classification with only 3 lines of code required. It's based on this repo but is designed to enable the use of Transformers without having to worry about the low level details. However, ease of usage comes at the cost of less control (and visibility) over how everything works.

Quickstart using Colab

Try thisGoogle Colab Notebook for a quick preview. You can run all cells without any modifications to see how everything works. However, due to the 12 hour time limit on Colab instances, the dataset has been undersampled from 500 000 samples to about 5000 samples. For such a tiny sample size, everything should complete in about 10 minutes.

With Conda

  1. Install Anaconda or Miniconda Package Manager fromhere
  2. Create a new virtual environment and install packages.
    conda create -n transformers python pandas tqdm jupyter
    conda activate transformers
    If using cuda:
    conda install pytorch cudatoolkit=10.0 -c pytorch
    else:
    conda install pytorch cpuonly -c pytorch
    conda install -c anaconda scipy
    conda install -c anaconda scikit-learn
    pip install pytorch-transformers
    pip install tensorboardX
  3. Clone repo.git clone https://github.com/ThilinaRajapakse/pytorch-transformers-classification.git

Usage

Yelp Demo

This demonstration uses the Yelp Reviews dataset.

Linux users can executedata_download.sh to download and set up the data files.

If you are doing it manually;

  1. DownloadYelp Reviews Dataset.
  2. Extracttrain.csv andtest.csv and place them in the directorydata/.

Once the download is complete, you can run thedata_prep.ipynb notebook to get the data ready for training.

Finally, you can run therun_model.ipynb notebook to fine-tune a Transformer model on the Yelp Dataset and evaluate the results.

Current Pretrained Models

The table below shows the currently available model types and their models. You can use any of these by setting themodel_type andmodel_name in theargs dictionary. For more information about pretrained models, seeHuggingFace docs.

ArchitectureModel TypeModel NameDetails
BERTbertbert-base-uncased12-layer, 768-hidden, 12-heads, 110M parameters.
Trained on lower-cased English text.
BERTbertbert-large-uncased24-layer, 1024-hidden, 16-heads, 340M parameters.
Trained on lower-cased English text.
BERTbertbert-base-cased12-layer, 768-hidden, 12-heads, 110M parameters.
Trained on cased English text.
BERTbertbert-large-cased24-layer, 1024-hidden, 16-heads, 340M parameters.
Trained on cased English text.
BERTbertbert-base-multilingual-uncased(Original, not recommended) 12-layer, 768-hidden, 12-heads, 110M parameters.
Trained on lower-cased text in the top 102 languages with the largest Wikipedias
BERTbertbert-base-multilingual-cased(New, recommended) 12-layer, 768-hidden, 12-heads, 110M parameters.
Trained on cased text in the top 104 languages with the largest Wikipedias
BERTbertbert-base-chinese12-layer, 768-hidden, 12-heads, 110M parameters.
Trained on cased Chinese Simplified and Traditional text.
BERTbertbert-base-german-cased12-layer, 768-hidden, 12-heads, 110M parameters.
Trained on cased German text by Deepset.ai
BERTbertbert-large-uncased-whole-word-masking24-layer, 1024-hidden, 16-heads, 340M parameters.
Trained on lower-cased English text using Whole-Word-Masking
BERTbertbert-large-cased-whole-word-masking24-layer, 1024-hidden, 16-heads, 340M parameters.
Trained on cased English text using Whole-Word-Masking
BERTbertbert-large-uncased-whole-word-masking-finetuned-squad24-layer, 1024-hidden, 16-heads, 340M parameters.
The bert-large-uncased-whole-word-masking model fine-tuned on SQuAD
BERTbertbert-large-cased-whole-word-masking-finetuned-squad24-layer, 1024-hidden, 16-heads, 340M parameters
The bert-large-cased-whole-word-masking model fine-tuned on SQuAD
BERTbertbert-base-cased-finetuned-mrpc12-layer, 768-hidden, 12-heads, 110M parameters.
The bert-base-cased model fine-tuned on MRPC
XLNetxlnetxlnet-base-cased12-layer, 768-hidden, 12-heads, 110M parameters.
XLNet English model
XLNetxlnetxlnet-large-cased24-layer, 1024-hidden, 16-heads, 340M parameters.
XLNet Large English model
XLMxlmxlm-mlm-en-204812-layer, 2048-hidden, 16-heads
XLM English model
XLMxlmxlm-mlm-ende-10246-layer, 1024-hidden, 8-heads
XLM English-German Multi-language model
XLMxlmxlm-mlm-enfr-10246-layer, 1024-hidden, 8-heads
XLM English-French Multi-language model
XLMxlmxlm-mlm-enro-10246-layer, 1024-hidden, 8-heads
XLM English-Romanian Multi-language model
XLMxlmxlm-mlm-xnli15-102412-layer, 1024-hidden, 8-heads
XLM Model pre-trained with MLM on the 15 XNLI languages
XLMxlmxlm-mlm-tlm-xnli15-102412-layer, 1024-hidden, 8-heads
XLM Model pre-trained with MLM + TLM on the 15 XNLI languages
XLMxlmxlm-clm-enfr-102412-layer, 1024-hidden, 8-heads
XLM English model trained with CLM (Causal Language Modeling)
XLMxlmxlm-clm-ende-10246-layer, 1024-hidden, 8-heads
XLM English-German Multi-language model trained with CLM (Causal Language Modeling)
RoBERTarobertaroberta-base125M parameters
RoBERTa using the BERT-base architecture
RoBERTarobertaroberta-large24-layer, 1024-hidden, 16-heads, 355M parameters
RoBERTa using the BERT-large architecture
RoBERTarobertaroberta-large-mnli24-layer, 1024-hidden, 16-heads, 355M parameters
roberta-large fine-tuned on MNLI.

Custom Datasets

When working with your own datasets, you can create a script/notebook similar todata_prep.ipynb that will convert the dataset to a Pytorch-Transformer ready format.

The data needs to be intsv format, with four columns, and no header.

This is the required structure.

  • guid: An ID for the row.
  • label: The label for the row (should be an int).
  • alpha: A column of the same letter for all rows. Not used in classification but still expected by theDataProcessor.
  • text: The sentence or sequence of text.

Evaluation Metrics

The evaluation process in therun_model.ipynb notebook outputs the confusion matrix, and the Matthews correlation coefficient. If you wish to add any more evaluation metrics, simply edit theget_eval_reports() function in the notebook. This function takes the predictions and the ground truth labels as parameters, therefore you can add any custom metrics calculations to the function as required.

Acknowledgements

None of this would have been possible without the hard work by the HuggingFace team in developing thePytorch-Transformers library.

About

Based on the Pytorch-Transformers library by HuggingFace. To be used as a starting point for employing Transformer models in text classification tasks. Contains code to easily train BERT, XLNet, RoBERTa, and XLM models for text classification.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp