matlab-deep-learning/transformer-modelsPublic

NotificationsYou must be signed in to change notification settings
Fork62
Star228

Deep Learning Transformer models in MATLAB

License

View license

228 stars 62 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
+bert		+bert
+finbert		+finbert
+gpt2		+gpt2
+sampling		+sampling
+transformer/+layer		+transformer/+layer
.circleci		.circleci
test		test
.gitignore		.gitignore
ClassifyTextDataUsingBERT.m		ClassifyTextDataUsingBERT.m
FineTuneBERT.m		FineTuneBERT.m
FineTuneBERTJapanese.m		FineTuneBERTJapanese.m
PredictMaskedTokensUsingBERT.m		PredictMaskedTokensUsingBERT.m
PredictMaskedTokensUsingFinBERT.m		PredictMaskedTokensUsingFinBERT.m
README.md		README.md
README_JP.md		README_JP.md
SECURITY.md		SECURITY.md
SentimentAnalysisWithFinBERT.m		SentimentAnalysisWithFinBERT.m
SummarizeTextUsingTransformersExample.m		SummarizeTextUsingTransformersExample.m
bert.m		bert.m
finbert.m		finbert.m
generateSummary.m		generateSummary.m
gpt2.m		gpt2.m
license.txt		license.txt
predictMaskedToken.m		predictMaskedToken.m
truncateSequences.m		truncateSequences.m

Repository files navigation

Transformer Models for MATLAB

This repository implements deep learning transformer models in MATLAB.

Translations

日本語

Requirements

BERT and FinBERT

MATLAB R2021a or later
Deep Learning Toolbox
Text Analytics Toolbox

GPT-2

MATLAB R2020a or later
Deep Learning Toolbox

Getting Started

Download orclone this repository to your machine and open it in MATLAB.

Functions

bert

mdl = bert loads a pretrained BERT transformer model and if necessary, downloads the model weights. The outputmdl is structure with fieldsTokenizer andParameters that contain the BERT tokenizer and the model parameters, respectively.

mdl = bert("Model",modelName) specifies which BERT model variant to use:

"base" (default) - A 12 layer model with hidden size 768.
"multilingual-cased" - A 12 layer model with hidden size 768. The tokenizer is case-sensitive. This model was trained on multi-lingual data.
"medium" - An 8 layer model with hidden size 512.
"small" - A 4 layer model with hidden size 512.
"mini" - A 4 layer model with hidden size 256.
"tiny" - A 2 layer model with hidden size 128.
"japanese-base" - A 12 layer model with hidden size 768, pretrained on texts in the Japanese language.
"japanese-base-wwm" - A 12 layer model with hidden size 768, pretrained on texts in the Japanese language. Additionally, the model is trained with the whole word masking enabled for the masked language modeling (MLM) objective.

bert.model

Z = bert.model(X,parameters) performs inference with a BERT model on the input1-by-numInputTokens-by-numObservations array of encoded tokens with the specified parameters. The outputZ is an array of size (NumHeads*HeadSize)-by-numInputTokens-by-numObservations. The elementZ(:,i,j) corresponds to the BERT embedding of input tokenX(1,i,j).

Z = bert.model(X,parameters,Name,Value) specifies additional options using one or more name-value pairs:

"PaddingCode" - Positive integer corresponding to the padding token. The default is1.
"InputMask" - Mask indicating which elements to include for computation, specified as a logical array the same size asX or as an empty array. The mask must be false at indices positions corresponds to padding, and true elsewhere. If the mask is[], then the function determines padding according to thePaddingCode name-value pair. The default is[].
"DropoutProb" - Probability of dropout for the output activation. The default is0.
"AttentionDropoutProb" - Probability of dropout used in the attention layer. The default is0.
"Outputs" - Indices of the layers to return outputs from, specified as a vector of positive integers, or"last". If"Outputs" is"last", then the function returns outputs from the final encoder layer only. The default is"last".
"SeparatorCode" - Separator token specified as a positive integer. The default is103.

finbert

mdl = finbert loads a pretrained BERT transformer model for sentiment analysis of financial text. The outputmdl is structure with fieldsTokenizer andParameters that contain the BERT tokenizer and the model parameters, respectively.

mdl = finbert("Model",modelName) specifies which FinBERT model variant to use:

"sentiment-model" (default) - The fine-tuned sentiment classifier model.
"language-model" - The FinBERT pretrained language model, which uses a BERT-Base architecture.

finbert.sentimentModel

sentiment = finbert.sentimentModel(X,parameters) classifies the sentiment of the input1-by-numInputTokens-by-numObservations array of encoded tokens with the specified parameters. The output sentiment is a categorical array with categories"positive","neutral", or"negative".

[sentiment, scores] = finbert.sentimentModel(X,parameters) also returns the corresponding sentiment scores in the range[-1 1].

gpt2

mdl = gpt2 loads a pretrained GPT-2 transformer model and if necessary, downloads the model weights.

generateSummary

summary = generateSummary(mdl,text) generates a summary of the string orchar arraytext using the transformer modelmdl. The output summary is a char array.

summary = generateSummary(mdl,text,Name,Value) specifies additional options using one or more name-value pairs.

"MaxSummaryLength" - The maximum number of tokens in the generated summary. The default is 50.
"TopK" - The number of tokens to sample from when generating the summary. The default is 2.
"Temperature" - Temperature applied to the GPT-2 output probability distribution. The default is 1.
"StopCharacter" - Character to indicate that the summary is complete. The default is".".

Example: Classify Text Data Using BERT

The simplest use of a pretrained BERT model is to use it as a feature extractor. In particular, you can use the BERT model to convert documents to feature vectors which you can then use as inputs to train a deep learning classification network.

The exampleClassifyTextDataUsingBERT.m shows how to use a pretrained BERT model to classify failure events given a data set of factory reports. This example requires thefactoryReports.csv data set from the Text Analytics Toolbox examplePrepare Text Data for Analysis.

Example: Fine-Tune Pretrained BERT Model

To get the most out of a pretrained BERT model, you can retrain and fine tune the BERT parameters weights for your task.

The exampleFineTuneBERT.m shows how to fine-tune a pretrained BERT model to classify failure events given a data set of factory reports. This example requires thefactoryReports.csv data set from the Text Analytics Toolbox examplePrepare Text Data for Analysis.

The exampleFineTuneBERTJapanese.m shows the same workflow using a pretrained Japanese-BERT model. This example requires thefactoryReportsJP.csv data set from the Text Analytics Toolbox exampleAnalyze Japanese Text Data, available in R2023a or later.

Example: Analyze Sentiment with FinBERT

FinBERT is a sentiment analysis model trained on financial text data and fine-tuned for sentiment analysis.

The exampleSentimentAnalysisWithFinBERT.m shows how to classify the sentiment of financial news reports using a pretrained FinBERT model.

Example: Predict Masked Tokens Using BERT and FinBERT

BERT models are trained to perform various tasks. One of the tasks is known as masked language modeling which is the task of predicting tokens in text that have been replaced by a mask value.

The examplePredictMaskedTokensUsingBERT.m shows how to predict masked tokens and calculate the token probabilities using a pretrained BERT model.

The examplePredictMaskedTokensUsingFinBERT.m shows how to predict masked tokens for financial text using and calculate the token probabilities using a pretrained FinBERT model.

Example: Summarize Text Using GPT-2

Transformer networks such as GPT-2 can be used to summarize a piece of text. The trained GPT-2 transformer can generate text given an initial sequence of words as input. The model was trained on comments left on various web pages and internet forums.

Because lots of these comments themselves contain a summary indicated by the statement "TL;DR" (Too long, didn't read), you can use the transformer model to generate a summary by appending "TL;DR" to the input text. ThegenerateSummary function takes the input text, automatically appends the string"TL;DR" and generates the summary.

The exampleSummarizeTextUsingTransformersExample.m shows how to summarize a piece of text using GPT-2.

About

Deep Learning Transformer models in MATLAB

Releases2

R2023a Latest

Jul 25, 2023

+ 1 release

Contributors6

Languages

MATLAB100.0%

Movatterモバイル変換

License

matlab-deep-learning/transformer-models

Folders and files

Latest commit

History

Repository files navigation

Transformer Models for MATLAB

Translations

Requirements

BERT and FinBERT

GPT-2

Getting Started

Functions

bert

bert.model

finbert

finbert.sentimentModel

gpt2

generateSummary

Example: Classify Text Data Using BERT

Example: Fine-Tune Pretrained BERT Model

Example: Analyze Sentiment with FinBERT

Example: Predict Masked Tokens Using BERT and FinBERT

Example: Summarize Text Using GPT-2

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases2

Contributors6

Uh oh!

Languages