Movatterモバイル変換


[0]ホーム

URL:


CN111008517A - A compression method of neural language model based on tensor decomposition technology - Google Patents

A compression method of neural language model based on tensor decomposition technology
Download PDF

Info

Publication number
CN111008517A
CN111008517ACN201911043675.3ACN201911043675ACN111008517ACN 111008517 ACN111008517 ACN 111008517ACN 201911043675 ACN201911043675 ACN 201911043675ACN 111008517 ACN111008517 ACN 111008517A
Authority
CN
China
Prior art keywords
model
tensor
transformer
language
translation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911043675.3A
Other languages
Chinese (zh)
Inventor
马鑫典
张鹏
张帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin UniversityfiledCriticalTianjin University
Priority to CN201911043675.3ApriorityCriticalpatent/CN111008517A/en
Publication of CN111008517ApublicationCriticalpatent/CN111008517A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The invention discloses a neural language model compression method based on tensor decomposition technology, which starts from linear representation of an original attention function, then proves that the attention function can be linearly represented by a group of orthonormal basis vectors, and then compresses parameters by sharing the group of basis vectors under the condition of constructing a multi-head mechanism; meanwhile, the neural network model can have stronger discrimination capability through the modeling in a tensor slicing mode; the invention provides a new idea for developing a neural network model with low parameters and high accuracy.

Description

Tensor decomposition technology-based neural language model compression method
Technical Field
The invention relates to the field of neural language model compression, in particular to compression of an original attention function aiming at a Transformer neural network model.
Background
With the development of artificial intelligence, nowadays in the field of natural language processing, neural language Pre-training (Pre-training) models have demonstrated their effectiveness for most tasks. The Transformer model is based on an attention mechanism, and replaces a circular neural network and a convolutional neural network. This model has been widely used extensively today, playing a key role in many other pre-trained language models, such as the BERT pre-trained model. However, the large number of these pre-training models results in difficult deployment of the models on limited resources due to the large number of parameters. The compression of pre-trained language models is therefore an important research problem.
There are several methods of model compression currently available. In the process of training a language model, the size of a word list has an important influence on the number of parameters of the model, the parameters of an embedding layer (EmbeddingLayer) can be reduced by a tensed word embedding method, the model is compressed by a Tensor decomposition method of a Tensor Train (Tensor Train), and the main idea is the idea of low-rank decomposition. The method can be used at the embedding level of any language model. Recently, in the field of image processing, researchers have proposed using tensor block decomposition to compress a recurrent neural network, since the input vector representation is relatively long, resulting in a large number of parameters for linear computation. These compression methods are only compression of data representation in the input layer part of the model, and do not compress the model structure itself, and the compressed model cannot be easily embedded into the original model structure for training. There are also methods proposed that can use a higher order tensor to replace all the convolution kernels in a convolutional network.
Tensor techniques have been used as compression techniques for models, and in general, they are used alone at the input layer, the fully connected layer, of the model, and these compressions have solved the situation of excessive model parameters to some extent, however, the work of combining multiple tensor compression methods to jointly compress the internal structure of the model is lacking. Therefore, we propose the idea of using low rank decomposition and parameter sharing simultaneously, and the tensor decomposition technology is mainly the Tucker decomposition combining the third-order tensor and the Block-Term decomposition combining the third-order tensor. By combining the two methods, the model is reduced by nearly half the parameters in overall effect.
Language modeling is a most basic task and is mainly used for testing the capability of a model modeling language, and language modeling data on a Text mainly comprises the following three data sets, namely a small language modeling data set PTB, a medium data set Wiki-Text 103 and a large corpus One-Billion data set. To test the behavior of the compressed language model in downstream tasks, we selected the set of delta-to-english translation data and performed experiments on that data set.
Disclosure of Invention
The invention aims to solve the technical problem that the existing large-scale pre-trained neural language model has a large number of parameters and cannot be used for experimental deployment on limited resources. And training a language modeling data set and a translation data set by using a tensor Transformer model, and training a network model by using a back propagation and random gradient descent optimization method to obtain a prediction result of an optimal model on a test set, so as to obtain more accurate prediction and translation results.
A method for compressing a neural language model based on a tensor decomposition technology comprises the following steps:
constructing a tensor model by linearly representing an original attention function;
constructing a single block attention function by using Tucker in a tensor model;
constructing a Multi-linear attention function by using a Block-Term and sharing factor matrix in a tensor model, and realizing parameter compression of a transform neural network model;
embedding a Multi-linear attention function into a transform neural network model structure to obtain a compressed tensor transform model;
applying the compressed tensor Transformer model to a language modeling task and a machine translation task; the method comprises the following steps:
acquiring a language model data set and a German-English translation data set;
processing a data set and cleaning data, wherein for text data, the position of each word in a sentence is in a relation, and a word vector representation model of the text is constructed by combining the position information of the words in the sentence;
inputting the expression model of the word vector into a tensor Transformer model for model training to obtain a training loss function of the neural network model, wherein the training loss function is as follows:
Figure BDA0002253540660000021
wherein y isiThe representation is really a category label,
Figure BDA0002253540660000022
represents the prediction result, and n represents the length of the sentence. And training the model by a back propagation algorithm and a batch random gradient descent method.
Training a tensor Transformer model on a training set, simultaneously verifying the tensor Transformer model on a verification set at certain intervals in batches, and recording model parameters stored on the verification set when the effect is optimal;
testing samples on the test set by using the optimal model stored in the last step to finally obtain a prediction result or a translation model of each test sample, comparing the test labels, and calculating the accuracy of prediction and translation; wherein:
in the language modeling task, inputting a test set in a language modeling data set into a trained tensor Transformer model for testing, and calculating the probability of each sentence;
in a translation task, inputting a test set in a German-English translation data set into a trained tensor Transformer model for testing, and calculating a BLEU value of each translated sentence and an original sentence;
and recording the experimental results of the tensor Transformer model on a language modeling task and a German-English translation task.
The invention has the beneficial effects that:
(1) a pre-training language model with low parameters and high accuracy is built, and the dilemma that the model with excessive parameters is difficult to deploy can be overcome.
(2) The two compression ideas are mixed for use, so that a better compression model can be obtained.
(3) The compressed model can improve the experimental result, and in the model testing stage, the batch scale can be improved so as to meet the possibility of multi-user request translation. Tables 1 and 2 below show the experimental results of the compressed model in the language modeling task.
Figure RE-GDA0002380953430000031
Note: the lower the TestPPL, the better the One-Billion dataset.
TABLE 1
Figure RE-GDA0002380953430000032
Note: the larger the Test BLEU value, the better, in WMT-16 EndText translation data set.
TABLE 2
Drawings
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a diagram of a tensor Transformer model;
figure 3 is a schematic diagram of a tensor representation that can reconstruct the original attention.
Detailed Description
The technical solutions of the present invention are further described in detail below with reference to the accompanying drawings, but the scope of the present invention is not limited to the following. FIG. 1 shows a flow chart of the analysis method of the compression model proposed by the present method; FIG. 2 shows a diagram of a neural network model designed according to the present invention; figure 3 shows a schematic diagram of the tensor representation that can reconstruct the original attention.
The invention discloses a method for compressing parameters of a neural language model based on tensor decomposition. Due to the outstanding expression of the language model of the self-attention system in the natural language processing task, a Pre-training (Pre-training) language model based on an encoding-decoding (Encoder-Decoder) structure of the self-attention system becomes a research hotspot in the field of natural language processing, and the training is difficult on limited computing resources in consideration of the overlarge model parameters. We develop a way to combine tensor decomposition and parameter sharing compression ideas to compress a more general neural network language model, transform. And the compressed model is applied to language modeling and machine translation tasks.
The implementation process and the application of the invention mainly comprise the following steps: finding a structure needing compression in a Transformer language model and researching the main properties of the structure; designing a new model structure by using a tensor decomposition technology; researching the difference between the new structure and the original structure, and proving the rationality of the new structure; collecting a language model and a translation corpus data set, dividing the language model and the translation corpus data set into a training set and a testing set, and constructing word representation by applying position vector information and semantic vector information to each word in the corpus; inputting the text word vectors of the training corpus into a compressed network structure, and training a language model; inputting the text word vectors of the test set into the trained language model, thereby calculating the prediction probability of each sample;
the present invention starts with a linear representation of the original attention function and then demonstrates that the attention function can be linearly represented by a set of orthonormal basis vectors, which we can then share in case of constructing a multi-headed mechanism. Thereby compressing the parameters. Meanwhile, the invention also proves that the original attention function can be reconstructed by the new expression, and the neural network model can have stronger discrimination capability by the tensor slicing mode modeling. The invention provides a new idea for developing a neural network model with low parameters and high accuracy.
The purpose of the invention is realized by the following technical scheme, which comprises the following steps:
constructing a tensor model by linearly representing an original attention function;
constructing a single block attention function by using Tucker in a tensor model;
constructing a Multi-linear attention function by using a Block-Term and sharing factor matrix in a tensor model, and realizing parameter compression of a transform neural network model;
since there are three encoded matrices in the transform neural network model, which are considered as three factor matrices, after initializing a core tensor, a linear attention function can be constructed by using the Tucker decomposition in the tensor decomposition technique.
The monolithic-attention function is constructed using the Tucker decomposition technique, and is of the form:
Figure BDA0002253540660000041
herein, the
Figure BDA0002253540660000042
Is a core tensor, i, j and m are indices of the core tensor,
Figure BDA0002253540660000043
is the outer product of the vectors. In particular, in the experiment, the nuclear tensor here
Figure BDA0002253540660000044
Can be defined as the following equation:
Figure BDA0002253540660000045
here rand (0,1) is a function that generates random values. Finally, the parameter we store is a vector, which we perform Softmax normalization on.
Embedding a Multi-linear attention function into a transform neural network model structure to obtain a compressed tensor transform model;
to construct a multi-head attention mechanism and to be able to compress the parameters of the model, we use a set of linear mappings and then share the output of this part. In our model, it is called the Multi-linear attention function, which can be formalized as:
Figure BDA0002253540660000046
Figure BDA0002253540660000047
Figure BDA0002253540660000048
is the diagonal tensor. The multi-linear attention function herein can decompress the original language model. In multi-headed compression, the compression ratio of the model can be calculated by the following formula:
Figure BDA0002253540660000049
h is generally 8, so as d increases, the compression ratio of the model increases.
Applying the compressed tensor Transformer model to a language modeling task and a machine translation task; the method comprises the following steps:
acquiring a language model data set and a German-English translation data set;
processing a data set and cleaning data, wherein for text data, the position of each word in a sentence is in a relation, and a word vector representation model of the text is constructed by combining the position information of the words in the sentence;
inputting the expression model of the word vector into a tensor Transformer model for model training to obtain a training loss function of the neural network model, wherein the training loss function is as follows:
Figure BDA0002253540660000051
wherein y isiThe representation is really a category label,
Figure BDA0002253540660000052
represents the prediction result, and n represents the length of the sentence. And training the model by a back propagation algorithm and a batch random gradient descent method.
Training a tensor Transformer model on a training set, simultaneously verifying the tensor Transformer model on a verification set at certain intervals in batches, and recording model parameters stored on the verification set when the effect is optimal;
testing samples on the test set by using the optimal model stored in the last step to finally obtain a prediction result or a translation model of each test sample, comparing the test labels, and calculating the accuracy of prediction and translation; wherein:
in the language modeling task, inputting a test set in a language modeling data set into a trained tensor Transformer model for testing, and calculating the probability of each sentence;
in a translation task, inputting a test set in a German-English translation data set into a trained tensor Transformer model for testing, and calculating a BLEU value of each translated sentence and an original sentence;
and recording the experimental results of the tensor Transformer model on a language modeling task and a German-English translation task.
According to the method, the original attention function is compressed by using parameter sharing and low-rank decomposition in a mixed mode, a new attention function Multi-linear attention function is obtained and is embedded into a Transformer language model to perform natural language processing tasks.
The specific application steps are as follows:
the method mainly comprises three parts, wherein the first part is how to design an Encoder structure of a Transformer, the second part is that a compressed language model is tested in a language modeling task, and the third part is that the compressed model is tested in a translation task.
A first part:
(1): a Single-Block Attention function (Single-Block Attention) is constructed based on the Tucker tensor decomposition, as shown on the left side of fig. 2.
(2): multiple monolithic attention functions are constructed while sharing the same set of factor matrices. The tensor representation of these monolithic attentions was then taken using the average pooling method, resulting in a third order tensor representation. As shown on the right side of figure 2.
(3): in order to enable the third-order tensor expression to be embedded into a Transformer neural network model framework, a tensor splitting method is adopted. The method mainly comprises the steps of segmenting the third-order tensor in the second dimension of the third-order tensor to obtain a plurality of matrixes, splicing the matrixes, and expressing the tensor through a full-connection layerSuccessfully embedded in the transform structure, as shown on the right side of fig. 2, we split (split) an nxnxnxn by N third order tensor, and then use the concatenation (concat) to arrive at the matrix T1,…,Tn.
A second part:
(1) and processing the three data sets, removing punctuation marks to obtain a word list of the corpus, then carrying out vector coding on each word, and splicing the relative position vectors on each word vector at each sentence coding time. The encoding method of the position vector is as follows:
Figure BDA0002253540660000061
Figure BDA0002253540660000062
pos here is the position of the word and i denotes the dimension of the vector.
(2) The sentences in the corpus are processed in Batch (Batch), each word is coded by the method in (1), and then the word vector representation of one sentence is input into a tensor Transformer language model designed by us. The specific steps are introduced as follows:
the first step is as follows: these inputs are subjected to a linear encoding process three times:
Figure BDA0002253540660000063
w hereinV、WkAnd WQIs the three initialization parameter matrices and E is the vector input for the sentence.
The second step is that: the three matrices Q, K and V are then input into the Multi-linear attention function we have designed, the specific formula is as follows:
Multilinear(Q,K,V)=Concat(H1,H2,…,Hn)WO
Where H1=TensorSplit(G)
where G is a third order tensor. H1,H2,…,HnIs a matrix resulting from tensor cutting.
The third step: inputting the output of the Multi-linear function into a Feed-Forward network, wherein the network function is as follows:
FFN(x)=max(0,xW1+b1)W2+b2
(3) the last layer performs full connection on the last output, then performs loss function calculation, then performs back propagation by using BP, and then trains the model.
(4) Model validation is performed after each batch through the validation set, and the model with the best result on the prediction set is saved.
And a third part:
(1) the coding part adopts the same model structure as the language modeling to code, in this part, English is used as original text and is input into an Encoder part of a transform language model, German is used as a target statement, namely, translation is input into a Decoder part, and the Decoder part adopts an original self-attention function. The function is:
Figure BDA0002253540660000071
(2) english passes through the encoder, German passes through the decoder, and then matching probability is calculated at an output layer, and a translation model is trained through a cross entropy loss function.
(3) Model validation is performed after each batch through the validation set, and the model with the best result on the prediction set is saved.
The model can realize the parameter compression of nearly half of the original Transformer language model parameters. In addition, by adopting the technology, the self-attention module of the model can be embedded into an original model structure after being compressed, and meanwhile, the improvement can be ensured in an experimental result. The main technical support is that the representation of the tensor, which is a more informative representation, is able to reconstruct the original attention output representation, which is a marginal probability of the tensor representation. Fig. 3 shows the process, and the left side is third-order tensor representation obtained by the technology, and by slicing tensor in the vertical dimension, N matrixes above the right side of fig. 3 can be obtained, and the N matrixes are summed to obtain the original attention output X. In our technique, not summation, but concatenation operation (concat) as shown in fig. 2 is adopted, so that richer information can be ensured to be modeled, and the final modeling effect of the model is improved.
The technical means disclosed in the invention scheme are not limited to the technical means disclosed in the above embodiments, but also include the technical scheme formed by any combination of the above technical features. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications are also considered to be within the scope of the present invention.

Claims (2)

Translated fromChinese
1.一种基于张量分解技术的神经语言模型的压缩方法,其特征在于:它包括如下步骤:1. a compression method based on the neural language model of tensor decomposition technology, is characterized in that: it comprises the steps:通过线性表示出原始注意力函数进行张量模型构建;Tensor model construction is performed by linearly representing the original attention function;利用张量模型中Tucker构造单块注意力函数;Use Tucker in the tensor model to construct a single block attention function;利用张量模型中Block-Term和共享因子矩阵构造Multi-linear注意力函数,实现Transformer神经网络模型的参数压缩;The Multi-linear attention function is constructed by using the Block-Term and shared factor matrix in the tensor model to realize the parameter compression of the Transformer neural network model;将Multi-linear注意力函数嵌入到Transformer神经网络模型结构中获得压缩后的张量Transformer模型;Embed the Multi-linear attention function into the Transformer neural network model structure to obtain a compressed tensor Transformer model;将压缩后的张量Transformer模型应用在语言建模任务与机器翻译任务中;Apply the compressed tensor Transformer model to language modeling tasks and machine translation tasks;在训练集上训练张量Transformer模型,同时每间隔一定批次在验证集上进行验证张量Transformer模型,记录保存在验证集上效果达到最优时的模型参数;Train the tensor Transformer model on the training set, and verify the tensor Transformer model on the validation set at regular intervals, and record and save the model parameters when the effect is optimal on the validation set;用上一步中保存的最优的模型去测试测试集上的样本,最终得到每个测试样本的预测结果或翻译模型,对比测试标签,计算出预测和翻译的准确率;其中:Use the optimal model saved in the previous step to test the samples on the test set, and finally obtain the prediction result or translation model of each test sample, compare the test labels, and calculate the accuracy of prediction and translation; among them:在语言建模任务中,将语言建模数据集中的测试集输入到训练好张量Transformer模型中进行测试,计算每个句子出现的概率;In the language modeling task, the test set in the language modeling dataset is input into the trained tensor Transformer model for testing, and the probability of each sentence appearing is calculated;在翻译任务中,将德英翻译数据集中的测试集输入到训练好的张量Transformer模型中进行测试,计算每个译句与原句的BLEU值;In the translation task, the test set in the German-English translation dataset is input into the trained tensor Transformer model for testing, and the BLEU value of each translated sentence and the original sentence is calculated;记录张量Transformer模型在语言建模任务以及德英翻译任务上的实验结果。Record the experimental results of the tensor Transformer model on language modeling tasks and German-English translation tasks.2.根据权利要求1所述的一种基于张量分解技术的神经语言模型的压缩方法,其特征在于:所述张量Transformer模型应用在语言建模任务与机器翻译任务中的具体步骤如下:2. a kind of compression method based on the neural language model of tensor decomposition technology according to claim 1, is characterized in that: the concrete steps that described tensor Transformer model is applied in language modeling task and machine translation task are as follows:获取语言模型数据集和德英翻译数据集。Get the language model dataset and the German-English translation dataset.处理数据集并进行数据清洗,对以文本数据,一个句子中每个词的位置是有关系,结合句子中词的位置信息构建文本的词向量的表示模型;Process the data set and perform data cleaning. For text data, the position of each word in a sentence is related, and the representation model of the word vector of the text is constructed by combining the position information of the words in the sentence;将词向量的表示模型输入到张量Transformer模型中进行模型训练获得神经网络模型的训练损失函数为:Input the representation model of the word vector into the tensor Transformer model for model training to obtain the training loss function of the neural network model:
Figure FDA0002253540650000011
Figure FDA0002253540650000011
其中yi代表真是类别标签,
Figure FDA0002253540650000012
代表预测结果,n代表句子的长度。通过反向传播算法、批量随机梯度下降法训练模型。
where yi represents the true category label,
Figure FDA0002253540650000012
represents the prediction result, and n represents the length of the sentence. The model is trained by backpropagation algorithm, batch stochastic gradient descent method.
CN201911043675.3A2019-10-302019-10-30 A compression method of neural language model based on tensor decomposition technologyPendingCN111008517A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201911043675.3ACN111008517A (en)2019-10-302019-10-30 A compression method of neural language model based on tensor decomposition technology

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201911043675.3ACN111008517A (en)2019-10-302019-10-30 A compression method of neural language model based on tensor decomposition technology

Publications (1)

Publication NumberPublication Date
CN111008517Atrue CN111008517A (en)2020-04-14

Family

ID=70111682

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201911043675.3APendingCN111008517A (en)2019-10-302019-10-30 A compression method of neural language model based on tensor decomposition technology

Country Status (1)

CountryLink
CN (1)CN111008517A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112084796A (en)*2020-09-152020-12-15南京文图景信息科技有限公司Multi-language place name root Chinese translation method based on Transformer deep learning model
CN112434804A (en)*2020-10-232021-03-02东南数字经济发展研究院Compression algorithm for deep transform cascade neural network model
CN112925904A (en)*2021-01-272021-06-08天津大学Lightweight text classification method based on Tucker decomposition
CN113537485A (en)*2020-04-152021-10-22北京金山数字娱乐科技有限公司Neural network model compression method and device
CN115309713A (en)*2022-09-292022-11-08江西锦路科技开发有限公司Traffic data compression method and device, electronic equipment and storage medium
CN119494263A (en)*2024-10-292025-02-21哈尔滨理工大学 Temperature field prediction method and system for soft-pack lithium-ion batteries based on tensor Tucker decomposition

Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2018191344A1 (en)*2017-04-142018-10-18Salesforce.Com, Inc.Neural machine translation with latent tree attention
WO2018213763A1 (en)*2017-05-192018-11-22Salesforce.Com, Inc.Natural language processing using context-specific word vectors
US20190130273A1 (en)*2017-10-272019-05-02Salesforce.Com, Inc.Sequence-to-sequence prediction using a neural network model

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2018191344A1 (en)*2017-04-142018-10-18Salesforce.Com, Inc.Neural machine translation with latent tree attention
WO2018213763A1 (en)*2017-05-192018-11-22Salesforce.Com, Inc.Natural language processing using context-specific word vectors
US20190130273A1 (en)*2017-10-272019-05-02Salesforce.Com, Inc.Sequence-to-sequence prediction using a neural network model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XINDIAN MA: "A Tensorized Transformer for Language Modeling"*

Cited By (10)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113537485A (en)*2020-04-152021-10-22北京金山数字娱乐科技有限公司Neural network model compression method and device
CN113537485B (en)*2020-04-152024-09-06北京金山数字娱乐科技有限公司Compression method and device for neural network model
CN112084796A (en)*2020-09-152020-12-15南京文图景信息科技有限公司Multi-language place name root Chinese translation method based on Transformer deep learning model
CN112084796B (en)*2020-09-152021-04-09南京文图景信息科技有限公司Multi-language place name root Chinese translation method based on Transformer deep learning model
CN112434804A (en)*2020-10-232021-03-02东南数字经济发展研究院Compression algorithm for deep transform cascade neural network model
CN112925904A (en)*2021-01-272021-06-08天津大学Lightweight text classification method based on Tucker decomposition
CN112925904B (en)*2021-01-272022-11-29天津大学Lightweight text classification method based on Tucker decomposition
CN115309713A (en)*2022-09-292022-11-08江西锦路科技开发有限公司Traffic data compression method and device, electronic equipment and storage medium
CN119494263A (en)*2024-10-292025-02-21哈尔滨理工大学 Temperature field prediction method and system for soft-pack lithium-ion batteries based on tensor Tucker decomposition
CN119494263B (en)*2024-10-292025-07-29哈尔滨理工大学Soft package lithium ion battery temperature field prediction method and system based on tensor Tucker decomposition

Similar Documents

PublicationPublication DateTitle
CN111368565B (en)Text translation method, text translation device, storage medium and computer equipment
CN111008517A (en) A compression method of neural language model based on tensor decomposition technology
CN110598224B (en)Training method of translation model, text processing method, device and storage medium
CN110570845B (en) A Speech Recognition Method Based on Domain Invariant Features
CN111125333B (en) A Generative Question Answering Method Based on Representation Learning and Multilayer Covering Mechanism
CN117076931A (en)Time sequence data prediction method and system based on conditional diffusion model
CN114997174B (en)Intention recognition model training and voice intention recognition method and device and related equipment
CN110852066B (en) A method and system for multilingual entity relation extraction based on adversarial training mechanism
CN117151173B (en)Model compression method and system based on meta learning
CN115019785A (en)Streaming voice recognition method and device, electronic equipment and storage medium
Huai et al.Zerobn: Learning compact neural networks for latency-critical edge systems
CN114139011A (en) An Encoder-Dual Decoder-based Image Chinese Description Generation Method
CN117828072B (en) A conversation classification method and system based on heterogeneous graph neural network
CN110347860A (en)Depth image based on convolutional neural networks describes method
CN112712855B (en)Joint training-based clustering method for gene microarray containing deletion value
CN117556009A (en) Multi-turn dialogue generation method and system based on conditional diffusion model
CN117494815A (en) Archive-oriented trusted large language model training, inference methods and devices
CN118297131A (en)Text natural language processing training method and system in operation of electric power system
CN114638905B (en) Image generation method, device, equipment and storage medium
CN114896969A (en)Method for extracting aspect words based on deep learning
Zhang et al.Word-level BERT-CNN-RNN model for Chinese punctuation restoration
WO2020040255A1 (en)Word coding device, analysis device, language model learning device, method, and program
CN119206098A (en) A CLIP-based method for reverse engineering properties to microstructures
Zheng et al.Contrastive auto-encoder for phoneme recognition
Haikun et al.Speech recognition model based on deep learning and application in pronunciation quality evaluation system

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
WD01Invention patent application deemed withdrawn after publication

Application publication date:20200414

WD01Invention patent application deemed withdrawn after publication

[8]ページ先頭

©2009-2025 Movatter.jp