Detailed Description
In the prior art, the basic unit in ShuffleNet v is improved from the residual unit.
The residual unit is composed of a convolution block, short circuit connection and element addition operation, wherein the residual unit inputs the unit input into the convolution block to carry out convolution processing to obtain convolution output, and the convolution output and the unit input directly transmitted from the short circuit connection are added according to the element to obtain final output of the residual unit. The residual unit has the advantages that the calculation amount is increased and more time is consumed due to the fact that the addition operation according to elements is carried out, so that the ShuffleNet v proposes a basic unit, namely the input of the basic unit is divided into two groups in average through the Channel splitting (CHANNEL SPLIT) operation, one group serves as the input of a convolution block, the convolution output is obtained through convolution processing, the other group is directly spliced with the convolution output in the Channel dimension, and the Channel uniform cross shuffling (Channel Shuffle) operation is carried out to obtain the unit output, so that the addition operation according to elements is avoided. For example, the convolution outputs are sequentially arranged feature images 2, 3, 4 and 5, the other group is sequentially arranged feature images a, b, c and d, the splicing is performed to obtain sequentially arranged feature images 2, 3, 4, 5, a, b, c and d, and the Channel uniform cross shuffling (Channel shuffling) operation is performed to obtain sequentially arranged feature images 2, a, 3, b, 4, c, 5 and d. The purpose of the uniform cross shuffling of the channels is to make each group contain a characteristic diagram after convolution and a characteristic diagram after deconvolution when two groups are next time.
ShuffleNet v2 the entire network is formed by a plurality of its basic units repeatedly stacked. Analysis of the ShuffleNet v whole network shows that in the network, the number of feature patterns acquired by the convolution blocks in the following base units from the convolution blocks in the previous base unit is sequentially and proportionally attenuated, for example, the number of feature patterns acquired by the convolution blocks in the second base unit from the convolution blocks in the first base unit is 8, the number of feature patterns acquired by the convolution blocks in the third base unit from the convolution blocks in the first base unit is 4, the number of feature patterns acquired by the convolution blocks in the fourth base unit from the convolution blocks in the first base unit is 2, and the number of feature patterns acquired by the convolution blocks in the fifth base unit from the convolution blocks in the first base unit is 1. Wherein 8, 4, 2, 1 are attenuated in equal proportion.
From the above analysis, the network in the prior art has complex overall design and is not flexible enough.
And, the inventors found through studies that the passing requirement of high frequency information is not the same at different feature extraction stages of the network. For example, in the low-level feature extraction stage, mainly low-level features such as edges, gradients, textures and the like of objects are extracted, and the information belongs to high-frequency signals, so that the passing rate of the high-frequency information needs to be increased in the low-level feature extraction stage, in the middle-level feature extraction stage, a neural network mainly extracts abstract concepts which contain certain high-frequency components but do not contain a large amount of high-frequency components like a low level, so that the passing rate of the high-frequency information needs to be slightly reduced compared with the low-level feature extraction stage in the middle-level feature extraction stage, in the high-level feature extraction stage, the neural network already extracts various high-level abstract concepts which need to be linearly separable in vector space where the output of the neural network exists, namely the signals of the level should not contain complex high-frequency components, so that the passing rate of the high-frequency information needs to be suppressed as much as possible in the high-level feature extraction stage.
However, due to the design of the basic unit in ShuffleNet v, different characteristic extraction stages in the whole network are all divided in an equal-ratio attenuation mode, the characteristics of the different characteristic extraction stages of the network are ignored, and the prediction accuracy of the network is seriously affected.
The embodiment of the application provides a lightweight neural network model. The network structure is simple, the design is flexible, and thus, the relation among the number of the feature images in each feature image group in the plurality of feature image groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so that the prediction precision is improved.
In order to enable those skilled in the art to better understand the present application, the following description will make clear and complete descriptions of the technical solutions according to the embodiments of the present application with reference to the accompanying drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Furthermore, in some of the flows described in the specification, claims, and drawings above, a plurality of operations occurring in a particular order may be included, and the operations may be performed out of order or concurrently with respect to the order in which they occur. The sequence numbers of operations such as 101, 102, etc. are merely used to distinguish between the various operations, and the sequence numbers themselves do not represent any order of execution. In addition, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first" and "second" herein are used to distinguish different messages, devices, network blocks, etc., and do not represent a sequence, and are not limited to the "first" and "second" being different types.
Fig. 2 is a flow chart illustrating a data identification method according to an embodiment of the application. As shown in fig. 2, the method includes:
101. And obtaining data to be processed.
102. And inputting the data to be processed into a trained neural network model to obtain a recognition result.
In practical application, the data to be processed may be an image to be processed, a video to be processed, or an audio to be processed. When the data to be processed is the video to be processed, video frames in the video to be processed can be input into the trained neural network model in a sequence mode. When the data to be processed is the audio to be processed, the audio frames in the audio to be processed can be input into the trained neural network model in a sequence mode.
It should be added that in practical application, the audio to be processed may be sampled to obtain a plurality of audio frames, for example, the audio to be processed may be sampled at equal time intervals or at different time intervals to obtain a plurality of audio frames. Specific sampling techniques may be found in the prior art and will not be described in detail herein.
In addition, in an example, when the data to be processed is audio to be processed, before the audio to be processed is input into the trained neural network model, voice signal analysis may be performed on each frame of audio of the audio to be processed, so as to obtain a time spectrum corresponding to each frame of audio. The time spectrum corresponding to the multi-frame audio of the audio to be processed is input into the neural network model, so that the time spectrum is processed as a frame image by the neural network model.
When the data to be processed is an image to be processed, the identification result may be an image classification result, a target detection result, or an image segmentation result. When the data to be processed is a video to be identified, the identification result may be a video classification result, a target tracking result, or the like. When the data to be processed is audio to be processed, the identification result may be an audio classification result.
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
The sequence is formed by sequentially connecting a plurality of network blocks. The first network block and the second network block are each comprised of at least one neural network layer, which may include a convolutional layer. In one example, each network block in the sequence convolves its input to its output. One first downsampled block and a plurality of scale-invariant convolution blocks arranged in sequence may be included in the sequence, the first downsampled block ordering a first in the sequence. The number of scale-invariant convolution blocks in the sequence can be determined according to the feature extraction stage and the application requirement of the sequence in the neural network model. The specific structure of the first downsampling block can be seen in fig. 3, and the specific structure of the scale-invariant convolution block can be seen in fig. 4. The first downsampled block reduces the feature map size and the scale-invariant convolution block does not change the feature map size.
As shown in fig. 3, the first downsampling block includes two branches, the left branch is formed by combining a 5x5 depth convolution layer (DEPTHWISE CONV) and a 1x1 convolution layer, the right branch is formed by combining a 1x1 convolution layer, a 5x5 depth convolution layer (DEPTHWISE CONV) and a 1x1 convolution layer, and finally the outputs of the two branches are added to obtain the output of the whole structure.
As shown in fig. 4, the scale-invariant convolution block is formed by combining a 1x1 convolution layer, a 3x3 depth convolution layer (DEPTHWISE CONV), and a 1x1 convolution layer.
In practical application, the division mode can be uniform division or non-uniform division, and can be specifically set according to the depth (i.e. stage) of the sequence in the network and specific task requirements or deployed hardware environments. In most cases, uniform slicing is more hardware friendly than non-uniform slicing. In addition, the plurality of feature map partitions may not be divided continuously, and may be divided by discontinuous jump, so long as the obtained plurality of feature map groups are ensured not to overlap each other. In general, the sequential division is easier to implement in hardware.
In the sequence, the first network block and at least one second network block are sequentially connected. The number of groups corresponding to the first network block may be configured in advance, or automatically determined when the model is executed. In an example, when constructing the neural network model, the number of groups corresponding to the first network blocks configured by the user according to the number of at least one second network block is received and recorded. In another example, the number of at least one second network block is obtained by the neural network model upon subsequent execution, and the number of groups is automatically determined based on the number.
In practice, the number of groups may be equal to the number of the at least one second network block, so that the number of the plurality of feature map groups obtained by division corresponds to the number of the at least one second network block, and each second network block is allocated to one of the feature map groups from the second network block. Or the difference between the number of groups and the number of at least one second network block is a fixed value, for example, 1, so that the number of the plurality of feature map groups obtained by dividing is one more than the number of the at least one second network block.
In the neural network model training stage and the application stage, the multiple feature maps are divided according to a division mode configured for the first network block when the neural network model is constructed, and the multiple feature map sets are allocated according to an allocation rule configured for the first network block when the neural network model is constructed. In this way, it is ensured that the partitioning and allocation is not performed randomly, and that the second network block can be allocated (i.e. shared) from the plurality of feature maps output by the first network block to the feature map useful therefor by means of model training.
In addition, in the embodiment of the application, the relation among the number of the feature images in each of the plurality of feature image groups can be configured when the network is constructed according to actual needs, and the method is flexible and convenient, for example, the number of the feature images in each group is equal, equal-ratio attenuation or linear attenuation is realized. Specifically, the characteristics of the sequence in the feature extraction stage in the network can be flexibly configured.
Wherein the feature images in the feature image groups are mutually disjoint, and the union of the feature image groups comprises the feature images, so that each feature image can be ensured to be utilized.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
It should be added that in practical application, the division and allocation can be performed on the output of each network block in the sequence according to the method in the same way, so as to realize multi-layer feature joint sharing. Thus, the inputs of the network blocks in the sequence are all formed by combining the output parts of the network blocks positioned in front of the network blocks in the sequence, so that the passing rate of the high-frequency signals is realized.
The method comprises the steps of determining a sequence of characteristic map groups, wherein the sequence comprises a third network block and all network blocks positioned in front of the third network block, and in the channel dimension, splicing characteristic map groups distributed by the third network block from all network blocks positioned in front of the third network block in the sequence to obtain the input of the third network block. Wherein the third network block refers to any network block in the sequence except for the two network blocks before the sequence. It should be added that one feature map corresponds to one channel.
The channel map group to which the second network block in the sequence is assigned from the first network block in the sequence is directly used as an input of the second network block in the sequence.
In another example, a fifth network block connected to and located after the last network block in the first sequence may also be included in the neural network model. In this example, the number of the plurality of feature map groups divided for the plurality of feature maps output by the first network block is one more than the number of the at least one second network block. The neural network model is further used for splicing a plurality of feature graphs output by the last network block with channel graph groups distributed by the fifth network block from other network blocks in a channel dimension to obtain input of the fifth network block, wherein the other network blocks refer to all network blocks except the last network block in the first sequence. The fifth network block may also consist of at least one neural network layer, such as a convolutional layer. In one example, the fifth network block may convolve its input to obtain its output.
In general, in practical application, the feature extraction network in the neural network model is divided into a plurality of feature extraction stages (stages) according to the feature map size, for example, the neural network model shown in fig. 6 includes five feature extraction stages, and the plurality of feature extraction stages are connected in sequence. The size of the feature map output in each feature extraction stage is unchanged, and the size of the feature map output in the later feature extraction stage is smaller than that of the feature map output in the previous feature extraction stage. The above sequence may be deployed in part of the feature extraction stages in the plurality of feature extraction stages according to actual needs. The inventors have found that the high-frequency information needs to be suppressed from passing through the high-level stage among the plurality of feature extraction stages, so that the above sequence is not used in the high-level stage among the plurality of feature extraction stages, and the sequence is deployed in the other feature extraction stages than the high-level stage among the plurality of feature extraction stages. Taking a neural network of five feature extraction stages as an example, the feature extraction stage of the fifth of the five feature extraction stages is a high-level stage.
The inventor also researches that the feature map size is larger in the initial stage of the plurality of feature extraction stages, and if the multi-level feature sharing mode is adopted, the calculated amount is easy to be expanded, so that the convolution block is directly used for reducing the feature map size in the initial stage, and feature sharing is not carried out. Taking a neural network model of five feature extraction stages as an example, the feature extraction stages of the five feature extraction stages, which are ranked first and second, are all initial stages.
In an example, the neural network model comprises a first feature extraction stage and a second feature extraction stage positioned after the first feature extraction stage, wherein the size of a feature map output in each of the first feature extraction stage and the second feature extraction stage is unchanged, each feature extraction stage comprises one sequence, and the size of the feature map output in the second feature extraction stage is smaller than the size of the feature map output in the first feature extraction stage. The first feature extraction stage belongs to a low-level stage, and the second feature extraction stage belongs to a medium-level stage.
For a neural network model with five feature extraction stages, the first feature extraction stage may be the third feature extraction stage of the five feature extraction stages, and the second feature extraction stage may be the fourth feature extraction stage of the five feature extraction stages.
The inventors have found that increasing the pass rate of high frequency information at the low level stage and the need to increase the connections between network blocks as much as possible at the low level stage, using uniform partitioning and connecting together at the low level stage is the most efficient solution.
Therefore, when the sequence is located in the first feature extraction stage, the "dividing the plurality of feature images output by the first network block according to the number of groups" is performed to obtain a plurality of feature image groups, specifically, the plurality of feature images output by the first network block are uniformly divided according to the number of groups to obtain a plurality of feature image groups, wherein the number of feature images in the plurality of feature image groups is equal. And sequentially distributing the plurality of characteristic diagram groups to corresponding network blocks in at least one second network block.
It should be added that, in order to avoid loss of high frequency signals during the cross-feature extraction phase, the feature map output by the first network block in the first feature extraction phase needs to be shared into the first-ordered network block in the second feature extraction phase. The number of the plurality of feature map sets is one more than the number of the at least one second network block, so that the plurality of feature map sets can be shared to the next feature extraction stage.
The inventor researches that the neural network mainly extracts abstract concepts which contain a certain high-frequency component but do not contain a large amount of high-frequency component like the low-level stage in the middle-level stage, so that the passing rate of high-frequency information needs to be slightly reduced in the middle-level stage compared with the low-level feature extraction stage.
When the sequence is in the second feature extraction stage, the plurality of feature images output by the first network block are divided according to the group number to obtain a plurality of feature image groups, specifically, the plurality of feature images output by the first network block are subjected to attenuation division according to the group number to obtain a plurality of first feature image groups, wherein the number of feature images of part of the feature image groups in the plurality of feature image groups is sequentially reduced, and the number of feature images of the feature image groups distributed from the part of the feature image groups by the second network block in the front sequence is larger than the number of feature images of the feature image groups distributed from the part of the feature image groups by the second network block in the rear sequence in any two second network blocks in the at least one second network block in the adjacent sequence.
It is added that in order to avoid loss of high frequency signals across the feature extraction stages, the feature map output by the first network block in the second feature extraction stage needs to be shared into the first-ordered network block in the next feature extraction stage following the second feature extraction stage. That is, the number of the plurality of feature map sets is one more than the number of the at least one second network block, so that the feature map sets except the partial feature map sets can be shared to the next feature extraction stage.
In order to follow the rule that the passing rate of the high-frequency information needs to be reduced in the middle-level stage, the number of the feature images of the feature image groups other than the partial feature image group may be smaller than or equal to the feature image group with the smallest feature image number in the partial feature image group.
The inventor finds that a better balance can be achieved between the calculated amount and the network prediction precision by using an exponential decay mode in the second feature extraction stage through research, namely, the best network prediction precision can be achieved by using exponential decay in the second feature extraction stage under the same calculated amount. Exponential decay refers to the number of feature maps of the set of partial feature maps being in turn equal.
Further, the neural network model further comprises a third feature extraction stage positioned after the second feature extraction stage, the size of the feature map output in the third feature extraction stage is unchanged, the size of the feature map output in the third feature extraction stage is smaller than that of the feature map output in the second feature extraction stage, and the third feature extraction stage comprises at least one residual block which is sequentially connected.
The specific structure is shown in fig. 5, the residual block is formed by introducing residual connection (resdual link) on the basis of a convolution block with unchanged size, and the input and convolution output are added to obtain the output of the whole structure.
The third feature extraction stage belongs to a high-level stage in the neural network model, i.e. the last feature extraction stage. At a high level stage, neural networks have extracted various high level abstractions that need to be linearly separable in the vector space in which the output of the neural network is located, i.e., the signals at that level should not contain complex high frequency components. By not using feature sharing, high frequency signals can be suppressed. In addition, the residual structure is used because the stage is at the final part of the neural network, and if there is no residual structure when gradient descent is performed, gradient explosion or gradient disappearance is likely to occur when the gradient propagates to the lowest layer according to the chain gradient rule. The residual structure can improve the effective propagation of the gradient to a certain extent, thereby improving the training speed and the training precision of the model.
The technical scheme of the application will be described in detail with reference to the accompanying drawings:
The neural network model is assumed to be in the first feature extraction stage, and only three network blocks are included in the sequence. The feature maps output by the network blocks in the first order are called first layer features, the feature maps output by the network blocks in the second order are called second layer features, and the feature maps output by the network blocks in the third order are third layer features. In addition, the third network block is connected to the first network block in the second feature extraction stage. As shown in figure 1b of the drawings,
(1) The number n of network blocks after the first network block is ordered is 2, and the group number is n+1, i.e. 3. The first layer features are divided into three non-overlapping sets of features that will be used by subsequent different network blocks.
(2) And the second network block is sequenced to carry out convolution operation on the first group of features in the first layer of features to obtain the second layer of features.
(3) If the number of network blocks after the second network block is ordered is 1, the second layer features output by the second network block are further divided into non-overlapping 2 groups, and the 2 groups of features are also used by different subsequent blocks.
(4) And splicing (concat) the second group of features in the first layer of features and the first group of features of the second layer of feature graphs to serve as input of the third-order network blocks, and carrying out convolution operation by the third-order network blocks to obtain the third layer of features.
Since no network blocks follow the third ordered network block, the third tier features are no longer split.
(5) And splicing (concat) the third group of features in the first layer of features and the second group of features in the second layer of features to serve as the input of the network blocks sequenced first in the second feature extraction stage, and performing convolution operation by the input to obtain the features. The feature may continue multi-layer feature joint sharing in the second feature extraction stage in the same manner.
According to the embodiment of the application, a complete lightweight convolutional neural network is constructed according to the method and the structure. As shown in fig. 6, the network includes:
(1) In the first two feature extraction stages, the input is downsampled for 2 times by adopting a 3x3 convolution layer and a 3x3 maximum pooling layer (max pooling) respectively, and the subsequent feature joint sharing is entered.
(2) At the Low-level stage (Low-LEVEL STAGE), the part is a combination of a downsampling block (reduction block) and 3 convolution blocks (normal block), and a multi-layer feature joint sharing mode is adopted. In this stage, the feature graphs are uniformly segmented, i.e. the number of feature graphs shared by the current block to the following block is equal.
(3) In the middle level stage (Mid-LEVEL STAGE), the part is a combination of a downsampling block (reduction block) and 7 convolution blocks (normal block), and a multi-layer characteristic joint sharing mode is adopted. At this stage, the feature map is split unevenly. Specifically, the number of feature maps shared by the current block to the following blocks is not equal, and the feature maps are sequentially attenuated in equal proportion.
(4) In the High-level stage (High-LEVEL STAGE), a combination of a downsampling block and 3 residual blocks is different from the low-level stage and the middle-level stage, and the multi-layer characteristic joint sharing mode is not adopted in the stage.
(5) In the prediction stage, at the end stage of the network, a 1x1 convolution layer and a global average pooling layer (global avg pooling) are adopted to map the features into one-dimensional vectors, and then the vectors enter a full connection layer (full connection) for final classification.
In summary, the multi-layer feature joint sharing mode can ensure that the features are used only once later, can effectively reduce the calculation amount and memory occupation of the network, and is beneficial to acceleration on mobile equipment. Compared with the single feature sharing mode of shufflenetv2, the scheme adopts a plurality of different feature joint sharing modes in the overall network architecture, can fully utilize the features extracted at different stages of the network, and effectively improves the precision under the condition of not increasing the network calculation amount.
The lightweight convolutional neural network provided by the application has prediction accuracy superior to that of the current mainstream network under the same calculation amount, and can be conveniently deployed and operated on mobile equipment. The advantage is more pronounced at a smaller computational power level (40M), which is nearly 2% higher than the best shufflenetv. In addition, the multi-layer characteristic joint sharing mode is relatively intensive in calculation of each stage, does not have excessive fragment operation, and is very suitable for being realized and operated on general hardware.
The method for identifying data provided by the embodiment of the application is described below by way of example with reference to fig. 1a, in which a user may input an image to be classified on a client input interface. The client can respond to the input operation of the user, input the images to be classified into the trained neural network model, and obtain the classification result, namely the class X.
The neural network model comprises a sequence, wherein the sequence comprises three network blocks. Also included in the neural network model is a network block connected to and located after the sequence. And dividing the feature map a, the feature map b and the feature map c output by the first network block into three groups, wherein each group is one feature map. The feature map a is used as the input of the network blocks of the second order, the feature map b is used as the input component of the network blocks of the third order, and the feature c is used as the input component of the network blocks positioned after the sequence.
The second network block is subjected to convolution processing on the input of the second network block, namely the feature map a, and the feature maps 1 and 2 are output. The feature map 1 and the feature map 2 output by the second-ranked network block are divided into two groups, one feature map for each group. The feature map 2 is taken as an input component of the third ordered network block and the feature map 1 is taken as an input component of the network block located after the sequence.
The third network block in the order carries out convolution processing on its input, namely the spliced features of the feature map 2 and the feature map b, and outputs a feature map A as an input component part of the network block located after the sequence.
According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each feature image group in the plurality of feature image groups can be conveniently designed according to actual needs, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
It should be added that, in practical application, the neural network model in the prior art can also be deployed on a plurality of clients (for example, mobile clients) respectively. Although the calculation amount of the neural network model in the prior art is large, the data identification task can be divided into a plurality of subtasks, for example, a plurality of images in a family sharing album are divided into a plurality of subtasks, the plurality of subtasks are respectively distributed to a plurality of clients, so that the plurality of clients can process the respective subtasks simultaneously by utilizing the neural network model deployed on the plurality of clients, the task processing time can be greatly reduced, and the task processing speed can be improved.
Furthermore, a first neural network model (i.e., the improved neural network model provided by embodiments of the present application) and a second neural network model (i.e., the neural network model of the prior art) may be deployed simultaneously on a client (e.g., a mobile client) in advance. The client may determine whether to utilize the first neural network model or the second neural network model to process the data recognition task according to the current load amount of the terminal. The method comprises the steps of detecting the current load of a terminal, selecting a first neural network model to process a data recognition task when the current load of the terminal is larger than a preset threshold value to obtain a recognition result, and selecting a second neural network model to process the data recognition task when the current load of the terminal is smaller than or equal to the preset threshold value to obtain the recognition result. The specific implementation of the first neural network model to process the image to be processed to obtain the recognition result can be referred to the corresponding content in each embodiment and will not be described herein, and the specific implementation of the second neural network to process the image to be processed to obtain the recognition result can be referred to the prior art and will not be described herein.
Fig. 7 is a schematic flow chart of a method for constructing the above model according to another embodiment of the present application. As shown in fig. 7, the method includes:
201. A sequence is obtained.
202. And constructing a neural network model for data identification according to the sequence.
The sequence comprises a first network block and at least one second network block positioned behind the first network block, the constructed neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
In 201 above, a plurality of network blocks may be created, the network blocks being made up of at least one neural network layer. And sequentially connecting a plurality of network blocks to obtain the sequence. A first downsampled block and a plurality of scale-invariant convolution blocks may be included in the sequence.
In 202 above, the sequence may be connected to other sequences or network blocks to obtain a neural network model for data identification.
The specific implementation of the steps performed by the neural network model may be referred to the corresponding content in the above embodiments, which is not described herein.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
Further, the method further comprises the following steps:
203. two of the sequences are constructed from a plurality of network blocks.
Correspondingly, the step 202 is to connect two sequences in turn to obtain a neural network model for data identification;
The two sequences are respectively positioned in a first feature extraction stage and a second feature extraction stage after the first feature extraction stage of the constructed neural network model, the size of a feature image output in each of the first feature extraction stage and the second feature extraction stage is unchanged, and the size of the feature image output in the second feature extraction stage is smaller than that of the feature image output in the first feature extraction stage.
Further, the method further comprises the following steps:
204. At least one residual block is constructed.
Correspondingly, the above-mentioned "connecting two said sequences in turn to obtain a neural network model for data identification", specifically, connecting two said sequences and said at least one residual block in turn to obtain a neural network model for data identification.
The method comprises a neural network model, at least one residual block, a third feature extraction stage, a feature graph size unchanged, and a feature graph size smaller than the feature graph size outputted in the second feature extraction stage, wherein the at least one residual block is located in the third feature extraction stage in the neural network model, the third feature extraction stage is located after the second feature extraction stage, and the feature graph size outputted in the third feature extraction stage is unchanged.
In practical application, the sequence in the second stage is connected with at least one residual block in the third stage through a second downsampling block, and a second downsampling port is positioned in the third stage. The specific structure of the second downsampling block is shown in fig. 3, i.e. is the same as the first downsampling block.
It should be noted that, in the method provided by the embodiment of the present application (including the steps performed by the neural network model) and details of the neural network structure that are not fully described in detail may refer to corresponding details in the foregoing embodiments, which are not repeated herein.
Fig. 8 shows a flowchart of the training method of the model according to the embodiment of the present application. As shown in fig. 8, the method includes:
301. and inputting the sample data into a neural network model to obtain a recognition result.
302. And optimizing the neural network model according to the identification result and the expected identification result of the sample data.
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block. The neural network model is used for data identification.
In the above 301, the sample data may be a sample image, a sample video, or a sample audio. When the sample data is a sample video, video frames in the sample video can be input into the trained neural network model in a sequence form. When the sample data is sample audio, audio frames in the sample audio can be input into the trained neural network model in a sequence form.
The sampling manner of the frame audio and the manner of converting the frame audio into the time spectrum may be similar to the corresponding content in the above embodiments, and will not be described herein.
And taking the sample data as input of a neural network model to obtain a recognition result by using the neural network model. The implementation of the neural network model in this embodiment may be similar to that described in the above embodiments, and will not be described in detail herein.
The initial values of the network parameters in the neural network model may be random values.
In 302 above, the neural network model is used to identify the data to be processed after being trained.
The parameter optimization of the neural network model according to the expected recognition result corresponding to the sample data may be specifically implemented by using a loss function (loss function), where the loss function is used to measure the degree of inconsistency between the recognition result of the model and the expected recognition result, and is generally a non-negative real-valued function.
Alternatively, the loss function may be embodied as a cross entropy (Cross Entropy) loss.
And carrying out parameter optimization on the neural network model each time, so as to obtain the adjustment coefficient of each model parameter in the neural network model, and carrying out numerical adjustment on each model parameter by utilizing the adjustment coefficient of each model parameter, so as to obtain the model parameter of the neural network model.
The manner of parameter optimization by using the loss function is the same as that of the prior art, and redundant description is omitted here.
In one practical application, the expected recognition result can be a training label of data, for example, in an image classification scene, the training label can be a cat, a dog, a background and the like. The training samples for model training are the same as the prior art, and the difference is mainly that the neural network model provided by the embodiment of the application has different processing procedures on the training samples.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
Further, the neural network model comprises a first feature extraction stage and a second feature extraction stage positioned after the first feature extraction stage, wherein the size of a feature image output in each of the first feature extraction stage and the second feature extraction stage is unchanged, each feature extraction stage comprises one sequence, and the size of the feature image output in the second feature extraction stage is smaller than that of the feature image output in the first feature extraction stage.
Further, the neural network model further comprises a third feature extraction stage positioned after the second feature extraction stage, the size of the feature map output in the third feature extraction stage is unchanged, the size of the feature map output in the third feature extraction stage is smaller than that of the feature map output in the second feature extraction stage, and the third feature extraction stage comprises at least one residual block which is sequentially connected.
It should be noted that, in the method provided by the embodiment of the present application (including the steps performed by the neural network model) and details of the neural network structure that are not fully described in detail may refer to corresponding details in the foregoing embodiments, which are not repeated herein.
Yet another embodiment of the present application provides a neural network system. The system is used for identifying the data to be processed to obtain an identification result, and comprises a sequence and a splitting module, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block;
The splitting module is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, respectively distributing the plurality of feature graphs to corresponding network blocks in the at least one second network block to serve as input component parts of the corresponding network blocks, and the group number is related to the number of the at least one second network block.
Wherein the first network block and the second network block are each comprised of at least one neural network layer, which may comprise a convolutional layer.
The specific implementation manner of the steps executed by the splitting module may refer to the corresponding content in the above embodiments, which is not described herein again.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
Further, the neural network system comprises a first feature extraction stage and a second feature extraction stage positioned after the first feature extraction stage, wherein the size of a feature image output in each of the first feature extraction stage and the second feature extraction stage is unchanged, each feature extraction stage comprises one sequence, and the size of the feature image output in the second feature extraction stage is smaller than that of the feature image output in the first feature extraction stage.
Further, the neural network system further comprises a third feature extraction stage positioned after the second feature extraction stage, the size of the feature map output in the third feature extraction stage is unchanged, the size of the feature map output in the third feature extraction stage is smaller than that of the feature map output in the second feature extraction stage, and the third feature extraction stage comprises at least one residual block which is sequentially connected.
It should be noted that, in the system provided by the embodiment of the present application, the steps executed by the splitting module and the specific structure of each component may be referred to the corresponding content in the above embodiment, which is not described herein again.
Fig. 9 is a schematic flow chart of a feature extraction method according to another embodiment of the application. As shown in fig. 9, the method includes:
401. And obtaining data to be processed.
402. Inputting the data to be processed into a trained neural network model, and extracting the characteristics of the data to be processed.
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
In practical application, the data to be processed may be an image to be processed, a video to be processed, or an audio to be processed. When the data to be processed is the video to be processed, video frames in the video to be processed can be input into the trained neural network model in a sequence mode. When the data to be processed is the audio to be processed, the audio frames in the audio to be processed can be input into the trained neural network model in a sequence mode.
It should be added that in practical application, the audio to be processed may be sampled to obtain a plurality of audio frames, for example, the audio to be processed may be sampled at equal time intervals or at different time intervals to obtain a plurality of audio frames. Specific sampling techniques may be found in the prior art and will not be described in detail herein.
In addition, in an example, when the data to be processed is audio to be processed, before the audio to be processed is input into the trained neural network model, voice signal analysis may be performed on each frame of audio of the audio to be processed, so as to obtain a time spectrum corresponding to each frame of audio. The time spectrum corresponding to the multi-frame audio of the audio to be processed is input into the neural network model, so that the time spectrum is processed as a frame image by the neural network model.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the effectiveness of feature extraction.
The specific implementation of the steps performed by the neural network model may be referred to the corresponding content in the above embodiments, and will not be described herein.
Fig. 10 is a block diagram showing a data recognition apparatus according to an embodiment of the present application. As shown in fig. 10, the apparatus includes a first acquisition module 501 and a first input module 502. Wherein, the
A first obtaining module 501, configured to obtain data to be processed;
the first input module 502 is configured to input the data to be processed into a trained neural network model, and obtain a recognition result;
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
Further, the neural network model comprises a first feature extraction stage and a second feature extraction stage positioned after the first feature extraction stage, wherein the size of a feature image output in each of the first feature extraction stage and the second feature extraction stage is unchanged, each feature extraction stage comprises one sequence, and the size of the feature image output in the second feature extraction stage is smaller than that of the feature image output in the first feature extraction stage.
Further, when the sequence is located in the first feature extraction stage, the step of dividing the plurality of feature graphs output by the first network block according to the number of groups to obtain a plurality of feature graph groups includes:
According to the group number, uniformly dividing a plurality of feature images output by the first network block to obtain a plurality of feature image groups;
wherein the number of feature images in the plurality of feature image groups is equal.
Further, when the sequence is in the second feature extraction stage, the neural network model is specifically configured to attenuate and divide, according to the number of groups, a plurality of feature graphs output by the first network block to obtain a plurality of first feature graph groups;
Wherein the number of feature images of part of the feature image groups is sequentially reduced;
And the number of the feature graphs of the feature graph group allocated to the second network block with the front sequence from the part of the feature graph group is larger than that of the feature graph group allocated to the second network block with the rear sequence from the part of the feature graph group in any two second network blocks with the adjacent sequence.
Further, the number of the feature images of the partial feature image group is sequentially equal.
Further, the neural network model also comprises a third feature extraction stage positioned after the second feature extraction stage, wherein the size of the feature map output in the third feature extraction stage is unchanged, and the size of the feature map output in the third feature extraction stage is smaller than that of the feature map output in the second feature extraction stage;
and at least one residual block which is sequentially connected is included in the third feature extraction stage.
Further, the feature images in the plurality of feature image groups are mutually disjoint, and the union of the plurality of feature image groups comprises the plurality of feature images.
It should be noted that, the data identifying device provided in the foregoing embodiment may implement the technical solutions and technical effects described in the foregoing corresponding method embodiments, and specific implementation and principles of each module or the neural network model may refer to corresponding contents in the foregoing corresponding method embodiments, which are not described herein again.
Fig. 11 is a block diagram showing a construction of a model construction apparatus according to still another embodiment of the present application. As shown in fig. 11, the apparatus includes a second acquisition module 601 and a first construction module 602. Wherein, the
A second obtaining module 601, configured to obtain a sequence by using the obtaining module, where the sequence includes a first network block and at least one second network block located after the first network block;
A first construction module 602, configured to construct a neural network model for data identification according to the sequence;
The built neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, respectively distributing the plurality of feature graphs to corresponding network blocks in the at least one second network block to serve as input components of the corresponding network blocks, and the group number is related to the number of the at least one second network block.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
Further, the first constructing module 602 is further configured to construct two sequences according to a plurality of network blocks;
the first construction module 602 is specifically configured to sequentially connect two sequences to obtain a neural network model for data identification;
The two sequences are respectively positioned in a first feature extraction stage and a second feature extraction stage after the first feature extraction stage of the constructed neural network model, the size of a feature image output in each of the first feature extraction stage and the second feature extraction stage is unchanged, and the size of the feature image output in the second feature extraction stage is smaller than that of the feature image output in the first feature extraction stage.
Further, the first construction module is further configured to construct at least one residual block;
The first construction module 602 is specifically configured to sequentially connect two sequences and the at least one residual block to obtain a neural network model for data identification;
The method comprises a neural network model, at least one residual block, a third feature extraction stage, a feature graph size unchanged, and a feature graph size smaller than the feature graph size outputted in the second feature extraction stage, wherein the at least one residual block is located in the third feature extraction stage in the neural network model, the third feature extraction stage is located after the second feature extraction stage, and the feature graph size outputted in the third feature extraction stage is unchanged.
It should be noted that, the model building device provided in the foregoing embodiment may implement the technical solutions and technical effects described in the foregoing corresponding method embodiments, and specific implementation and principles of the foregoing modules or neural network models may refer to corresponding contents in the foregoing corresponding method embodiments, which are not described herein again.
Fig. 12 is a block diagram showing a structure of a model training apparatus according to still another embodiment of the present application. As shown in fig. 12, the apparatus includes:
the second input module 701 is configured to input sample data into the neural network model to obtain a recognition result;
a first optimizing module 702, configured to optimize the neural network model according to the identification result and an expected identification result of the sample data;
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the prediction accuracy.
Further, the neural network model comprises a first feature extraction stage and a second feature extraction stage positioned after the first feature extraction stage, wherein the size of a feature image output in each of the first feature extraction stage and the second feature extraction stage is unchanged, each feature extraction stage comprises one sequence, and the size of the feature image output in the second feature extraction stage is smaller than that of the feature image output in the first feature extraction stage.
It should be noted that, the model training device provided in the foregoing embodiment may implement the technical solutions and technical effects described in the foregoing corresponding method embodiments, and specific implementation and principles of the foregoing modules or neural network models may refer to corresponding contents in the foregoing corresponding method embodiments, which are not described herein again.
Fig. 13 is a block diagram showing a configuration of a feature extraction apparatus according to still another embodiment of the present application. As shown in fig. 13, the apparatus includes a third acquisition module 801 and a third input module 802, wherein,
A third obtaining module 801, configured to obtain data to be processed;
A third input module 802, configured to input the data to be processed into a trained neural network model, and extract features of the data to be processed;
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
According to the technical scheme provided by the embodiment of the application, the plurality of feature images output by the first network block are divided according to the group number related to the number of at least one second network block positioned behind the first network block, so as to obtain a plurality of feature image groups. The plurality of feature map sets are assigned sequentially to at least one second network block located after the first network block. According to the technical scheme provided by the embodiment of the application, the network structure is simple to design, the relation among the number of the feature images in each of the plurality of feature image groups can be conveniently designed according to actual needs, namely, the relation among the number of the feature images shared by each second network block from the first network block is conveniently designed, and the flexibility is improved. In this way, the relation among the feature map numbers of each feature map group in the plurality of feature map groups can be flexibly designed according to the characteristics of each feature extraction stage in the neural network model, so as to improve the effectiveness of feature extraction.
It should be noted that, the feature extraction device provided in the foregoing embodiment may implement the technical solutions and technical effects described in the foregoing corresponding method embodiments, and specific implementation and principles of the foregoing modules or neural network models may refer to corresponding contents in the foregoing corresponding method embodiments, which are not described herein again.
Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 14, the electronic device includes a memory 1101 and a processor 1102. The memory 1101 may be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on an electronic device. The memory 1101 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The memory 1101 is configured to store a program;
The processor 1102 is coupled to the memory 1101 for executing the program stored in the memory 1101 for:
acquiring data to be processed;
Inputting the data to be processed into a trained neural network model to obtain a recognition result;
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
In addition, the processor 1102 may implement other functions in addition to the above functions when executing the program in the memory 1101, and the above description of the embodiments may be specifically referred to.
Further, as shown in FIG. 14, the electronic device also includes a communication component 1103, a display 1104, a power supply component 1105, an audio component 1106, and other components. Only some of the components are schematically shown in fig. 14, which does not mean that the electronic device only comprises the components shown in fig. 14.
Accordingly, the embodiments of the present application also provide a computer-readable storage medium storing a computer program which, when executed by a computer, is capable of implementing the data identification method steps or functions provided in the above respective embodiments.
Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 14, the electronic device includes a memory 1101 and a processor 1102. The memory 1101 may be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on an electronic device. The memory 1101 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The memory 1101 is configured to store a program;
The processor 1102 is coupled to the memory 1101 for executing the program stored in the memory 1101 for:
The method comprises the steps of acquiring a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block;
Constructing a neural network model for data identification according to the sequence;
The built neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, respectively distributing the plurality of feature graphs to corresponding network blocks in the at least one second network block to serve as input components of the corresponding network blocks, and the group number is related to the number of the at least one second network block.
In addition, the processor 1102 may implement other functions in addition to the above functions when executing the program in the memory 1101, and the above description of the embodiments may be specifically referred to.
Further, as shown in FIG. 14, the electronic device also includes a communication component 1103, a display 1104, a power supply component 1105, an audio component 1106, and other components. Only some of the components are schematically shown in fig. 14, which does not mean that the electronic device only comprises the components shown in fig. 14.
Accordingly, the present application also provides a computer-readable storage medium storing a computer program, which when executed by a computer, is capable of implementing the steps or functions of the model building method provided in the foregoing corresponding embodiment.
Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 14, the electronic device includes a memory 1101 and a processor 1102. The memory 1101 may be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on an electronic device. The memory 1101 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The memory 1101 is configured to store a program;
The processor 1102 is coupled to the memory 1101 for executing the program stored in the memory 1101 for:
inputting the sample data into a neural network model to obtain an identification result;
Optimizing the neural network model according to the identification result and the expected identification result of the sample data;
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
In addition, the processor 1102 may implement other functions in addition to the above functions when executing the program in the memory 1101, and the above description of the embodiments may be specifically referred to.
Further, as shown in FIG. 14, the electronic device also includes a communication component 1103, a display 1104, a power supply component 1105, an audio component 1106, and other components. Only some of the components are schematically shown in fig. 14, which does not mean that the electronic device only comprises the components shown in fig. 14.
Accordingly, the present application also provides a computer readable storage medium storing a computer program, where the computer program is executed by a computer to implement the steps or functions of the model training method provided in the foregoing corresponding embodiments.
Fig. 14 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 14, the electronic device includes a memory 1101 and a processor 1102. The memory 1101 may be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on an electronic device. The memory 1101 may be implemented by any type of volatile or non-volatile memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The memory 1101 is configured to store a program;
The processor 1102 is coupled to the memory 1101 for executing the program stored in the memory 1101 for:
acquiring data to be processed;
inputting the data to be processed into a trained neural network model, and extracting to obtain the characteristics of the data to be processed;
The neural network model comprises a sequence, wherein the sequence comprises a first network block and at least one second network block positioned behind the first network block, the neural network model is used for acquiring the group number corresponding to the first network block, dividing a plurality of feature graphs output by the first network block according to the group number to obtain a plurality of feature graph groups, and distributing the feature graph groups to corresponding network blocks in the at least one second network block respectively to serve as input components of the corresponding network blocks, wherein the group number is related to the number of the at least one second network block.
In addition, the processor 1102 may implement other functions in addition to the above functions when executing the program in the memory 1101, and the above description of the embodiments may be specifically referred to.
Further, as shown in FIG. 14, the electronic device also includes a communication component 1103, a display 1104, a power supply component 1105, an audio component 1106, and other components. Only some of the components are schematically shown in fig. 14, which does not mean that the electronic device only comprises the components shown in fig. 14.
Accordingly, the embodiments of the present application also provide a computer-readable storage medium storing a computer program which, when executed by a computer, is capable of implementing the feature extraction method steps or functions provided by the above-described respective embodiments.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the network blocks may be selected according to actual needs to achieve the purpose of the embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
It should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application, and not for limiting the same, and although the present application has been described in detail with reference to the above-mentioned embodiments, it should be understood by those skilled in the art that the technical solution described in the above-mentioned embodiments may be modified or some technical features may be equivalently replaced, and these modifications or substitutions do not make the essence of the corresponding technical solution deviate from the spirit and scope of the technical solution of the embodiments of the present application.