Learning memory information by using recursive connections, using vectors x input at time t_tAnd the output vector q generated in the previous step_t-1Activated by sigmoid function by weight multiplication, i.e. according to the formula z_t＝σ(W_z·[q_t-1,x_t]) And r_t＝σ(W_r·[q_t-1,x_t]) Get two gate values, update the gate weight z_tAnd reset gate weight r_t(ii) a Sigma is an activation function sigmoid; at q_t-1After multiplication with the weight and resetting gate r_tMultiplication followed by a formula

Obtain the final new memory

Wherein W_zFor weight matrix from hidden layer to refresh gate, W_rA weight matrix from the hidden layer to the reset gate, and W from the hidden layer to the candidate state

The weight matrix of (2). Finally according to the formula q_t＝(1-z_t)

Obtaining the output vector q of the current step_t. For each time t a vector q is output_tAnd carrying out subsequent treatment.

(4) Building a decoder

Visual coding vector p and linguistic coding vector q at time t_tConnected component vector r_tAnd inputting the data into a second GRU-based decoder model, wherein the model consists of 2 layers of GRU neural networks respectively containing 512 cells and is used for decoding the performance data obtained by the visual model and the language model learning.

A training stage:

the model is trained using a supervised learning approach. To better balance long term dependence and computational loss, a sliding window pair of length 48 is used forEach trained DSL input file is segmented to obtain a signature sequence. At each moment, inputting a hand-drawn image I and a corresponding characteristic sequence x_tOutputting the predicted next feature y_t. The model uses a cross-entropy cost function (cross-entropy cost) as its loss function, which will predict the next feature y of the model_tAnd the actual next feature x_t+1A comparison is made.

As the context for training is updated through the sliding window at each instant, the same input image I will be reused for samples associated with the same page style; finally, two kinds of marks are set: < START > and < END >, which are used as place-occupying marks of DSL file prefix and suffix, respectively, so as to replace the specific content of the prefix and suffix in the subsequent compiling process;

training is performed by calculating the partial derivatives of the loss with respect to the network weights calculated with back propagation to minimize the multi-class log loss, the loss calculation formula being as follows:

in the above formula, x_t+1Is the input vector at the next time, y_tIs the output vector at the current time. Training with RMSProp (root Mean Square Prop) algorithm, the learning rate is set to 1 × 10^-4And limiting the output gradient to [ -1.0, 1.0 [ -1.0 [ ]]Within ranges to account for numerical instability. In order to prevent overfitting of the model, random inactivation (Dropout) regularization is introduced, an inactivation rate of 0.3 is set after a complete connection layer of the visual model, namely 30% of neurons are randomly deleted each time in the training of the layer, so that the model is less dependent on certain local characteristics, and the generalization is stronger. The training mode adopts mini-batch (mini-batch) training with 64 groups of image sequences as one batch. After training, a relation model between the image data and the related characteristic sequence expressed by the DSL codes is established.

And (3) a testing stage:

to generate the DSL Code, a hand-drawn web page image I and a context sequence X with a feature number of 48 are input into the above-described Draft2Code model.X is to be_t...x_T-1Initialization is set to null vector, last sequence x_TIs arranged as<START>. The predicted feature vector y is then used_tTo update the next context feature sequence. That is, x is to be adjusted_t...x_T-1Are respectively set as x_t+₁...x_TThen x is added_tIs set to y_t. This process is repeated until the model generates a signature<END>. And finally compiling the generated DSL characteristic sequence into a required target language by using a traditional compiling method. The whole process is shown in fig. 2.

Aiming at the code specification requirements of different frameworks, the invention writes a plurality of mapping relations between the DSL and the front-end code, and stores the mapping relations in a json format file, and the content of the mapping relations is used for replacing the generated DSL so as to meet the development requirements. And for all replacement contents, three replacement marks are proposed: brace ({ }) is used to replace sub-element content, if a < div > element contains a < button > button element, then the button element puts the replacing brace inside the div element; the brackets ([ ]) are used for replacing randomly generated texts, the characters in the hand-drawing manuscript are not analyzed, and therefore, some characters can be randomly generated and put into elements such as titles and the like with texts as main parts; the brackets (()) are used for replacing the property in the element label, mainly event binding, for example, the click event binding in vue replaces the @ click property content in the label, the invention counts according to the number of the buttons, generates empty method functions in sequence for occupying, and binds to the property of each button in sequence.

By converting the webpage design hand-painted manuscript image into an engineered front-end code, the requirement of automatic generation of the webpage code under the condition that no webpage screenshot or professional design drawing is referred to in a front-end project can be met, meanwhile, the modularized standard code conforming to Vue and a React frame can be output, secondary development is facilitated for an engineer, and the working efficiency is remarkably improved. The bilingual evaluation replacement score of the Draft2Code model system reaches 7.7 points, and a webpage corresponding to the hand-drawn manuscript can be generated accurately.

The core technology of this patent includes:

(1) DSL simplified HTML grammar is introduced for optimizing the training process, and the data volume required by training is reduced to a certain extent

(2) A front-end Code automatic generation algorithm (Draft2Code) aiming at the hand-drawn webpage image is constructed, so that codes conforming to front-end engineering can be accurately output, and further development is facilitated.

Drawings

FIG. 1 is an architectural design model training algorithm of the present invention.

FIG. 2 is an architectural design code generation algorithm of the present invention.

Detailed Description

(1) The invention can be operated in a computer of a Windows/Linux/MacOS operating system, and environmental software requires Keras to be version 2.1.2, tensoflow to be version 1.4.0, nltk to be version 3.2.5, opencv-python to be version 3.3.0.10, numpy to be version 1.13.1, h5py to be version 2.7.1, matplotlib to be version 2.0.2, Pillow to be version 4.3.0, tqdm to be version 4.17.1, and scipy to be version 1.0.0.

(2) The training data set of the invention needs to make the webpage image look the same as that drawn by hands, and the training set is subjected to two steps: modifying a CSS style sheet of a webpage in a data set, firstly changing the shape and thickness of a frame of each element on the webpage, changing an original rectangular button and a < div > element into a round angle, simultaneously adding a shadow effect properly, then changing a font into a font which looks like handwriting, and finally enhancing the effect of an image, such as adding oblique lines, shifting, rotating and other effects to the image, and simulating the style of multiple ends in real hand-drawing; and modifying information of each image, such as gray scale conversion, contour detection and the like.

(3) The invention trains the Draft2Code system by using 1700 pairs of hand-drawn webpage images and GUI data, and divides a data set into a training set and a verification set according to the ratio of 8: 2.

(4) The input of the front-end engineering code generation algorithm is a hand-drawn webpage image, and the output is a front-end code file which is in a corresponding webpage layout and is based on Vue/React.

Claims

1. A front-end engineering Code generation method based on a hand-drawn webpage image is characterized in that a front-end Code automatic generation algorithm Draft2Code based on the hand-drawn webpage image is designed, and comprises 3 parts:

(1) establishing a visual model

In the design of a visual model, CNN unsupervised learning is adopted, and an input image is converted into a learning fixed-length vector to be output;

adjusting an input image into a 256 multiplied by 256 color picture, wherein the activation functions are all ReLU, and only convolution is carried out without processing the boundary; the number of convolution kernels of the first layer is set to 16, the second layer is 32, the third layer is 64, and the last layer is 128;

outputting a vector p to be subsequently processed through four layers of convolution;

(2) establishing language model

Introducing variant GRUs of LSTM to model the relationship of long-term sequence data, the model consisting of 2 layers of GRU neural networks each containing 128 cells;

the new memory h-in the GRU is learned memory information by using recursive concatenation, using the vector x input at time t_tAnd the output vector q generated in the previous step_t-1Activated by sigmoid function by weight multiplication, i.e. according to the formula z_t＝σ(W_z·[q_t-1,x_t]) And r_t＝σ(W_r·[q_t-1,x_t]) Get two gate values, update the gate weight z_tAnd reset gate weight r_t(ii) a Sigma is an activation function sigmoid; at q_t-1After multiplication with the weight and resetting gate r_tMultiplication followed by a formula

Obtain the final new memory

The weight matrix of (2); finally according to the formula

Obtaining the output vector q of the current step_t；

For each time t a vector q is output_tTreating for subsequent treatment;

(3) building a decoder

Visual coding vector p and linguistic coding vector q at time t_tConnected component vector r_tInputting the data into a second GRU-based decoder model, wherein the model consists of 2 layers of GRU neural networks respectively containing 512 cells and is used for decoding the expression data obtained by the learning of a visual model and a language model;

(4) training phase

The model is trained by using a supervised learning method; in order to better balance long-term dependence and computational loss, each DSL input file used for training is segmented using a sliding window of length 48, resulting in a signature sequence; at each moment, inputting a hand-drawn image I and a corresponding characteristic sequence x_tOutputting the predicted next feature y_t(ii) a The model uses a cross-entropy cost function as its loss function, which will predict the next feature y of the model_tAnd the actual next feature x_t+1Comparing;

in the above formula, x_t+1Is the input vector at the next time, y_tIs the output vector at the current time;

training by using RMSProp (Root Mean Square Prop) algorithm, and setting the learning rate to be 1 multiplied by 10^-4And limiting the output gradient to [ -1.0, 1.0 [ -1.0 [ ]]In the range to account for numerical instability; in order to prevent overfitting of the model, random inactivation regularization is introduced, the inactivation rate of 0.3 is set behind a complete connection layer of the visual model, namely 30% of neurons are randomly deleted in the layer of training each time, so that the model is less dependent on some local characteristics and has stronger generalization;

the training mode adopts the small batch training with 64 groups of image sequences as one batch;

after training, establishing a relation model between image data and a related characteristic sequence represented by a DSL code;

(5) testing phase

Inputting a hand-drawn webpage image I and a context sequence X with the characteristic number of 48 into the Draft2Code model for generating a DSL Code; x is to be_t...x_T-1Initialization is set to null vector, last eigenvector x of the sequence_TIs arranged as<START>(ii) a The predicted feature vector y is then used_tTo update the next signature sequence; that is, x is to be adjusted_t...x_T-1Are respectively set as x_t+₁...x_TThen x is added_tIs set to y_t(ii) a This process is repeated until the model generates a signature<END>(ii) a And finally compiling the generated DSL characteristic sequence into a required target language by using a traditional compiling method.

2. The method of claim 1, wherein: compiling a mapping relation between a plurality of DSLs and front-end codes, and storing the mapping relation in a json format file, wherein the content of the file is used for replacing the generated DSLs; and for all replacement contents, three replacement marks are proposed: brace ({ }) is used to replace sub-element content, if a < div > element contains a < button > button element, then the button element puts the replacing brace inside the div element; the middle brackets ([ ]) are used for replacing randomly generated texts and do not analyze characters in the hand-drawing manuscript, so that some characters can be randomly generated and put into elements with texts as main positions; brackets (()) are used to replace attributes within the element tag.

3. The method of claim 1, wherein:

the overall layout element has only the following 7 types set to represent block level elements under different application scenarios:

row layout elements placed horizontally

Stack layout elements placed in the vertical direction

Single card layout element nearly filling an entire row of a page

Double card layout elements where one row can place two

Quad card layout elements four card layout elements can be placed in a row

For block-level text elements and button elements, 8 types are set as perfect supplements:

btn-active button in activated state

Btn-inactive button

Btn-success/confirm button

Btn-warning operation button

Btn-danger dangerous operation button

Big title

Small-title subtitle

Text.