6) judging whether the expression $ initial character exists or not, if yes, marking the position of w: t in w: p paragraph; if not, continuing to return to the step 6) to search a starting mark for the next w: t;

7) continuing to judge the expression } end character, if the character exists, recording the position of w: t in w: p paragraph; if not, continuing to return to the step 7) to search for an end mark for the next w: t;

8) recording a starting position and an ending position, collecting text node information from a text node of w: t at the beginning to a text node of w: t at the end, and acquiring text contents from the starting node to the ending node; if w is still, t is not analyzed completely, returning to the step 5) to continue searching the expression;

9) starting to traverse the information collected by the paragraph where the expression $ { } is located, acquiring all w: t nodes of the expression, and acquiring a starting w: t node;

10) traversing all w: t nodes of the expression, and splicing text contents of all w: t;

11) cleaning w: t text contents behind the w: t node at the beginning, judging whether the w: t node with the beginning mark of the next expression possibly exists, clearing the text contents before the ending mark, and writing all the contents spliced in the step 10) into the w: t node at the beginning;

12) reconstructing new empty character string content, searching the position of the character string at the start of the expression, splicing the character strings at the position of the character string at the start of the expression, searching the position of the character string at the end of the expression, extracting a variable name, and returning to the step 9) to continue the next expression analysis if all the character strings are searched;

13) obtaining the value of the corresponding variable name from the parameter map according to the variable name extracted in the step 12), and splicing the value to the content of the character string;

14) judging whether a starting mark $ {' exists in text content behind the expression in the step 12), if yes, returning to the step 12), and continuing splicing and analyzing the expressions until all expressions are analyzed;

15) and completing traversal analysis of all w: p, generating a new document.xml file, covering the new document/document.xml file in the template document, and completing expression replacement of the document.

The method has the advantages that the method is suitable for efficiently generating WORD in batches only by making a template document, replacing the expression with data by a subsequent program, analyzing the XML structure by using a docx document based on XML and ZIP technologies, extracting the expression, performing text replacement, avoiding disorder of the document structure and the style, improving the generation efficiency of the format document, and meeting the standard of WORD documents.

Drawings

FIG. 1 is a flow chart of the steps of the present invention;

fig. 2 is a view showing a frame structure of a docx file in the present invention.

Detailed Description

In the existing business system, a general word needs to depend on an office suite, and due to the fact that a plurality of business personnel write, a certain difference exists between a possibly generated document format and an expected word format, and in the situations that a large amount of documents are needed, efficiency and error rate exist, rework and the like, in order to solve the problems, the business personnel write a format document with a variable expression, store the format document in the business system, obtain dynamic data needed by a template document through the business system, call the invention to complete the generation of the document, and the detailed description is given below on a specific scheme.

Firstly, template managers write format documents, variable contents in the documents, and the writers use a $ { variable name } mode as a replacement grammar mark to finish writing the template documents.

The following results the figures and implementations further illustrate the invention:

1) decompressing the docx document, extracting word/document.xml files, and analyzing the document.xml to obtain an xml object;

2) xml, preparing to obtain a variable expression in a paragraph, wherein all text contents of the word are in the w: p nodes (the position of $ { name } needs to be obtained through analysis as follows):

3) all w: r/w: t child nodes of w: p paragraph nodes are traversed. And acquiring the text content of w: t and splicing the text. Acquiring the content of the paragraph;

4) by regular expressions

Judging whether the paragraph content has an expression or not, and continuing the next paragraph analysis if not;

6) judging whether the expression $ initial character exists or not, if yes, marking the position of w: t in w: p paragraph; if not, continuing to enter the step 6) to search a starting mark for the next w: t;

7) continuing to judge the expression } end character, if the character exists, recording the position of w: t in w: p paragraph; if not, continuing to step 7) to search for an end mark for the next w: t;

8) recording the starting position and the ending position, and collecting text node information from the text node of the beginning w: t to the text node of the ending w: t; and the text content from the starting node to the ending node can be obtained, if w is t is not analyzed, the step 5) is returned to continue searching the expression;

9) beginning to traverse the expression information collected by the paragraph, acquiring all w: t nodes of the expression, and acquiring the beginning w: t nodes;

11) clearing the text contents of w: t except the initial w: t node (the w: t node of the initial mark of the expression of the next expression possibly exists, judging the w: t node which is coincident with the next mark, and clearing the text contents before the end mark), and completely writing all the spliced contents into the initial w: t node in step 10);

12) reconstructing new empty character string content, searching the expression starting character string position, splicing character strings at the starting character string position, searching the expression ending character string position, extracting variable names, and returning to the step 9) to continue the next expression analysis if all the character strings are searched;

13) according to the extracted variable names, obtaining values corresponding to the variable names from the parameter map, and splicing the values to the content of the character string;

14) judging whether an expression starting mark exists in text content behind the expression, returning to the step 12) and continuing splicing and analyzing the expression until all expressions are analyzed;

15) and completing traversal analysis of p, generating a new document.xml file, covering the word/document.xml file of the template document, completing expression replacement of the document, and covering the word/document.xml file in the compressed document to complete document generation.

Because the docx file is generated based on a compression mode, no dependency relationship exists between the operating system environment generated by the WORD document and a third-party middleware, so that the independence of the operating system is realized, and the purpose of generating the WORD document in any environment in a deployable manner is achieved.

The open source software POI (http:// POI. apache. org /) supports the function of analyzing word documents for editing, theoretically, the function can also be realized, and the invention scheme is adopted by considering the factors of execution efficiency, software dependence, clear docx document structure and the like.

Claims

1. A method for generating word documents in a templating manner is characterized by comprising the following steps:

4) by regular expressions