Disclosure of Invention
In view of the above, the present invention is directed to a method for automatically constructing test data based on multiple database types, so as to improve the execution efficiency of data construction.
In order to achieve the purpose, the technical scheme of the invention is realized as follows:
a test data automatic construction method based on multiple database types comprises the following steps:
s1, inputting configuration information;
s2, a configuration analysis module reads configuration information;
s3, the scheduling module reads the type of the database and calls a corresponding code;
s4, connecting a database to judge whether the database table exists or not;
A1. if so, step S5 is skipped;
A2. if not, executing the steps in sequence;
s5, calling a database operation module to create a database table;
s6, reading the data volume and the thread number by the execution module, and uniformly distributing the data volume of the single thread;
s7, a data generation module reads the configured data rule to generate data and combines the generated data of all the table fields into a piece of data to be put in storage;
s8, performing insertion operation of the database operation module, and inserting the combined data into a target base table;
s9, judging whether the thread number and the data volume cycle are finished or not;
B1. ending, and executing the steps in sequence;
B2. if not, jumping to step S7;
and S10, outputting a result.
Further, the step S1 of inputting the configuration information includes two methods:
C1. directly inputting through a visual interface;
C2. and uploading the yaml configuration file containing the configuration information, wherein the configuration information comprises database type configuration, database connection information configuration, data volume configuration, database table structure configuration, table field type configuration, data generation rule configuration and execution thread number configuration.
Further, the step S2 is to configure the parsing module, and the method for reading the configuration information includes:
the configuration analysis module analyzes the configuration file, performs structural processing and quantization on the related information of the database, and can be directly called later; and generating SQL statements according to the type and the table structure information of the database as required for later use when executing database operation.
Further, the method for the scheduling module to read the database type to call the corresponding code in step S3 includes:
and the scheduling module reads the database type, calls the corresponding database execution code and executes the database connection operation.
Further, the method for executing module reading data volume and thread number in step S6 to evenly distribute data volume of single thread includes:
the execution module can read the data volume and the thread number, and calculate the data storage volume required to be executed by a single thread according to the data volume and the thread number so as to achieve balanced concurrent execution.
Further, the method of step S7 includes:
the scheduling module calls a data generation module corresponding to the data type according to the data type of each table field in the configuration file, the data generation module generates data according to the data generation rule, and the data of each field is combined after being generated to form a piece of test data to be inserted
Further, the step S10 outputs the result including: and the system outputs the total successful data volume of the warehousing and the execution duration.
Compared with the prior art, the invention has the following advantages:
(1) the invention simplifies the usability of the tool: configuration information is visually input, information such as the type of the database and the type of the data field is reasonably checked, after execution is finished, the total number of the data fields and the storage time length of the data fields are returned to a user by the system for checking, and the interactivity of the tool is improved.
(2) The functions of the data construction tool are further enriched: the data construction functions of various database types are integrated; besides the common data types, types supported by non-relational databases such as arrays, Json and the like are added; and enriching flexible data generation rules.
(3) In order to effectively control the execution efficiency of the data construction, a user can set the thread number according to the data volume, and the system can evenly distribute the data volume of a single thread according to the thread number and the data volume so as to reasonably control the execution efficiency of the data construction.
Detailed Description
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "up", "down", "front", "back", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used only for convenience in describing the present invention and for simplicity in description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and thus, are not to be construed as limiting the present invention. Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," etc. may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art through specific situations.
The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.
According to FIG. 1, the method comprises the following steps
1) Configuration information is input, and the method is realized through two ways: 1. and directly inputting through a visual interface. 2. By uploading a. yaml profile containing the configuration information. The configuration information comprises database type configuration, database connection information configuration, data volume configuration, database table structure configuration, table field type configuration, data generation rule configuration and execution thread number configuration.
2) The configuration analysis module analyzes the configuration file, performs structural processing and quantization on the related information of the database, and can be directly called later; generating SQL sentences according to the type and the table structure information of the database as required for later use when executing database operation;
3) the scheduling module reads the database type, calls a corresponding database execution code and executes the operation of connecting the database;
4) after the database is successfully connected, the system can automatically judge whether the target database table exists. If not, automatically executing the creation base table SQL of the database operation module; if the base table already exists, directly executing the next step;
5) the execution module reads the data volume and the thread number, and calculates the data storage volume required to be executed by a single thread according to the data volume and the thread number so as to achieve balanced concurrent execution;
6) the scheduling module calls a data generation module corresponding to the data type according to the data type of each table field in the configuration file, the data generation module generates data according to the data generation rule, and the data of each field is combined after the data generation is finished to form a piece of test data to be inserted;
7) and the execution module calls a data insertion code through the calling module and inserts the data combined in the step 6 into the target base table.
8) Embedding a data volume loop in each thread, and executing the step 6 and the step 7 in the loop until the thread number and the data volume loop are finished;
9) and after all threads are executed, outputting the total successful data volume and the execution duration of the warehousing by the system.
The module comprises:
a configuration analysis module: the standard of the configuration file is yaml file, and the configuration file is slightly different due to different types of databases and different internal storage structures. The configuration file analysis module comprises a plurality of classes (a configuration file processing class and an SQL statement generation class which take the database types as dimensions), respectively processes configuration files of different database types, and has the following main functions:
1. structuring the configuration information: and extracting database connection information, database table names and database field information (type/length/rule) and storing the information in the dictionary in a key/value mode.
The configuration file supports the following rules:
1. data of numerical value type (integer, decimal) generates corresponding data according to the initial value (start, end) and step length (step); and if no step length exists, generating a random value in the data range.
2. Time type data, generating corresponding data according to the initial values (start, end) and the step length (s/min/hour/day/mouth/year); if no step length exists, a random value in the range of the initial value is generated.
3. And generating a character string with the character string length within the range of 0-length according to the input length by using the data of the character string type.
4. The array type data generates corresponding array data according to the array type (array _ type), the data length (num) and the corresponding rule (refer to 1/2/3).
5. Boolean type data, randomly generating true/false.
Json data, based on the Json level number (num), generating random type data (write in template)
2. Generating a Sql statement: and generating createsql statements (judging whether the database tables exist, do not exist and do not exist) and insertsql statements according to the database table information in the configuration file.
A data generation module: and generating data for the dimension according to the data type, acquiring the type of each field in the configuration file, and generating the data according to the corresponding rule.
Among the rules supported by the configuration file are (basically covering the common data types):
1. data of numerical value type (integer, decimal) generates corresponding data according to the initial value (start, end) and step length (step); and if no step length exists, generating a random value in the data range.
2. Time type data, generating corresponding data according to the initial values (start, end) and the step length (s/min/hour/day/mouth/year); if no step length exists, a random value in the range of the initial value is generated.
3. And generating a character string with the character string length within the range of 0-length according to the input length by using the data of the character string type.
4. The array type data generates corresponding array data according to the array type (array _ type), the data length (num) and the corresponding rule (refer to 1/2/3).
5. Boolean type data, randomly generating true/false.
Json data, based on the Json level number (num), generating random type data (write in template)
A database execution module: the module encapsulates different database types, currently comprises data operation types of different database types such as myslq, mongo, hbase and the like, and mainly comprises connection database operation, creation base table (creation if the creation is not performed or not performed), data insertion operation, data information query operation and data volume query operation.
An execution module:
1. and distributing the data warehousing quantity required to be executed by a single thread according to the total data quantity and the thread number so as to achieve balanced concurrent execution. Example (c): 100w of data, 10 threads of code can automatically create 10 threads for concurrence, wherein the first thread is 1-10w, the second thread is 10w + 1-20 w, and the like, and 100w of non-repeated data import is realized.
2. And in the data import process, the corresponding database execution module is scheduled according to the type of the database input by the user. Respectively executing the following steps: connecting the databases, creating a base table, concurrently inserting 10w data (inserting data operation) into 10 threads, and finishing the current data size of the real-time printing database after each thread executes.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.