Movatterモバイル変換


[0]ホーム

URL:


CN113887189A - Information identification method and device applied to automatic data query - Google Patents

Information identification method and device applied to automatic data query
Download PDF

Info

Publication number
CN113887189A
CN113887189ACN202110919097.6ACN202110919097ACN113887189ACN 113887189 ACN113887189 ACN 113887189ACN 202110919097 ACN202110919097 ACN 202110919097ACN 113887189 ACN113887189 ACN 113887189A
Authority
CN
China
Prior art keywords
row
information
column
node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110919097.6A
Other languages
Chinese (zh)
Inventor
刘强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Hanya Software Technology Co ltd
Original Assignee
Wuxi Hanya Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Hanya Software Technology Co ltdfiledCriticalWuxi Hanya Software Technology Co ltd
Priority to CN202110919097.6ApriorityCriticalpatent/CN113887189A/en
Publication of CN113887189ApublicationCriticalpatent/CN113887189A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The application discloses an information identification method and device applied to automatic data query, and belongs to the field of report information processing. For an automatic data query scene, automatically identifying a row header area and a column header area corresponding to a target report and a row node system and a column node system corresponding to the target report according to received typesetting information of the target report, obtaining identification information comprising the row header information, the column header information and coordinate information of data to be filled by utilizing spatial analysis and sorting, wherein the coordinate information of the data to be filled is used for providing a coordinate position of the data to be filled after automatic data query, the method has the advantages that the problem of difficulty in information collection before automatic data query (such as automatic writing of query statements of other databases like SQL) is solved, the method for filling report data by manually writing query statements of other databases like SQL by processing personnel in the related art is replaced by the automatic data query scene, the manual information collection and processing cost is greatly reduced, and the method becomes a necessary premise for automatic processing work of subsequent target reports.

Description

Information identification method and device applied to automatic data query
Technical Field
The embodiment of the application relates to the field of report information processing, in particular to an information identification method and an information identification device applied to automatic data query.
Background
With the advent of the big data era, data processing and analysis are also realized as an intelligent trend. In a wide range of data processing applications (such as banking, auditing and finance and tax), report processing work related to data information is a technical work, and besides requiring a processing person to be familiar with report rules, the use skills of related software for report processing are also required to be mastered.
In the related art, report processing software widely used mainly has the following operation flows: data source connection, data set setting and report template design. There are software that can implement the directed hints of data source connections, but data set setup requires the processor to have familiarity with the relevant languages of database queries (including but not limited to SQL, NoSQL, Access, MDX, and DAX) and to have knowledge of the functions and formulas involved with the reporting tool; in addition, for report template design, skills of interface editing, report classification and design and the like need to be mastered.
Therefore, for the big data processing work of the report, the report still is in a state that a software tool is used as an auxiliary and manual operation is used as a main at present, and whether the data processing capacity of the report tool can be further improved and the manual data processing efficiency can be further improved is the problem to be solved by the application.
Disclosure of Invention
The embodiment of the application provides an information identification method and device applied to automatic data query, which can solve the problem that the big data processing of a report tool is complex in the related technology, and the method is suitable for an automatic data query scene. The technical scheme is as follows:
in one aspect, an information identification method applied to automatic data query is provided, the method including:
receiving typesetting information of a target report, wherein the typesetting information comprises at least one of row header information, list header information and slash header information of the target report, the receiving format of the typesetting information consists of four-dimensional coordinates and header names, and the row header information and the list header information are used for providing necessary input information during automatic data query;
identifying a row header area and a list header area corresponding to the target report according to the typesetting information, wherein the four-dimensional coordinates of the row header area correspond to the row header information, and the four-dimensional coordinates of the list header area correspond to the list header information;
identifying a trip node system according to the four-dimensional coordinates of the head region of the row table, and identifying a column node system according to the four-dimensional coordinates of the head region of the column table, wherein the node system comprises a root node, a child node and a terminal node, and the root node is determined according to the slash table head information;
performing permutation and combination operation on all the tail end nodes of the row node system and all the tail end nodes of the column node system to obtain a coordinate crossing result;
and sorting the coordinate crossing result to obtain at least one group of identification information, wherein the at least one group of identification information comprises the row header information, the list header information and coordinate information of data to be filled, and the coordinate information of the data to be filled is used for providing a coordinate position of the data to be filled on the target report after automatic data query.
In another aspect, an information recognition apparatus applied to an automatic data query is provided, the apparatus including:
the information receiving module is used for receiving typesetting information of a target report, wherein the typesetting information comprises at least one of row header information and list header information of the target report, the receiving format of the typesetting information consists of four-dimensional coordinates and header names, and the row header information and the list header information are used for providing necessary input information during automatic data query;
the area identification module is used for identifying a row header area and a list header area corresponding to the target report according to the typesetting information, wherein the four-dimensional coordinates of the row header area correspond to the row header information, and the four-dimensional coordinates of the list header area correspond to the list header information;
the system identification module is used for identifying a travel node system according to the four-dimensional coordinates of the head area of the row table and identifying a column node system according to the four-dimensional coordinates of the head area of the column table, wherein the node system comprises a root node, a child node and a terminal node;
the coordinate operation module is used for carrying out permutation and combination operation on all tail end nodes of the row node system and all tail end nodes of the column node system to obtain a coordinate intersection result;
and the information integration module is used for sorting the coordinate crossing result to obtain at least one group of identification information, wherein the at least one group of identification information comprises the linelist head information, the linelist head information and coordinate information of data to be filled, and the coordinate information of the data to be filled is used for providing the coordinate position of the data to be filled on the target report after automatic data query.
In another aspect, a computer-readable storage medium is provided, wherein the storage medium stores at least one instruction for execution by a processor to implement the information identification method applied to automatic data query as described in the above aspect.
The invention has the following beneficial effects:
in the embodiment of the application, for an automatic data query scene, an information identification method applied to automatic data query is provided, a row header area and a list header area corresponding to a target report are automatically identified according to the typesetting information of the received target report, a row node system and a column node system are further identified according to each header information, at least one group of identification information is obtained by utilizing an analysis method of spatial position, the identification information comprises row header information, list header information and coordinate information of data to be filled, the coordinate information of the data to be filled is used for providing the coordinate position of the data to be filled on the target report after the automatic data query, thereby overcoming the problem of difficult information collection before the automatic data query (such as automatically writing query statements of other databases like SQL, etc.), the automatic data query scene replaces the method of filling the report data by manually writing query statements of other databases like by a processor in the related technology, the manual information collection and processing cost is greatly reduced, and the method becomes a necessary premise for automatic processing work of subsequent target reports.
Drawings
FIG. 1 illustrates a flow chart of an information identification method applied to an automatic data query, according to an exemplary embodiment of the present application;
FIG. 2 illustrates a layout diagram of a target report according to an exemplary embodiment of the present application;
FIG. 3 illustrates a schematic diagram of a row node architecture, shown in an exemplary embodiment of the present application;
FIG. 4 illustrates a schematic diagram of a column node architecture, shown in an exemplary embodiment of the present application;
FIG. 5 illustrates a flow chart of an information identification method applied to an automatic data query, according to another exemplary embodiment of the present application;
fig. 6 is a schematic diagram illustrating slash header information as valid information according to an exemplary embodiment of the present application;
FIG. 7 is a diagram illustrating an interface after a target report is populated according to an illustrative embodiment of the present application;
fig. 8 shows a structural block of an information recognition apparatus applied to an automatic data query according to an exemplary embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Reference herein to "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.
For convenience of understanding, terms referred to in the embodiments of the present application are explained below.
Automatic data query: in the related art, most report processing software interacts with the database of the report and depends on the familiarity of the processing personnel with the SQL statement; the method is suitable for an automatic data query scene, in which a processor does not need to master SQL sentences any more, and automatic determination and subsequent calling of data of the report are realized through automatic data query; the invention provides an information identification method applied to automatic data query, wherein the automatic identification of information is involved in order to realize an automatic data query scene.
And (4) target report form: in the present application, a report refers to a table with a table header, a table body, and a table annotation, such as a financial report, and the target report refers to a currently processed report object; it should be noted that, in the present application, the target report only needs to be implemented as having the header information, the header information is analyzed, the input information of the automatic data query is obtained to implement the automatic data query scene, and the coordinate position of the data to be filled in the target report is provided by using the operation relationship between the spatial coordinates.
Four-dimensional coordinates: for the coordinate format of the information, the present application is described with an example of a coordinate structure of a row start, a row end, a column start and a column end, where a column coordinate range refers to a range between the column start and the column end, and a row coordinate range refers to a range between the row start and the row end, but the present application is not limited thereto.
Example 1
Referring to fig. 1, a flowchart of an information identification method applied to an automatic data query according to an exemplary embodiment of the present application is shown, where the method includes:
step 101, receiving the typesetting information of the target report.
In an actual operation process, the information identification method applied to automatic data query provided by the embodiment of the present application is described by taking a computer program as an example. As shown in fig. 2, which shows a layout diagram of a target report, aninterface 200 is a third-party software interface displayed by a terminal device, and when a user uses the third-party software, the carrier computer program of the present application implements the information identification method after providing, for example, atarget report 210. Optionally, theinterface 200 may also be an interface schematic diagram of the computer program when the computer program is run in an embedded manner, which is not limited in this embodiment of the present application.
The method comprises the steps of automatically generating typesetting information for a target report through an interface identification module, and receiving the typesetting information of the target report.
Optionally, the layout information includes at least one of row header information and list header information of the target report, where a receiving format of the layout information is composed of four-dimensional coordinates and a header name, and the row header information and the list header information are used to provide necessary input information for automatic data query.
As shown in fig. 2, the row header includes cells in which the header names "region", "product line", "jingjin", "shanghai", "south", "products 1", "products 2", "total 1", "beijing", "tianjin", "total 2", "shenzhen", and "guangzhou" are located, and the header includes cells in which the header names "time", "salesman", "2020 year", "2021 year", "group leader", "first name", "second name", "average", "number of people", "amount to be paid", "month 1", "month 2", "total 3", "month 4", and "total 4". The numbers of the months and the products are only schematic illustrations, the header names to be expanded can be determined by the program according to the read actual data, for example, each month 1 can be expanded according to the actually related month to obtainmonths 1 and 2 months … …, wherein the expansion is automatically performed by the program according to the read actual data according to each sub-level concept under the current header concept, and the embodiment of the application is not limited thereto.
Correspondingly, the receiving format of the row header information and the list header information may further include additional information, in an example, as shown in fig. 2, an example of the row header information is "jingjin", and in the corresponding information format, the four-dimensional coordinates are (5,5,5,7), the name of the header is "jingjin", and the additional information is not present (for example, the additional information is added, and "sales/1000" can be written); in another example, as shown in fig. 2, the list header information is exemplified by "group leader", and the corresponding information format has four-dimensional coordinates of (15,16,3,4), a list header name of "group leader", and no additional information (e.g., additional information is added, and "person name" can be written).
And 102, identifying a row header area and a list header area corresponding to the target report according to the typesetting information.
As shown in fig. 2, the computer automatically performs the operation of identifying the row header area and the column header area corresponding to the target report according to the typesetting information afterstep 101, considering that there is a priority and inclusion relationship between the header names, so that the row header and the column header not only contain the header names of a single or several equal levels, but also exist in the row header area and the column header area.
The four-dimensional coordinates of the row header area correspond to the row header information, and the four-dimensional coordinates of the list header area correspond to the list header information.
As shown in tables 1 and 2, information of the row header area and the column header area is shown, which is exemplified in the form of a table. As is clear from tables 1 and 2, the four-dimensional coordinates of the row header area and the column header area are provided by the four-dimensional coordinates in the row header information and the column header information, respectively.
TABLE 1
Figure BDA0003206791180000061
TABLE 2
Figure BDA0003206791180000071
And 103, identifying a travel node system according to the four-dimensional coordinates of the head area of the row table, and identifying a column node system according to the four-dimensional coordinates of the head area of the column table.
Instep 102, it is mentioned that the header names are considered to have a priority and inclusion relationship, so that the row header and the list header not only contain the header names of a single or several equivalent levels, but also exist in the row header area and the list header area. Instep 103, considering the characteristic that the header names have the level and inclusion relationship, the row node system is further identified according to the four-dimensional coordinates of the row header area, and the column node system is identified according to the four-dimensional coordinates of the column header area.
Fig. 3 is a schematic structural diagram illustrating a row node architecture provided in an embodiment of the present application, and fig. 4 is a schematic structural diagram illustrating a column node architecture provided in an embodiment of the present application. Each node hierarchy corresponds to table 1 and table 2.
And 104, performing permutation and combination operation on all tail end nodes of the row node system and all tail end nodes of the column node system to obtain a coordinate crossing result.
In one example, as shown in FIG. 3, all end nodes of the row node hierarchy are "Total 1", "Beijing", "Tianjin", "Shanghai", "Total 2", "Shenzhen", "Guangzhou", "products 1", and "products 2", and all end nodes of the column node hierarchy are "month 1", "month 2", "Total 3", "month 4", "Total 4", "group Length", "first name", "second name", "average", "number of people", and "amount to be returned", as shown in FIG. 4.
Further, according to the information of the row header area and the column header area shown in tables 1 and 2, the four-dimensional coordinates of each header name in the above example are obtained, and are subjected to permutation and combination operation, so that a coordinate intersection result can be obtained.
As shown in fig. 2, the permutation and combination operation is performed on "total 1" and "month 1", and cell information with four-dimensional coordinates of (7,7,5,5) is obtained, and so on, and the coordinate intersection result is obtained.
And 105, sorting the coordinate crossing result to obtain at least one group of identification information.
The at least one group of identification information comprises row header information, list header information and coordinate information of data to be filled, wherein the coordinate information of the data to be filled is used for providing coordinate positions of the data to be filled on the target report after automatic data query. Instep 104, the cell area obtained by the intersection is a filling area to be filled with data.
It should be noted that two major factors forming one report query are: and determining the position of the query result and the query of the target data source. Wherein, the first necessary factor is realized by both rows and columns participating in the intersection instep 104, that is, the target data source is pointed according to the row header information and the list header information; step 105 realizes a second necessary factor, namely, determining the coordinate position of the data to be filled according to the coordinate information of the data to be filled, thereby completing one report query.
In the related art, if the BI report industry is wide, the applied report processing software manually determines the two necessary factors (namely, the SQL formula for writing the query at the designated coordinate position or the query formula similar to the SQL and defined by the software manufacturer) based on the operator, the processing mode in the related art is time-consuming and the effect is more dependent on the manual operation capability, so that the problem of difference between the report processing level and the effect exists, and the utilization of the header information is ignored.
To sum up, for an automatic data query scenario, embodiments of the present application provide an information identification method applied to automatic data query, which automatically identifies a row header area and a list header area corresponding to a target report according to received layout information of the target report, further identifies a row node system and a column node system according to each header information, and obtains at least one set of identification information by using a spatial position analysis method, where the identification information includes row header information, list header information, and coordinate information of data to be filled, and the coordinate information of the data to be filled is used to provide a coordinate position of data to be filled on the target report after automatic data query, so as to overcome a problem of difficulty in information collection before automatic data query (e.g. automatically writing other database query statements such as SQL, etc.), and the automatic data query scenario of the present application replaces a method in which a handler writes other database query statements such as SQL, etc., to fill in report data in related technologies, the manual information collection and processing cost is greatly reduced, and the method becomes a necessary premise for automatic processing work of subsequent target reports.
Example 2
Referring to fig. 5, a flowchart of an information identification method applied to an automatic data query according to another exemplary embodiment of the present application is shown, where the method includes:
step 501, receiving typesetting information of a target report.
Please refer to step 101, which is not described herein again.
Step 502, identifying a row header area and a list header area corresponding to the target report according to the typesetting information.
Please refer to step 102, which is not described herein again.
In one possible implementation,step 502 may be followed by one ofsteps 503 and 505, with the order of execution being shown in fig. 5.
Step 503, traversing the range of the row end coordinates and the column coordinates of each row node to identify the corresponding child node, wherein the row node which is not identified as the child node is the root node of the row node system, and the row node which does not have the child node is the end node of the row node system.
The child nodes and the corresponding row nodes have row adjacency relation and column inclusion relation, the row adjacency relation is determined according to the relation between the row start coordinates of the child nodes and the row end coordinates of the corresponding row nodes, and the column inclusion relation is determined according to the column coordinate range of the child nodes and the column coordinate range of the corresponding row nodes.
In one example, the object currently identifying the child node is row node A, whose four-dimensional coordinates are expressed as (1,2,1,4), and there are also row node B's four-dimensional coordinates of (3,4,1, 2). In a possible embodiment, when the last row coordinate of the row node a is defined as 2 and the start row coordinate of the row node B is defined as 3, the row node a and the row node B have a row adjacency relationship under the condition that the last row coordinate of the row node a +1 is defined as the start row coordinate of the row node B; further, the column coordinate range of row node a is (1,4), the column coordinate range of row node B is (1,2), and in a possible embodiment, the row node a and the row node B have a column inclusion relationship when the column coordinate range of row node a is specified to include the column coordinate range of row node B. Thus, the row node B is identified as a child node of the row node a on the condition that the row adjacency and column inclusion relationship is satisfied.
And step 504, arranging the root nodes and the child nodes to obtain a row node system.
Alternatively, considering the case where the layout information may further include slash header information, step 504 may include the following one to three.
And content I, identifying a preset root node of a trip node system according to the slash division of slash header information.
And secondly, determining the root node obtained in the traversal process as a child node of the preset root node.
And thirdly, arranging the preset root node and the child nodes to obtain a row node system.
Correspondingly, the slash header refers to thearea 211 in fig. 2, and if thearea 211 is currently invalid information, the layout information includes at least one of the line header information and the list header information of the target report; and if the slash header is effective information, the typesetting information comprises at least one of row header information, list header information and slash header information of the target report.
As shown in fig. 6, a schematic diagram of the slash header information as valid information is shown. In thisarea 211, slash header information is displayed.
In one example, according to the common slash header type, two common slash header cases are provided, fig. 6 (a) and fig. 6 (b), respectively. Taking (a) as an example, the slash header information includes "header name 1" information, "header name 2" information, and "header name 3" information, in the header, the header name 1 is a root node with the highest priority in the row node system (denoted as a preset root node), and the header name 3 is a root node with the highest priority in the column node system (denoted as a preset root node). In fig. 6 (a) and 6 (b), the numbers of the header names are merely schematic descriptions.
And 505, traversing the range of the column end coordinates and the row coordinates of each column node to identify the corresponding child nodes, wherein the column node which is not identified as the child node is a root node of a column node system, and the column node without the child node is a tail end node of the column node system.
The child nodes and the corresponding column nodes have a column adjacency relation and a row inclusion relation, the column adjacency relation is determined according to the relation between the column start coordinates of the child nodes and the column end coordinates of the corresponding column nodes, and the row inclusion relation is determined according to the row coordinate range of the child nodes and the row coordinate range of the corresponding column nodes.
In one example, the object currently identifying the child node is column node C, whose four-dimensional coordinates are expressed as (7,10,3,3), and there is also a column node D whose four-dimensional coordinates are (7,7,4, 4). In a possible embodiment, when the last column coordinate of the column node C +1 is the first column coordinate of the column node D, the column node C and the column node D have a column adjacent relationship; further, the row coordinate range of the column node C is (7,10), the row coordinate range of the column node D is (7,7), and in a possible embodiment, the column node C and the column node D have a row inclusion relationship when the row coordinate range of the column node C is specified to include the row coordinate range of the column node D. Thus, the column node D is identified as a child node of the column node C on the condition that the above-described column adjacency and row inclusion relationship is satisfied.
Step 506, the root node and the child nodes are sorted to obtain a column node system.
Alternatively, considering the case where the layout information may also include slash header information, step 506 may include the following one to three.
And content I, identifying a preset root node of a column node system according to the slash division of slash header information.
And secondly, determining the root node obtained in the traversal process as a child node of the preset root node.
And thirdly, arranging the preset root node and the child nodes to obtain a column node system.
Step 507, acquiring column coordinates of all end nodes in the row node system.
And step 508, acquiring the row coordinates of all the end nodes in the column node system.
And 509, intersecting the column coordinates and the row coordinates to obtain a coordinate intersection result.
And step 510, sorting the coordinate crossing results to obtain at least one group of identification information.
In one example, in the row node hierarchy, "Shenzhen" is taken as an example, and in the column node hierarchy, "2 months" is taken as an example. In this example, it can be understood that, according to the node association condition of the "Shenzhen" in the row node system, the program can automatically infer information of "south China" and "region" (i.e., automatically infer the target data source of the "Shenzhen"), and so does "2 months". In addition, the program can also obtain the filtering condition when the data is automatically queried according to the automatic inference result.
Correspondingly, the target data source and the terminal node information automatically deduced by the program form a coordinate crossing result, and the identification information is obtained through information arrangement and induction. If the corresponding SQL statement is obtained:
select sum as sales statistics
From sales report
Where region and time are 2020 and 2 months
Further, the Shenzhen intersects with the Shenzhen 1, and a corresponding SQL statement can be obtained:
select sum as sales statistics
From sales report
Where region and time are 2020 year +1 month
Since "1 month" and "2 month" are not specific months but are expansion terms of the concept "2020", the two SQL statements may be merged as follows:
statistics of Select month as month, sum as sales
From sales report
Where region and time are Shenzhen and 2020
Group by month
It should be noted that, the result of the above grouping statistics is more than 2 time items (month 1 and month 2), and the program automatically and correspondingly inserts rows or columns in the data filling coordinate area for accommodation.
Schematically, as shown in fig. 7, it shows a populated data table 700 obtained by automatic information identification of a program in an automatic data query scenario according to an embodiment of the present application. As can be taken from FIG. 7, for the current sales report, each month expands to 1-12 months in the month corresponding to 2020, corresponding to thetarget report 210 shown in FIG. 2.
In the embodiment of the application, the construction principle of a row node system and a column node system is further disclosed, and the analysis of the spatial position is realized through the construction of the node system; furthermore, the slash header condition in the report is also considered, and the scenes encountered by recognition are further enriched.
Referring to fig. 8, a block diagram of an information identification apparatus for automatic data query according to an embodiment of the present application is shown. The apparatus may be implemented as all or part of a computer device in software, hardware, or a combination of both. The device includes:
aninformation receiving module 801, configured to receive layout information of a target report, where the layout information includes at least one of row header information and list header information of the target report, a receiving format of the layout information is composed of four-dimensional coordinates and a header name, and the row header information and the list header information are used to provide necessary input information for automatic data query;
anarea identification module 802, configured to identify a row header area and a column header area corresponding to the target report according to the layout information, where a four-dimensional coordinate of the row header area corresponds to the row header information, and a four-dimensional coordinate of the column header area corresponds to the column header information;
asystem identification module 803, configured to identify a row node system according to the four-dimensional coordinates of the row header area, and identify a column node system according to the four-dimensional coordinates of the column header area, where the node system includes a root node, a child node, and a terminal node;
a coordinateoperation module 804, configured to perform permutation and combination operation on all end nodes of the row node system and all end nodes of the column node system to obtain a coordinate intersection result;
theinformation integration module 805 is configured to sort the coordinate crossing result to obtain at least one set of identification information, where the at least one set of identification information includes the linelist header information, and coordinate information of data to be filled, and the coordinate information of the data to be filled is used to provide a coordinate position of the data to be filled on the target report after automatic data query.
Optionally, thesystem identification module 803 includes:
the first identification unit is used for traversing the row end coordinates and the column coordinate ranges of all the row nodes to identify the corresponding child nodes, the child nodes and the corresponding row nodes have row adjacent relation and column containing relation, the row adjacent relation is determined according to the relation between the row start coordinates of the child nodes and the row end coordinates of the corresponding row nodes, the column containing relation is determined according to the column coordinate ranges of the child nodes and the column coordinate ranges of the corresponding row nodes, the row nodes which are not identified as the child nodes are root nodes of the row node system, and the row nodes which do not have the child nodes are end nodes of the row node system;
and the second identification unit is used for sorting the root node and the child nodes to obtain the row node system.
Optionally, thesystem identification module 803 further includes:
a third identifying unit, configured to traverse a column end coordinate and a row coordinate range of each column node to identify a corresponding child node, where the child node and the corresponding column node have a column adjacency relation and a row inclusion relation, the column adjacency relation is determined according to a relation between a column start coordinate of the child node and a column end coordinate of the corresponding column node, and the row inclusion relation is determined according to a row coordinate range of the child node and a row coordinate range of the corresponding column node, where a column node that is not identified as a child node is a root node of the column node system, and a column node that is not identified as a child node is an end node of the column node system;
and the fourth identification unit is used for sorting the root node and the child nodes to obtain the column node system.
Optionally, in response to that the layout information further includes slash header information, the second identifying unit is further configured to:
identifying a preset root node of the line node system according to the slash division of the slash header information;
determining the root node obtained in the traversal process as a child node of the preset root node;
and arranging the preset root node and the child nodes to obtain the row node system.
Optionally, in response to that the layout information further includes slash header information, the second identifying unit is further configured to:
identifying a preset root node of the column node system according to the slash division of the slash header information;
determining the root node obtained in the traversal process as a child node of the preset root node;
and arranging the preset root node and the child nodes to obtain the column node system.
Optionally, the coordinate operation module 704 includes:
the first arithmetic unit is used for acquiring the column coordinates of all the tail end nodes in the row node system;
the second operation unit is used for acquiring the row coordinates of all the tail end nodes in the column node system;
and the third operation unit is used for intersecting the column coordinates and the row coordinates to obtain a coordinate intersection result.
The present embodiments also provide a computer-readable storage medium, in which at least one instruction, at least one program, a code set, or a set of instructions is stored, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the information identification method applied to automatic data query as provided in the above embodiments.
Optionally, the computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM).
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments. The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (8)

1. An information identification method applied to automatic data query, characterized in that the method comprises:
receiving typesetting information of a target report, wherein the typesetting information comprises at least one of row header information and list header information of the target report, the receiving format of the typesetting information consists of four-dimensional coordinates and header names, and the row header information and the list header information are used for providing necessary input information during automatic data query;
identifying a row header area and a list header area corresponding to the target report according to the typesetting information, wherein the four-dimensional coordinates of the row header area correspond to the row header information, and the four-dimensional coordinates of the list header area correspond to the list header information;
identifying a travel node system according to the four-dimensional coordinates of the head region of the row table, and identifying a column node system according to the four-dimensional coordinates of the head region of the column table, wherein the node system comprises a root node, a child node and a terminal node;
performing permutation and combination operation on all the tail end nodes of the row node system and all the tail end nodes of the column node system to obtain a coordinate crossing result;
and sorting the coordinate crossing result to obtain at least one group of identification information, wherein the at least one group of identification information comprises the row header information, the list header information and coordinate information of data to be filled, and the coordinate information of the data to be filled is used for providing a coordinate position of the data to be filled on the target report after automatic data query.
2. The method of claim 1, wherein identifying a row node hierarchy from the four-dimensional coordinates of the head of row region comprises:
traversing the row end coordinates and the column coordinate ranges of all the row nodes to identify respectively corresponding child nodes, wherein the child nodes and the corresponding row nodes have row adjacent relation and column containing relation, the row adjacent relation is determined according to the relation between the row start coordinates of the child nodes and the row end coordinates of the corresponding row nodes, the column containing relation is determined according to the column coordinate ranges of the child nodes and the column coordinate ranges of the corresponding row nodes, the row nodes which are not identified as the child nodes are root nodes of the row node system, and the row nodes which do not have the child nodes are tail end nodes of the row node system;
and sorting the root node and the child nodes to obtain the row node system.
3. The method of claim 1, wherein identifying a hierarchy of column nodes from the four-dimensional coordinates of the head region of the list comprises:
traversing the column end coordinates and the row coordinate ranges of all the column nodes to identify respectively corresponding child nodes, wherein the child nodes and the corresponding column nodes have column adjacent relation and row inclusion relation, the column adjacent relation is determined according to the relation between the column start coordinates of the child nodes and the column end coordinates of the corresponding column nodes, the row inclusion relation is determined according to the row coordinate ranges of the child nodes and the row coordinate ranges of the corresponding column nodes, the column nodes which are not identified as the child nodes are root nodes of the column node system, and the column nodes which do not have the child nodes are end nodes of the column node system;
and arranging the root node and the child nodes to obtain the column node system.
4. The method according to claim 2, wherein in response to the layout information further including slash header information, the sorting the root node and the child nodes into the line node hierarchy includes:
identifying a preset root node of the line node system according to the slash division of the slash header information;
determining the root node obtained in the traversal process as a child node of the preset root node;
and arranging the preset root node and the child nodes to obtain the row node system.
5. The method according to claim 3, wherein in response to the composition information further including slash header information, the sorting the root node and the child node into the column node system includes:
identifying a preset root node of the column node system according to the slash division of the slash header information;
determining the root node obtained in the traversal process as a child node of the preset root node;
and arranging the preset root node and the child nodes to obtain the column node system.
6. The method according to any one of claims 1 to 5, wherein the performing a permutation and combination operation on all end nodes of the row node system and all end nodes of the column node system to obtain a coordinate crossing result comprises:
in the row node system, acquiring column coordinates of all the end nodes;
acquiring row coordinates of all the end nodes in the column node system;
and intersecting the column coordinates and the row coordinates to obtain the coordinate intersection result.
7. An information recognition apparatus applied to automatic data query, the apparatus comprising:
the information receiving module is used for receiving typesetting information of a target report, wherein the typesetting information comprises at least one of row header information and list header information of the target report, the receiving format of the typesetting information consists of four-dimensional coordinates and header names, and the row header information and the list header information are used for providing necessary input information during automatic data query;
the area identification module is used for identifying a row header area and a list header area corresponding to the target report according to the typesetting information, wherein the four-dimensional coordinates of the row header area correspond to the row header information, and the four-dimensional coordinates of the list header area correspond to the list header information;
the system identification module is used for identifying a travel node system according to the four-dimensional coordinates of the head area of the row table and identifying a column node system according to the four-dimensional coordinates of the head area of the column table, wherein the node system comprises a root node, a child node and a terminal node;
the coordinate operation module is used for carrying out permutation and combination operation on all tail end nodes of the row node system and all tail end nodes of the column node system to obtain a coordinate intersection result;
and the information integration module is used for sorting the coordinate crossing result to obtain at least one group of identification information, wherein the at least one group of identification information comprises the linelist head information, the linelist head information and coordinate information of data to be filled, and the coordinate information of the data to be filled is used for providing the coordinate position of the data to be filled on the target report after automatic data query.
8. A computer-readable storage medium storing at least one instruction for execution by a processor to implement the information identification method applied to automatic data query of any one of claims 1 to 6.
CN202110919097.6A2021-08-112021-08-11Information identification method and device applied to automatic data queryPendingCN113887189A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202110919097.6ACN113887189A (en)2021-08-112021-08-11Information identification method and device applied to automatic data query

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202110919097.6ACN113887189A (en)2021-08-112021-08-11Information identification method and device applied to automatic data query

Publications (1)

Publication NumberPublication Date
CN113887189Atrue CN113887189A (en)2022-01-04

Family

ID=79011010

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202110919097.6APendingCN113887189A (en)2021-08-112021-08-11Information identification method and device applied to automatic data query

Country Status (1)

CountryLink
CN (1)CN113887189A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101944082A (en)*2010-09-102011-01-12中国恩菲工程技术有限公司Excel-like report processing method
CN105095249A (en)*2014-05-052015-11-25中国石油化工股份有限公司Method generating multi-dimension report form
JP2016162275A (en)*2015-03-032016-09-05日本電信電話株式会社 Data structure extraction device, data structure extraction method, and data structure extraction program
US20180203838A1 (en)*2015-05-182018-07-19Workiva Inc.Data storage and retrieval system and method for storing cell coordinates in a computer memory
CN109635011A (en)*2018-10-312019-04-16北京辰森世纪科技股份有限公司Multistage gauge outfit report processing method, device and equipment based on data service metadata
CN111753706A (en)*2020-06-192020-10-09西安工业大学 A Clustering Extraction Method for Intersections of Complex Tables Based on Image Statistics
CN112100546A (en)*2020-09-112020-12-18东软集团股份有限公司Form loading method and device, storage medium and electronic equipment
CN112200117A (en)*2020-10-222021-01-08长城计算机软件与系统有限公司Form identification method and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN101944082A (en)*2010-09-102011-01-12中国恩菲工程技术有限公司Excel-like report processing method
CN105095249A (en)*2014-05-052015-11-25中国石油化工股份有限公司Method generating multi-dimension report form
JP2016162275A (en)*2015-03-032016-09-05日本電信電話株式会社 Data structure extraction device, data structure extraction method, and data structure extraction program
US20180203838A1 (en)*2015-05-182018-07-19Workiva Inc.Data storage and retrieval system and method for storing cell coordinates in a computer memory
CN109635011A (en)*2018-10-312019-04-16北京辰森世纪科技股份有限公司Multistage gauge outfit report processing method, device and equipment based on data service metadata
CN111753706A (en)*2020-06-192020-10-09西安工业大学 A Clustering Extraction Method for Intersections of Complex Tables Based on Image Statistics
CN112100546A (en)*2020-09-112020-12-18东软集团股份有限公司Form loading method and device, storage medium and electronic equipment
CN112200117A (en)*2020-10-222021-01-08长城计算机软件与系统有限公司Form identification method and device

Similar Documents

PublicationPublication DateTitle
US11755606B2 (en)Dynamically updated data sheets using row links
US11816100B2 (en)Dynamically materialized views for sheets based data
CN111542813B (en)Object model using heterogeneous data to facilitate building data visualizations
CN110292775B (en)Method and device for acquiring difference data
CN113051885B (en)AutoCAD-based design drawing rapid typesetting method
CN111027294B (en)Method, device and system for summarizing table
CN107301214B (en)Data migration method and device in HIVE and terminal equipment
CN101329676B (en) A data parallel extraction method, device and database system
DE60035432T2 (en) SYSTEM FOR MANAGING THE RDBM FRAGMENTATION
CN112256684B (en)Report generation method, terminal equipment and storage medium
CN113254455A (en)Dynamic configuration method and device of database, computer equipment and storage medium
CN117633035A (en) Data query method and device
CN114036914A (en) Form data processing method, device, electronic device and storage medium
CN113688150B (en)Keyword searching method and device, electronic equipment and storage medium
CN113887189A (en)Information identification method and device applied to automatic data query
US20040044683A1 (en)Data compiling method
DE68927327T2 (en) DEVICE FOR VOICE COMMENTATION AND MANIPULATION USING IMAGES OF A WINDOW SOURCE
CN106991116B (en)Optimization method and device for database execution plan
JP2016218747A (en)Apparatus, method, and program for mapping, and recording medium
CN110032574B (en)SQL statement processing method and device
CN109426458B (en)Method and device for printing relation graph
DE69031149T2 (en) Process for hiding or making cells visible in an electronic spreadsheet
CN110110270B (en) A method and device for generating large-scale genealogy graphs with parallel processing
CN118467543A (en)Method and device for exporting Excel from data
CN112732833B (en)Universal data bridge architecture for acquiring blockchain information and design method

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp