Disclosure of Invention
The invention mainly aims to provide a multi-source data retrieval method, a multi-source data retrieval device, multi-source data retrieval equipment and a multi-source data retrieval storage medium, and aims to solve the technical problem that in the prior art, different data sources have different grammatical rules, so that the data acquisition efficiency is low when developers develop and interface various data sources.
In order to achieve the above object, the present invention provides a multi-source data retrieval method, including the steps of:
acquiring a pipeline type retrieval statement;
extracting keywords of the pipeline type retrieval statement;
converting the keywords into a uniform expression statement;
acquiring a retrieval data source;
converting the uniform expression statement into a target retrieval statement in a syntax format corresponding to the retrieval data source;
and obtaining target retrieval data according to the target retrieval statement so as to realize multi-source data retrieval through the target retrieval data.
Optionally, the obtaining a pipeline search statement includes:
establishing a retrieval channel with user equipment;
and receiving the pipeline type retrieval statement sent by the user equipment through the retrieval channel.
Optionally, the extracting the keyword of the pipeline type search statement includes:
determining retrieval elements of the pipeline type retrieval statement;
and determining the keywords of the pipeline type retrieval statement according to the retrieval elements.
Optionally, before extracting the keyword of the pipeline search statement, the method further includes:
judging whether the pipeline type retrieval statement conforms to a preset rule or not;
and if the preset rule is met, executing the step of extracting the keywords of the pipeline type retrieval statement.
Optionally, the converting the keyword into a unified expression statement includes:
determining the retrieval relation of the keywords according to the pipeline type retrieval statement;
and converting the keywords into a unified expression statement according to the retrieval relationship.
Optionally, the obtaining a retrieval data source includes:
determining a data source retrieval statement according to the pipeline type retrieval statement;
and determining a retrieval data source according to the data source retrieval statement.
Optionally, the obtaining target retrieval data according to the target retrieval statement includes:
determining the queue sequence of the target retrieval statement according to the sequence of the pipeline type retrieval statement;
constructing the target retrieval statement into a pipeline type retrieval command according to the queue sequence;
obtaining retrieval data according to the retrieval command;
and fusing the retrieval data to obtain target retrieval data.
In addition, in order to achieve the above object, the present invention also provides a multi-source data retrieval apparatus, including:
the acquisition module is used for acquiring a pipeline type retrieval statement;
the extraction module is used for extracting keywords of the pipeline type retrieval statement;
the conversion module is used for converting the keywords into a unified expression statement;
the data source acquisition module is used for acquiring a retrieval data source;
the sentence conversion module is used for converting the uniform expression sentences into target retrieval sentences in a corresponding grammar format of the retrieval data source;
and the retrieval module is used for obtaining target retrieval data according to the target retrieval statement so as to realize multi-source data retrieval through the target retrieval data.
In addition, in order to achieve the above object, the present invention also provides a multi-source data retrieval apparatus, including: a memory, a processor, and a multi-source data retrieval program stored on the memory and executable on the processor, the multi-source data retrieval program configured to implement the steps of the multi-source data retrieval method as described above.
In addition, to achieve the above object, the present invention further provides a storage medium, on which a multi-source data retrieval program is stored, and the multi-source data retrieval program, when executed by a processor, implements the steps of the multi-source data retrieval method as described above.
The method comprises the steps of obtaining a pipeline type retrieval statement; extracting keywords of the pipeline type retrieval statement; converting the keywords into a uniform expression statement; acquiring a retrieval data source; converting the uniform expression statement into a target retrieval statement in a syntax format corresponding to the retrieval data source; and obtaining target retrieval data according to the target retrieval statement so as to realize multi-source data retrieval through the target retrieval data. Through the method, the pipeline type retrieval sentences submitted by the user are obtained, the keywords of the pipeline type retrieval sentences are extracted, the keywords are converted into the sentences in the unified expression mode and finally converted into the sentences corresponding to the grammar rules of the data sources, one retrieval language can be converted into the target retrieval sentences corresponding to multiple data sources, and the data of the multiple data sources are obtained according to the target retrieval sentences, so that the purpose that the data of the multiple data sources are obtained by one retrieval sentence is achieved, developers can obtain the data of the multiple data sources only by learning one retrieval sentence, and the efficiency of developers is greatly improved.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Referring to fig. 1, fig. 1 is a schematic structural diagram of a multi-source data retrieval device in a hardware operating environment according to an embodiment of the present invention.
As shown in fig. 1, the multi-source data retrieval apparatus may include: aprocessor 1001, such as a Central Processing Unit (CPU), acommunication bus 1002, auser interface 1003, anetwork interface 1004, and amemory 1005. Wherein acommunication bus 1002 is used to enable connective communication between these components. Theuser interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and theoptional user interface 1003 may also include a standard wired interface, a wireless interface. Thenetwork interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). TheMemory 1005 may be a Random Access Memory (RAM) Memory, or may be a Non-Volatile Memory (NVM), such as a disk Memory. Thememory 1005 may alternatively be a storage device separate from theprocessor 1001.
Those skilled in the art will appreciate that the configuration shown in FIG. 1 does not constitute a limitation of the multi-source data retrieval device, and may include more or fewer components than those shown, or some components in combination, or a different arrangement of components.
As shown in fig. 1, amemory 1005, which is a storage medium, may include therein an operating system, a network communication module, a user interface module, and a multi-source data retrieval program.
In the multi-source data retrieval apparatus shown in fig. 1, thenetwork interface 1004 is mainly used for data communication with a network server; theuser interface 1003 is mainly used for data interaction with a user; theprocessor 1001 and thememory 1005 of the multi-source data retrieval device of the present invention may be arranged in the multi-source data retrieval device, and the multi-source data retrieval device invokes the multi-source data retrieval program stored in thememory 1005 through theprocessor 1001 and executes the multi-source data retrieval method provided by the embodiment of the present invention.
An embodiment of the present invention provides a multi-source data retrieval method, and referring to fig. 2, fig. 2 is a schematic flow diagram of a first embodiment of a multi-source data retrieval method according to the present invention.
In this embodiment, the multi-source data retrieval method includes the following steps:
step S10: and acquiring a pipeline type retrieval statement.
It should be noted that the execution main body of this embodiment may be a search server, and the user sends the pipeline search statement to the search server through a specific software, a web page, or an interface, and the search server executes the subsequent steps after receiving the pipeline search statement.
It is understood that the pipeline search statement is similar to the pipeline search command of Linux, and the pipeline search statement includes a plurality of search statements, and the search operation statements are executed in sequence, for example: in one pipeline search term, the output value of the first search term is the input value of the second search term, and all the search terms in the pipeline search terms are sequentially executed. For example, a pipeline-type search statement may be in the form of: "query object |operation instruction 1 to data set | operation instruction 2 to data set | operation instruction 3 to data set. By using a plurality of operation sets combined by the vertical lines, one operation instruction is a retrieval statement.
The operation instruction on the data set may be: the method comprises the following steps of filtering conditions, counting functions, aggregation functions, merging operations, sorting functions, distribution functions, top N operations, specific content highlighting operations, multi-data source fusion operations and the like, wherein the operation instructions comprise: SEARCH (SEARCH operation), FILTER (FILTER operation), AGG (aggregation operation), FIELD (merge operation), SORT (SORT operation), LIMIT (N operations in data fetch), PAGE (PAGE operation), TOP (N operations before fetch), HIGHLIGHT (specific content highlight operation), SCROLL (SCROLL operation), SCHEMA (mode operation), etc., for example, in 100 pieces of data, by setting a parameter of LIMIT instruction, the 20 th to 30 th pieces of data in the data can be selected. The above is merely an example, and the present embodiment is not limited thereto.
Further, the step S10 includes: establishing a retrieval channel with user equipment; and receiving the pipeline type retrieval statement sent by the user equipment through the retrieval channel.
It should be noted that the retrieval channel is used for transmitting a pipeline type retrieval statement submitted by the user equipment, the retrieval channel connects the user equipment and the retrieval server, the retrieval channel may be a preset Web server, the Web server is used as an intermediate node between the user equipment and the retrieval server, the user sends the pipeline type retrieval statement to the retrieval server through the Web server, and the retrieval server sends a retrieval result to the user equipment through the Web server.
It will be appreciated that the user retrieves the search submission page by accessing a particular port of the Web server, such as 80 ports. And the user inputs the pipeline type retrieval statement into a retrieval submission page and submits the retrieval statement to the Web server. The data is retrieved through the retrieval submission page, so that the user can retrieve through a browser at any place, and the user experience is further improved.
Step S20: and extracting keywords of the pipeline type retrieval statement.
It should be understood that the keywords include search command keywords, data set operation instruction keywords, and the like in the pipeline retrieval sentence. For example, in the pipeline retrieval statement Search cluster. The operation object or the operation condition is attached after the keyword.
Further, the step S20 includes: determining retrieval elements of the pipeline type retrieval statement; and determining the keywords of the pipeline type retrieval statement according to the retrieval elements.
The search element includes: retrieving data sources, retrieving condition filtering, retrieving condition statistics, retrieving result quantity, retrieving result ordering and the like. Accordingly, the key includes: the method comprises the following steps: keywords of retrieval data sources, keywords of retrieval condition filtering, keywords of retrieval condition statistics, keywords of retrieval result quantity, keywords of retrieval result ordering, and the like. For example: the key word filter is used for filtering according to the condition, and the key word filter is used for filtering the key word according to the retrieval condition.
Step S30: and converting the pipeline type retrieval statement into a unified expression statement according to the keywords.
It can be understood that the unified expression statement refers to an expression manner obtained by classifying the pipeline type retrieval statement according to the keywords and the operation objects of the keywords, and after the expression manner is converted into the unified expression manner, the corresponding pipeline type retrieval statement is formed by only the keywords, the operation objects, the conditions after the keywords and the like according to the grammar rules of different data sources.
It should be noted that, when different data sources are searched, corresponding pipeline-type search statements need to be used, so that before the pipeline-type search statements are converted into corresponding pipeline-type search statements, the pipeline-type search statements need to be converted into uniform expression statements according to keywords, which facilitates subsequent conversion of the corresponding search statements. For example: search cluster, descriptor in the pipeline Search statement, the keyword is Search followed by the operation object, and at this time, the Search cluster, descriptor is expressed as "keyword: student ". The Search statement of the keyword Search is a Search operation, the cluster represents a cluster server, i.e., an object of the Search operation, and the database represents a database located in the cluster server.
Further, the step S30 includes: determining the retrieval relation of the keywords according to the pipeline type retrieval statement; and converting the keywords into a unified expression statement according to the retrieval relationship.
It should be understood that the Search relationship of the keyword refers to the correspondence between the keyword and the operation object and the operation condition of the keyword, for example, in the pipeline Search statement Search cluster. The unified expression statement of the Search statement Search cluster. student ".
Step S40: and acquiring a retrieval data source.
It will be appreciated that the data sources may include a variety of databases, such as Mysql, Elasticissearch, Oracle, DB2, SQL Server, sqlite, mongodb, redis, and the like. Therefore, when searching for data in a corresponding database, it is necessary to search for the data using a pipeline search statement of a corresponding syntax rule. The pipeline type retrieval statement sent by the user comprises a data source to be retrieved, and the data source to be retrieved can be obtained by analyzing and identifying the pipeline type retrieval statement.
Further, the step S40 includes: determining a data source retrieval statement according to the pipeline type retrieval statement; and determining a retrieval data source according to the data source retrieval statement.
In a specific implementation, for example, in a pipeline type Search statement Search cluster.student | filter >10, the data source Search statement is a Search cluster.student, and the data source Search statement can be obtained by searching a keyword Search, wherein student is a database, i.e., a data source, specific information of the database is stored in a Search server, and a database type, i.e., a data source type, can be obtained by searching a database name, so that a Search grammar rule corresponding to the data source is determined.
Step S50: and converting the unified expression statement into a target retrieval statement in a corresponding syntactic format of the retrieval data source.
It should be noted that, because the same expression statement disassembles and classifies the pipeline search statement according to the search relationship and the keyword, the pipeline search statement only needs to be converted into the corresponding search statement, i.e., the target search statement, according to the syntactic format of the data source. For example: the pipeline Search statement input by the user is Search cluster. student | filter age >10, the Search statement converted into Mysql database is Select from student name >10, and the Search statement converted into the elastic Search storage engine is as shown in fig. 3.
Step S60: and obtaining target retrieval data according to the target retrieval statement so as to realize multi-source data retrieval through the target retrieval data.
It is understood that executing the target retrieval statement may obtain data of the corresponding data source, i.e., the target retrieval data.
Before the search operation is performed, it is necessary to perform connection configuration with a plurality of data sources. The content retrieved by the multiple data sources may be duplicated, causing data interference, and therefore, the data retrieved by the multiple data sources needs to be subjected to a deduplication operation.
In the embodiment, a pipeline type retrieval statement is obtained; extracting keywords of the pipeline type retrieval statement; converting the keywords into a uniform expression statement; acquiring a retrieval data source; converting the uniform expression statement into a target retrieval statement in a syntax format corresponding to the retrieval data source; and obtaining target retrieval data according to the target retrieval statement so as to realize multi-source data retrieval through the target retrieval data. Through the method, the pipeline type retrieval sentences submitted by the user are obtained, the keywords of the pipeline type retrieval sentences are extracted, the keywords are converted into the sentences in the unified expression mode and finally converted into the sentences corresponding to the grammar rules of the data sources, one retrieval language can be converted into the target retrieval sentences corresponding to multiple data sources, and the data of the multiple data sources are obtained according to the target retrieval sentences, so that the purpose that the data of the multiple data sources are obtained by one retrieval sentence is achieved, developers can obtain the data of the multiple data sources only by learning one retrieval sentence, and the efficiency of developers is greatly improved.
Referring to fig. 4, fig. 4 is a flowchart illustrating a multi-source data retrieval method according to a second embodiment of the present invention.
Based on the first embodiment, before the step S20, the multi-source data retrieval method of this embodiment further includes:
step S11: and judging whether the pipeline type retrieval statement conforms to a preset rule or not.
It should be noted that the preset rule is a syntax rule that is correct in the pipeline type Search statement in this embodiment, for example, the keyword Search may Search a plurality of operation objects at the same time, each Search target is divided into "a Search target and" a Search target ", each Search target is composed of three parts, namely, a cluster name, a database (or an index in an ES) and a table name (or a type in an ES), and each part is divided into". The names upper AND lower case letters, _, AND numbers, AND the filter command supports logical expressions, i.e., logical expressions consisting of AND, OR, NOT. Only the correct grammar rules can be used to determine the keywords and their operation objects.
In a specific implementation, for example, the correct Search statement is Search cluster, the Search statement can determine the keyword Search and the operation object cluster, and if the Search statement submitted by the user is Search & ^ student, where "& ^" is unrecognizable scrambling code, the user is notified that the Search statement needs to be re-input for an error statement.
Step S12: and if the preset rule is met, executing the step of extracting the keywords of the pipeline type retrieval statement.
It will be appreciated that subsequent operations will only begin if the user enters the correct pipelined search statement.
In a specific implementation, after the user submits the error retrieval statement, the error part is marked, and the user is reminded to input the error part again.
In the embodiment, whether the pipeline type retrieval statement conforms to a preset rule is judged; and if the preset rule is met, executing the step of extracting the keywords of the pipeline type retrieval statement. When the user submits the pipeline type retrieval statement which accords with the preset rule, the subsequent steps can be carried out, and when the user inputs error information, the error part of the user can be reminded, so that the user experience is greatly improved.
Referring to fig. 5, fig. 5 is a flowchart illustrating a multi-source data retrieval method according to a third embodiment of the present invention.
Based on the first embodiment, in step S60, the multi-source data retrieval method of this embodiment includes:
step S61: and determining the queue sequence of the target retrieval statement according to the sequence of the pipeline type retrieval statement.
It should be noted that since the result of the pipeline search term is related to the execution order of the search terms in the pipeline search term and the output of the previous search term is the input of the next search term, the search operation needs to be performed strictly in order.
It is to be understood that, in order to keep the same order as the order of the search sentences in the pipeline search sentences submitted by the user, the order of the pipeline search sentences may be determined by using the order of the keywords, and after the unified expression sentence is converted into the target search sentence, the target search sentence is stored in the queue in the order of the keywords.
Step S62: and constructing the target retrieval statement into a pipeline type retrieval command according to the queue sequence.
It should be understood that the target search statements arranged in the order of the pipeline search statements are pipeline search commands, that is, an execution tree queue is constructed, and corresponding data sources are searched according to the execution tree queue.
Step S63: and obtaining retrieval data according to the retrieval command.
In a specific implementation, the retrieval data can be obtained by retrieving the corresponding database through the corresponding retrieval command.
Step S64: and fusing the retrieval data to obtain target retrieval data.
It should be noted that, data stored in a plurality of data sources may be the same or similar, so that repeated content may exist in the retrieved data, or data forms of different data sources are different, and the data is excessively complicated when being displayed to a user, so that operations such as deduplication and fusion of data, unification of data forms, and the like need to be performed.
The present embodiment determines the queue order of the target search statement according to the order of the pipeline search statement; constructing the target retrieval statement into a pipeline type retrieval command according to the queue sequence; obtaining retrieval data according to the retrieval command; and fusing the retrieval data to obtain target retrieval data. Through the method, the target retrieval command can be constructed according to the sequence of the pipeline type retrieval statement input by the user, and the retrieval result is subjected to operation such as duplication removal, so that the retrieval result can meet the requirement of the user.
As shown in fig. 6, a user submits a pipeline type search statement, the pipeline type search statement is transmitted to the converter through the Web server, and the converter module is responsible for disassembling, classifying and reassembling the search statement input by the user, generating search statements of different data sources, and executing a tree queue to be searched. The grammar definition in the converter extracts the key words according to the retrieval elements, the grammar analysis is according to the sentence input by the user and the definition of the grammar rule, analyzing, checking and identifying the user input, converting each search keyword and relationship input by the user into a uniform command view after the grammar analysis, in the target language converter is converted into a statement in the syntactic format of the target data source according to the data structure of the command view, and constructs an execution tree queue according to the retrieval sentences input by the user, constructs a pipeline type execution sequence, then, data is inquired from the target data source, and the result is recorded in the execution tree queue, after all queues of the executor have finished running, and performing final fusion processing on the result according to the execution tree queue, wherein the final fusion processing comprises operations of data merging, duplicate removal, impurity removal and the like, and then returning the final result to the user.
In addition, an embodiment of the present invention further provides a storage medium, where the storage medium stores a multi-source data retrieval program, and the multi-source data retrieval program, when executed by a processor, implements the steps of the multi-source data retrieval method described above.
Referring to fig. 7, fig. 7 is a block diagram of a multi-source data retrieval apparatus according to a first embodiment of the present invention.
As shown in fig. 7, the multi-source data retrieval apparatus according to the embodiment of the present invention includes:
an obtainingmodule 10, configured to obtain a pipeline type search statement;
an extractingmodule 20, configured to extract keywords of the pipeline type search statement;
aconversion module 30, configured to convert the keyword into a unified expression statement;
a datasource obtaining module 40, configured to obtain a retrieval data source;
astatement conversion module 50, configured to convert the unified expression statement into a target retrieval statement in a syntax format corresponding to the retrieval data source;
and theretrieval module 60 is configured to obtain target retrieval data according to the target retrieval statement, so as to implement multi-source data retrieval through the target retrieval data.
In an embodiment, the obtainingmodule 10 is further configured to establish a retrieval channel with a user equipment;
and receiving the pipeline type retrieval statement sent by the user equipment through the retrieval channel.
In an embodiment, the extractingmodule 20 is further configured to determine a search element of the pipeline search statement; and determining the keywords of the pipeline type retrieval statement according to the retrieval elements.
In an embodiment, the extractingmodule 20 is further configured to determine whether the pipeline type search statement meets a preset rule; and if the preset rule is met, executing the step of extracting the keywords of the pipeline type retrieval statement.
In an embodiment, theconversion module 30 is further configured to determine a retrieval relationship of the keyword according to the pipeline retrieval statement; and converting the keywords into a unified expression statement according to the retrieval relationship.
In an embodiment, the datasource obtaining module 40 is further configured to determine a data source retrieval statement according to the pipeline retrieval statement; and determining a retrieval data source according to the data source retrieval statement.
In an embodiment, the retrievingmodule 60 is further configured to determine a queue order of the target retrieving statements according to an order of the pipeline retrieving statements; constructing the target retrieval statement into a pipeline type retrieval command according to the queue sequence; obtaining retrieval data according to the retrieval command; and fusing the retrieval data to obtain target retrieval data.
It should be understood that the above is only an example, and the technical solution of the present invention is not limited in any way, and in a specific application, a person skilled in the art may set the technical solution as needed, and the present invention is not limited thereto.
In the embodiment, a pipeline type retrieval statement is obtained; extracting keywords of the pipeline type retrieval statement; converting the keywords into a uniform expression statement; acquiring a retrieval data source; converting the uniform expression statement into a target retrieval statement in a syntax format corresponding to the retrieval data source; and obtaining target retrieval data according to the target retrieval statement so as to realize multi-source data retrieval through the target retrieval data. Through the method, the pipeline type retrieval sentences submitted by the user are obtained, the keywords of the pipeline type retrieval sentences are extracted, the keywords are converted into the sentences in the unified expression mode and finally converted into the sentences corresponding to the grammar rules of the data sources, one retrieval language can be converted into the target retrieval sentences corresponding to multiple data sources, and the data of the multiple data sources are obtained according to the target retrieval sentences, so that the purpose that the data of the multiple data sources are obtained by one retrieval sentence is achieved, developers can obtain the data of the multiple data sources only by learning one retrieval sentence, and the efficiency of developers is greatly improved.
It should be noted that the above-described work flows are only exemplary, and do not limit the scope of the present invention, and in practical applications, a person skilled in the art may select some or all of them to achieve the purpose of the solution of the embodiment according to actual needs, and the present invention is not limited herein.
In addition, the technical details that are not described in detail in this embodiment may refer to the multi-source data retrieval method provided in any embodiment of the present invention, and are not described herein again.
Further, it is to be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention or portions thereof that contribute to the prior art may be embodied in the form of a software product, where the computer software product is stored in a storage medium (e.g. Read Only Memory (ROM)/RAM, magnetic disk, optical disk), and includes several instructions for enabling a terminal device (e.g. a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.