Movatterモバイル変換


[0]ホーム

URL:


CN114741279A - SQL statement debugging method, device and equipment - Google Patents

SQL statement debugging method, device and equipment
Download PDF

Info

Publication number
CN114741279A
CN114741279ACN202210297316.6ACN202210297316ACN114741279ACN 114741279 ACN114741279 ACN 114741279ACN 202210297316 ACN202210297316 ACN 202210297316ACN 114741279 ACN114741279 ACN 114741279A
Authority
CN
China
Prior art keywords
debugging
sql statement
debugged
module
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210297316.6A
Other languages
Chinese (zh)
Inventor
郭健
张乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co LtdfiledCriticalNew H3C Big Data Technologies Co Ltd
Priority to CN202210297316.6ApriorityCriticalpatent/CN114741279A/en
Publication of CN114741279ApublicationCriticalpatent/CN114741279A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Landscapes

Abstract

The invention provides a method, a device and equipment for debugging SQL sentences, which are used for solving the technical problems of difficulty in debugging the SQL sentences and unstable service in a big data environment. The method comprises the steps that a debugging server module takes SQL to be debugged as a parameter of debugging task application, the debugging task is submitted to a distributed computing engine to be executed in a client interaction mode, the debugging task application writes an SQL statement debugging result into a specified storage space, and the debugging result is fed back to a user through an inquiry interface. The invention adopts a relatively stable task submitting mode to submit the debugging task, and the debugging task application program writes the debugging result into the specified storage space so as to conveniently obtain the debugging result by the query interface, so that the debugging process is more easy to use, stable and efficient.

Description

Translated fromChinese
SQL语句调试方法、装置和设备SQL statement debugging method, device and device

技术领域technical field

本发明涉及通信及云计算技术领域,尤其涉及一种SQL语句调试方法、装置和设备。The present invention relates to the technical field of communication and cloud computing, and in particular, to a method, device and device for debugging an SQL statement.

背景技术Background technique

Apache Spark是专为大规模数据处理而设计、基于内存的分布式计算引擎,其最大特点就是将计算数据和中间结果都存储在内存中,大大减少了输入输出即IO开销。Spark下的Spark SQL基于Spark核心即Spark Core,能够使用SQL命令统一处理关系表和弹性分布式数据集(Resilient Distributed Dataset,RDD),使用门槛低。利用Spark计算引擎以及Hadoop生态,能够很好的处理许多业务场景中的数据分析,满足大部分业务的使用需求。Apache Spark is a memory-based distributed computing engine designed for large-scale data processing. Its biggest feature is that it stores computing data and intermediate results in memory, which greatly reduces input and output (IO) overhead. Spark SQL under Spark is based on the Spark core, that is, Spark Core. It can use SQL commands to uniformly process relational tables and Resilient Distributed Datasets (RDDs), with a low threshold for use. Using the Spark computing engine and the Hadoop ecosystem, it can handle data analysis in many business scenarios and meet the needs of most businesses.

但是目前使用Spark SQL进行处理的任务,涉及的表数据量一般较大,执行流程都相对繁琐。生产环境使用的过程中,如需对Spark SQL的正确性与执行结果的准确性进行验证,会存在调试耗时长、调试困难等问题。However, the tasks currently processed by Spark SQL generally involve a large amount of table data, and the execution process is relatively cumbersome. In the process of using in the production environment, if you need to verify the correctness of Spark SQL and the accuracy of the execution results, there will be problems such as long debugging time and difficulty in debugging.

YARN(Yet Another Resource Negotiator)是Apache Hadoop中的一种集群资源管理器,可为上层应用提供统一的资源管理和调度。在YARN之前,YARN的部分工作是由MapReduce完成的。Spark支持可插拔的集群管理模式,例如Standalone、Mesos以及YARN等。YARN (Yet Another Resource Negotiator) is a cluster resource manager in Apache Hadoop that provides unified resource management and scheduling for upper-layer applications. Before YARN, part of YARN's work was done by MapReduce. Spark supports pluggable cluster management modes such as Standalone, Mesos, and YARN.

目前,对Spark SQL的正确性与执行结果的准确性进行验证,比较通用的方法是:首先,将待验证的SQL语句写入Spark程序,将Spark程序编译打包,并上传至集群;然后,通过Spark将执行Spark程序的任务提交运行;运行后,在YARN的图形用户界面上查看该任务的日志,验证执行结果;如果任务执行失败或验证结果不符合预期,则重复上述步骤,直至验证成功。At present, the more general method to verify the correctness of Spark SQL and the accuracy of the execution results is: first, write the SQL statement to be verified into the Spark program, compile and package the Spark program, and upload it to the cluster; Spark submits the task of executing the Spark program for running; after running, view the log of the task on the YARN GUI to verify the execution result; if the task fails to execute or the verification result does not meet expectations, repeat the above steps until the verification succeeds.

上述验证方式,每次调试都需要重新对Spark程序打包运行,查询效率低下,且流程复杂、操作困难,需要用户自己编写代码,去集群上运行,再去YARN UI界面查看日志,易用性差,容易出错。The above verification method requires the Spark program to be repackaged and run for each debugging. The query efficiency is low, the process is complex, and the operation is difficult. Users need to write their own code, run it on the cluster, and then go to the YARN UI interface to view the logs, which is not easy to use. Error-prone.

另一种对Spark SQL的正确性与执行结果的准确性进行验证的方法是通过SparkThrift Server进行验证,方法是首先从客户端向Spark Thrift Server发送查询请求;然后客户端从Spark Thrift Server获取结果,根据结果进行验证。由于Spark ThriftServer为提交到YARN上的一个长服务,因此这种验证方式运行情况不稳定,无法在生产环境使用。另外,Spark Thrift Server所占用的资源由Spark组件相关配置决定,因此,在SQL较为复杂或数据量大的时候执行时间较长、性能较差,甚至会执行失败。Another way to verify the correctness of Spark SQL and the accuracy of the execution results is to verify through SparkThrift Server. The method is to first send a query request from the client to the Spark Thrift Server; then the client obtains the result from the Spark Thrift Server, Validate against the results. Since Spark ThriftServer is a long service submitted to YARN, this verification method is unstable and cannot be used in a production environment. In addition, the resources occupied by the Spark Thrift Server are determined by the configuration of Spark components. Therefore, when the SQL is complex or the amount of data is large, the execution time is long, the performance is poor, and the execution may even fail.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明提供一种SQL语句调试方法、装置和设备,用于解决大数据环境中的SQL语句调试困难和服务不稳定的技术问题。In view of this, the present invention provides a method, device and device for debugging SQL statements, which are used to solve the technical problems of difficulty in debugging SQL statements and unstable services in a big data environment.

基于本发明实施例的一方面,本发明提供了一种SQL语句调试方法,该方法应用于服务端调试模块,该方法包括:Based on an aspect of the embodiments of the present invention, the present invention provides a method for debugging an SQL statement. The method is applied to a server-side debugging module, and the method includes:

接收SQL调试请求,SQL调试请求中携带待调试的SQL语句;Receive a SQL debugging request, and the SQL debugging request carries the SQL statement to be debugged;

将包含调试任务应用的任务以客户端模式提交到集群资源管理器(例如YARN),所述调试任务应用携带的参数包括待调试的SQL语句;Submit a task including a debugging task application to a cluster resource manager (such as YARN) in client mode, where the parameters carried by the debugging task application include the SQL statement to be debugged;

所述调试任务应用用于将待调试的SQL语句提交给分布式计算引擎(例如SparkSQL计算引擎)去执行,并接收分布式计算引擎反馈的所述待调试的SQL语句的执行结果,根据执行结果生成调试结果后将调试结果写入指定存储位置的存储空间中(例如HDFS指定存储路径下的数据文件中)。The debugging task application is used to submit the SQL statement to be debugged to a distributed computing engine (for example, the SparkSQL computing engine) for execution, and receive the execution result of the SQL statement to be debugged fed back by the distributed computing engine, according to the execution result After the debugging results are generated, the debugging results are written into the storage space of the specified storage location (for example, the data files in the specified storage path of HDFS).

进一步地,所述方法还包括:Further, the method also includes:

基于调试结果查看请求,获取所述指定存储位置的存储空间中存储的所述待调试的SQL语句的调试结果,并将调试结果发送给客户端调试模块;或,Based on the debugging result viewing request, obtain the debugging result of the SQL statement to be debugged stored in the storage space of the specified storage location, and send the debugging result to the client debugging module; or,

在生成调试结果后,主动将所述调试结果发送给客户端调试模块。After the debugging result is generated, the debugging result is actively sent to the client debugging module.

进一步地,所述方法还包括:Further, the method also includes:

接收SQL语句语法检测请求,语法检测请求中携带待检测的SQL语句;Receive a SQL statement syntax detection request, and the syntax detection request carries the SQL statement to be detected;

调用分布式计算引擎提供的执行计划生成接口获取所述待检测的SQL语句的执行计划;Calling the execution plan generation interface provided by the distributed computing engine to obtain the execution plan of the SQL statement to be detected;

解析返回的执行计划,分析是否存在语法错误,如果存在语法错误,向客户端调试模块反馈错误信息。Parse the returned execution plan, analyze whether there is a syntax error, and if there is a syntax error, feedback the error information to the client debugging module.

进一步地,在将包含调试任务应用的任务提交到集群资源管理器之前,所述方法还包括:Further, before submitting the task including the debugging task application to the cluster resource manager, the method further includes:

调用分布式计算引擎提供的执行计划生成接口获取所述待调试的SQL语句的执行计划;Calling the execution plan generation interface provided by the distributed computing engine to obtain the execution plan of the SQL statement to be debugged;

解析返回的执行计划,分析是否存在语法错误;Parse the returned execution plan and analyze whether there are syntax errors;

如果存在语法错误,向客户端调试模块反馈错误信息;If there is a syntax error, feedback the error information to the client debugging module;

如果不存在语法错误,则将待调试的SQL语句作为调试任务应用的参数将调试任务应用及待调试SQL语句一同提交给集群资源管理器。If there is no syntax error, the SQL statement to be debugged is used as a parameter of the debug task application, and the debug task application and the SQL statement to be debugged are submitted to the cluster resource manager together.

进一步地,在将包含调试任务应用的任务提交到集群资源管理器之前,所述方法还包括如下预处理的步骤:Further, before submitting the task including the debugging task application to the cluster resource manager, the method further includes the following preprocessing steps:

基于预处理参数,为待调试的SQL语句添加限制返回记录条数的limit子句。Based on the preprocessing parameters, add a limit clause to limit the number of returned records for the SQL statement to be debugged.

基于本发明实施例的另一方面,本发明还提供一种SQL语句调试装置,该装置包括:Based on another aspect of the embodiments of the present invention, the present invention further provides a device for debugging SQL statements, the device comprising:

收发模块,用于接收SQL调试请求,SQL调试请求中携带待调试的SQL语句;The transceiver module is used to receive the SQL debugging request, and the SQL debugging request carries the SQL statement to be debugged;

提交模块,用于将包含调试任务应用的任务以客户端模式提交到集群资源管理器,所述调试任务应用携带的参数包括待调试的SQL语句;所述调试任务应用用于将待调试的SQL语句提交给分布式计算引擎去执行;a submission module, configured to submit a task including a debugging task application to the cluster resource manager in a client mode, where the parameters carried by the debugging task application include the SQL statement to be debugged; the debugging task application is used to submit the SQL to be debugged The statement is submitted to the distributed computing engine for execution;

结果处理模块,用于接收分布式计算引擎反馈的所述待调试的SQL语句的执行结果,根据执行结果生成调试结果后将调试结果写入指定存储位置的存储空间中。The result processing module is configured to receive the execution result of the SQL statement to be debugged fed back by the distributed computing engine, generate the debug result according to the execution result, and write the debug result into the storage space of the designated storage location.

进一步地,所述装置还包括:查看接口模块,用于基于调试结果查看请求(由所述收发模块转发)获取所述指定存储位置的存储空间中存储的所述待调试的SQL语句的调试结果,并将调试结果通过所述收发模块发送给客户端调试模块;或用于在生成调试结果后,主动将调试结果通过所述收发模块发送给客户端调试模块。Further, the apparatus further includes: a viewing interface module, configured to obtain the debugging result of the SQL statement to be debugged stored in the storage space of the specified storage location based on a debugging result viewing request (forwarded by the transceiver module) , and send the debugging result to the client debugging module through the transceiver module; or after generating the debugging result, actively send the debugging result to the client debugging module through the transceiver module.

进一步地,所述装置还包括:语法检测接口模块,用于基于SQL语句语法检测请求,调用分布式计算引擎提供的执行计划生成接口获取所述待检测的SQL语句的执行计划;解析返回的执行计划,分析是否存在语法错误;Further, the device further includes: a grammar detection interface module, configured to call an execution plan generation interface provided by a distributed computing engine based on the SQL statement grammar detection request to obtain the execution plan of the SQL statement to be detected; parse the returned execution plan plan, analyze for grammatical errors;

所述收发模块还用于接收SQL语句语法检测请求,语法检测请求中携带待检测的SQL语句;以及在所述执行计划存在语法错误的情况下,向客户端调试模块反馈错误信息。The transceiver module is further configured to receive a SQL statement syntax detection request, where the syntax detection request carries the SQL statement to be detected; and when there is a syntax error in the execution plan, feedback error information to the client debugging module.

进一步地,在所述提交模块将包含调试任务应用的任务提交到集群资源管理器之前调用语法检测接口模块获取所述待调试的SQL语句的执行计划;在所述待调试的SQL语句的执行计划存在语法错误的情况下,通过所述收发模块向客户端调试模块反馈错误信息;在所述待调试的SQL语句的执行计划不存在语法错误的情况下,将待调试的SQL语句作为调试任务应用的参数将调试任务应用及待调试SQL语句一同提交给集群资源管理器。Further, before the submission module submits the task containing the debugging task application to the cluster resource manager, the grammar detection interface module is called to obtain the execution plan of the SQL statement to be debugged; in the execution plan of the SQL statement to be debugged In the case of a syntax error, the sending and receiving module feeds back error information to the client debugging module; under the condition that there is no syntax error in the execution plan of the SQL statement to be debugged, the SQL statement to be debugged is used as a debugging task The parameter submits the debugging task application and the SQL statement to be debugged to the cluster resource manager.

进一步地,所述装置还包括:预处理模块,用于在将包含调试任务应用的任务提交到集群资源管理器之前,基于预处理参数,为待调试的SQL语句添加限制返回记录条数的limit子句。Further, the device further includes: a preprocessing module, configured to add a limit to the number of returned records for the SQL statement to be debugged based on the preprocessing parameter before submitting the task including the debugging task application to the cluster resource manager clause.

本发明由调试服务端模块将待调试的SQL作为调试任务应用的参数以客户端交互模式将调试任务提交给分布式计算引擎执行,调试任务应用将SQL语句调试结果写入指定存储空间,通过查询接口将调试结果反馈给用户。本发明采用较为稳定的任务提交方式提交调试任务,由调试任务应用程序将调试结果写入指定存储空间以供查询接口方便的获取调试结果,调试过程更加易用、稳定和高效。In the present invention, the debugging server module takes the SQL to be debugged as a parameter of the debugging task application and submits the debugging task to the distributed computing engine in the client interactive mode for execution. The interface feeds back the debugging results to the user. The invention adopts a relatively stable task submission method to submit the debugging task, and the debugging task application program writes the debugging results into the designated storage space for the query interface to conveniently obtain the debugging results, and the debugging process is more easy to use, stable and efficient.

附图说明Description of drawings

为了更加清楚地说明本发明实施例或者现有技术中的技术方案,下面将对本发明实施例或者现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据本发明实施例的这些附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following briefly introduces the drawings that are required to be used in the description of the embodiments of the present invention or the prior art. Obviously, the drawings in the following description These are just some embodiments described in the present invention, and for those of ordinary skill in the art, other drawings can also be obtained according to these drawings of the embodiments of the present invention.

图1为本发明一实施例提供的应用本发明提供的SQL语句调试方法的系统结构及模块间的信息交互示意图;1 is a schematic diagram of a system structure and information interaction between modules applying the SQL statement debugging method provided by the present invention according to an embodiment of the present invention;

图2为本发明一实施例提供的通过调试结果查看接口查看调试的SQL语句的调试结果的模块交互示意图;2 is a schematic diagram of module interaction for viewing a debugging result of a debugged SQL statement through a debugging result viewing interface provided by an embodiment of the present invention;

图3示例了本发明一实施例提供的对SQL语句进行语法检测的方法及模块交互示意图;3 illustrates a schematic diagram of a method and module interaction for performing syntax detection on an SQL statement provided by an embodiment of the present invention;

图4为本发明一实施例提供的用于实现本发明实施例提供的SQL语句调试方法的电子设备结构示意图。FIG. 4 is a schematic structural diagram of an electronic device for implementing the SQL statement debugging method provided by the embodiment of the present invention according to an embodiment of the present invention.

具体实施方式Detailed ways

在本发明实施例使用的术语仅仅是出于描述特定实施例的目的,而非限制本发明实施例。本发明实施例中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其它含义。本发明中使用的术语“和/或”是指包含一个或多个相关联的列出项目的任何或所有可能组合。The terms used in the embodiments of the present invention are only for the purpose of describing specific embodiments, rather than limiting the embodiments of the present invention. As used in the embodiments of the present invention, the singular forms "a," "the," and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. As used herein, the term "and/or" is meant to include any and all possible combinations of one or more of the associated listed items.

应当理解,尽管在本发明实施例可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本发明实施例范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,此外,所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used to describe various information in the embodiments of the present invention, the information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, without departing from the scope of the embodiments of the present invention, the first information may also be referred to as the second information, and similarly, the second information may also be referred to as the first information. Furthermore, the use of the word "if" can be interpreted as "at the time of" or "when" or "in response to determining", depending on the context.

本发明的发明目的是提供一种稳定高效的应用于大数据环境中的SQL语句调试方法、装置和设备,本发明的基本思想是:用户通过调试客户端模块将待调试的SQL语句提交给调试服务端,调试服务端模块将待调试的SQL作为调试任务应用的参数以客户端交互模式将调试任务提交给分布式计算引擎(例如Spark SQL引擎)执行,调试任务应用将SQL语句调试结果写入指定存储空间(例如分布式文件系统HDFS),通过查询接口将调试结果反馈给用户。本发明采用较为稳定的任务提交方式提交调试任务,由调试任务应用程序将调试结果写入指定存储空间以供查询接口方便的获取调试结果,调试过程更加易用、稳定和高效。The purpose of the present invention is to provide a stable and efficient SQL statement debugging method, device and equipment applied in a big data environment. The basic idea of the present invention is: the user submits the SQL statement to be debugged to the debugging through the debugging client module On the server side, the debugging server module takes the SQL to be debugged as a parameter of the debugging task application and submits the debugging task to a distributed computing engine (such as the Spark SQL engine) in the client interactive mode for execution, and the debugging task application writes the SQL statement debugging results into Specify the storage space (such as the distributed file system HDFS), and feedback the debugging results to the user through the query interface. The invention adopts a relatively stable task submission method to submit the debugging task, and the debugging task application program writes the debugging results into the designated storage space for the query interface to conveniently obtain the debugging results, and the debugging process is more easy to use, stable and efficient.

基于本发明的基本思想,以下结合附图和具体实施例来描述本发明的具体实现过程。Based on the basic idea of the present invention, the specific implementation process of the present invention is described below with reference to the accompanying drawings and specific embodiments.

图1为本发明一实施例提供的应用本发明提供的SQL语句调试方法的系统结构及模块间的信息交互示意图。基于该实施例的系统结构,SQL语句调试方法的步骤包括:FIG. 1 is a schematic diagram of a system structure and information interaction between modules applying the SQL statement debugging method provided by the present invention according to an embodiment of the present invention. Based on the system structure of this embodiment, the steps of the SQL statement debugging method include:

步骤101.客户端调试模块向服务端调试模块发送SQL调试请求,SQL调试请求中携带待调试的SQL语句;Step 101. The client debugging module sends an SQL debugging request to the server debugging module, and the SQL debugging request carries the SQL statement to be debugged;

在本发明实施例中,可为用户提供集成本发明提供的SQL语句调试功能的客户端调试模块,该客户端调试模块可以是一个独立的可执行程序,有可以是以组件的方式集成到现有的管理客户端当中,本发明不做限制。In the embodiment of the present invention, a client-side debugging module that integrates the SQL statement debugging function provided by the present invention can be provided for the user. The client-side debugging module can be an independent executable program, or can be integrated into the existing Among some management clients, the present invention does not limit it.

在本发明一实施例中,本发明方法所应用的大数据环境所采用的分布式计算引擎为Spark SQL引擎,本发明所提方案也可适用于其它类似的分布式计算引擎,本发明不做具体限定。以Spark SQL引擎为例,Spark SQL引擎支持标准SQL语句和Spark DataFrame API接口两种运行方式方式,本发明一优选实施例中,客户端调试模块为用户提供标准SQL调试接口,用户只需输入标准SQL即可通过客户端调试模块发起调试请求,对用户更加友好,提升了调试易用性。In an embodiment of the present invention, the distributed computing engine used in the big data environment to which the method of the present invention is applied is Spark SQL engine. The solution proposed in the present invention can also be applied to other similar distributed computing engines. Specific restrictions. Taking the Spark SQL engine as an example, the Spark SQL engine supports two operating modes: standard SQL statements and Spark DataFrame API interfaces. In a preferred embodiment of the present invention, the client debugging module provides users with standard SQL debugging interfaces, and users only need to input standard SQL can initiate debugging requests through the client debugging module, which is more user-friendly and improves the ease of debugging.

步骤102.服务端调试模块接收SQL调试请求,服务端调试模块在Yarn集群管理模式下,将包含调试任务应用的任务以客户端模式提交到Yarn集群资源管理器;所述调试任务应用携带的参数包括待调试的SQL语句。Step 102. The server-side debugging module receives the SQL debugging request, and in the Yarn cluster management mode, the server-side debugging module submits the task containing the debugging task application to the Yarn cluster resource manager in the client mode; the parameters carried by the debugging task application Include the SQL statement to be debugged.

步骤103.当调试任务被调度运行后,调试任务应用将待调试的SQL语句提交给Spark SQL引擎去执行。Step 103. After the debugging task is scheduled to run, the debugging task application submits the SQL statement to be debugged to the Spark SQL engine for execution.

Spark支持可插拔的集群管理模式有Standalone、Mesos以及YARN三种模式,YARN集群管理模式的作业任务提交Spark-submit又包括两种任务提交模式,分别为yarn-client客户端模式和yarn-cluster集群模式。其中,yarn-cluster适用于生产环境,而yarn-client适用于交互和调试环境。Spark supports pluggable cluster management modes: Standalone, Mesos and YARN. The job task submission in YARN cluster management mode Spark-submit includes two task submission modes, yarn-client client mode and yarn-cluster cluster mode. Among them, yarn-cluster is suitable for production environments, while yarn-client is suitable for interactive and debugging environments.

该实施例中,将调试任务应用以yarn-client客户端模式被提交给Yarn集群资源管理器,由Yarn进行任务调度,当任务被执行时,调试任务应用将待调试的SQL语句提交给Spark SQL引擎去执行,执行结果反馈给调试任务应用,由调试任务应用将执行结果写入HDFS指定的目录中。In this embodiment, the debugging task application is submitted to the Yarn cluster resource manager in the yarn-client client mode, and Yarn performs task scheduling. When the task is executed, the debugging task application submits the SQL statement to be debugged to Spark SQL The engine executes, and the execution result is fed back to the debugging task application, which writes the execution result to the directory specified by HDFS.

使用yarn-client方式向集群提交调试任务(每条待调试的SQL语句都有一个属于自己的应用Application,互不影响),相比较而言,较之Spark Thrift Server方式更为稳定。The yarn-client method is used to submit debugging tasks to the cluster (each SQL statement to be debugged has its own application, which does not affect each other). Compared with the Spark Thrift Server method, it is more stable.

步骤104.调试任务应用接收Spark SQL引擎反馈的待调试SQL语句的执行结果,基于所述执行结果将包括执行结果的调试结果写入指定存储位置的存储空间中;Step 104. The debugging task application receives the execution result of the SQL statement to be debugged fed back by the Spark SQL engine, and writes the debugging result including the execution result into the storage space of the specified storage location based on the execution result;

指定存储位置的存储空间可以是分布式文件系统(例如HDFS)指定路径下的数据文件或指定数据库的数据表中,本发明不做具体限定。The storage space of the specified storage location may be a data file under a specified path of a distributed file system (for example, HDFS) or a data table of a specified database, which is not specifically limited in the present invention.

在一具体实施例中,Spark SQL引擎执行提交上来的待调试的SQL语句,获取包含SQL语句执行结果的数据框架DataFrame,使用数据框架的获取列信息的DataFrame.columns方法获得执行结果的列信息,再使用数据框架的转文本方法将SQL语句的执行结果的列数据转换成指定文本格式,最后将列信息及列数据打包生成调式结果,将调试结果存储到HDFS的指定路径下的数据文件中。In a specific embodiment, the Spark SQL engine executes the submitted SQL statement to be debugged, obtains a data frame DataFrame containing the execution result of the SQL statement, and obtains the column information of the execution result using the DataFrame.columns method of obtaining column information of the data frame, Then use the data-to-text conversion method of the data frame to convert the column data of the execution result of the SQL statement into the specified text format, and finally package the column information and column data to generate the debug result, and store the debug result in the data file under the specified path of HDFS.

例如,通过DataFrame.toJSON.collectAsList().toString的方式,将SQL语句的执行结果数据框架DataFrame中的结果数据取出,拆分转换成JSON字符串格式,形如:For example, through the method of DataFrame.toJSON.collectAsList().toString, the result data in the data frame DataFrame of the execution result of the SQL statement is taken out, split and converted into JSON string format, such as:

[{\"id\"(字段名1):\"aaa\"(值1),\"age\"(字段名2):"1"(值2)}][{\"id\"(field name 1):\"aaa\"(value 1),\"age\"(field name 2):"1"(value 2)}]

将SQL语句的执行结果的列信息与结果数据按指定格式导出到HDFS的指定路径下的数据文件里,采用的程序语句示例如下:Export the column information and result data of the execution result of the SQL statement to the data file in the specified path of HDFS according to the specified format. The example of the program statement used is as follows:

sparkSession.createDataFrame(dataList,schema).write.mode(SaveMode.Append).text(params.destHdfsPath)sparkSession.createDataFrame(dataList,schema).write.mode(SaveMode.Append).text(params.destHdfsPath)

其中,dataList为包含执行结果数据的数据列表参数,schema为数据结构参数,destHdfsPath为HDFS集群中用于存储调试结果的数据文件的指定路径参数。Among them, dataList is the data list parameter containing the execution result data, schema is the data structure parameter, and destHdfsPath is the specified path parameter of the data file used to store the debugging results in the HDFS cluster.

HDFS指定路径和数据文件的名称可在服务端预配置,以参数的形式传入调试任务应用,每个作业任务可在HDFS指定路径下生成唯一的子路径或数据文件名。The HDFS specified path and data file name can be pre-configured on the server and passed to the debugging task application in the form of parameters. Each job task can generate a unique subpath or data file name under the HDFS specified path.

本发明另一实施例中,除了提供上述SQL语句的调试接口外,还提供调试结果查看接口,以使用户能够通过客户端调试模块及时、方便、快速的查看调试的SQL语句的执行结果。图2为本发明一实施例提供的通过调试结果查看接口查看调试的SQL语句的调试结果的模块交互示意图。In another embodiment of the present invention, in addition to the debugging interface for the above-mentioned SQL statement, a debugging result viewing interface is also provided, so that the user can view the execution result of the debugged SQL statement in a timely, convenient and fast manner through the client debugging module. FIG. 2 is a schematic diagram of module interaction for viewing a debugging result of a debugged SQL statement through a debugging result viewing interface according to an embodiment of the present invention.

步骤201.客户端调试模块向服务端调试模块发送调试结果查看请求,查看请求中携带能够唯一定位调式的SQL语句的调式结果的标识信息;Step 201. The client debugging module sends a debugging result viewing request to the server debugging module, and the viewing request carries the identification information of the debugging result of the SQL statement that can uniquely locate the debugging;

用户可通过客户端调试模块向服务端调试模块发送针对某个调试的SQL语句的调试结果查询请求,查看请求中携带能够唯一定位调式的SQL语句的调式结果的标识信息。例如,标识信息可以是基于以下信息中的一个或多个生成的唯一性标识:调试的SQL语句、用户名称、会话标识、调试时间等。The user can send a debugging result query request for a certain debugging SQL statement to the server debugging module through the client debugging module, and check the identification information of the debugging result of the SQL statement that can uniquely locate the debugging query carried in the request. For example, the identification information may be a unique identification generated based on one or more of the following information: a debugged SQL statement, a user name, a session identification, a debugging time, and the like.

步骤202.服务端调试模块接收到调试结果查看请求后,基于标识信息从HDFS的指定路径下获取调试结果;Step 202. After receiving the debugging result viewing request, the server debugging module obtains the debugging result from the specified path of HDFS based on the identification information;

服务端调试模块根据调试结果查看请求中携带的与调试的SQL语句对应的标识信息,从HDFS的指定路径下获取调试的SQL语句的执行结果的列信息与结果数据,将调试结果反馈给客户端调试模块。The server-side debugging module checks the identification information corresponding to the debugged SQL statement carried in the request according to the debugging result, obtains the column information and result data of the execution result of the debugged SQL statement from the specified path of HDFS, and feeds back the debugging result to the client. Debug module.

步骤203.服务端调试模块将调试结果打包发送给客户端调试模块;Step 203. The server-side debugging module packages and sends the debugging results to the client-side debugging module;

步骤204.客户端调试模块接收和解析调试结果数据包,通过人机交互接口展示给用户。Step 204. The client debugging module receives and parses the debugging result data packet, and displays it to the user through the human-computer interaction interface.

在本发明另一实施例中,也可采用主动推送的方式由服务端调试模块主动将调试的SQL语句的调试结果反馈给客户端调试模块,在服务端调试模块将调试任务提交给YARN后处于等待状态,调试任务应用在将调试结果写入HDFS后,向服务端调试模块发送调试任务通知消息,通知消息中携带调试结果的存储路径,服务端调试模块根据通知消息中的存储路径获取调试结果并将调试结果打包返回给客户端调试模块。In another embodiment of the present invention, the server-side debugging module can also actively feed back the debugging results of the debugged SQL statements to the client-side debugging module in the way of active push. After the server-side debugging module submits the debugging task to YARN, the Waiting state, after the debugging task application writes the debugging results to HDFS, it sends a debugging task notification message to the server-side debugging module. The notification message carries the storage path of the debugging results. The server-side debugging module obtains the debugging results according to the storage path in the notification message. And package the debugging results back to the client debugging module.

在开发或日常管理维护过程中,经常会遇到输入的SQL语句有语法错误的情况,因此需要对SQL语句进行语法检测,针对这种语法检测的需求,图3示例了本发明一实施例提供的对SQL语句进行语法检测的方法及模块交互示意图。In the process of development or daily management and maintenance, it is often encountered that the input SQL statement has grammatical errors, so it is necessary to perform grammar detection on the SQL statement. Schematic diagram of the method and module interaction for syntax detection of SQL statements.

步骤301.客户端调试模块向服务端调试模块发送语法检测请求,语法检测请求中携带待检测的SQL语句;Step 301. The client debugging module sends a syntax detection request to the server debugging module, and the syntax detection request carries the SQL statement to be detected;

步骤302.服务端调试模块调用Spark SQL引擎提供的执行计划生成接口获取待检测的SQL语句的执行计划;Step 302. The server-side debugging module invokes the execution plan generation interface provided by the Spark SQL engine to obtain the execution plan of the SQL statement to be detected;

例如,通过调用Spark SQL引擎提供的执行计划生成接口explain$sql来获得要调试的sql语句的执行计划。For example, the execution plan of the SQL statement to be debugged can be obtained by calling the execution plan generation interface explain$sql provided by the Spark SQL engine.

步骤303.服务端调试模块解析Spark SQL引擎返回的执行计划,分析是否存在语法错误,如果存在语法错误,向客户端调试模块发送错误信息,如果不存在语法错误,则向客户端调试模块反馈检测无错误的响应信息。Step 303. The server-side debugging module parses the execution plan returned by the Spark SQL engine, and analyzes whether there is a syntax error. If there is a syntax error, it sends an error message to the client-side debugging module. If there is no syntax error, it reports the detection to the client-side debugging module. Error-free response information.

该实施例中,虽然使用了待检测的SQL语句的表述,但本领域技术人员可以理解,将SQL语句调试的过程与SQL语句语法校验的过程结合的场景中,在服务端调试模块对待调试的SQL语句以调试任务应用的参数向YARN提交作业任务之前,可先进行待调试的SQL语句的语法检测,若语法检测失败了,向客户端调试模块反馈语法检测结果;如果语法检测通过了,则将待调试的SQL语句作为调试任务应用的参数将调试任务应用及待调试SQL语句一同提交给YARN,由YARN进行作业调度和执行。在语法检测过程和调式过程结合的场景下,可以将调试请求和语法检测请求合二为一。In this embodiment, although the expression of the SQL statement to be detected is used, those skilled in the art can understand that in the scenario where the process of debugging the SQL statement is combined with the process of checking the syntax of the SQL statement, the debugging module on the server side is to be debugged. Before submitting the job task to YARN using the parameters of the debug task application, the SQL statement of the SQL statement to be debugged can be checked for the syntax of the SQL statement to be debugged. If the syntax check fails, the syntax check result is fed back to the client debugging module; Then, the SQL statement to be debugged is used as a parameter of the debug task application, and the debug task application and the SQL statement to be debugged are submitted to YARN, and YARN performs job scheduling and execution. In the scenario where the syntax detection process and the debugging process are combined, the debugging request and the syntax detection request can be combined into one.

在对SQL语句进行调试的过程中,可能会遇到SQL语句较为复杂、表数据量巨大、查询结果返回数据量巨大等需要消耗大量服务器和网络资源的情况,为了避免在调试过程中消耗过多的服务器计算和网络资源,本发明另一实施例中进一步对调试过程进行改进,在服务端调试模块提交调试任务之前,服务端调试模块对待调试的SQL语句进行预处理,预处理的目的是使待调试的SQL语句在执行的时候尽量少的使用服务器和网络资源,所使用的手段一个是减少查询语句所涉及的表数据,另一个是减少返回的结果数据量。In the process of debugging SQL statements, you may encounter situations that require a lot of server and network resources, such as complex SQL statements, huge amount of table data, and huge amount of data returned by query results. In order to avoid excessive consumption during the debugging process In another embodiment of the present invention, the debugging process is further improved. Before the server debugging module submits the debugging task, the server debugging module preprocesses the SQL statement to be debugged. The purpose of the preprocessing is to make The SQL statement to be debugged uses as little server and network resources as possible during execution. One of the methods used is to reduce the table data involved in the query statement, and the other is to reduce the amount of returned result data.

为了进行预处理,可在客户端调试模块为用户提供预处理参数设置接口,预处理参数中包括返回记录数限制参数,服务端调试模块根据所设置的返回记录数限制参数为待调试的SQL语句(查询类SQL语句或SQL语句中包括查询子句)自动添加增加限制返回记录条数的limit子句,示例如下:In order to perform preprocessing, the client debugging module can provide users with a preprocessing parameter setting interface. The preprocessing parameters include the limit parameter of the number of returned records. The server debugging module sets the SQL statement to be debugged according to the set limit parameter of the number of returned records. (Query-like SQL statements or SQL statements include query clauses) Automatically add limit clauses that limit the number of returned records. Examples are as follows:

Select*from t_user limit 10;Select*from t_user limit 10;

其中,limit 10指该次查询结果仅返回10条数据记录。Among them, limit 10 means that only 10 data records are returned in the query result.

通过上述预处理方式,可以对待调试的SQL语句返回的结果数据进行裁剪,限制执行结果中包含的记录条数,减少存入HDFS指定路径下结果数据文件中的数据量,从而在达到调试结果需求的同时提高查询效率,减少对集群的资源消耗。Through the above preprocessing method, the result data returned by the SQL statement to be debugged can be trimmed, the number of records included in the execution result can be limited, and the amount of data stored in the result data file under the specified path of HDFS can be reduced, so as to meet the debug result requirements. At the same time, the query efficiency is improved and the resource consumption of the cluster is reduced.

用户可针对单个调式任务的实际情况调整返回记录数限制参数,从而在调试过程中动态自定义分配资源,避免出现资源不足任务运行失败的情况,有利于根据SQL复杂度与数据量调整资源参数,提高调试的实用性和效率。The user can adjust the limit parameter of the number of returned records according to the actual situation of a single debug task, so as to dynamically customize the allocation of resources during the debugging process, to avoid the failure of running tasks with insufficient resources, and to adjust the resource parameters according to the complexity of SQL and the amount of data. Improve the practicality and efficiency of debugging.

由于提交到集群上运行的任务会消耗时间,为了提高SQL调试功能的响应性能,将调试作业任务的执行方式设置为异步方式,在任务执行期间,用户可执行其他作业任务,可监控调试作业任务的状态,动态实时获取任务运行日志(了解是否出错,错在哪里),无需等结果返回即可执行其他操作,任务执行成功的情况下可及时调用查看接口查看调试结果,任务执行失败的情况下也可以通过查看日志了解失败原因。Since the tasks submitted to the cluster to run will consume time, in order to improve the response performance of the SQL debugging function, the execution mode of the debugging job task is set to asynchronous mode. During the execution of the task, the user can execute other job tasks and monitor the debugging job tasks. The status of the task running log can be dynamically obtained in real time (to know if there is an error and where the error is), and other operations can be performed without waiting for the result to return. If the task execution is successful, the viewing interface can be called in time to view the debugging results. If the task execution fails You can also check the log to understand the reason for the failure.

本发明提供的SQL语句调试方法,将基于Spark的SQL语句调试过程主要扩展为三部分,分别为语法检测部分、SQL调试执行部分(结果校验)和结果查看部分(执行错误原因的获取和分析)。核心的SQL调试执行部分将调试的SQL语句的执行结果数据直接统一落地到Hadoop分布式文件系统(Hadoop Distributed File System,HDFS)上,用户可方便地通过客户端调式模块发起请求,由服务端调试模块从HDFS读取调试结果数据文件内容,可采用可视化的方式在客户端展示给用户,交互性好,稳定性高。The SQL statement debugging method provided by the present invention mainly expands the Spark-based SQL statement debugging process into three parts, which are a grammar detection part, an SQL debugging execution part (result verification) and a result viewing part (acquisition and analysis of the cause of execution errors) ). The core SQL debugging and execution part directly and uniformly places the execution result data of the debugged SQL statements on the Hadoop Distributed File System (HDFS). Users can easily initiate requests through the client debugging module, and the server debugs them. The module reads the content of the debugging result data file from HDFS, which can be displayed to the user on the client side in a visual way, with good interactivity and high stability.

图4为本发明一实施例提供的用于实现本发明实施例提供的SQL语句调试方法的电子设备结构示意图,该设备400包括:诸如中央处理单元(CPU)的处理器410、通信总线420、通信接口440以及存储介质430。其中,处理器410与存储介质430可以通过通信总线420相互通信。存储介质430内存储有计算机程序,当该计算机程序被处理器410执行时即可实现本发明提供的SQL语句调试方法中的一个或多个步骤的功能。4 is a schematic structural diagram of an electronic device for implementing the SQL statement debugging method provided by an embodiment of the present invention. Thedevice 400 includes: aprocessor 410 such as a central processing unit (CPU), acommunication bus 420,Communication interface 440 andstorage medium 430 . Theprocessor 410 and thestorage medium 430 may communicate with each other through thecommunication bus 420 . A computer program is stored in thestorage medium 430, and when the computer program is executed by theprocessor 410, the function of one or more steps in the SQL statement debugging method provided by the present invention can be realized.

其中,存储介质可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。另外,存储介质还可以是至少一个位于远离前述处理器的存储装置。处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processing,DSP)、专用集成电路(ApplicationSpecific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable GateArray,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The storage medium may include random access memory (Random Access Memory, RAM), and may also include non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk storage. In addition, the storage medium may also be at least one storage device located remote from the aforementioned processor. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (Digital Signal Processing, DSP), an application-specific integrated circuit ( Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.

应当认识到,本发明的实施例可以由计算机硬件、硬件和软件的组合、或者通过存储在非暂时性存储器中的计算机指令来实现或实施。所述方法可以使用标准编程技术,包括配置有计算机程序的非暂时性存储介质在计算机程序中实现,其中如此配置的存储介质使得计算机以特定和预定义的方式操作。每个程序可以以高级过程或面向对象的编程语言来实现以与计算机系统通信。然而,若需要,该程序可以以汇编或机器语言实现。在任何情况下,该语言可以是编译或解释的语言。此外,为此目的该程序能够在编程的专用集成电路上运行。此外,可按任何合适的顺序来执行本发明描述的过程的操作,除非本发明另外指示或以其他方式明显地与上下文矛盾。本发明描述的过程(或变型和/或其组合)可在配置有可执行指令的一个或多个计算机系统的控制下执行,并且可作为共同地在一个或多个处理器上执行的代码(例如,可执行指令、一个或多个计算机程序或一个或多个应用)、由硬件或其组合来实现。所述计算机程序包括可由一个或多个处理器执行的多个指令。It should be appreciated that embodiments of the present invention may be implemented or implemented by computer hardware, a combination of hardware and software, or by computer instructions stored in non-transitory memory. The methods may be implemented in a computer program using standard programming techniques, including a non-transitory storage medium configured with a computer program, wherein the storage medium so configured causes a computer to operate in a specific and predefined manner. Each program may be implemented in a high-level procedural or object-oriented programming language to communicate with a computer system. However, if desired, the program can be implemented in assembly or machine language. In any case, the language can be a compiled or interpreted language. Furthermore, the program can be run on a programmed application specific integrated circuit for this purpose. Furthermore, the operations of the processes described herein may be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The processes (or variations and/or combinations thereof) described herein can be performed under the control of one or more computer systems configured with executable instructions, and as code that executes collectively on one or more processors ( For example, executable instructions, one or more computer programs or one or more applications), implemented in hardware, or a combination thereof. The computer program includes a plurality of instructions executable by one or more processors.

进一步,所述方法可以在可操作地连接至合适的任何类型的计算平台中实现,包括但不限于个人电脑、迷你计算机、主框架、工作站、网络或分布式计算环境、单独的或集成的计算机平台、或者与带电粒子工具或其它成像装置通信等等。本发明的各方面可以以存储在非暂时性存储介质或设备上的机器可读代码来实现,无论是可移动的还是集成至计算平台,如硬盘、光学读取和/或写入存储介质、RAM、ROM等,使得其可由可编程计算机读取,当存储介质或设备由计算机读取时可用于配置和操作计算机以执行在此所描述的过程。此外,机器可读代码,或其部分可以通过有线或无线网络传输。当此类媒体包括结合微处理器或其他数据处理器实现上文所述步骤的指令或程序时,本发明所述的发明包括这些和其他不同类型的非暂时性计算机可读存储介质。当根据本发明所述的方法和技术编程时,本发明还包括计算机本身。Further, the methods may be implemented in any type of computing platform operably connected to a suitable, including but not limited to personal computer, minicomputer, mainframe, workstation, network or distributed computing environment, stand-alone or integrated computer platform, or communicate with charged particle tools or other imaging devices, etc. Aspects of the invention may be implemented in machine-readable code stored on a non-transitory storage medium or device, whether removable or integrated into a computing platform, such as a hard disk, an optically read and/or written storage medium, RAM, ROM, etc., such that it can be read by a programmable computer, when a storage medium or device is read by a computer, it can be used to configure and operate the computer to perform the processes described herein. Furthermore, the machine-readable code, or portions thereof, may be transmitted over wired or wireless networks. The invention described herein includes these and other various types of non-transitory computer-readable storage media when such media include instructions or programs that implement the steps described above in conjunction with a microprocessor or other data processor. The invention also includes the computer itself when programmed according to the methods and techniques described herein.

以上所述仅为本发明的实施例而已,并不用于限制本发明。对于本领域技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above descriptions are merely embodiments of the present invention, and are not intended to limit the present invention. Various modifications and variations of the present invention are possible for those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims (11)

Translated fromChinese
1.一种SQL语句调试方法,其特征在于,该方法应用于服务端调试模块,该方法包括:1. a SQL statement debugging method, it is characterised in that the method is applied to a server-side debugging module, and the method comprises:接收SQL调试请求,SQL调试请求中携带待调试的SQL语句;Receive a SQL debugging request, and the SQL debugging request carries the SQL statement to be debugged;将包含调试任务应用的任务以客户端模式提交到集群资源管理器,所述调试任务应用携带的参数包括待调试的SQL语句;Submit the task including the debugging task application to the cluster resource manager in the client mode, and the parameters carried by the debugging task application include the SQL statement to be debugged;所述调试任务应用用于将待调试的SQL语句提交给分布式计算引擎去执行,并接收分布式计算引擎反馈的所述待调试的SQL语句的执行结果,根据执行结果生成调试结果后将调试结果写入指定存储位置的存储空间中。The debugging task application is used to submit the SQL statement to be debugged to the distributed computing engine for execution, and to receive the execution result of the SQL statement to be debugged fed back by the distributed computing engine, and to debug the SQL statement after generating the debugging result according to the execution result. The result is written to the storage space of the specified storage location.2.根据权利要求1所述的方法,其特征在于,所述方法还包括:2. The method according to claim 1, wherein the method further comprises:基于调试结果查看请求,获取所述指定存储位置的存储空间中存储的所述待调试的SQL语句的调试结果,并将调试结果发送给客户端调试模块;或,在生成调试结果后,主动将所述调试结果发送给客户端调试模块。Based on the debugging result viewing request, obtain the debugging result of the SQL statement to be debugged stored in the storage space of the specified storage location, and send the debugging result to the client debugging module; or, after generating the debugging result, actively The debugging result is sent to the client debugging module.3.根据权利要求1所述的方法,其特征在于,所述方法还包括:3. The method according to claim 1, wherein the method further comprises:接收SQL语句语法检测请求,语法检测请求中携带待检测的SQL语句;Receive a SQL statement syntax detection request, and the syntax detection request carries the SQL statement to be detected;调用分布式计算引擎提供的执行计划生成接口获取所述待检测的SQL语句的执行计划;Calling the execution plan generation interface provided by the distributed computing engine to obtain the execution plan of the SQL statement to be detected;解析返回的执行计划,分析是否存在语法错误,如果存在语法错误,向客户端调试模块反馈错误信息。Parse the returned execution plan, analyze whether there is a syntax error, and if there is a syntax error, feedback the error information to the client debugging module.4.根据权利要求1所述的方法,其特征在于,在将包含调试任务应用的任务提交到集群资源管理器之前,所述方法还包括:4. The method according to claim 1, wherein before submitting the task including the debugging task application to the cluster resource manager, the method further comprises:调用分布式计算引擎提供的执行计划生成接口获取所述待调试的SQL语句的执行计划;Calling the execution plan generation interface provided by the distributed computing engine to obtain the execution plan of the SQL statement to be debugged;解析返回的执行计划,分析是否存在语法错误;Parse the returned execution plan and analyze whether there are syntax errors;如果存在语法错误,向客户端调试模块反馈错误信息;If there is a syntax error, feedback the error information to the client debugging module;如果不存在语法错误,则将待调试的SQL语句作为调试任务应用的参数将调试任务应用及待调试SQL语句一同提交给集群资源管理器。If there is no syntax error, the SQL statement to be debugged is used as a parameter of the debug task application, and the debug task application and the SQL statement to be debugged are submitted to the cluster resource manager together.5.根据权利要求1所述的方法,其特征在于,在将包含调试任务应用的任务提交到集群资源管理器之前,所述方法还包括如下预处理的步骤:5. The method according to claim 1, wherein before submitting the task including the debugging task application to the cluster resource manager, the method further comprises the following preprocessing steps:基于预处理参数,为待调试的SQL语句添加限制返回记录条数的limit子句。Based on the preprocessing parameters, add a limit clause to limit the number of returned records for the SQL statement to be debugged.6.一种SQL语句调试装置,其特征在于,该装置包括:6. A device for debugging SQL statements, characterized in that the device comprises:收发模块,用于接收SQL调试请求,SQL调试请求中携带待调试的SQL语句;The transceiver module is used to receive the SQL debugging request, and the SQL debugging request carries the SQL statement to be debugged;提交模块,用于将包含调试任务应用的任务以客户端模式提交到集群资源管理器,所述调试任务应用携带的参数包括待调试的SQL语句;所述调试任务应用用于将待调试的SQL语句提交给分布式计算引擎去执行;a submission module, configured to submit a task including a debugging task application to the cluster resource manager in a client mode, where the parameters carried by the debugging task application include the SQL statement to be debugged; the debugging task application is used to submit the SQL to be debugged The statement is submitted to the distributed computing engine for execution;结果处理模块,用于接收分布式计算引擎反馈的所述待调试的SQL语句的执行结果,根据执行结果生成调试结果后将调试结果写入指定存储位置的存储空间中。The result processing module is configured to receive the execution result of the SQL statement to be debugged fed back by the distributed computing engine, generate the debug result according to the execution result, and write the debug result into the storage space of the designated storage location.7.根据权利要求6所述的装置,其特征在于,所述装置还包括:7. The apparatus of claim 6, wherein the apparatus further comprises:查看接口模块,用于基于调试结果查看请求(由所述收发模块转发)获取所述指定存储位置的存储空间中存储的所述待调试的SQL语句的调试结果,并将调试结果通过所述收发模块发送给客户端调试模块;或用于在生成调试结果后,主动将调试结果通过所述收发模块发送给客户端调试模块。The viewing interface module is configured to obtain the debugging result of the SQL statement to be debugged stored in the storage space of the specified storage location based on the debugging result viewing request (forwarded by the transceiver module), and send the debugging result through the transceiver module. The module is sent to the client debugging module; or used to actively send the debugging results to the client debugging module through the transceiver module after the debugging results are generated.8.根据权利要求6所述的装置,其特征在于,所述装置还包括:8. The apparatus of claim 6, wherein the apparatus further comprises:语法检测接口模块,用于基于SQL语句语法检测请求,调用分布式计算引擎提供的执行计划生成接口获取待检测的SQL语句的执行计划;解析返回的执行计划,分析是否存在语法错误;The syntax detection interface module is used to call the execution plan generation interface provided by the distributed computing engine based on the SQL statement syntax detection request to obtain the execution plan of the SQL statement to be detected; parse the returned execution plan and analyze whether there is a syntax error;所述收发模块还用于接收SQL语句语法检测请求,语法检测请求中携带待检测的SQL语句;以及在所述执行计划存在语法错误的情况下,向客户端调试模块反馈错误信息。The transceiver module is further configured to receive a SQL statement syntax detection request, where the syntax detection request carries the SQL statement to be detected; and when there is a syntax error in the execution plan, feedback error information to the client debugging module.9.根据权利要求8所述的装置,其特征在于,9. The device of claim 8, wherein在所述提交模块将包含调试任务应用的任务提交到集群资源管理器之前调用语法检测接口模块获取所述待调试的SQL语句的执行计划;在所述待调试的SQL语句的执行计划存在语法错误的情况下,通过所述收发模块向客户端调试模块反馈错误信息;在所述待调试的SQL语句的执行计划不存在语法错误的情况下,将待调试的SQL语句作为调试任务应用的参数将调试任务应用及待调试SQL语句一同提交给集群资源管理器。Before the submission module submits the task including the debugging task application to the cluster resource manager, the syntax detection interface module is called to obtain the execution plan of the SQL statement to be debugged; there is a syntax error in the execution plan of the SQL statement to be debugged Under the circumstance, the error information is fed back to the client debugging module through the transceiver module; in the case that there is no syntax error in the execution plan of the SQL statement to be debugged, the SQL statement to be debugged is used as a parameter of the debugging task application. The debugging task application and the SQL statement to be debugged are submitted to the cluster resource manager together.10.根据权利要求6所述的装置,其特征在于,所述装置还包括:10. The apparatus of claim 6, wherein the apparatus further comprises:预处理模块,用于在将包含调试任务应用的任务提交到集群资源管理器之前,基于预处理参数,为待调试的SQL语句添加限制返回记录条数的limit子句。The preprocessing module is used to add a limit clause limiting the number of returned records to the SQL statement to be debugged based on the preprocessing parameters before submitting the task including the debugging task application to the cluster resource manager.11.一种电子设备,其特征在于,包括处理器、通信接口、存储介质和通信总线,其中,处理器、通信接口、存储介质通过通信总线完成相互间的通信;11. An electronic device, comprising a processor, a communication interface, a storage medium, and a communication bus, wherein the processor, the communication interface, and the storage medium communicate with each other through the communication bus;存储介质,用于存放计算机程序;storage medium for storing computer programs;处理器,用于执行存储介质上所存放的计算机程序时,实施权利要求1-5中任一项所述的方法。The processor is configured to implement the method of any one of claims 1-5 when executing the computer program stored on the storage medium.
CN202210297316.6A2022-03-242022-03-24SQL statement debugging method, device and equipmentPendingCN114741279A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202210297316.6ACN114741279A (en)2022-03-242022-03-24SQL statement debugging method, device and equipment

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202210297316.6ACN114741279A (en)2022-03-242022-03-24SQL statement debugging method, device and equipment

Publications (1)

Publication NumberPublication Date
CN114741279Atrue CN114741279A (en)2022-07-12

Family

ID=82276474

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202210297316.6APendingCN114741279A (en)2022-03-242022-03-24SQL statement debugging method, device and equipment

Country Status (1)

CountryLink
CN (1)CN114741279A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN117171194A (en)*2023-08-252023-12-05浪潮软件科技有限公司SQL query engine based on JDBC
WO2025039250A1 (en)*2023-08-222025-02-27深圳计算科学研究院Lightweight pl/sql debugger implementation method

Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110209422A (en)*2018-05-092019-09-06腾讯科技(深圳)有限公司A kind of method for processing business, computer equipment and client
CN111581088A (en)*2020-04-292020-08-25上海中通吉网络技术有限公司 Spark-based SQL program debugging method, device, device and storage medium
CN113504904A (en)*2021-07-262021-10-15中国平安人寿保险股份有限公司User-defined function implementation method and device, computer equipment and storage medium
CN113641572A (en)*2021-07-022021-11-12多点生活(成都)科技有限公司Massive big data calculation development debugging method based on SQL
WO2021259367A1 (en)*2020-06-242021-12-30中兴通讯股份有限公司Sql unification method, system, and device, and medium
US20220027258A1 (en)*2020-07-242022-01-27Paypal, Inc.Online query execution using a big data framework

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110209422A (en)*2018-05-092019-09-06腾讯科技(深圳)有限公司A kind of method for processing business, computer equipment and client
CN111581088A (en)*2020-04-292020-08-25上海中通吉网络技术有限公司 Spark-based SQL program debugging method, device, device and storage medium
WO2021259367A1 (en)*2020-06-242021-12-30中兴通讯股份有限公司Sql unification method, system, and device, and medium
US20220027258A1 (en)*2020-07-242022-01-27Paypal, Inc.Online query execution using a big data framework
CN113641572A (en)*2021-07-022021-11-12多点生活(成都)科技有限公司Massive big data calculation development debugging method based on SQL
CN113504904A (en)*2021-07-262021-10-15中国平安人寿保险股份有限公司User-defined function implementation method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
WO2025039250A1 (en)*2023-08-222025-02-27深圳计算科学研究院Lightweight pl/sql debugger implementation method
CN117171194A (en)*2023-08-252023-12-05浪潮软件科技有限公司SQL query engine based on JDBC

Similar Documents

PublicationPublication DateTitle
CN110069572B (en) HIVE task scheduling method, device, equipment and storage medium based on big data platform
CN109716320B (en) Method, system, medium and application processing engine for graph generation of event processing
US8572575B2 (en)Debugging a map reduce application on a cluster
CN109739663B (en) Job processing method, device, device, and computer-readable storage medium
EP3859533A2 (en)Method and apparatus for testing map service, electronic device, storage medium and computer program product
CN109656963B (en) Metadata acquisition method, device, device and computer-readable storage medium
US8321450B2 (en)Standardized database connectivity support for an event processing server in an embedded context
KR20210040850A (en)Method, apparatus, device, and storage medium for parsing document
CN110196888A (en)Data-updating method, device, system and medium based on Hadoop
CN106341444B (en)Data access method and device
US8656056B2 (en)Web-enabled mainframe
CN104954453A (en)Data mining REST service platform based on cloud computing
CN110704290A (en)Log analysis method and device
CN111625585B (en)Access method, device, host and storage medium of hardware acceleration database
EP2947582A1 (en)Computing device and method for executing database operation command
US12287812B2 (en)Enriching search results with provenance information in an observability pipeline system
CN114741279A (en)SQL statement debugging method, device and equipment
CN114547206A (en)Data synchronization method and data synchronization system
CN115344614A (en) Data processing method, device, storage medium and electronic equipment
CN113962597A (en) A data analysis method, device, electronic device and storage medium
CN111813758A (en) Database file distributed analysis method, device, server and storage medium
CN117389700A (en) A flow-batch integrated data processing method, device, system and storage medium
CN112328917A (en)SQL (structured query language) writing oriented method for generating http interface service and data display page
CN117033234A (en)Interface testing method, device, equipment and medium
CN116992065A (en)Graph database data importing method, system, electronic equipment and medium

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination

[8]ページ先頭

©2009-2025 Movatter.jp