Background
As business evolves, more and more table data is available. However, in more and more tables, the number of tables to be deleted is very large due to business reasons or because the tables themselves have no value. Such as a table in which the corresponding underlying data has been archived without the metadata being deleted, etc. These tables that need to be deleted are basically external partition tables, where when a table is deleted, the internal tables are deleted once the folders and files in the corresponding metadata and hdfs are deleted. The external table only deletes metadata, and folders and files in the corresponding hdfs are not deleted.
In practical application, when many tables are designed due to business reasons, the design partitions are complex, the number of catalogs is large, and even tens of thousands of partitions can be obtained. In the process of deleting a table, for a table with too many partitions, when performing drop operation, metadata of the partitions need to be deleted in the process of bottom layer operation. In the process of partitioning, the metadata of hive is generally stored in mysql, when the metadata of partitions is deleted, data deletion needs to be performed by associating multiple metadata tables, and concurrent deletion of tens of thousands of partitions at the same time easily causes sudden increase of mysql service pressure, thereby causing performance reduction, and also causing no result return when the process of drop tables is always blocked. Therefore, it is impossible to perform a drop operation on a table having too many partitions.
In the prior art, partition deletion for tables is basically performed manually. Specifically, a request for deleting the table is provided to the data warehouse operation and maintenance personnel, and then the warehouse operation and maintenance personnel delete the partition of the table. And in the process of deleting the partitions, splicing the deletion commands of the partition of the table by acquiring the partition information of the table. And obtaining the partition information of the table by performing show partitions operation on the live client. If the delete command is manually executed, a plurality of hive clients need to be started, and the operation is performed in a daemon process mode. If one table has tens of thousands of partitions, the partition amount of the tables can reach hundreds of thousands of partitions, so that the partition deletion of the tables is very inconvenient, the deletion of the table partitions needs more time, and the manual deletion is very labor-consuming.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for deleting a table, which can send partition information of a table to be deleted to a message queue after receiving a table deletion request, and automatically execute an operation of deleting a table partition by processing a message in the message queue. The table deleting operation is automatically and efficiently processed, and the problem that a large amount of labor cost and time cost are wasted in the prior art is solved.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a method of deleting a table.
The method for deleting the table in the embodiment of the invention comprises the following steps: acquiring a partition information set of a table to be deleted according to a received table deletion request; the partition information set comprises identification information of all partitions of the table to be deleted; sending the partition information set to a message queue; processing the message in the message queue to finish the deletion operation of all the partitions of the table to be deleted; and for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
Optionally, before executing the partition deleting operation according to the identification information of the partition to be deleted, the method further includes: and according to the identification information of the partition to be deleted, confirming that the table to be deleted exists in the database, and not executing deletion operation on the partition to be deleted.
Optionally, after the deleting operation is performed according to the identification information of the partition to be deleted, the method further includes: sending execution result data to the message queue; and counting the execution result data corresponding to the table to be deleted in the message queue at regular time, and displaying the counting result.
Optionally, the message queue is a RabbitMQ queue; and/or, the identification information includes at least one of: partition name, name of the database where it is located, or applicant information.
To achieve the above object, according to another aspect of an embodiment of the present invention, there is provided an apparatus for deleting a table.
The device for deleting the table of the embodiment of the invention comprises:
the information acquisition module is used for acquiring a partition information set of the table to be deleted according to the received table deletion request; the partition information set comprises identification information of all partitions of the table to be deleted;
a message queue sending module, configured to send the partition information set to a message queue;
the partition deleting module is used for processing the messages in the message queue and finishing the deleting operation of all the partitions of the table to be deleted; and for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
Optionally, the apparatus for deleting a table in the embodiment of the present invention further includes a checking module, configured to confirm that the table to be deleted exists in the database according to the identification information of the partition to be deleted, and perform no deletion operation on the partition to be deleted.
Optionally, the apparatus for deleting a table in the embodiment of the present invention further includes a monitoring module, configured to send execution result data to the message queue; and counting the execution result data corresponding to the table to be deleted in the message queue at regular time, and displaying the counting result.
Optionally, the message queue is a RabbitMQ queue; and/or, the identification information includes at least one of: partition name, name of the database where it is located, or applicant information.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an electronic apparatus.
The electronic device of the embodiment of the invention comprises: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method of deleting a table of any of the above.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided a computer-readable medium having a computer program stored thereon, wherein the computer program is configured to implement any one of the above methods for deleting a table when executed by a processor.
One embodiment of the above invention has the following advantages or benefits: after receiving the table deleting request, the partition information of the table to be deleted is sent to the message queue, and the operation of deleting the table partition can be automatically executed by processing the messages in the message queue. The table deleting operation is automatically and efficiently processed, and the problem that a large amount of labor cost and time cost are wasted in the prior art is solved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a main flow of a method for deleting a table according to an embodiment of the present invention, and as shown in fig. 1, the method for deleting a table according to an embodiment of the present invention mainly includes:
step S101: and acquiring a partition information set of the table to be deleted according to the received table deletion request. The partition information set includes identification information of all partitions of the table to be deleted. The partition information set includes partition information (identification information) of a plurality of partitions. The identification information of the partition is information for uniquely identifying the partition, and may include a partition name, a name of a database where the partition is located, or applicant information.
Step S102: the set of partition information is sent to a message queue. In the embodiment of the invention, the order and the safety performance of message processing can be safely ensured through the message queue.
Step S103: and processing the messages in the message queue to finish the deletion operation of all the partitions of the table to be deleted. And for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
According to the embodiment of the invention, after the table deletion request is received, the partition information of the table to be deleted is sent to the message queue, and the operation of deleting the table partition can be automatically executed by processing the message in the message queue. The table deleting operation is automatically and efficiently processed, and the problem that a large amount of labor cost and time cost are wasted in the prior art is solved.
Fig. 2 is a schematic diagram of a method for deleting a table according to an embodiment of the present invention, and as shown in fig. 2, the method for deleting a table according to an embodiment of the present invention includes:
step S201: and acquiring a partition information set of the table to be deleted according to the received table deletion request. The partition information set corresponding to the table deletion request can be directly obtained from the database according to the analysis result by analyzing the table deletion request.
Step S202: and sending the partition information set of the table to be deleted to a message queue. By deploying the deleted table partition service, after the applicant submits a table deletion request, an external interface can be automatically triggered, and the partition information set of the table to be deleted is sent to a message queue through the external interface. The partition information in the partition information set mainly includes a partition name, a table name, a name of a database where the partition information set is located, or applicant information. The method can be automatically realized without manual processing, and solves the problem that the partition information of the deleted list needs to be transmitted by contacting operation and maintenance personnel (mail or telephone contact) for communication in the prior art.
And sending the partition information submitted by the applicant to a message queue, storing the partition information in the message queue, and then consuming the messages in the message queue by constructing a consuming program.
Step S203: and processing the messages in the message queue and finishing the deletion operation of all the partitions of the table to be deleted. For example, a message in a message queue may be pulled by a consuming program. And for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
Fig. 3 is a schematic diagram of a method for deleting a table according to an embodiment of the present invention, and as shown in fig. 3, the method for deleting a table according to an embodiment of the present invention includes:
step S301: and acquiring a partition information set of the table to be deleted according to the received table deletion request.
Step S302: the set of partition information is sent to a message queue.
Step S303: and processing the messages in the message queue. For example, a message in a message queue may be pulled by a consuming program.
Step S304: and analyzing the current message, and acquiring the identification information of the partition to be deleted corresponding to the current message.
Step S305: and confirming that the table to be deleted exists in the database according to the identification information of the partition to be deleted corresponding to the current message, and deleting the partition to be deleted. The message content is analyzed, partition information such as the table name and the information of the submitter is obtained, and automatic verification is performed, such as whether the table exists, whether the table is submitted and the partition deleting operation is performed. If the table to be deleted exists in the database and the deletion operation is not executed on the partition to be deleted, the verification is successful, and step S306 is performed; otherwise, the verification is failed, the deletion application is rejected, and the processing state information is submitted to the message queue at the same time, for example, the deletion operation cannot be executed due to the verification failure.
Step S306: and executing deletion operation according to the identification information of the partition to be deleted corresponding to the current message. After the table name is obtained through analysis, the partition information of the table is firstly obtained, and then the partition is deleted through multithreading.
According to the embodiment of the invention, only one request needs to be submitted for the table of the multiple partitions, the partition deleting operation of the table can be rapidly and efficiently carried out, the concurrent processing of the multiple tables can be well supported, and a large amount of time is saved for workers. Under the same conditions, compared with the prior art, the technical scheme of the embodiment of the invention can reduce the average time of the consumed workers from more than 30 minutes to less than 1 minute.
Fig. 4 is a schematic diagram of a method for deleting a table according to an embodiment of the present invention, and as shown in fig. 4, the method for deleting a table according to an embodiment of the present invention includes:
step S401: and acquiring a partition information set of the table to be deleted according to the received table deletion request.
Step S402: the set of partition information is sent to a message queue.
Step S403: and processing the messages in the message queue and finishing the deletion operation of all the partitions of the table to be deleted. For example, a message in the consumption queue may be pulled by the consuming program. And for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
Step S404: and sending the execution result data to a message queue.
Step S405: and counting the execution result data corresponding to the table to be deleted in the message queue at regular time, and displaying the statistical result.
The embodiment of the invention realizes the quick and efficient deletion of the table partitions, can monitor the deletion condition of each table in real time according to the running state, and is convenient for timely troubleshooting and solving the problems encountered in the deletion process. And monitoring information such as how many tables are being deleted, which are successful, which are failed, and which are left through the page. The embodiment of the invention realizes the flow, automation and visualization of the deletion table partition, greatly reduces the working time of operation and maintenance personnel, and can monitor the dynamic state in the partition deletion operation process in real time.
FIG. 5 is a diagram of a prior art method of deleting a table; fig. 6 is a schematic diagram of a method of deleting a table according to an embodiment of the present invention.
Deletion is needed to save storage resources of the warehouse for business reasons or to be worthless per se. Some of the data in the bottom layer of the table is deleted by performing operations such as cold data filing or deletion of the table, and only the metadata information remains. In the prior art, the table is deleted manually because the deleting operation for the table is less before and the table partitions at the time are less. Now, because the data size is too large and the occupied space is too much, many useless tables are needed to be deleted, in the deleting process, for the reasons, the partitions are firstly deleted, in the deleting process, the window partitions are firstly needed to find the partition information, and then the partitions are deleted according to the partition information.
In the process of deleting the table, data is generally moved to a recycle bin, and then a drop operation is performed on the table (an existing table is deleted from a database or an existing index is deleted from the table, a table structure and all data are deleted, and the space occupied by the table is completely released). For the table with too many partitions, the drop operation cannot be performed, and the partition needs to be deleted first, that is, the alter table T drop partition (operation) is performed. As shown in fig. 5, after receiving a table deletion request submitted by a service party, obtaining corresponding partition information, and manually executing a partition deletion command until all partitions are deleted. Specifically, each service party needs to delete some tables according to the service condition of the service party, and the tables cannot be deleted by the service party due to the authority, so that a table deletion request needs to be provided to the operation and maintenance personnel of the data warehouse. If there are not many partitions of a table, the table may be deleted directly. However, in the case of many table partitions, the table cannot be deleted directly, and the partition needs to be deleted first. In the process of obtaining the partition information of the table, the partition information is mainly obtained by performing show partitions operation on the live client. And splicing the partition deleting command of the table through the acquired partition information, and then executing the command at the hive client.
In the case that the number of partitions reaches tens of thousands, the manual deletion in the prior art is very labor-consuming, and needs to wait for the deletion of all the partitions to be completed. And the command of deleting the partitions is manually executed, and a plurality of hive clients are required to be started and executed in a daemon process mode. If one table has tens of thousands of partitions, the partition amount of the tables can reach hundreds of thousands of data amount, and the mode is very inconvenient for deleting the partitions of the tables. And the deletion operation is carried out through a background program, so that the monitoring is inconvenient.
As shown in fig. 6, the method for deleting a table according to the embodiment of the present invention includes:
step S601: and acquiring a partition information set of the table to be deleted according to the received table deletion request.
Step S602: and sending the batch partition information to a RabbitMQ queue. RabbitMQ is an open source message broker software (also known as message-oriented middleware) that implements the Advanced Message Queuing Protocol (AMQP). The RabbitMQ server is written in Erlang language, and the clustering and failover are built on an open telecommunications platform framework. All major programming languages have a client library that communicates with the agent interface. The RabbitMQ serves as a message queue service, and the order and the safety performance of message consumption can be ensured more safely under the condition that a high-availability service is built.
Step S603: and the consumption program pulls the message in the RabbitMQ queue and analyzes the pulled message to obtain the corresponding partition information.
Step S604: and checking the acquired partition information. If the verification fails, go to step S607; otherwise, step S605 is executed.
Step S605: and executing deletion operation on the partition, and returning execution result data to the RabbitMQ queue.
Step S606: the validation table was deleted successfully.
Step S607: returning the result of the test failure to the RabbitMQ queue
The embodiment of the invention realizes the quick and efficient deletion of the table partitions, can monitor the deletion condition of each table in real time according to the running state, and is convenient for timely troubleshooting and solving the problems encountered in the deletion process. And monitoring information such as how many tables are being deleted, which are successful, which are failed, and which are left through the page. The embodiment of the invention realizes the flow, automation and visualization of the deletion table partition, greatly reduces the working time of operation and maintenance personnel, and can monitor the dynamic state in the partition deletion operation process in real time.
Fig. 7 is a schematic diagram of main modules of an apparatus for deleting a table according to an embodiment of the present invention, and as shown in fig. 7, anapparatus 700 for deleting a table according to an embodiment of the present invention includes aninformation obtaining module 701, a messagequeue sending module 702, and apartition deleting module 703.
Theinformation obtaining module 701 is configured to obtain a partition information set of a table to be deleted according to the received table deletion request; the partition information set includes identification information of all partitions of the table to be deleted.
The messagequeue sending module 702 is configured to send the partition information set to a message queue. The message queue is a RabbitMQ queue; and/or, the identification information includes at least one of: partition name, name of the database where it is located, or applicant information.
Thepartition deleting module 703 is configured to process the message in the message queue, and complete the deleting operation of all the partitions of the table to be deleted; and for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
The device for deleting the table in the embodiment of the invention further comprises a checking module, before the partition deleting module executes the partition deleting operation according to the identification information of the partition to be deleted, the checking module is used for confirming that the table to be deleted exists in the database according to the identification information of the partition to be deleted and not executing the deleting operation on the partition to be deleted.
The device for deleting the table further comprises a monitoring module, wherein after the partition deleting module executes deleting operation according to the identification information of the partition to be deleted, the monitoring module is used for sending execution result data to the message queue; and counting the execution result data corresponding to the table to be deleted in the message queue at regular time, and displaying the statistical result.
The embodiment of the invention realizes the quick and efficient deletion of the table partitions, can monitor the deletion condition of each table in real time according to the running state, and is convenient for timely troubleshooting and solving the problems encountered in the deletion process. And monitoring information such as how many tables are being deleted, which are successful, which are failed, and which are left through the page. The embodiment of the invention realizes the flow, automation and visualization of the deletion table partition, greatly reduces the working time of operation and maintenance personnel, and can monitor the dynamic state in the partition deletion operation process in real time.
Fig. 8 illustrates anexemplary system architecture 800 to which the method of deleting a table or the apparatus for deleting a table of the embodiments of the present invention may be applied.
As shown in fig. 8, thesystem architecture 800 may includeterminal devices 801, 802, 803, anetwork 804, and aserver 805. Thenetwork 804 serves to provide a medium for communication links between theterminal devices 801, 802, 803 and theserver 805.Network 804 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use theterminal devices 801, 802, 803 to interact with aserver 805 over anetwork 804 to receive or send messages or the like. Theterminal devices 801, 802, 803 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
Theterminal devices 801, 802, 803 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
Theserver 805 may be a server that provides various services, such as a back-office management server (for example only) that supports shopping-like websites browsed by users using theterminal devices 801, 802, 803. The background management server can analyze and process the received data such as the product information inquiry request and feed back the processing result to the terminal equipment.
It should be noted that the method for deleting the table provided by the embodiment of the present invention is generally performed by theserver 805, and accordingly, the apparatus for deleting the table is generally disposed in theserver 805.
It should be understood that the number of terminal devices, networks, and servers in fig. 8 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 9, shown is a block diagram of acomputer system 900 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 9, thecomputer system 900 includes a Central Processing Unit (CPU)901 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from astorage section 908 into a Random Access Memory (RAM) 903. In theRAM 903, various programs and data necessary for the operation of thesystem 900 are also stored. TheCPU 901,ROM 902, andRAM 903 are connected to each other via abus 904. An input/output (I/O)interface 905 is also connected tobus 904.
The following components are connected to the I/O interface 905: aninput portion 906 including a keyboard, a mouse, and the like; anoutput section 907 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; astorage portion 908 including a hard disk and the like; and acommunication section 909 including a network interface card such as a LAN card, a modem, or the like. Thecommunication section 909 performs communication processing via a network such as the internet. Thedrive 910 is also connected to the I/O interface 905 as necessary. Aremovable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on thedrive 910 as necessary, so that a computer program read out therefrom is mounted into thestorage section 908 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through thecommunication section 909, and/or installed from theremovable medium 911. The above-described functions defined in the system of the present invention are executed when the computer program is executed by a Central Processing Unit (CPU) 901.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an information acquisition module, a message queue sending module, and a partition deletion module. The names of these modules do not constitute a limitation to the module itself in some cases, and for example, the information acquisition module may also be described as a "module that acquires a partition information set of a table to be deleted according to a received table deletion request".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: acquiring a partition information set of a table to be deleted according to a received table deletion request; the partition information set comprises identification information of all partitions of the table to be deleted; sending the partition information set to a message queue; processing the messages in the message queue and finishing the deletion operation of all the partitions of the table to be deleted; and for each message in the processed message queue, analyzing the message to acquire the identification information of the partition to be deleted corresponding to the message, and executing deletion operation according to the identification information of the partition to be deleted.
According to the embodiment of the invention, after the table deletion request is received, the partition information of the table to be deleted is sent to the message queue, and the operation of deleting the table partition can be automatically executed by processing the message in the message queue. The table deleting operation is automatically and efficiently processed, and the problem that a large amount of labor cost and time cost are wasted in the prior art is solved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.