CN119226252A

Movatterモバイル変換

Info

Publication number: CN119226252A
Application number: CN202310804094.7A
Authority: CN
Inventors: 夏启超
Original assignee: Beijing Kingsoft Cloud Network Technology Co Ltd
Current assignee: Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date: 2023-06-30
Filing date: 2023-06-30
Publication date: 2024-12-31

Abstract

The application relates to a copy data disk dropping method and device based on distributed block storage, wherein the method comprises the steps of writing a copy data block corresponding to a write request on a physical disk under the condition that the write request of the copy data block of a target copy is received, wherein the write request is used for requesting to disk dropping of the copy data block; the method comprises the steps of receiving a read request of a target copy, returning a reply message, wherein the reply message carries the copy length of the target copy which is written into a physical disk, returning the copy data block requested by the read request under the condition that the copy data block requested by the read request is located in the copy length of the target copy under the condition that the read request of the read target copy is received, and returning a prompt message according to the search result of the copy data block requested by the read request under the condition that the copy data block requested by the read request is located outside the copy length of the target copy. The application solves the technical problem of low data disk drop efficiency.

Description

Method and device for dropping copy data based on distributed block storage

Technical Field

The application relates to the field of cloud service, in particular to a copy data disk-dropping method and device based on distributed block storage.

Background

With the development of the internet, new demands are being made on the rapid landing of data in distributed block storage. In the prior art, for the drop disc of the copy, data is written into an original data area through an additional writing mode, indexes are stored and compressed in a layered and ordered mode of a disk-oriented data structure Tree (Log structure MERGE TREE, LSM Tree), and finally, the copy is subjected to a consistency protocol Raft protocol, and the additional writing of the data initiated by the main copy is passively received from the copy.

However, the prior art has the problem of double writing, namely, when the data is dropped, the log is written first, and then the data is written, so that the data drop efficiency is low.

Disclosure of Invention

The application provides a copy data disk-dropping method and device based on distributed block storage, which are used for solving the problem of low data disk-dropping efficiency.

In a first aspect, the application provides a copy data disk drop method based on distributed block storage, which comprises the steps of writing a copy data block corresponding to a write request onto a physical disk under the condition that the write request of the copy data block of a target copy is received, wherein the write request is used for requesting to drop the copy data block, returning a reply message, wherein the reply message carries the copy length of the target copy already written onto the physical disk, returning the copy data block requested by the read request under the condition that the copy data block requested by the read request is located within the copy length of the target copy under the condition that the read request is received, and returning a prompt message according to the search result of the copy data block requested by the read request under the condition that the copy data block requested by the read request is located outside the copy length of the target copy.

The application provides a copy data disk drop device based on distributed block storage, which comprises a writing module, a reply module and a reading module, wherein the writing module is used for writing a copy data block corresponding to a writing request onto a physical disk under the condition that the writing request of the copy data block of a target copy is received, the writing request is used for requesting to drop the copy data block, the reply module is used for returning a reply message, the reply message carries the copy length of the target copy which is already written onto the physical disk, the reading module is used for returning the copy data block requested by the reading request under the condition that the reading request of the target copy is received and the copy data block requested by the reading request is located in the copy length of the target copy, and the reply message is returned according to the searching result of the copy data block requested by the reading request under the condition that the copy data block requested by the reading request is located outside the copy length of the target copy.

As an alternative example, the writing module comprises a writing unit for writing the copy data blocks onto the physical disk in random writing order when the writing request is received, and a determining unit for determining lengths of a plurality of copy data blocks which are continuous and uninterrupted as the copy length from a first copy data block of the target copy.

As an alternative example, the determining unit comprises a determining subunit, configured to store, in a message confirmation queue, the write request corresponding to any one copy data block in the case where the copy data block is written to the physical disk, where the write request in the message confirmation queue is a request that has been received and written to be completed, and determine, in the case where 1 st to nth write requests exist in the message confirmation queue and no (n+1) th write request is included, the N as the copy length.

As an alternative example, the reading module includes a replying unit configured to find a copy data block requested by the read request on a target physical disk, return the copy data block if the copy data block requested by the read request is found, find the copy data block requested by the read request from other physical disks, where the other physical disks are other physical disks written to the target copy, return the copy data block if the copy data block requested by the read request is found in the other physical disks, and return an error message if the copy data block requested by the read request is not found in the other physical disks.

As an optional example, the device further comprises a copy supplementing module, configured to determine, after the target copy is written to the physical disk, a longest target copy among all target copies of other physical disks as a data source if a copy supplementing operation is triggered, set the data source to a write-prohibited state, and perform a copy supplementing operation on the target copy on the target physical disk using the data source.

As an optional example, the copy complement module includes a copy complement unit configured to read data from an offset position of the data source, where the offset position is a position in the data source that is offset by a copy length of the target copy, write the read data to the target copy until the target copy is the same as the data source in length, and update the copy length of the target copy.

As an alternative example, the device further comprises an execution module, configured to allocate a first operation type code for the write request if the write request of the copy data block of the target copy is received, insert the first operation type code into a lock-free queue, where the lock-free queue further includes a second operation type code, where the second operation type code is a code for an internal operation of the data server of the disk, and execute the write request and the internal operation according to an order of the operation type codes in the lock-free queue.

In a third aspect, the application provides an electronic device comprising at least one communication interface, at least one bus connected with the at least one communication interface, at least one processor connected with the at least one bus, at least one memory connected with the at least one bus, wherein the processor is configured to write a copy data block corresponding to a write request onto a physical disk when a write request of the copy data block of a target copy is received, wherein the write request is used for requesting that the copy data block fall on the physical disk, return a reply message, wherein the reply message carries the copy length of the target copy already written onto the physical disk, return the copy data block requested by the read request when a read request of the target copy is received, return the copy data block requested by the read request when the copy data block requested by the read request is located within the copy length of the target copy, and return a prompt message according to the read request of the copy data block requested by the read request when the copy data block requested by the read request is located outside the copy length of the target copy.

In a fourth aspect, the present application further provides a computer storage medium storing computer executable instructions for performing the method for dropping copy data based on distributed block storage according to any one of the above aspects of the present application.

Compared with the prior art, the technical scheme provided by the embodiment of the application has the advantages that when the data is dropped, the data blocks are written to the physical disk under the condition that the writing request is received for each copy data block. When the message is replied, the copy length of the copy is replied, so that when the data is read, only the data in the copy length is read, and the exceeding data is not guaranteed to be successfully returned, so that double writing operation is not needed when the data is dropped, namely, the operation of writing the log first and then writing the data is performed, and the data drop efficiency is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.

One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.

FIG. 1 is a flowchart of a method for dropping copy data based on distributed block storage according to an embodiment of the present application;

FIG. 2 is a schematic diagram of a system for a method for dropping copy data based on distributed block storage according to an embodiment of the present application;

FIG. 3 is an IO path diagram of a copy data disk-drop method based on distributed block storage according to an embodiment of the present application;

FIG. 4 is an interaction diagram of modules of a method for dropping copy data based on distributed block storage according to an embodiment of the present application;

FIG. 5 is a periodic task diagram of the execution of a method for dropping copy data based on distributed block storage according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a data-client writing flow of a copy data disk-dropping method based on distributed block storage according to an embodiment of the present application;

FIG. 7 is a schematic diagram of a data-server writing flow of a copy data disk-dropping method based on distributed block storage according to an embodiment of the present application;

FIG. 8 is a schematic diagram of a data-client read flow of a copy data placement method based on distributed block storage according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a data-server read flow of a copy data disk-dropping method based on distributed block storage according to an embodiment of the present application;

FIG. 10 is a flowchart of a method DATA SERVER for adding copies ADDREPLICA based on a distributed block storage copy data placement method according to an embodiment of the present application;

FIG. 11 is an inheritance diagram of a copy data disk-dropping method based on distributed block storage according to an embodiment of the present application;

FIG. 12 is a block diagram of a copy data disk-drop device based on distributed block storage according to an embodiment of the present application;

Fig. 13 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. They are, of course, merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.

In order to solve the technical problem of low data disk-dropping efficiency in the prior art, the application provides a copy data disk-dropping method based on distributed block storage, which can realize the effect of improving the data disk-dropping efficiency.

Fig. 1 is a flowchart of a copy data disk-dropping method based on distributed block storage according to an embodiment of the present application. Comprising the following steps:

S102, under the condition that a write request of a copy data block of a target copy is received, writing the copy data block corresponding to the write request to a physical disk, wherein the write request is used for requesting that the copy data block is dropped;

S104, returning a reply message, wherein the reply message carries the copy length of the target copy written to the physical disk;

S106-1, under the condition that a read request for reading a target copy is received, returning a copy data block requested by the read request under the condition that the copy data block requested by the read request is positioned in the copy length of the target copy;

S106-2, returning a prompt message according to the searching result of the copy data block requested by the read request under the condition that the copy data block requested by the read request is located outside the copy length of the target copy.

The method for dropping the copy data can be applied to the process of dropping the copy data of the elastically telescopic distributed block storage (Elastic block store, EBS).

In distributed storage, in order to ensure that service data on a physical server is not affected when the physical server is down, the service data is usually written into multiple copies, and each copy of the data is called a duplicate copy. Different copies may be written to different servers. The service data is not lost when the server is down, and normal access can be realized. The target copy may be one of the copies of the service data, and the step of dropping is to write the copy data from the server to the physical disk.

In order to implement the above-mentioned method for dropping copy data, a system may be built, and a schematic diagram of the system is shown in fig. 2. A user executes read-write operation on a copy through a cloud platform and a virtual machine of a user layer, an input/output (IO) request (read/write request) is generated, a block controller of a distributed block storage logic layer schedules the read/write request, a block server forwards the read/write request, a data controller of a storage layer schedules copy data, and a data server reads and writes the copy data.

In the system, a server for data disk drop is DATA SERVER data servers, DATA SERVER data servers are servers for single-node data IO request disk drop and resource management on a data path of a physical disk, the block server block-server is a server for calling a data-client interface to initiate IO requests on DATA SERVER where all copies are located, and is responsible for accepting IO requests of clients, applying resources, issuing IO requests and establishing indexes, the block controller block-master is a server for controlling the block-server, the data controller data-master is a server for controlling all DATA SERVER resource management, copy recovery and balance on the physical disk path, and the data-client is a client for initiating writing into multiple copies.

In the above system, as shown in fig. 3, the path diagram of the IO request initiated by the data-client is shown, the IO request is generated from the virtual machine, and sent to the block server block-server through the distributed block storage engine, the block server forwards the IO request to the data server DATA SERVER, and the data server DATA SERVER performs the disk dropping operation (writing operation) or the reading operation of the copy data according to the IO request.

The effect of DATA SERVER is that the IO request reads and writes copies, and also carries other functions. As shown in fig. 4, fig. 4 is an interaction diagram of DATA SERVER with other modules. From the point of view of separation of data flow and control flow, its functions are as follows:

The method comprises the steps of receiving DATA SERVER business IO of a block-server on a data stream to achieve additional writing semantics, receiving a data-master request to achieve data complement of a copy after a disc is dropped and copy data of the complement, loading data on a disc, replying a data writing position, and guaranteeing that data which is replied to an upper layer cannot be lost.

The control flow has the following management function that the node is registered with the data-master, and the node offline/query request is responded and completed. Disk management, namely, responding to and realizing disk on-line/off-line requests, counting the number of times of disk iops (Input/Output Operations Per Second, namely, carrying out read-write (I/O) operation per second)/throughput/time delay latency/capacity, block management, namely, applying blocks/releasing blocks as required, responding to requests such as block inquiry, and the like, recovering task management, namely, responding to requests of data-master copy/synchronous data, realizing data synchronization of copies after disk dropping or the full copy, and ensuring that the data can be recovered continuously after restarting.

For the upstream and downstream modules DATA SERVER, the upper layer main module of DATA SERVER on the data flow comprises an ebs server which distributes IO requests transmitted from the front end to a plurality of DATA SERVER (a Database SYSTEM CLIENT Software Development Kit can be called by a client software development package (dbs CLIENT SDK) of the Database to realize the function), and the lower layer realizes writing and reading data by calling UserSpaceBlock: write, userSpaceBlock:Read two interfaces. From the control flow, DATA SERVER reports relevant information of the disk and the metadata to DATA MASTER upwards, and BlockSystemController is called downwards to GetBlockSystemInfo interface to acquire the information of the disk. In addition, DATA SERVER also needs to respond to the copy, disk add-drop operation and node start-stop operation sent by DATA MASTER.

DATA SERVER implement the control flow function by the following sub-modules.

DISK MANAGER module is responsible for monitoring and recording information of the magnetic disk under the system, namely storemanager { [ storeid, status, capactiy, chunks, units, userblocksystemFs }, when a physical disk is added or a node is started, if a data block is arranged on the physical disk, all existing chunk is automatically loaded when a process is started, the state and the size of the existing chunk are recovered, physical disk information is registered with a physical disk monitoring program, and SPDK poller (a poller) on the corresponding physical disk is started. And when the physical disk is down, the relevant replica service is shut down, and the SPDK poller threads on the replica service are shut down.

The block manager module is responsible for managing single-machine single-disk level logical address to physical address conversion. According to the design of a storage engine, recording single chunk information mainly comprises ：chunkmanager{[chunkid,status,size,storeid,userblocksystemFs*],…};dis-or der submit&&in-order ack logical;ExpiredIOFlushPoller., the layer is required to realize additional writing and in-order ack semantics of the seq chunk sequence, and in order to ensure that IO timeout with less than ack can be returned, timeout IO is also required to be checked periodically and fed back to a data-client.

The recovery service is mainly used for generating, scheduling and executing the Block copy recovery task, and the record of the Block copy recovery task is as follows:

recover manager{[chunkid,task-status,src_id,src_size],...};

The task content of each copy recovery (data alignment) of the retry thread retrytaskpoller is as follows [ chunkId, status, size, store_id, userBlockFs ] the recovery task of the Block copy recovery task that fails to schedule will enter the retry task queue according to the FIFO order. The recovery service has a periodic task, and periodically selects the task to Recover data. The main function of the recovery service is to realize alarm and repair services after the damage of the copy, the disk and the node, on one hand, the method reports abnormality quickly by DATA SERVER after detecting hardware abnormality, and on the other hand, after selecting proper node and disk, the method responds to the recovery request of the Database (DBS) master.

MANAGER SERVICE module for controlling related service by physical Disk/Node, mainly responding to DBS master management, control and inquiry request, and realizing specific function. The record is as follows ：manage service{registerDS/AllockChunk/DeleteChunk/AddReplica/AlignData/Dis ableDisk/GetBlocksEnableDisk/Statistic}.

And the Heartbeat timer module periodically reports information such as a disk, a copy and the like of the current storage node to the DBS master, and when the reporting time is data-master for periodic inquiry, the mainly reported data comprises all disk information recorded by a 1.Disk manager and all chunk information recorded by a 2.Chunk manager.

The Scheduler module is responsible for periodically starting tasks such as Update SERVER STATISTIC/flush output io/Handle recover task, and the periodic tasks mainly executed by the Scheduler module are shown in fig. 5. Two of UcxServerNetMasterPolling and UcxServerMsgHandlerPolling are access threads of the compensation server ucx server and are responsible for receiving request and reply information, spdkThreadPoolExecutor is a core work thread of DATA SERVER and is responsible for executing context connection of data flow and control flow, and SpdkThreadPoolExecutor is responsible for driving the three periodic tasks. When the duplicate data recovery task is executed, the request is sent to UcxClientNetPoller client network pollers to pull the data, so that the influence on service IO can be avoided. IO cheker checking is implemented in ucx server to avoid situations where the copy under recovery provides a read, the old version of the data provides a read, etc., which may result in illegal data being read. And IO statistics is used for counting indexes such as latency/iops/throughput in the IO process.

For DATA SERVER above, in terms of reading and writing, copy data can be written to the physical disk by a read-write function. FIG. 6 is a schematic diagram of a data-client write flow of a DBS database. The user initiates a request, applies for memory from the memory pool through SPDK THREAD threads, then queries the copy location, sends the IO request to DATA MASTER, and DATA MASTER sends the IO request to the corresponding DATA SERVER. As shown in fig. 6, the IO requests are sent to DATA SERVER, DATA SERVER2, and DATA SERVER3, respectively, and the responses are obtained, and finally the responses are fed back to the user and the memory is released. in the drawings, the transmission order of requests 0 to 2 and the acquisition order of responses 0 to 2 are examples, and are not limited thereto. After the IO request of the data-client is sent to DATA SERVER, DATA SERVER may write the target copy to which the IO request requested to be written onto the physical disk. FIG. 7 is a schematic diagram of a writing process of DATA SERVER of the DBS database. The write request is sent to DATA SERVER, DATA SERVER for a write operation by the UcxServerMsgHandlerPolling thread. the data block chunk is then checked by the SPKD THREAD thread, checking if it exists, if the status is readable and writable and if the data addresses in the data block are consecutive, if the data is out of bounds, if it is an overwrite. And then sending a request and receiving a response to the distributed block system of the user, checking the chunk again, checking whether the chunk exists, whether the state is readable and writable, and whether IO is continuous. When DATA SERVER writes data, out-of-order writing is executed, and after writing is finished, orderly replying is carried out, and the user is replied. For a target copy, it may consist of copy data blocks. DATA SERVER when writing a target copy to a physical disk, each copy data block of the target copy is written to the physical disk. Since there may be a write failure for the duplicate data block, DATA SERVER copies of the target copy to be written back after writing the target copy to the physical disk. The copy length DATA SERVER the copy side has determined the furthest position where the data is consecutively dropped and there is no hole before, meaning the length of consecutive data that is written into the physical disk to be effective, no hole meaning that the data is consecutively without data loss. If the duplicate data blocks 1-10, the data blocks written to the physical disk and validated have 1-6, 8,9, the duplicate length is 6. If the copy data blocks written to the physical disk are 1-3, 4-8, 10, then the copy length of the target copy written to the physical disk is 3. a consistency point of the copy length refers to a copy length that is reached for more than half of the copies. After completion of IO defined as each copy, the length of the current copy is returned to the data client, and the length that more than half of the copies exceed or reach is called a consistency point. If 3 copies, where the copy length of two copies reaches 3, the consistency point is 3.DATA SERVER after the target copy is written, replying the copy length and consistency point of the target copy to the block-server. When a client requests to read data, an IO request to read data can be initiated, as shown in FIG. 8, which is a schematic diagram of a data-client read flow of the DBS database. The user initiates a request to apply for memory from the memory pool via SPDK THREAD threads, then queries the copy location from DATA MASTER, queries which DATA SERVER the copy is located in, and sends the IO request to DATA SERVER. The block-server determines via the data-master which DATA SERVER the client requests to read and determines DATA SERVER to send the request to DATA SERVER, which DATA SERVER provides the data. FIG. 9 is a schematic diagram of a read flow of a data-server of a DBS database. The read request is sent to DATA SERVER, DATA SERVER for a read operation by the UcxServerMsgHandlerPolling thread. The data block chucnk is checked by SPKD THREAD threads for its presence, status, and read/write. If present and the status is readable and writable, it is checked whether the data addresses in the data block are consecutive and the data are out of bounds. DATA SERVER replies the read data to the user. Because DATA SERVER returns the copy length after writing the target copy, the data in the copy length requested by the client can be returned to the client, and the data block-server outside the copy length requested by the client is not guaranteed to be able to return.

That is, DATA SERVER implements sequential append writing, brings back the copy length to the Data-client in the reply after writing, the Data-client writes successfully back to the upper layer, DATA SERVER implements out-of-order writing in order to improve the concurrency of writing, and DATA SERVER replies in order according to the copy length (i.e. replies only to the position of the current copy length) in order to avoid that out-of-order reply may cause the exception and complexity of the complementary copy or the copy Data alignment process. The method comprises the steps of recording the length of each copy at DATA SERVER, only receiving a read request which does not exceed the length range, checking when reading, replacing other copies for retry if the data requested by the read request exceeds the copy data in the current DATA SERVER, and replacing the target copy on the physical disk A to provide data if the target copy is located on the physical disk A and the physical disk B, the copy length of the target copy on the physical disk A is 10, the copy length of the target copy on the physical disk B is 8, and when a data client requests a copy data block 9 from the physical disk B, the data is not provided on the physical disk B, and replacing the target copy on the physical disk A. For data that is not fed back to the upper layer when written, the read request does not promise to be returned successfully. A request queue for out-of-order write ordered replies. In order to realize out-of-order write ordered reply, DATA SERVER side introduces an ack queue inside, puts the IO submitted to the physical disk but finished first in the ack queue for queuing, and replies to the data-client after the preceding IO falls down and has no hole.

In this embodiment, when the data is dropped, for each copy data block, the data block is directly written onto the physical disk under the condition that a write request is received, and it is not necessary to write the log first and then write the data. When the message is replied, the copy length of the copy is replied, so that when the data is read, only the data with the copy length is read, and the exceeding data is not guaranteed to be successfully returned, so that double writing operation is not needed when the data is dropped, and the data dropping efficiency is improved.

As an alternative example, in the case of receiving a write request for a copy data block of a target copy, writing the copy data block corresponding to the write request to the physical disk includes, in the case of receiving the write request, writing the copy data blocks to the physical disk in a random write order, and determining lengths of a plurality of copy data blocks that are continuous and uninterrupted as a copy length, starting from a first copy data block of the target copy.

In this embodiment, when writing duplicate data blocks, writing may be out of order. That is, it is not necessary to sequentially write in the order of reception of the duplicate data blocks. Instead, after the duplicate data blocks are received in order, the duplicate data blocks are written to the physical disk in a random order. The random writing order is not limited. For example, if the receiving order of the duplicate data blocks is 123456, the writing order is random, which may be 135426, which may be 265314. There may be a case of write failure, such as 263_51, where 4 is not written successfully. Out-of-order writing can avoid waiting according to the receiving sequence, and the problem that writing of a subsequent copy data block is affected under the condition of data writing failure is avoided. The copy length is the length of the continuous uninterrupted data blocks in the written copy data blocks. If 135426, 6 duplicate data blocks are successfully written, the duplicate length is 6, and if 1-3 are successfully written in 263_51 and 4 are not successfully written, the duplicate length is 3.

As an alternative example, starting from the first copy data block of the target copy, determining the lengths of the plurality of copy data blocks that are continuous and uninterrupted as the copy lengths includes storing write requests corresponding to the copy data blocks in a message confirmation queue in the case that any one copy data block is written to the physical disk, wherein the write requests in the message confirmation queue are requests that have been received and written to completion, and determining N as the copy length in the case that the 1 st to N-th write requests are present in the message confirmation queue and the n+1-th write request is not included.

In this embodiment, each of the plurality of duplicate data blocks of one target duplicate may be requested to be written to the physical disk by one write request. Each duplicate data block corresponds to a write request. The write requests may be ordered in the order of the duplicate data blocks, with a precedence order. After receiving a write request, the copy data block corresponding to the write request may be written to the physical disk, and the corresponding write request may be stored in the message acknowledgement queue. The data output request in the message confirmation queue is a write request of a copy data block which is successfully written, and the write request of a data block which is not successfully written is not stored in the message confirmation queue.

Since the write requests are sequential, the order may be represented by a number. And when the first N writing requests with complete numbers exist in the message confirmation queue, confirming that the writing of the first N data blocks is successful, wherein the copy length is N.

That is, in this embodiment, the method can reply in an out-of-order write ordered reply manner. The duplicate data block is written to and not replied to immediately, but rather the write request is written to an ordered reply in the message acknowledgement queue.

As an alternative example, in the case that the copy data block requested by the read request is located outside the copy length of the target copy, returning the hint message according to the result of the lookup of the copy data block requested by the read request includes looking up the copy data block requested by the read request on the target physical disk, returning the copy data block if the copy data block requested by the read request is found, looking up the copy data block requested by the read request from other physical disks, where the other physical disks are the physical disks of the other write target copy, returning the copy data block if the copy data block requested by the read request is found in the other physical disks, and returning the error-reporting message if the copy data block requested by the read request is not found in the other physical disks.

After the duplicate data block is written and a reply message is returned, the client may request access to the duplicate data block written to the physical disk. The client may initiate a read request, which may request access to any one or more data blocks of the target copy. If the physical disk has the copy data block requested by the client, DATA SERVER may return the copy data block, and if the physical disk does not have the copy data block requested by the read request, then the copy data block of the target copy needs to be searched for on the other physical disk and returned by the corresponding DATA SERVER server.

If the target copy is located on the physical disk 1 and the physical disk 2, the copy length of the target copy on the physical disk 1 is 10, the copy length of the target copy on the physical disk 2 is 12, and when the client requests the copy data block 11 of the target copy, the client may send a read request to the physical disk 1, and the physical disk 1 does not have the copy data block 11, so that the copy data block 11 is acquired from the physical disk 2 and returned to the client. The operation of looking up duplicate data block 11 may be done by a data-master.

As an alternative example, after writing the target copy to the physical disk, the method further includes determining a longest target copy among all target copies of other physical disks as a data source if a copy-complement operation is triggered, setting the data source to a write-prohibited state, and performing the copy-complement operation on the target copy on the target physical disk using the data source.

In this embodiment, since the target copy may have data lost when writing to the physical disk, it cannot be written completely, and a part of the copy is lost due to a failure of the physical disk, a process of copy replenishment may be involved. The complementary copies are mainly used for supplementing data of other target copies according to the longest copy in the target copies. The target copies are located on different physical disks, possibly of different lengths. Taking the longest copy as a data source copy, and filling other target copies to the length of the longest target copy. When the copy is replenished, the source copy prohibits writing. After the completion of the other copy trimming, the write-prohibited state may be released.

There are various scenarios that trigger the copy complement operation. Such as DropReplica discard duplicate copies, disableStore disable storage, KICKSERVER response server, balance, etc., all require duplicate-filling operations. It is understood that an insufficient number of copies of a block of data triggers a restore to a new copy complement operation. The complement copy operation is to pull data from the normal copy. The copy equalization seq chunk and the random chunk can both disable the src from writing, and the length of the copy to be deleted after the data is filled cannot be longer than that of the copy used as the data source. In the complementary copy process, a long source of src data is selected from a plurality of available copies. The write-disabled state needs to be persistent to prevent IO (input/output) on the copy recovered by the data part when the state is lost after DATA SERVER is restarted, otherwise, data errors or data loss can occur.

When the copy is specifically supplemented, the data-master selects the data-servers of the copy to be supplemented, one copy is added in the metadata of the data-master (the copy state is KREPLICACREATING), the data-master sends a request ADDREPLICA to the selected data-servers, and the flow of DATA SERVER processing ADDREPLICA is shown in figure 10. The data-master processes the data-SEVER ADDREPLICA return code, the data-master returns kNotFinish, the data-master retries ADDREPLICA after sleep for 1 second, the data-master returns kOk, the data-master considers success and updates own metadata (copy state is updated to KREPLICAACTIVE), the data-master returns other error codes, the data-master considers failure and updates own metadata (copy state is updated to KREPLICADELETING), the data-master sends Deletechunk request to the data-master, and the data-master checks that the data of the chunk copy is insufficient, picks the data-master again and supplements ADDREPLICA copies.

The specific flow of equalizing and complementing copies to the random/sequential block, the flow of data-master initiating operation is as follows:

for random block equalization (src- > data-server):

Creating a task a and a task b, wherein the task a is performed on a data-server ADDREPLICA, the task b is performed on a remote src source, the task a is performed, wherein the src of ADDREPLICA is selected, the data-server pulls data from the src (the data-server will set the src to a read only state when the data is pulled for the first time), after the data is pulled, the data-server copy state is set to active, the src copy is finally removed, and all the rest copy states are set to normal states.

For random block complement copies:

First, select a data-server, select ADDREPLICA src, then pull data from the src by the data-server (the first time the data-server pulls data will set the src to the read only state), after the data is pulled, set the data-server replica state to active, and set all remaining replica states to the normal state.

For the seq block complement copy:

First, select a data-server, select ADDREPLICA src, then pull data from the src by the data-server, after the data is pulled, set the data-server copy state to active.

For seq block equalization:

Creating task a and task b, task a is ADDREPLICA on the data-server, task b is replica on the remove src, task a is executed, wherein the selected ADDREPLICA src is set to be in a read_only state, the data-server pulls data from the src, after the data is pulled, the data-server copy state is set to be active, the src copy is removed, and all the rest copy states are set to be in a normal state.

As an alternative example, performing a complementary copy operation on a target copy on a target physical disk using a data source includes reading data from an offset location of the data source, wherein the offset location is a location in the data source that is offset to a copy length of the target copy, writing the read data to the target copy until the target copy and the data source are the same length, and updating the copy length of the target copy.

In this embodiment, since the lengths of the data source and the target copy to be complemented are not consistent, and the previous part of data of the target copy to be complemented is complete, when the target copy is complemented, the data can be read from the migration position of the data source. If the copy length of the target copy to be filled is 10, it is indicated that the first 10 copy data blocks of the target copy to be filled are complete, and thus, the data is read from the 11 th data block of the data source and filled into the target copy to be filled.

As an alternative example, in the case of receiving a write request of a copy data block of the target copy, the method further includes assigning a first operation type code to the write request, inserting the first operation type code into a lock-free queue, wherein the lock-free queue further includes a second operation type code, the second operation type code being a code for an internal operation of the data server of the drop disk, and executing the write request and the internal operation in an order of the operation type codes in the lock-free queue.

In this embodiment, for input/output requests (including read requests and write requests), the input/output requests and operations inside DATA SERVER may be encoded according to the type unified allocation operation type. The input-output request has a unique operation type code corresponding to the operation within DATA SERVER. The operation type codes are stored in the lock-free queue according to the generation sequence, the operation type codes are sequentially fetched by the lock-free queue, and the operation type codes are converted back to the input/output request or the internal operation of DATA SERVER. The input-output requests or the internal operations of DATA SERVER are performed in the fetch order. The input/output request or DATA SERVER internal operations are inherited from the same context structure. Fig. 11 is an integrated diagram. The embodiment realizes the effect of global lockless by executing the input and output requests to the columns without locks.

In this embodiment, through the DATA SERVER data server, double writing of WAL and apply write APPEND FILE in standard Raft can be avoided, delay time can be reduced by orderly writing and replying out-of-order writing and order preserving in application write, slow nodes do not need to be equal to fast nodes, and in order to meet the ultrahigh requirement of ESSD on performance, a majority dispatch writing protocol between DBS clients and DATA SERVER based on application write can be realized. Advancement through SCL (segment complete LSN) allows commit to occur while satisfying the most dispatch mechanism quorum, providing better concurrency. The majority style protocol of DBS is realized through the cooperation between data-clients and DATA SERVER.

The embodiment also provides a copy data disk-dropping device based on distributed block storage.

As shown in fig. 12, the apparatus includes:

A writing module 1202, configured to, when a writing request of a copy data block of the target copy is received, write a copy data block corresponding to the writing request onto a physical disk, where the writing request is used to request that the copy data block be dropped;

A reply module 1204, configured to return a reply message, where the reply message carries a copy length of the target copy that has been written to the physical disk;

And the reading module 1206 is used for returning the copy data block requested by the read request under the condition that the read request for reading the target copy is received and the copy data block requested by the read request is located in the copy length of the target copy, and returning a prompt message according to the search result of the copy data block requested by the read request under the condition that the copy data block requested by the read request is located outside the copy length of the target copy.

The above-described replica data disk-dropping device can be applied to the replica data disk-dropping process of the elastically telescopic distributed block storage (Elastic block store, EBS).

For DATA SERVER above, in terms of reading and writing, copy data can be written to the physical disk by a read-write function. FIG. 6 is a schematic diagram of a data-client write flow of a DBS database. The user initiates a request, applies for memory from the memory pool through SPDK THREAD threads, then queries the copy location, sends the IO request to DATA MASTER, and DATA MASTER sends the IO request to the corresponding DATA SERVER. As shown in fig. 6, the IO requests are sent to DATA SERVER, DATA SERVER2, and DATA SERVER3, respectively, and the responses are obtained, and finally the responses are fed back to the user and the memory is released. in the drawings, the transmission order of requests 0 to 2 and the acquisition order of responses 0 to 2 are examples, and are not limited thereto. After the IO request of the data-client is sent to DATA SERVER, DATA SERVER may write the target copy to which the IO request requested to be written onto the physical disk. FIG. 7 is a schematic diagram of a writing process of DATA SERVER of the DBS database. The write request is sent to DATA SERVER, DATA SERVER for a write operation by the UcxServerMsgHandlerPolling thread. the data block chunk is then checked by the SPKD THREAD thread, checking if it exists, if the status is readable and writable and if the data addresses in the data block are consecutive, if the data is out of bounds, if it is an overwrite. And then sending a request and receiving a response to the distributed block system of the user, checking the chunk again, checking whether the chunk exists, whether the state is readable and writable, and whether IO is continuous. When DATA SERVER writes data, out-of-order writing is executed, and after writing is finished, orderly replying is carried out, and the user is replied. For a target copy, it may consist of copy data blocks. DATA SERVER when writing a target copy to a physical disk, each copy data block of the target copy is written to the physical disk. Since there may be a write failure for the duplicate data block, DATA SERVER copies of the target copy to be written back after writing the target copy to the physical disk. Copy length DATA SERVER the copy side has determined the furthest position where the data was consecutively dropped and there was no hole before, referring to the length of consecutive data that was effectively written to the physical disk. If the duplicate data blocks 1-10, the data blocks written to the physical disk and validated have 1-6, 8, 9, the duplicate length is 6. If the copy data blocks written to the physical disk are 1-3, 4-8, 10, then the copy length of the target copy written to the physical disk is 3. a consistency point of the copy length refers to a copy length that is reached for more than half of the copies. After completion of IO defined as each copy, the length of the current copy is returned to the data client, and the length that more than half of the copies exceed or reach is called a consistency point. If 3 copies, where the copy length of two copies reaches 3, the consistency point is 3.DATA SERVER after the target copy is written, replying the copy length and consistency point of the target copy to the block-server. When a client requests to read data, an IO request to read data can be initiated, as shown in FIG. 8, which is a schematic diagram of a data-client read flow of the DBS database. The user initiates a request to apply for memory from the memory pool via SPDK THREAD threads, then queries the copy location from DATA MASTER, queries which DATA SERVER the copy is located in, and sends the IO request to DATA SERVER. The block-server determines via the data-master which DATA SERVER the client requests to read and determines DATA SERVER to send the request to DATA SERVER, which DATA SERVER provides the data. FIG. 9 is a schematic diagram of a read flow of a data-server of a DBS database. The read request is sent to DATA SERVER, DATA SERVER for a read operation by the UcxServerMsgHandlerPolling thread. The data block chucnk is checked by SPKD THREAD threads for its presence, status, and read/write. If present and the status is readable and writable, it is checked whether the data addresses in the data block are consecutive and the data are out of bounds. DATA SERVER replies the read data to the user. Because DATA SERVER returns the copy length after writing the target copy, the data in the copy length requested by the client can be returned to the client, and the data block-server outside the copy length requested by the client is not guaranteed to be able to return.

For other examples of this embodiment, please refer to the above examples, and the description thereof is omitted.

As shown in fig. 13, an embodiment of the present application provides an electronic device including a processor 111, a communication interface 112, a memory 113, and a communication bus 114, wherein the processor 111, the communication interface 112, and the memory 113 perform communication with each other through the communication bus 114,

A memory 113 for storing a computer program;

in one embodiment of the present application, when the processor 111 is configured to execute a program stored on the memory 113, a method for implementing any of the foregoing embodiments to provide a copy data dropping method based on distributed block storage includes:

writing a copy data block corresponding to a write request to a physical disk under the condition that the write request of the copy data block of the target copy is received, wherein the write request is used for requesting that the copy data block is dropped;

Returning a reply message, wherein the reply message carries the copy length of the target copy written to the physical disk;

returning the copy data block requested by the read request under the condition that the read request for reading the target copy is received and the copy data block requested by the read request is located within the copy length of the target copy;

And returning a prompt message according to the searching result of the copy data block requested by the read request under the condition that the copy data block requested by the read request is located outside the copy length of the target copy.

The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the method for providing a copy data landing method based on distributed block storage according to any one of the method embodiments described above.

The apparatus embodiments described above are merely illustrative, wherein elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.

It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "includes," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless an order of performance is explicitly stated. It should also be appreciated that additional or alternative steps may be used.

The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method for dropping duplicate data based on distributed block storage, comprising:

writing a copy data block corresponding to a write request onto a physical disk under the condition that the write request of the copy data block of a target copy is received, wherein the write request is used for requesting that the copy data block is dropped;

Returning the copy data block requested by the read request if the copy data block requested by the read request is within the copy length of the target copy, if a read request to read the target copy is received;

2. The method of claim 1, wherein, in the case of receiving a write request for a copy data block of a target copy, writing the copy data block corresponding to the write request to a physical disk comprises:

writing the copy data blocks to the physical disk according to a random writing sequence under the condition that the writing request is received;

starting from the first copy data block of the target copy, determining the lengths of a plurality of continuous and uninterrupted copy data blocks as the copy lengths.

3. The method of claim 2, wherein the determining a length of a plurality of consecutive uninterrupted duplicate data blocks as the duplicate length from a first duplicate data block of the target duplicate comprises:

Under the condition that any one copy data block is written into the physical disk, storing the write request corresponding to the copy data block into a message confirmation queue, wherein the write request in the message confirmation queue is a request which is already received and written into the physical disk;

In the case where there are 1 st through nth write requests in the message acknowledgement queue and no n+1th write request is included, the N is determined as the copy length.

4. The method of claim 1, wherein, in the case where the duplicate data block requested by the read request is located outside the duplicate length of the target duplicate, returning a hint message based on a lookup result of the duplicate data block requested by the read request comprises:

searching a copy data block requested by the read request on a target physical disk;

Returning the copy data block under the condition that the copy data block requested by the read request is found;

searching the copy data blocks requested by the read request from other physical disks under the condition that the copy data blocks requested by the read request are not found, wherein the other physical disks are other physical disks written into the target copy;

returning the copy data block under the condition that the copy data block requested by the read request is found in the other physical disks;

And returning an error reporting message under the condition that the copy data block requested by the read request is not found in the other physical disks.

5. The method of claim 1, wherein after writing the target copy to the physical disk, the method further comprises:

under the condition that the copy supplementing operation is triggered, determining the longest target copy in all target copies of other physical disks as a data source;

setting the data source to a write-inhibit state;

and executing a copy supplementing operation on the target copy on the target physical disk by using the data source.

6. The method of claim 5, wherein the performing a complement copy operation on the target copy on the target physical disk using the data source comprises:

reading data from an offset position of the data source, wherein the offset position is a position in the data source offset to the copy length of the target copy;

Writing the read data into the target copy until the lengths of the target copy and the data source are the same;

and updating the copy length of the target copy.

7. The method of claim 1, wherein in the event that a write request is received for a duplicate data block of a target duplicate, the method further comprises:

assigning a first operation type code to the write request;

inserting the first operation type code into a lock-free queue, wherein the lock-free queue further comprises a second operation type code which is used for the internal operation of the data server of the landing disk;

and executing the write request and the internal operation according to the sequence of the operation type codes in the lock-free queue.

8. A replica data disk-drop device based on distributed block storage, comprising:

The writing module is used for writing the copy data block corresponding to the writing request onto a physical disk under the condition that the writing request of the copy data block of the target copy is received, wherein the writing request is used for requesting to drop the copy data block from the disk;

The reply module is used for returning a reply message, wherein the reply message carries the copy length of the target copy written to the physical disk;

And the reading module is used for returning the copy data block requested by the read request under the condition that the read request for reading the target copy is received and the copy data block requested by the read request is located in the copy length of the target copy, and returning a prompt message according to the search result of the copy data block requested by the read request under the condition that the copy data block requested by the read request is located outside the copy length of the target copy.

9. A computer-readable storage medium, having stored thereon a computer program, characterized in that the computer program, when executed by a processor, performs the method of any of claims 1 to 7.

10. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 7 by means of the computer program.