Disclosure of Invention
An embodiment of the application aims to provide an address conversion method, a computing system and electronic equipment, which are used for relieving resource waste in a computing system comprising a plurality of computing devices.
The embodiment of the application provides an address conversion method which is applied to any one computing device of a computing system comprising a plurality of computing devices; the method comprises the steps that the root page table of the page table of each computing device is provided with marking information of a process running in the computing device, the process in the computing system comprises a sharing process running in a plurality of computing devices, the marking information of the sharing process is the first marking value in the root page tables of the other computing devices except for a target computing device, the marking information of the sharing process is not the first marking value in the root page tables of the target computing device, all table items of the sharing process are recorded in the page tables corresponding to the target computing device, and only the table items of the sharing process recorded in the root page tables are reserved in the page tables corresponding to the other computing devices;
The method comprises the steps of obtaining a virtual address to be converted and a process identifier corresponding to the virtual address, determining marking information of a process corresponding to the process identifier in a root page table of a page table corresponding to the computing device according to the process identifier, packing the virtual address and the process identifier into an address conversion request if the marking information is the first marking value, and sending the address conversion request to a target computing device of the process corresponding to the process identifier so that the target computing device can find out a physical address corresponding to the virtual address according to the page table corresponding to the target computing device, the virtual address and the process identifier, and if the marking information is not the first marking value, finding out the physical address corresponding to the virtual address step by step in the page table corresponding to the computing device according to the virtual address.
In the above implementation manner, in a computing system including a plurality of computing devices, by completely saving all entries of a shared process only in a target computing device among the plurality of computing devices running the shared process, and saving only entries recorded in a root page table in the remaining computing devices, and setting values of flag information of the shared process in the remaining computing devices to a first flag value and setting values of flag information of the shared process in the root page table of the target computing device to a second flag value, respectively. In this way, for any computing device, when performing address translation, the process identification corresponding to the virtual address to be translated can determine the marking information corresponding to the process in the page table corresponding to the process, and further determine whether the page table corresponding to the process has the complete entry corresponding to the process or not based on the value of the marking information. And under the condition that the complete table entry exists, address conversion is realized through the page table corresponding to the target computing device by sending an address conversion request to the target computing device. Therefore, the complete table entry needed to be saved by the sharing process is reduced from multiple to one while the normal address conversion can be ensured, the single table entry of the root page table only occupies 64 bits, and the root page table entries of one process generally do not exceed 255, so that the page table occupation on other computing devices except the target computing device is reduced to less than 2K for the same process, thereby reducing the occupation requirement of the sharing process on storage resources, reducing the resource waste and reserving more storage resources for business use. For example, taking the example that the page table of a single process occupies a single computing device 1M, in a computing system formed by 8 computing devices, the total page table occupation of a sharing process is reduced from 8M to 1M plus a few K or 1M plus a dozen or more K, thereby significantly reducing the occupation requirement of the sharing process on storage resources.
Further, if the virtual address and the process identifier are obtained from address translation requests sent by other computing devices, after the physical address corresponding to the virtual address is found step by step in a page table corresponding to the computing device according to the virtual address, the method further comprises returning the physical address to the other computing devices.
In the above implementation manner, when the computing device itself receives the address conversion request sent from the other computing device for the target computing device, it may find the physical address of the virtual address to be converted from the page table corresponding to itself, and return the physical address to the other computing device from which the address conversion request is sent, so that the other computing device can realize conversion from the virtual address to the physical address by means of the target computing device, thereby meeting the service requirement.
Further, the marking information of the process is recorded in the reserved bit of the table entry corresponding to the process of the root page table.
In the implementation manner, the reserved bits in the root page table entries are utilized to set the process marking information, so that page table entry resources can be fully utilized, and a management mechanism for additionally introducing marking information is not needed, so that additional expenditure is not needed, and popularization and use in industrial application are facilitated.
Further, the target computing device of the sharing process is a computing device that first runs the sharing process among the plurality of computing devices running the sharing process.
In the above implementation, by defining the computing device that runs the sharing process first as the target computing device, only the entry of the root page table may be created when the page table entry is created for the sharing in the computing device that runs the sharing process later, and no entries of the page tables of each level after the root page table are created. Compared with the mode that the computing device running the sharing process later is taken as the target computing device, the implementation mode does not need to carry out the establishment operation of other page table entries except the root page table in the computing device running the sharing process later, does not need to carry out the deletion operation of other page table entries except the root page table in the computing device running the sharing process earlier, simplifies the page table entry management flow of the sharing process, and saves the related table entry management overhead.
The method further comprises the steps of determining a computing device with the largest address translation requirement for any shared process as a new target computing device of the shared process according to any shared process, migrating each target table item in a page table corresponding to the target computing device of the shared process into the page table corresponding to the new target computing device, updating the mark information of the shared process in the root page table corresponding to the target computing device to the first mark value, and updating the mark information of the shared process in the root page table corresponding to the new target computing device to the second mark value, wherein the target table item is a table item of the shared process in each other page table except the root page table.
In the above implementation manner, by migrating each page table entry of the sharing process to the computing device with the largest address translation requirement for the sharing process, the computing device does not need to perform the address translation of the sharing process through another computing device. Since the computing device is the computing device requiring the greatest address translation of the shared process, the frequency of address translation operations of the shared process by other computing devices can be reduced as a whole. When address conversion is performed by another computing device, there is an interaction such as an address conversion request, and the power consumption is higher than that generated by performing address conversion by using its own page table, so that the power consumption of the computing system can be improved to a certain extent by the implementation manner.
The embodiment of the application also provides a computing system which comprises a plurality of computing devices, wherein the root page table of the page table of each computing device is provided with marking information of a process running in the computing device, the process in the computing system comprises a shared process running in a plurality of computing devices, the marking information of the shared process is the first marking value in the root page tables of the other computing devices except for a target computing device, all table items of the shared process are recorded in the page tables corresponding to the target computing device, the table items of the shared process recorded in the root page table are reserved in the page tables corresponding to the other computing devices, each computing device comprises a computing unit and a processing module, the processing module is in communication connection with the computing unit, the processing module is used for receiving virtual addresses to be converted and process identifiers corresponding to the virtual addresses, which are reported by the computing unit or other computing devices, in the root page tables of the other computing devices, recording all table items of the shared process in the page tables corresponding to the target computing devices, determining that the virtual addresses corresponding to the virtual addresses to be converted are the first marking value according to the process identifiers corresponding to the process identifiers, the virtual addresses to be converted are not converted, the virtual addresses corresponding to the page tables corresponding to the physical address to the first marking device are searched for the page table information, the virtual address to be the first marking information is the physical marking information corresponding to the page table of the virtual address to be converted, the virtual address to be converted to the virtual address, the virtual address is the virtual address to be converted, and the virtual address to the virtual address, so that the target computing device searches the physical address corresponding to the virtual address according to the page table corresponding to the target computing device, the virtual address and the process identifier.
In the above computing system, all the entries of the sharing process are completely saved only in the target computing device among the multiple computing devices running the sharing process, and only the entries recorded in the root page table are saved in the other computing devices, thereby reducing the page table entry occupation of the sharing process. And the value of the marking information of the sharing process in the root page table of the target computing device is set as a second marking value, and the values of the marking information of the sharing process in other computing devices are set as a first marking value, so that when any computing device performs address conversion, marking information corresponding to the process can be determined in the page table corresponding to the process according to the process identification corresponding to the virtual address to be converted, and whether the complete table item corresponding to the process exists in the page table corresponding to the process or not is further determined based on the value of the marking information. And under the condition that the complete table entry exists, address conversion is realized through the page table corresponding to the target computing device, so that the computing system can normally perform address conversion work. That is, the computing system provided above can reduce the number of complete entries required to be saved by the sharing process from multiple copies to one copy while ensuring that address translation is performed normally, thereby reducing the occupation requirement of the sharing process on storage resources, reducing resource waste, and reserving more storage resources for business use.
Further, the processing module comprises a page table conversion module and a page table proxy module; the page table conversion module is respectively in communication connection with the computing units and the page table proxy modules, the page table proxy modules of the computing devices are used for receiving address conversion requests transmitted by the other computing devices and resolving the virtual addresses and the process identifications, and sending the physical addresses found by the page table conversion module to the other computing devices, and packaging the virtual addresses and the process identifications to the target computing devices, the page table conversion module is used for receiving the virtual addresses and the process identifications reported by the computing units, and receiving the virtual addresses and the process identifications transmitted by the page table proxy modules, the page table conversion module is further used for determining the marking information of the process corresponding to the process identifications in the root page table of the page table corresponding to the computing devices according to the process identifications in the address conversion requests, and sending the virtual addresses corresponding to the virtual addresses found by the computing devices step by step and the physical addresses corresponding to the page table proxy modules to the page table under the condition that the marking information is not the first marking value, or sending the virtual addresses corresponding to the page table corresponding to the first marking value, and packaging the physical addresses corresponding to the page table modules under the condition that the virtual addresses are found by step and the first marking values.
In the implementation manner, the page table conversion module and the page table proxy module are arranged, interconnection among the computing devices is realized through the page table proxy module, and address conversion of identification of process marking information in the computing devices is realized through the page table conversion module, so that in each computing device, address conversion work of each computing device through a page table of a target computing device can be realized through cooperation between the page table conversion module and the page table proxy module.
The page table proxy module comprises a proxy connection unit, a message receiving unit and a conversion request processing unit which are sequentially in communication connection, wherein the proxy connection unit is used for establishing communication connection with proxy connection units of other computing devices so as to receive address conversion requests transmitted by the other computing devices or send physical addresses searched by the page table conversion module to the other computing devices, the conversion request processing unit is in communication connection with the page table conversion module and is used for receiving the virtual addresses and the process identifications transmitted by the page table conversion module and sending the virtual addresses and the process identifications to the message receiving unit and sending the virtual addresses and the process identifications to the page table conversion module, and the message receiving unit is used for analyzing the address conversion requests received by the proxy connection unit so as to obtain the virtual addresses and the process identifications and packing the virtual addresses and the process identifications of the conversion request processing unit into the address conversion requests, and the proxy connection unit is also used for sending the address conversion requests to the target computing devices.
In the implementation manner, the page table proxy module can realize communication interconnection and message receiving and transmitting functions through the proxy connection unit, the message receiving and transmitting unit and the conversion request processing unit, and the functions of the units are relatively simple and are easy to realize in a circuit, so that the page table proxy module can be easily realized.
The page table proxy module is further used for searching a physical address corresponding to the virtual address in the page table entries cached in the page table proxy module according to the virtual address after receiving address conversion requests transmitted by other computing devices and analyzing the virtual address and the process identifier.
In the implementation manner, the page table entries with the use frequency higher than the preset use frequency are cached in the page table proxy module, so that the computing device can prefetch the page table entries which are most likely to be used into the page table proxy module for caching according to the use frequency of the process on the page table entries, and when address conversion is carried out, the physical addresses can be directly searched from the page table proxy module preferentially without searching step by step based on the page table, thereby reducing the acquisition cost of the page table entries.
Further, each computing device further includes a high-speed memory, and the page table corresponding to each computing device is stored in the high-speed memory of the computing device.
In the above implementation, by storing the page table in the high-speed memory of the computing device, the subsequent computing device only needs to acquire the page table from the own high-speed memory when acquiring the page table, and compared with the scheme of storing the page table outside the computing device, the above implementation has faster acquisition speed and lower cost for the page table.
The embodiment of the application also provides electronic equipment, which comprises the computing system.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application.
Embodiment one:
In order to alleviate the problem of resource waste in a computing system including a plurality of computing devices, an address translation method is provided in an embodiment of the present application, where the address translation method is applied to any one of the computing devices in the computing system including the plurality of computing devices.
In the embodiment of the application, the computing device refers to a device which can run a process, perform virtual address to physical address, and execute a computing task, and can be, but is not limited to, a computing card, a GPU, and the like.
In an embodiment of the application, computing devices within a computing system are communicatively coupled to each other, and each computing device has a corresponding page table. For example, referring to FIG. 1, computing devices within a computing system may each be coupled to a host, and in turn, the computing devices may be communicatively coupled to each other via the host. In addition, the computing devices may be connected by multiplexing physical paths between the computing devices, or may be connected by a physical network card (such as an Infiniband (Infiniband) card, an ethernet card, etc.), which is not limited in the embodiment of the present application. In addition, the computing devices may be connected in a ring, star, or connected to a switch, which is not limited in the embodiments of the present application.
In the embodiment of the present application, the page table of each computing device may be stored in a storage apparatus such as a High-speed Memory (for example, HBM (High-speed Memory) of each computing device, GDDR (Graphics Double Data Rate, video Memory) or the like, or the page table of each computing device may be stored in a remote apparatus, for example, a host connected to each computing device, a Memory provided outside the computing device, or may be stored in the High-speed Memory of one computing device, and at this time, a correspondence between each page table and each computing device is configured in the remote apparatus, so that each computing device may acquire the page table corresponding to itself from the remote apparatus according to the correspondence.
It will be appreciated that the currently used page table is usually a multi-level list, and when the virtual address is converted to the physical address, the computing device performs a lookup in the first-level page table, and then performs a lookup in the first-level page table according to the table entry information of the first-level page table. And so on until the physical address corresponding to the virtual address is found. In the embodiment of the application, a first stage page table in the multi-stage page table is called a root page table.
In the embodiment of the application, the root page table of each page table is provided with the marking information of the process running in the computing device corresponding to the page table. The marking information at least comprises a first marking value and a second marking value.
In an embodiment of the present application, a concept of a shared process is presented, and a process running in a plurality of computing devices in a computing system is defined as a shared process.
In the embodiment of the present application, one or more of a plurality of computing devices running a shared process are taken as a target computing device, the tag information of the shared process in the root page table of the target computing device is marked as a value other than a first tag value, for example, a second tag value, and the tag information of the shared process in the root page tables of the rest of the computing devices running the shared process except the target computing device is marked as the first tag value. Wherein the first and second flag values are different.
In addition, all entries of the shared process are normally held in the page table of the target computing device, while only entries of the shared process in the root page table are held in the page tables of the rest of the computing devices other than the target computing device among the plurality of computing devices running the shared process.
Thus, by the first flag value and the second flag value in the root page table, each computing device can distinguish whether all entries of the shared process are in its own page table.
In the embodiment of the application, only two kinds of flag values, namely, a first flag value and a second flag value, can be configured, wherein the second flag value is also used as flag information corresponding to a common process (namely, a process running on only one computing device) which is not a sharing process. Thus, when the computing device finds that the marking information of the process in the root page table corresponding to the computing device is the second marking value, whether the computing device is a common process or a shared process, the computing device indicates that the computing device has a complete table entry of the process in the page table, and therefore the physical address required by the process can be found.
In addition, in the embodiment of the present application, a third flag value may be configured in addition to the first flag value and the second flag value, where the first flag value, the second flag value, and the third flag value are different from each other. At this time, the first flag value characterizes the process as a sharing process and no complete table entry exists in the page table, the second flag value characterizes the process as a sharing process and complete table entry exists in the page table, and the third flag value characterizes the process as a normal process.
It will be appreciated that the entries of the page table have a fixed format, with bit bits reserved in the definition of the page table entries for functional expansion. In the embodiment of the application, the reserved bit (i.e. reserved bit) in the table entry of the root page table can be used for recording the marking information of the process, i.e. the marking information of the process can be recorded in the reserved bit of the table entry corresponding to the process in the root page table. Therefore, page table entry resources can be fully utilized, and a management mechanism for additionally introducing marking information is not needed, so that additional expenditure is not needed, and popularization and use in industrial application are facilitated.
For example, when the tag information has both the first tag value and the second tag value, bit 5 of the entry may be used for recording the tag information of the progress. For example, a first flag value of 1 may be set and a second flag value of 0, or a first flag value of 0 and a second flag value of 1 may be set. When the tag information has three values of the first tag value, the second tag value and the third tag value, the 4 th bit and the 5 th bit of the entry can be used for recording the tag information of the progress. For example, a first flag value of 11, a second flag value of 10, a third flag value of 00, etc. may be set.
The address translation method provided in the embodiment of the present application is applied to any one of the computing devices of the computing system, as shown in fig. 2, and includes:
s201, obtaining a virtual address to be converted and a process identifier corresponding to the virtual address.
In an embodiment of the present application, as shown in fig. 3, each computing device may have at least one computing unit therein for performing specific computing tasks assigned by the process. The virtual address to be converted and the process identifier corresponding to the virtual address may be provided by the computing unit when performing the memory access operation.
In addition, for the sharing process, only the target computing device of the sharing process can realize address conversion, so that in the case that a certain computing device is not the target computing device of the sharing process, the virtual address to be converted and the process identifier corresponding to the virtual address obtained from the computing unit of the computing device are packaged into an address conversion request and sent to the target computing device of the sharing process. Therefore, in the case that the computing device executing the method provided by the embodiment of the present application is the target computing device of the shared process, the virtual address to be translated in S201 and the process identifier corresponding to the virtual address may also be obtained from the address translation request sent from the other computing device.
In the embodiment of the present application, the process identifier may be, but is not limited to, information that may uniquely identify the process, such as a process ID (Identity, identity number).
S202, determining the marking information of the process corresponding to the process identifier in a root page table of a page table corresponding to the computing device according to the process identifier.
In the embodiment of the application, after the process identifier is obtained, the table entry corresponding to the process identifier can be found in the page table, and then the value of the marking information recorded in the table entry is obtained, so that whether the address conversion can be carried out is determined.
S203, if the marking information is a first marking value, the virtual address and the process identifier are packaged into an address conversion request and are sent to a target computing device of a process corresponding to the process identifier, so that the target computing device searches a physical address corresponding to the virtual address according to a page table corresponding to the target computing device, the virtual address and the process identifier, and if the marking information is not the first marking value, the physical address corresponding to the virtual address is searched step by step in the page table corresponding to the computing device according to the virtual address.
In the embodiment of the application, the packed address conversion request can be sent to all the computing devices, and each computing device can analyze the virtual address to be converted and the process identifier corresponding to the virtual address after receiving the address conversion request, so as to determine whether the address conversion of the virtual address can be carried out or not based on the process identifier.
In addition, in the embodiment of the present application, in each computing device running the sharing process, the corresponding relationship between the process identifier and the target computing device of the sharing process may be preset, and then the packaged address conversion request may be directly sent to the target computing device.
In the embodiment of the present application, if the virtual address and the process identifier are obtained from the address translation request sent by the other computing device, after the physical address corresponding to the virtual address is found step by step in the page table corresponding to the computing device according to the virtual address, the physical address may also be returned to the other computing device that sends the address translation request.
In the embodiment of the application, the computing device can add the self identification information or the address information into the address conversion request when the address conversion request is obtained through packaging. After receiving the address conversion request, the target computing device can return the searched physical address to the computing device sending the address conversion request according to the identification information or the address information.
In the embodiment of the present application, after receiving the physical address returned by the target computing device, the computing device may give the physical address to the computing unit that provides the virtual address to be converted, so that the computing unit completes the memory access operation based on the physical address.
Similarly, in the case that the virtual address to be converted is provided by the computing unit of the computing device, the computing device may, after finding the physical address, give the physical address to the computing unit that provides the virtual address to be converted, so that the computing unit completes the memory access operation based on the physical address.
In some implementations of embodiments of the application, a particular computing device may be designated as the target computing device. At this time, if there is a computing device running the shared process before the target computing device, the complete table entry of the shared process may be stored in the page tables of the computing devices, and the flag information may be set to a second flag value, when the shared process runs in the target computing device, the complete table entry of the shared process is stored in the page table of the target computing device, and the previous table entries in each page table except the root page table are removed, and the flag information in the root page table is modified to a first flag value, or the computing device running the shared process may notify the target computing device first, so that the target computing device stores the complete table entry of the shared process in the page table, and sets the flag information to a second flag value, and meanwhile, the computing device running the shared process first only sets the table entry of the shared process in the root page table, and sets the flag information to the first flag value.
In some implementations of embodiments of the application, a computing device that first runs a shared process among a plurality of computing devices that run the shared process may be the target computing device.
For example, when one computing device runs a process, it may be queried whether other computing devices have run the process, and further determine whether itself is the first computing device to run the process based on the query. When the device is a computing device which runs the process first, a complete table entry corresponding to the process is established in a page table. When the computing device is not the first computing device to run the process, the entry corresponding to the process is only established in the root page table. Therefore, the establishment operation of other page table entries except the root page table in the computing device running the sharing process in the follow-up is not needed, the deletion operation of other page table entries except the root page table in the computing device running the sharing process in the prior art is not needed, the page table entry management flow of the sharing process is simplified, and the related table entry management overhead is saved.
The query mode may be, but not limited to, that a process identifier of the process may be sent to each computing device, each computing device queries whether a page table entry corresponding to the process identifier exists in a page table corresponding to the computing device, if so, the process is indicated to be run, if not, the process is indicated to not be run, and then the query result is fed back to the computing device sending the process identifier.
In some implementations of the embodiments of the present application, in a case where, for any one sharing process, a target computing device has been determined, a computing device with a largest address translation requirement for the sharing process may also be determined periodically or aperiodically as a new target computing device for the sharing process. And further, each target entry in the page table corresponding to the target computing device of the shared process is migrated to the page table corresponding to the new target computing device, the marking information of the shared process in the root page table corresponding to the original target computing device is updated to the first marking value, and the marking information of the shared process in the root page table corresponding to the new target computing device is updated to the second marking value. The target table entry is an entry of the shared process in each of the rest of the page tables except the root page table.
Thus, since the computing device is the computing device requiring the greatest address conversion requirement for the shared process, the frequency of address conversion work of the shared process by another computing device can be reduced as a whole. In addition, when address translation is performed by another computing device, there is an interaction such as an address translation request, and the power consumption is higher than that generated by performing address translation by using its own page table, so by the above embodiment, the power consumption of the computing system can be improved to a certain extent.
In the embodiment of the present application, for the same sharing process, the number of address conversion requirements generated by the computing units in each computing device with respect to the sharing process may be counted, and then the computing device with the largest number of generated address conversion requirements may be used as a new target computing device of the sharing process. Or, for the same sharing process, the number of address conversion requirements locally generated by the target computing device of the sharing process may be counted, and the number of address conversion requests sent by other computing devices except the target computing device may be counted among all computing devices running the sharing process, so that the computing device with the largest number is used as a new target computing device of the sharing process.
In an embodiment of the present application, there is further provided a computing system, as shown in fig. 4, which includes a plurality of computing devices, and each computing device includes a computing unit and a processing module, where the processing module is communicatively connected to the computing unit.
Each computing device has a corresponding page table, and the page table may be set in the manner described above, which is not described herein. Similarly, the setting manner of the entries of the sharing process is also referred to above, and will not be described again here.
In the embodiment of the application, the processing module is used for receiving the virtual address to be converted and the process identifier corresponding to the virtual address, which are reported by the computing unit or transmitted by other computing devices, and determining the marking information of the process corresponding to the process identifier in the root page table of the page table corresponding to the computing device according to the process identifier.
The processing module is further used for searching a physical address corresponding to the virtual address step by step in a page table corresponding to the computing device according to the virtual address when the marking information is not the first marking value, returning the physical address to the computing unit or other computing devices, and packing the virtual address and the process identifier into an address conversion request and sending the address conversion request to a target computing device of the process corresponding to the process identifier when the marking information is the first marking value, so that the target computing device searches the physical address corresponding to the virtual address according to the page table, the virtual address and the process identifier corresponding to the target computing device.
In an embodiment of the present application, as shown in fig. 5, the processing module may include a page table translation module and a page table proxy module.
The page table conversion module is respectively connected with the calculation unit and the page table proxy module in a communication way. The page table proxy modules of the computing devices are in communication connection.
The page table proxy module is used for receiving address conversion requests transmitted by other computing devices, resolving virtual addresses and process identifications, transmitting the physical addresses found by the page table conversion module to the other computing devices, and packaging the virtual addresses and the process identifications into address conversion requests and transmitting the address conversion requests to the target computing device.
The page table conversion module is used for receiving the virtual address and the process identifier reported by the calculation unit and receiving the virtual address and the process identifier transmitted by the page table proxy module.
The page table conversion module is further used for determining marking information of a process corresponding to the process identification in a root page table of a page table corresponding to the computing device according to the process identification in the address conversion request.
The page table conversion module is further used for searching a physical address corresponding to the virtual address step by step in a page table corresponding to the computing device according to the virtual address when the marking information is not the first marking value, returning the physical address to the computing unit or sending the physical address to the page table proxy module, and sending the virtual address and the process identification to the page table proxy module for packaging when the marking information is the first marking value.
In an embodiment of the present application, the page table translation module may include a page table translation unit and a shared page table processing unit, as shown in fig. 6.
The page table conversion unit is mainly used for searching step by step in the page table according to the virtual address and the process identifier transmitted by the calculation unit. The method comprises the steps of generating a shared page table conversion request when the mark information of the searched root page table is found to be a first mark value, submitting the request to a shared page table processing unit, and continuously searching a physical address step by step in a page table until a final page table item is found when the mark information of the searched root page table is found not to be the first mark value, so as to obtain the physical address.
The shared page table processing unit is mainly used for receiving the shared page table conversion request sent by the page table conversion unit and submitting the shared page table conversion request to the page table proxy module for processing, and is used for carrying out functional interaction with the page table proxy module, for example, receiving the result of the shared page table conversion request returned by the page table proxy module, returning the result to the page table conversion unit, receiving virtual addresses and process identifications in address conversion requests sent by other computing devices sent by the page table proxy module, and submitting the virtual addresses and process identifications to the page table conversion unit for searching.
In the embodiment of the present application, a technician may arrange corresponding functional circuits according to the functions to implement the above-mentioned page table translation unit and the shared page table processing unit, and the implementation circuits of the related functions are circuits known in the art, so that the implementation circuits are not expanded herein.
In the embodiment of the present application, the page table proxy module may be shown in fig. 7, and may include a proxy connection unit, a messaging unit, and a translation request processing unit that are sequentially connected in communication.
The proxy connection unit is used for establishing communication connection with proxy connection units of other computing devices so as to receive address conversion requests transmitted by the other computing devices or send physical addresses searched by the page table conversion module to the other computing devices.
The conversion request processing unit is in communication connection with the page table conversion module, and is used for receiving the virtual address and the process identifier transmitted by the page table conversion module, transmitting the virtual address and the process identifier to the message receiving and transmitting unit, receiving the virtual address and the process identifier transmitted by the message receiving and transmitting unit, and transmitting the virtual address and the process identifier to the page table conversion module.
The message receiving and transmitting unit is used for resolving the address conversion request received by the proxy connection unit to obtain a virtual address and a process identifier, and is used for packaging the virtual address and the process identifier transmitted by the conversion request processing unit into the address conversion request.
The proxy connection unit is also configured to send the address translation request to the target computing device.
In the embodiment of the present application, a technician may arrange corresponding functional circuits according to the above functions to implement the proxy connection unit, the message receiving and transmitting unit, and the conversion request processing unit, and the implementing circuits of the related functions are circuits known in the art, so that the implementation circuits are not expanded herein.
In the embodiment of the application, the page table proxy module can be further provided with a buffer for buffering page table entries with the use frequency higher than the preset use frequency. Therefore, the computing device can prefetch the most used page table entries into the page table proxy module for caching according to the using frequency of the process to the entries, and when address conversion is carried out, the physical address can be directly searched from the page table proxy module preferentially without gradually searching based on the page table, so that the acquisition cost of the page table entries is reduced.
In order to facilitate understanding of the solution according to the embodiment of the present application, a case where the computing device is a computing card and the computing system includes two computing cards is taken as an example, and further exemplary description is made on the solution according to the embodiment of the present application.
Referring to fig. 8, the computing system includes a computing card 1 and a computing card 2, the computing card 1 and the computing card 2 are both connected with a host, and a page table proxy module of the computing card 1 is communicatively connected with a page table proxy module of the computing card 2. The page table of the computing card 1 is stored in the high-speed memory of the computing card 1, and the page table of the computing card 2 is stored in the high-speed memory of the computing card 2. And as shown in fig. 8, for the sharing process, only entries of the root page table are reserved in the page table of the computing card 2.
For this sharing process:
When the computing unit in the computing card 2 initiates data access, the computing unit in the computing card 2 inputs a virtual address VA and a process ID to be converted into a page table conversion module in the computing card 2, and when the page table conversion module searches the table item of a root page table through the VA, the current process mark information is found to be a first mark value, so that the VA and the process ID are sent to a shared page table processing unit, the shared page table processing unit in the page table conversion module generates a request and sends the request to a conversion request processing unit in a page table proxy module of the computing card.
After receiving the request, the conversion request processing unit of the computing card 2 sends VA and the process ID to the messaging unit to be packaged into a message, and sends the message to the page table proxy module of the computing card 1 through the proxy connection unit.
After receiving the message through the proxy connection unit, the page table proxy module of the computing card 1 analyzes VA and process ID through the message receiving and transmitting unit, generates a shared page table conversion request through the conversion request processing unit, and sends the shared page table conversion request to the shared page table processing unit of the computing card 1.
The shared page table processing unit of the computing card 1 generates a page table conversion request of VA and a process ID, and the page table conversion request is sent to the page table conversion unit of the computing card 1 to carry out page table conversion to obtain a physical address PA corresponding to the VA and the process ID.
The page table conversion unit of the computing card 1 returns the searched PA to the shared page table processing unit, and the shared page table processing unit returns the PA to the conversion request processing unit of the page table proxy module, and the conversion request processing unit encapsulates the PA into an address conversion request in the form of a message through the message receiving and sending unit.
The page table proxy module of the computing card 1 sends back the return address translation request to the page table proxy module of the computing card 2.
After the page table proxy module of the computing card 2 analyzes the address conversion request to obtain PA, the PA is sent to the page table conversion module of the computing card 2.
So far, the computing card 2 makes the page table conversion module of the computing card 1 complete conversion from VA to PA through a page table conversion proxy mechanism.
When a computing unit in the computing card 1 initiates data access, the computing unit in the computing card 1 inputs a virtual address VA and a process ID to be converted into a page table conversion module in the computing card 1, and when the page table conversion module searches entries of a root page table through the VA, the marking information of the current process is found to be a second marking value, and conversion inquiry of second-level, third-level and fourth-level page tables is continued, so that a physical address PA is finally obtained.
In the embodiment of the application, the table entries of the root page table only occupy 64 bits, but the table entries of the root page table of 1 process generally do not exceed 255, so that the table entries of the page table on the computing card 2 occupy less than 2K for the same sharing process, the computing card 2 does not need to reserve the second-level, third-level and fourth-level page tables of the sharing process, and the page table occupation of the computing card 2 is greatly saved. In general, the occupation of the system page table in the scene of multiple computing devices is almost equivalent to the occupation of the system page table in the scene of a single computing device, so that the resource waste can be reduced, and more storage resources are reserved for business use.
Based on the same inventive concept, the embodiment of the application also provides an electronic device, which comprises the computing system provided by the embodiment of the application.
In the embodiment of the application, the electronic device can be, but is not limited to, an intelligent terminal (such as an intelligent mobile phone, a computer, a tablet, an intelligent television and the like), a server, a console, an unmanned aerial vehicle and other devices needing calculation.
In the embodiments of the present application, each embodiment or implementation may be freely combined to obtain a new embodiment without conflict.
In the embodiments provided by the present application, the device embodiments described above are merely illustrative.
In this document, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Herein, a plurality refers to two or more.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.