Movatterモバイル変換


[0]ホーム

URL:


CN1848096A - Improved apparatus and method for avoiding repeated matched inputting in switching side-looking buffer - Google Patents

Improved apparatus and method for avoiding repeated matched inputting in switching side-looking buffer
Download PDF

Info

Publication number
CN1848096A
CN1848096ACNA2005100649997ACN200510064999ACN1848096ACN 1848096 ACN1848096 ACN 1848096ACN A2005100649997 ACNA2005100649997 ACN A2005100649997ACN 200510064999 ACN200510064999 ACN 200510064999ACN 1848096 ACN1848096 ACN 1848096A
Authority
CN
China
Prior art keywords
input
translation lookaside
lookaside buffer
request
avoid
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2005100649997A
Other languages
Chinese (zh)
Inventor
莱恩·C·肯特
G·麦可·亚勒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
MIPS Tech LLC
Original Assignee
MIPS Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by MIPS Technologies IncfiledCriticalMIPS Technologies Inc
Priority to CNA2005100649997ApriorityCriticalpatent/CN1848096A/en
Publication of CN1848096ApublicationCriticalpatent/CN1848096A/en
Pendinglegal-statusCriticalCurrent

Links

Images

Landscapes

Abstract

The present invention relates to an improved device for preventing repetitive matched input in translation look-aside buffer (TLB) and its method. It is characterized by that in the translation look-aside buffer (TLB) every input possesses Include bit which can be used for specifying and acknowledging that said input must be mark matching comparison or excluded from said mark matching comparison. Said invention also provides the concrete steps of said method for preventing repetitive matched input in TLB by using Include bit.

Description

Avoid the improved device and the method for repeated matching input in the translation lookaside buffer
Technical field
The present invention relates to translation lookaside buffer in the microprocessor (Translation lookaside buffer, technical field TLB), the technology of particularly wherein avoiding repeated matching to import.
Background technology
Microprocessor now is the notion of virtual support internal memory (Virtual Memory) mostly, in virtual memory system, the programmed instruction of carrying out on microprocessor corresponds to the data of virtual address (Virtual Address) in the virtual address space that uses microprocessor; In addition, instruction itself directly is mapped as the use of virtual address in the virtual address space.Virtual address space can be in the system number of concrete internal memory, particularly virtual memory usually much larger than the number of the actual concrete internal memory of system; And be access system internal memory or other devices as the I/O device, the virtual address that microprocessor produces can be converted to the specific address that is connected to system bus in the microprocessor.
Usually the virtual memory architecture of microprocessor support is a page memory system, and its utilization paging mechanism is a specific address with virtual address translation (Translating) or reflection (Mapping).The specific address space is the concrete page that cuts into a plurality of fixed measures, and general page size is 4KB.Virtual address is to comprise that virtual page address partly is offset (Page offset) partly with the page, virtual page address wherein can be converted to concrete page address with virtual page address by paging mechanism in order to indicate virtual page number in virtual address space; Page skew then indicates concrete skew in the concrete page,, indicate corresponding concrete skew from concrete page address wherein that is.
The advantage of paging is well known to those skilled in the art.One of them advantage is that program can be carried out having under the virtual memory space that concrete memory headroom is big than reality; Another advantage is that paging can be convenient to corresponding programs reorientating in the concrete core position of difference when different or a plurality of programs is carried out; An advantage is that a plurality of programs of paging tolerable are carried out on processor simultaneously again, and wherein each program can not via disc exchange and not all specific address drop under the state to single program, be assigned to the concrete memory pages of exclusive access; Another advantage then is that paging can be protected the internal memory utilization between distinct program respectively on page basis.
Conversion of page (be about to virtual page address and be converted to concrete page address) is to realize by the mode that is commonly referred to as page reference table roaming (Page table walk).Generally speaking, operating system keeps and comprises in order to virtual page address is converted to the page reference table of concrete page address, this page reference table is present in the Installed System Memory usually, operate and need carry out multiple memory access because of converse routine, so the operation of page reference table roaming is relatively comparatively required great effort consuming time.
For reducing the number of times of page reference table roaming, a lot of microprocessors provide the mechanism in order to high-speed cache page reference table information, and this information comprises the next corresponding concrete page address of virtual page address conversion by frequent use; Page reference table high speed information buffer memory is commonly referred to as translation lookaside buffer (TLB), when virtual page address provides so far TLB, TLB promptly carries out consulting of this virtual page address, when if the virtual page address of consulting drops among the TLB, then TLB provides corresponding conversion back concrete virtual location, can omit the page reference table roam operation consuming time that virtual page address is converted to concrete virtual address by this.
The microprocessor of some utilization TLB can optionally add in TLB automatically fills out data, but, most microprocessor still need rely on the content of operating system planning TLB, under some special situation of this type of microprocessor, when two different inputs have respectively that a virtual page address meets request TLB conversion or during the virtual page address of consulting, operating system just can take place to load phenomenons among the TLB with two different inputs most probably.Significantly, aforementioned phenomenon is undesirable situation, it is not clear that its consequence at first is that TLB can export the situation of concrete page address, which conversion back actually, if the concrete page address of TLB output error, and processor still can continue its operation under without any situation of remedying, and then Xiang Guan data processing has impaired danger with regard to potential.Secondly, under the operation that relies on the TLB circuit, the attempt of trying to export concrete page address more than causes the damage of integrated circuit in the microprocessor possibly.
Therefore, be that those skilled in the art are needed in order to the apparatus and method of avoiding repeated matching input in TLB.
Summary of the invention
In one embodiment, the invention provides a kind of TLB that in each input, comprises a signifying sign, when this TLB confirming a virtual address whether with TLB in during arbitrary input coupling, whether this signifying sign indicates this input and should comprise or get rid of in this compare operation; When an input among the TLB successfully was written into, this signifying sign just was set in the input that is write; When software is attempted to write the virtual address coupling of an input and a virtual address that will be written into and an already present TLB input in TLB, just remove the signifying sign of coupling input, making the coupling input be excluded in coupling by this confirms in the operation, when successfully writing input, end, afterwards, abandon writing to the operation among the TLB, and produce an exception (exception), with the notifying operation system attempt of repeated matching input is arranged, and allow that by this operating system remedies this situation.Yet, the operation of abandoning writing and producing exception, only limit to when coupling input be outside the desired input that writes input, this coupling input effectively and the input value that writes of this preparation when effective; Otherwise TLB promptly writes this input that indicates.By aforesaid program, can reduce the number of times that produces exception effectively.
Further understand and approval for the present invention is had, make following detailed description in conjunction with the accompanying drawings.
Description of drawings
Fig. 1 is the calcspar according to microcontroller core of the present invention;
Fig. 2 is the calcspar according to TLB among Fig. 1 of the present invention;
Fig. 3 is the calcspar according to data array among Fig. 2 of the present invention;
Fig. 4 is the calcspar according to reference numerals group block among Fig. 2 of the present invention;
Fig. 5 is the calcspar that shows exception generation logic among Fig. 2 of the present invention;
Fig. 6 be show among Fig. 1 of the present invention, in the operation that writes TLB, the flow process of microprocessor operation;
Fig. 7 be show among Fig. 1 of the present invention, in the lookup operations of TLB, the process flow diagram of microprocessor operation; And
Fig. 8 is the calcspar that shows microprocessor and TLB among disposal system utilization Fig. 1 of the present invention.
Embodiment
For more embodying advantage of the present invention fully, in the following description, at first discuss in the software as operating system, attempt to write repeated matching and input to possible situation among the TLB.
First may situation is to involve the control transfer to operating system from firmware (or claim firmware, as the ROM monitor that cooperates with the microprocessor with TLB on the circuit board).When reseting computer system, firmware can with TLB initial setting to a known state, for example be set at a switched concrete page address identical with virtual page address by writing each input; Each firmware write operation among the TLB must indicate a virtual page address that has write in the input in the virtual address space in processor.When operating system is transferred in control, operating system promptly begins input is write among the TLB (this operating system may automatically with TLB initial setting to a known state), this operating system does not also know firmware is which virtual page address is write among the TLB actually, therefore, operating system may attempt a coupling input is repeated to input among the TLB, and this coupling input has identical virtual page address with other TLB inputs that write by firmware.
Second may situation be to involve clearly the operating system of letting out each input of (Flush) TLB, and an example on this opportunity is that operating system is must let out TLB clearly in response to Task Switching the time; The mode that operating system lost efficacy by each input that makes TLB is carried out the step that this lets out TLB clearly, promptly, operating system writes an effective place value in each input, on behalf of this input, this effective place value do not comprise effective self-virtualizing to concrete page address conversion, because each is invalid to write and must indicate a virtual page address in the virtual address space in the processor, so exist operating system to attempt because of virtual page address is identical a repeated matching input is write possibility among the TLB.In theory, operating system can take the necessary steps and check existing content among the TLB, the repeated matching input is write among the TLB guaranteeing; Yet in fact, operating system only wishes to let out TLB clearly by input was lost efficacy as early as possible usually, and regardless of the existing virtual page address that has been stored among the TLB; Therefore, the program of letting out clearly is general only to be write in each input of TLB with identical virtual page address, and this identical virtual page address has an invalid value in significance bit.Certainly, the TLB program of letting out clearly can be more deep to avoid writing the input of repetition, still, if TLB let out existing content clearly based on TLB, then will make integral body let out operation clearly wastes time and energy.
The 3rd may situation be to involve: after letting out TLB clearly, operating system will newly be imported when writing to TLB.Because of the TLB of operating system clearly the program of letting out must choose some virtual page address, write in the input of TLB so that it lost efficacy, so the virtual page number that is assigned to a new task by operating system just may have and be let out the identical virtual page address of the selected virtual page address of program by TLB clearly.Therefore, when the TLB of the operating system program of heavily filling out writes to new effective input among the TLB, just can produce the problem that repeated matching writes most probably.
The 4th may situation be to involve: operating system is removed and is distributed a virtual page number (for example, for responding the termination of a task) and follow-up when redistributing virtual page number (for example, to a new task).When operating system remove to be distributed a virtual page number, on the page table relatively the input of virtual page number promptly indicate an invalid flag, and the input among the TLB of the real-time invalid page table input of high-speed cache was lost efficacy.When operating system is follow-up when redistributing virtual page number, operating system may needn't need or can the mapping that virtual page number is new write in the TLB input identical with old mapping easily; Therefore, attempt mapping newly when writing among the TLB when operating system, the problem of repeated matching input just can take place most probably.
The 5th may situation is that operating system has a program error or other crushing mistake takes place, and at this moment, have an effectively input among the TLB, and operating system writes to the repeated matching input among the TLB most probably also.
For avoiding producing repeated matching input in TLB, a solution is based on and allows microprocessor produce an exception message, so that operating system can be noticed this special situation.Yet, as previously mentioned, in first to fourth possibility situation, because be suitable for expected situation, so operating system can be apprised of.Moreover, because of operating system is not to carry out user program, be exactly to carry out the exception processing, thereby, if there is more exception to need operating system to handle, can reduce the performance of operating system significantly; What is more, except that rational exception condition, the exception handler of operating system also must increase code, to handle foreseeable situation.
Whether for solving aforesaid problem, the present invention provides one to be denoted as Include (Inc) signifying sign in each input of TLB, should comprise or get rid of in the comparison of virtual page address coupling in order to indicate corresponding input; If the repeated matching input is made in the affiliation of writing of TLB, then the Inc position of this coupling input just is eliminated; If unexpected situation (for example have an effectively coupling input in TLB, and its value that will write also being effective) takes place, then abandons write operation and produce an exception; On the contrary,, then can't produce exception, and can be ordered about the eliminating of coupling input, by this, avoid in follow-up TLB lookup operations, taking place the situation of repeated matching input by the Inc position of being removed if the situation that is taken place is to be expected situation.
For helping to understand the present invention and advantage thereof, aforesaid each possibility situation occurred only be the example that can attempt to write the repeated matching input in order to interpretation software in TLB; Yet, each listed situation is not to attempt to write in TLB all examples of repeated matching input in order to exhaustive software, neither be in order to enumerate soluble each problem of the present invention, should be understood that other situations that can attempt to make software write the repeated matching input in TLB also may exist, and the present invention also can be in order to solve the similar problem of not listing in herein.
Consulting Fig. 1, is the calcspar according tomicrocontroller core 100 of the present invention (Microprocessor); As shown in the figure, thismicrocontroller core 100 and MIPS Technologies, the processor core that Inc produced roughly the same, yet,microcontroller core 100 of the present invention is not limited to the product of MIPS, and also other have the microprocessor of user-programmable TLB for it.
Thismicroprocessor 100 comprises the access unit 106 (Fetchunit) in order to the access program instruction, and this programmed instruction is for the executable operations ofmicroprocessor 100; Access unit 106 is to be connected to instruction cache 104 (Instruction cache), deposits the instruction ofmicroprocessor 100 recently in order to high-speed cache.In one embodiment, instruction cache 104 comprises 4 logical 64KB high-speed caches; Access unit 106 is connected to Bus Interface Unit 116 (Bus interface unit) again, in order to by processor bus withmicroprocessor 100 interfaces other parts (for example Installed System Memory) to computing machine, access unit 106 determines the next one wants the instruction of access whether to be present in the instruction cache 104, if, then access unit 106 gets instruction in instruction cache 104, otherwise, access unit 106 requestBus Interface Units 116 access instruction on the high-speed cache between instruction cache 104 and Installed System Memory in Installed System Memory or another memory architecture; In one embodiment, the steering logic (Control logic) of access unit 106 bag instruction caches 104 usefulness, command decoder (Instruction decoder), and branch instruction predictions device (Branch instructionpredictor).
Microprocessor 100 also comprises theperformance element 102 that is connected to access unit 106, andperformance element 102 is carried out the programmed instruction that is grasped by access unit 106; In one embodiment,performance element 102 comprises that address-generation unit (Address generation unit), branch resolution unit (Branchresolution unit), the ALU in order to the actuating logic computing, shift unit (Shifter) and aligner (Aligner), integer take advantage of/remove unit (Integer multiply/divide unit), reach floating point unit (Floating point unit), and access unit 106 instructs toperformance element 102 in order to send.
Microprocessor 100 also comprises the feed-in/storage element 112 (Load/store unit) that is connected toBus Interface Unit 116 andperformance element 102, this feed-in/storage element 112 byBus Interface Unit 116, inperformance element 102, carry out the data feed-in tomicroprocessor 100 registers (Register) from Installed System Memory, and carry out from register data storing to Installed System Memory, feed-in/storage element 112 also is connected in order to high-speed cache recently by the data cache 114 ofmicroprocessor 100 data of using (Data cache); In one embodiment, data cache 114 comprises 4 logical 64KB high-speed caches, if the data of feed-in or storage are cacheable, then feed-in/storage element 112 confirms whether to have the cache line of feed-in or storage data representative in data cache 114, if, then feed-in/storage element 112 is about to data by feed-in data in the data cache 114, or with data storing to data cache 114, otherwise feed-in/storage element 112 can go reading of data in the Installed System Memory by requestBus Interface Unit 116, or with data storing to Installed System Memory; In one embodiment, feed-in/storage element 112 also can be carried out the distribution that writes of the storage data missed in data cache 114.
Microprocessor 100 also comprises the TLB108 that is connected to access unit 106 and feed-in/storage element 112, thisTLB 108 comprises in order to virtual memory address is converted to the conversion of page input high-speed cache of concrete memory address, the virtual memory address thatTLB 108 changed is produced by access unit 106 and feed-in/storage element 112, particularly,TLB 108 becomes specific address with virtual address translation, to confirm whether instruction request can be carried out or whether request of data can be carried out in data cache 114 in instruction cache 104.In one embodiment, TLB 108 is programmed by the similar system software of operating system or running onmicroprocessor 100.
In one embodiment,TLB 108 comprise in order to the micro-TLB of servo instruction high-speed cache 104 instruction, and support two micro-TLB in conjunction with TLB, this micro-TLB comprises the subclass in conjunction with TLB; In one embodiment, but this in conjunction with the 16/32/64 dual input complete shut-down connection type that comprises framework among the TLB in conjunction with TLB (Configurable 16/32/64dual-entry fully associative jointTLB), instruction micro-TLB comprises 4 input complete shut-down connection type TLB, and data micro-TLB comprises 8 input complete shut-down connection type TLB; In one embodiment, whensoftware programming TLB 108, the information that is written among theTLB 108 can be written in conjunction with among the TLB, and the micro-TLB of this moment is not that software is visual; When carry out TLB consult, so that a virtual address translation is become a specific address, the micro-TLB collection is at first by access, if not coupling input then promptly is used for virtual address translation is become specific address in conjunction with TLB in micro-TLB, lays equal stress on and fills out micro-TLB; If when coupling is not imported in conjunction with TLB, promptly produce a TLB exception.In the following description and figure, can introduce thisTLB 108 in detail.
Seeing also Fig. 2, is the calcspar ofTLB 108 of the present invention among Fig. 1;TLB 108 receives request or request to carry out a lookup operations, use confirm a virtual address mark (Tag) whether with TLB108 in the indicia matched of an existing input; In one embodiment, this consults the virtual address mark and is imported 226 specified byVPN_in input 228 and one ASID_in.
ThisVPN_in input 228 that indicates a memory pages virtual page address be also referred to as the virtual page number number (Virtual page number, VPN); If mark comparison coupling,TLB 108 is the concrete page address of output through changing and the page properties (Page attributes) of exporting this page in PgAttr_out output 254 just, wherein the concrete page address through changing is with concrete framework number (Physical frame number, PFN) representative, and this concrete page address is thevirtual page address 228 of conversion inPEN_out output 252; In one embodiment,VPN_in input 228 comprises 20.
Above-mentioned ASID_ininput 226 indicate in order to the address space marking device of indicating an address space (Address space identifier, ASID); In one embodiment, each that address space is dispensed to operation on themicroprocessor 100 in the operating system of carrying out on themicroprocessor 100 is program or task on, and specifies the ASID in this space, address; Therefore,VPN_in 228 is extended, is consulted forTLB 108 to produce an exclusive virtual address mark by ASID_in226; In one embodiment, ASID_in 226 comprise 8, in order to indicate 256 exclusive address spaces.Formula as shown in Figure 4, according toG position 436 numerical value that are stored in eachTLB 108 inputs, ASID_in226 optionally is included in indicia matched withVPN_in 228 and confirms in the program (Tag matchdetermination), describe in detail as after.
If lookup operations does not produce the indicia matched result,TLB 108 just producesTLB_refill_exception 216, so that operating system can heavily be filled out (promptly writing)TLB 108, and this heavy filling out is the input of consulting virtual address mark (not seeing among the TLB 108) conversion for indicating.The detailed description of TLB108 lookup operations is as back (particularly Fig. 7).
In addition, TLB 108 receives request, so that an input is write (as the input value oficon 222 to 236) among theTLB 108, TLB 108 also receives one and indicates the write_idx input 238 which input will be written among theTLB 108,TLB 108 receives one and indicates whether this request will write to input theTLB_write input 242 among theTLB 108,TLB 108 writes the request input and comprises mark part, significance bit, and data division, tag value is stored in the mark array block 202 (Tagarray block), and value data then is stored in the data array 204 (Data array).
TLB 108 writes request msg and comprisesPEN_in input 222 and one PgAttr_in input 224, PEN_ininput 222 wherein indicates concrete framework number (the Physical frame number that will writedata array 204, PFN) or concrete page address, PgAttr_in input 224 then also indicates the attribute that will writedata array 204 and be denoted as the memory pages ofPEN_in input 222; In one embodiment, the page properties that indicate by PgAttr_in input 224 comprise significance bit, write drive position (a Write-enable bit), in order to indicate dirty position (Dirty bit) that whether page be written into, and high-speed cache association attributes (Cache coherency attributes).
TLB 108 writes request marks and comprises and be provided at the virtual page address in theVPN_in input 228 and be provided at ASID in theASID_in input 226.
TLB 108 writes request marks and also comprisesPgMask_in input 232; In one embodiment,microprocessor 100 is supported multiple page size, and thisPgMask_in input 232 indicates a page mask value, in order to determine the page size byvirtual address 228 representatives; Write when request when receiving a TLB108,PgMask_in input 232 promptly is used as in order to assess the pointer of occurrence flag coupling whether (shown in the listed formula of Fig. 4).
TLB 108 writes request marks and also comprises one overall (whether Global, G) position input G_in236 should be included in the program of confirming indicia matched in order to indicate ASID_in 226.In one embodiment, if configureG_in 236, then ASID_in 226 will be excluded in the program outer (formula as shown in Figure 4) of indicia matched contrast.In one embodiment,G_in 236 can make operating system use the part of the virtual address space of being shared by each program.
TLB 108 writes request and also comprises significance bit input 234 (Valid bit input, Valid_in), whether effective in order to indicate the input that writes among theTLB 108, Valid_in 234 can make the input among theTLB 108 can be effectively virtual programme or make its inefficacy to the specific address converse routine, particularly, Valid_in 234 can order about operating system with the input ineffective treatment among theTLB 108.
TLB 108 also comprises adata array 204; As shown in Figure 3,data array 204 comprises the array of a storage assembly again, and each storage assembly is in order to store the part ofTLB 108 inputs.
Seeing also Fig. 3, is the calcspar ofdata array 204 among Fig. 2 of the present invention; Embodiment as shown in Figure 3,data array 204 includes 64 inputs, however the present invention does not limit TLB should have for how many inputs, but the present invention allows to have multiple input size among the TLB.As shown in the figure, each input in thedata array 204 comprises concrete framework number (PFN) 302 (also can be a concrete page address 302), and receivesPFN_in input 222 and PgAttr_in input 224 bycorresponding PFN 302 specified memorypages attribute PgAttr 304data array 204, also receives and selects (select) input 258 and write data (data_write) input 244.If data_write input 244 indications will writedata array 204, then write by select input 258PFN 302 that imports that indicated, and writePgAttr_in 304 with the numerical value in the PgAttr_in input 224 with the numerical value in the PFN_in input 222.On the contrary, if want reading ofdata array 204,data array 204 is promptly exported by select and is imported the PgAttr_out 254 that 258 specifiedPFN_out 252 onPFN 302 reach on PgAttr 30.
Seeing also Fig. 4, is the calcspar of referencenumerals group block 202 among Fig. 2 of the present invention; Markarray block 202 is themark arrays 412 that comprise a storage assembly, and each storage assembly is in order to store the part of TLB108 input.Single representational input content in Fig. 4 show tags array 412 (with the i representative) and other and the relevant assembly of eachmark array block 202 input.Though Fig. 4 only shows the logic of storage assembly and single marking array input usefulness, it is noted thatmark array 412 of the present invention is still and comprise a plurality of inputs.In one embodiment,mark array 412 comprises 64 inputs, corresponding to 64 inputs on thedata array 204 among Fig. 3 embodiment.Though TLB 108 herein narrates with a specific input number, should be not limit the number of its TLB input up to enforcement of the present invention.
Mark array 412 comprises virtual page number number (VPN) 428 (or claiming virtual page address 428),VPN 428 is the virtual addresses that store a memory pages, andconcrete page address 302 is to be stored in the corresponding input ofdata array 204 among Fig. 3 after the conversion of this memory pages.In one embodiment,VPN 428 comprises 28, and formula as shown in Figure 4, the numerical value ofVPN 428 are the situations in order to confirm whether occurrence flag is mated; Under the situation that writes TLB 108, the numerical value that is about toVPN_in input 228 writes among theVPN 428.
Mark array 412 also comprises theASID field 426 that the address space pointer bymark array 412 input is indicated, formula as shown in Figure 4, ASIDfield 426 is optionally in order to confirming that whether the situation of underlined coupling takes place, and this is confirmed to be based onG position 436 numerical value (if a TLB108 viewer program) andG position 436 numerical value (if aTLB 108 write-in programs) with G_in input 236.In one embodiment, ASID426 comprises 8, in order to indicating 256 peculiar address spaces, and inTLB 108 write-in programs, is that the numerical value withASID_in input 226 writes in theASID field 426.
Mark array 412 input also comprises page mask (PgMask)field 432, in order to store mask value, this mask value is the page size that is indicated byTLB 108 inputs in order to confirm, shown in the formula on Fig. 4, thisPgMask field 432 is as in order to whether to confirm the assessment pointer of occurrence flag matching state.In the program that writesTLB 108, be PgMask_in to be gone into 232 numerical value write in the PgMaskfield 432.
412 inputs of mark array also comprise overall (G)position 436, whether should be comprised or be got rid of outside the program of indicia matched comparison in order to indicate ASID 426, shown in the formula on Fig. 4, in one embodiment, if configureG position 436, then ASID 426 is excluded outside the program of indicia matched comparison; And in another embodiment,G position 436 can make operating system use the part of the virtual address space of being shared by each program.In the program that writes TLB 108, be that the numerical value withG_in input 236 writes in theG position 436.
412 inputs of mark array also comprisesignificance bit 434, whether effective in order to indicateTLB 108 inputs, promptly,Valid position 434 indicates whether theconcrete page address 302 that is stored in corresponding input in thedata array 204 is effective conversion of virtual address mark withpage properties 304, this virtual bit digit synbol incorresponding TLB 108 inputs be by ASID 426, VPN 428, PgMask 432, andG field 436 indicated.
412 inputs of mark array also comprise Include (Inc)position 438, andInc position 438 indicates 412 inputs of mark array and should not be included in the indicia matched comparison program.In one embodiment, if configureInc position 438, thenTLB 108 inputs should be included in the indicia matched comparison program; On the contrary, ifInc position 438 is eliminated, thenTLB 108 inputs should be excluded outside indicia matched comparison program.Though working as at this,Inc position 438 is described to a setting value, then representTLB 108 inputs should be included in the indicia matched comparison program, and when it is a removing value, then represent the TLB108 input should be excluded outside indicia matched comparison program, but should also can use the judgement opposite up to the present invention with above-mentioned condition, therefore, the present invention does not limit its enforcement actually with what condition affirmation.Moreover, among the present invention,Inc position 438 also can in conjunction with or be coded in other control fields ofTLB 108, and be not defined as position single or that separate.
Response TLB 108 resets, andInc position 438 is eliminated.(as described below) also can be removed by the true value on theclearIncMatch signal 442 in Inc position 438.In one embodiment,Inc position 438 is not that the user is visible, and on the contrary,Inc position 438 is to be hidden in the open air and bymark array block 202 to set or remove (as described below).Advantage is:Inc position 438 is beneficial to the operation of avoiding the input of repeated matching among theTLB 108, and its to avoid operation be to carry out can reduce the mode thatmicroprocessor 100 exceptions produce numbers.
Mark array block 202 also comprises at 412 each input of mark array and is connected to thelogic 402 thatmark array 412 is imported, thislogic 402 receives ASID_in 226, VPN_in 228, PgMask_in 232, Valid_in 234, G_in 236, and TLB_write 242 inputs etc.,logic 402 also withmark array 412 storage assembly fields 426,428,432,434,436, and 438 (in Fig. 4 are for corresponding ASID 456,VPN 458, PgMask 462, Valid 464,G 466, and Inc 468) output be used as input and receive.For responding these inputs,logic 402 can producelookupMatch output 444,writeDataMatch output 446, reach clearIncMatch output 442 (with reference to formula shown in Figure 4).
Mark array block 202 also comprises that for the input of eachmark array 412 one is connected to themultiplexer 404 oflogic 402, thismultiplexer 404 is received from thelookupMatch output 444 oflogic 402 in its data input, also be received from the writeDataMatchoutput 446 oflogic 402 in other data inputs,multiplexer 404 also receives the select input ofTLB_write input 242 as him.IfTLB_write input 242 is true, thenmultiplexer 404 provides to its output terminal (withmatch 246 representatives) with the value of writeDataMatchinput 446; Otherwise,multiplexer 404 provides the value oflookupMatch input 444 to match 246, wherein, writeDataMatch 446 is in order to confirm whether need to produce hardware check exception (Machine check exception) as described below, particularly with respect to Fig. 5 and shown in Figure 6.In aTLB 108 lookup operations, lookupMatch 444 finally can become select 258 and be used to selectsuitable data array 204 to input to output onPFN_out 252 and the PgAttr_out 254.
In one embodiment,multiplexer 404 can tighten tologic 402 by using TLB_write 242, by this, in aTLB 108 lookup operations, force PgMask_in 232 to 1 values, andG_in 236 to 0 values, also by Valid 434 and the assessment pointer of Valid_in 234 asTLB 108 writeoperations product match 246.
See also Fig. 2,TLB 108 also comprises themultiplexer 262 that is connected to mark array block 202.Multiplexer 262 receives thematch output 246 ofmark array block 202 on its data input pin, receive write_idx 238 on another data input pin, and receiveTLB_write input 242 on its select input end.IfTLB_write input 242 is true, thenmultiplexer 262 provides write_idx 238 numerical value to its output terminal with select 258 representatives; Otherwisemultiplexer 262 providesmatch 246 numerical value to select 258.
TLB 108 also comprises theAND door 208 of one two input end and is connected to the phase inverter 212 of data array 204.Phase inverter 212 is receivingmachine_check_exception output 214 and its output is being exported on the input end ofAND door 208 on the input end, and ANDdoor 208 receivesTLB_write signal 242 on its another input end, ANDdoor 208 produces data_write 244 outputs, in order to the input asdata array 204; Therefore, whenTLB_write 242 indication has a TLB to write demand, and this is when writing demand and can't cause the hardware check exception, will be write by the input data of the indicateddata array 204 of select 258.
TLB 108 comprises that also the exception that is connected tomark array block 202 produces logic 206.This exception produceslogic 206 and receivesTLB_write 242, write_idx 238,match 246, reaches PgAttr_out 254.Response is as Fig. 5 and input shown in Figure 6, and exception produceslogic 206 and producesmachine_check_exception 214; Response is as Fig. 5 and input shown in Figure 7, and exception produceslogic 206 and producesTLB_refill_excepetion 216; And response input as shown in Figure 7, exception produceslogic 206 and producesother exception 218 outputs.
In one embodiment, the input ofTLB 108 utilization pair of pages faces, promptly,data array 204 comprises two inputs for eachmark array 412 input, wherein, eachmark array 412 input stores a virtual address mark, and this virtual address mark indicates two virtual adjacent memory pages that can be mapped to concrete non-adjacent memory pages by operating system.
Seeing also Fig. 5, is the calcspar that showsexception generation logic 206 among Fig. 2 of the present invention, and the exception shown in it produceslogic 206 embodiment and uses with theTLB 108 of 64 inputs of a tool to cooperate.Exception produceslogic 206 and comprises aphase inverter 502 and two input ANDdoors 504 that cooperate with eachTLB 108 input.Each phase inverter 212 for correspondingTLB 108 inputs, receives the write_ide 238 among Fig. 2, and its output is exported on the input end of corresponding ANDdoor 504, and ANDdoor 504 is correspondingTLB 108 inputs on its another input end, receives thematch 246 among Fig. 2.Phase inverter 502 is to get rid of in order to writing the specified input of request byTLB 108 with ANDdoor 504, and it is machine_check_exception 214 outputs that producelogic 206 from exception that thisTLB 108 writes request.
It is theOR doors 508 that comprise one 64 input that exception produceslogic 206, and in order to receive the output of each 64AND door, exception produceslogic 206 and comprises one two input ANDdoor 506 again.ANDdoor 506 receives the output of ORdoor 508 on an one input end, its another input end then receivesTLB_write 242; And ANDdoor 506 producesmachine_check_exception 214 outputs on its output terminal.
Exception produceslogic 206 and also comprises an ORdoor 512 of 65 inputs, in order to receive all 64 match signals 246 andTLB_write 242; Exception produceslogic 206 and also comprises aphase inverter 514, in order to the output of reception ORdoor 512, and in order to produceTLB_refill_exception 216 as its output.
Seeing also Fig. 6, is to show among Fig. 1 of the present invention to write in the operation ofTLB 108 one, the process flow diagram ofmicroprocessor 100 operations.Wherein, flow process begins fromsquare 602.
Insquare 602,TLB 108 accepts a write operation request.This write operation request indicates the coefficient ofTLB 108 inputs that will be written into; In Fig. 6, the coefficient that be written into is to represent with " j ".The request that writes also indicates the numerical value that will be written into coefficient j position input among the TLB, and this numerical value that will be written into is the TLB on theinput signal 222 to 236 108 in Fig. 2.In one embodiment, be the response TLBWI or the execution of TLBWR instruction,microprocessor 100 produces to write asks toTLB 108,TLB 108 inputs how TLBWI wherein or TLBWRinstruction instruction microprocessor 100 will have by the indicated numerical value of the visible register ofmicroprocessor 100 softwares write, wherein, the coefficient ofTLB 108 inputs that are written into is by obviously indication of TLBWI instruction, as for another TLBWR instruction, then be that the coefficient ofTLB 108 inputs that will be written into is specified by the random register of a microprocessor 100.In one embodiment, random register successively decreases each time-count cycle ofmicroprocessor 100 substantially, in case when its periodic quantity reaches a numerical value on the online register, be the maximal value that it successively decreases.In one embodiment,microprocessor 100 do not provide about whichTLB 108 input be the information that need be updated (for example, which input is recent minimum being used), therefore, be marked at when disappearing among theTLB 108 when consulting, the TLBWR instruction provides one to upgrade the method thatTLB 108 imports, and deciphers and provide the coefficient that will be written intoTLB 108 inputs on write_idx 238.Then carry out square 604.
Insquare 604, thelogic 402 among Fig. 4 relatively writes request marks and each mark inmark array 412 according to the writeTagMatch formula among Fig. 4.Then carry out square 606.
Insquare 606,logic 402 will haveTLB 108 inputs of removingInc position 438 and get rid of according to the clearIncMatch of eachTLB 108 input and the writeDataMatch formula among Fig. 4.Then carry out and determine square 608.
Indefinite square 608,TLB 108 confirms the situation that whether has occurrence flag to mate by the mode whether checkclearIncMatch 442 has true value; If have, then carry out square 612; Otherwise, carry out and determine square 614.
Insquare 612, remove theInc position 438 that will have each TLB 10 input of true value on clearIncMatch 442.Then carry out and determine square 614.
Indefinite square 614,TLB 108 confirms according to the writeDataMatch formula among Fig. 4 whetherValid_in 234 is true; If not be true, then carry out square 616; Otherwise, carry out and determine square 618.
In square 616,TLB 108 writes byinput 222 to 236 numerical value that indicated inTLB 108 coefficient j, and setsInc position 438 in input j.Then flow process stops on the square 616.
Indefinite square 618,TLB 108 according to the generation logic 502-58 of machine_check_exception among the writeDataMatch formula among Fig. 4 and Fig. 5 214 output confirm, except that input j, have the matched indicia of settingInc position 438 among theTLB 108 each import in which Valid 434 be for very; If not be true, then carry out square 616; Otherwise, carry out square 622.
Insquare 622,TLB 108 abandons write operation, closesTLB 108 and produces amachine_check_exception 214 outputs; Herein, the generation ofmachine_check_exception 214 output is according to the generation logic 502-508 among Fig. 5, is to export a non-mode that is worth to the data_write 244 by thelogic 208 that produces data_write 244 and 212 to carry out and abandon write operation.Then carry out square 622.
See also Fig. 7, be show among Fig. 1 of the present invention, in TLB one lookup operations, the process flow diagram of microprocessor operation; Wherein step is to start from square 702.
In square 702,TLB 108 receives the request of a lookup operations.In typical example, access unit 106 among Fig. 1 or feed-in/storage element 112 send one and consult and ask toTLB 108, with obtain one to read/write instruction cache 104, data cache 114 or viaBus Interface Unit 116 to the instruction of Installed System Memory or the concrete page address of data; This lookup operations request be byVPN_in 228 with andASID_in 226 indicate VPN and the ASID that consults mark.The lookup operations request requiresTLB 108 to confirm that specified consulting marks whether to mate with anyTLB 108 input markings; If coupling is then exported and mated thePFN 302 andPgAttr 304 that imports, that is, the lookup operations request requiresTLB 108 to change byVPN_in 228 and reaches the virtual address thatASID_in 226 is demarcated.Then carry out square 704.
In square 704, logic among Fig. 4 402 according to lookupTagMatch formula wherein relatively consult request mark, withmark array 412 in each mark.Then carry out square 706.
In square 706,logic 402 is got rid ofTLB 108 inputs with a removingInc position 438 according to 444 formula of the lookupMatch among Fig. 4.Then carry out and determine square 708.
In definite square 708, whetherTLB 108 has the situation that a true value confirms whether occurrence flag is mated bycheck lookupMatch 444; If not be true, then carry out square 712; Otherwise, carry out square 714.
In square 712, as long as indicating the mark of being consulted viamatch 246 toexception generation logic 206,lookupMatch 444 mates with arbitrary TLB input, then exception produceslogic 206 generations one TLB_refill_exception 216.In one embodiment, if an exception is crossed inmicroprocessor 100 executeds, then exception produceslogic 206 and produce a TLB InvalidException inoutput 218, rather than produces aTLB_refill_exception 216 outputs.Then carry out square 712.
In square 714, provide select 258 todata array 204, import with the coupling in the reading of data array 204.Then carry out and determine square 716.
In definite square 716, whetherTLB 108 confirms to also have other exception conditions to take place.In one embodiment, exception produces the page properties oflogic 206 checks on PgAttr_in 224, to confirm whether still have other exception conditions to take place.In one embodiment, if aTLB 108 input markings with consult indicia matched but coupling input when invalid, then exception produceslogic 206 and can confirm to have a TLB Invalid Exception situation.In one embodiment, if aTLB 108 input markings are for effectively but not when dirty (Valid but not dirty) with consulting indicia matched and input, then exception produceslogic 206 and can confirm to have a TLB Modifed Exception situation.In this square,, then then carry out square 718 if other exceptions are still arranged; Otherwise, carry out square 722.
In square 718, exception produceslogic 206 and produce a true value on output 218.Then flow process stops on the square 718.
In square 722,data array 204 output is by the correspondingPFN 302 and thePgAttr 304 of select 258 selecteddata array 204 inputs onPFN_out 252 and the PgAttr_out 254.Then flow process stops on the square 722.
In one embodiment,microprocessor 100 also can comprise TLBP instruction, goes to detect among theTLB 108 input with consulting indicia matched in order toinstruction microprocessor 100; But, with normal lookup operations on the contrary, the coefficient that comprises the input of matched indicia among theTLB 108 is only returned in this TLBP instruction.The operation of described TLBP instruction is to be analogous to lookup operations shown in Figure 7, and different is that the specific address through changing not is output in square 722, is stored in the visible register of a software but will mate the coefficient of importing; In addition,, then can not produceTLB_refill_exception 216, but can set a present situation position, the situation of coupling is not arranged with expression at the visible present situation register of a software if the situation of coupling does not take place in square 708.
In one embodiment,microprocessor 100 also can comprise TLBR instruction, goes to read a corresponding input by the TLBR instruction coefficient value that indicates among theTLB 108 in order toinstruction microprocessor 100; Therefore, opposite with normal lookup operations is not indicate an input marking in the TLBR instruction, but the coefficient that will be read input among theTLB 108 is provided.
From the above mentioned, apparatus and method provided by the present invention can be avoided the situation of repeated matching input among the TLB, and can be reduced to the exception number of attempting the repeated matching input among the response TLB and producing effectively.And the feature that reduces exception generation number can obtain two possible advantages at least: at first, respectively can expect that situation no longer takes place because of foregoing, so exception handle code (Handler code) can be simplified; Secondly, because of when carrying out a TLB write operation, the exception that operating system must be handled reduces, so the software of carrying out on the processor of tool TLB can have the performance of remarkable lifting.
Seeing also Fig. 8, is to show that the present invention one is in order to handle the calcspar of thesystem 800 of stored routine; Thissystem 800 comprises microprocessor shown in Figure 1 100, and it is to be connected to an internal memory 802 and at least one input/output device 804, andmicroprocessor 100 comprises theTLB 108 among Fig. 1 again.Thissystem 800 can comprise a computer system, but to be not limited to be personal computer, task station computing machine, server computer, mobile computer, personal digital assistant, archives server, printing server, enterprise servers or other similar computing machines.Thissystem 800 also can comprise an embedded system (Embedded system), but to be not limited to be that box, intelligent peripheral unit, automobile embedded system, electrical equipment embedded system, a large amount of store controllers or other similar devices are gone up in the top.
Internal memory 802 comprises one in order to store for the programmed instruction ofmicroprocessor 100 utilizations and the internal memory of data, it also can comprise the internal memory of other suitable stored program instructions and data, but to be not limited to be the array mode of DRAM, SRAM, SDRAM, DDR-SDRAM, RDRAM, ROM, PROM, EPROM, EEPROM, flash memory or other similar internal memories and any internal memory.
Whether input/output device 804 comprises device, in order to the data of reception formicroprocessor 100 utilizations, but be not limited to be imported by the user actually; Input/output device 804 can comprise again in order to receiving and the result of output frommicroprocessor 100, but to be not limited to be user's output.Input/output device 804 can comprise, but non-limitation is the direct memory access controller, timer, clock, block controller, the serial port controller, port controller arranged side by side, the USB port controller, the IEEE1394 controller, scsi controller, the ATA controller, the fiber channel controller, Floppy Disk Controller, hard disk controller, drawing controller, display device, keyboard, mouse, scanner, plotter, printer, floppy disk, hard disk drive, light memory device, tape drive, digital camera, and other similarly device or combinations of each device etc.
Though the present invention and purpose thereof, feature, all open in detail with advantage, but scope of the present invention also comprises other still NM embodiment, for example, though the Include position is described with a fixed value (representing an input is to be included in the matching result) and a removing value (representative does not comprise this input), the present invention also can be modified to opposite utilization.Similarly, though the Include position is described with a single position, needed only is whether a definite input should be included in the Warning Mark in the matching result, therefore, this Warning Mark also can have more than one figure place, and can understand in other indication fields.Moreover though the present invention is to be example with an operating system that operates on microprocessor, it also can use the software of carrying out to other on microprocessor, as embedded type system software or firmware.
Though the present invention and purpose thereof, feature, all open in detail with advantage, scope of the present invention also comprises other still NM embodiment; Outside the present invention implements divided by hardware, it also can be embedded in computing machine and can use in the software of (as readable) medium (as computer-readable code, program code, instruction and data), these softwares can make apparatus and method effect described here, make, module, emulation, describe and/or test, for example, above function can be by the general procedure language (as C, C++, JAVA etc.), the GDSII database, comprise Vorilog HDL, VHDL and other hardware description language (HDL), or other available programs, database, with/or circuit (being framework) obtain instrument and realize.Aforesaid software can be arranged in any computing machine can use (as readable) medium, these media can comprise semiconductor memory, disk, CD (as CD-ROM, DVD-ROM etc.) etc., and can computer data signal be embedded in computing machine can be with in (readable) transmission medium (comprising media such as numeral, optics or simulation) as carrier wave or other.So, software can be by comprising communication network transmission such as external network and internal network.Will be appreciated that, the present invention can software implementation and can be converted into the hardware configuration of integrated circuit in making, aforesaid software can be HDL software, is used as the part of semiconductor intellectual property core (as microcontroller core), or the design of a systemic hierarchial (as SOC).Certainly, the present invention can software and the mode of hardware combinations implement.
The above is to utilize preferred embodiment to describe the present invention in detail, but not limits the scope of the invention, and those skilled in the art can both understand, suitably makes some changes and adjustment, will not break away from main idea of the present invention, does not also break away from the spirit and scope of the present invention.
In sum, specifics of the invention process has met the patent of invention important document of defined in the Patent Law, asks the auditor to be examined, and wishes to grant a patent.

Claims (55)

CNA2005100649997A2005-04-112005-04-11Improved apparatus and method for avoiding repeated matched inputting in switching side-looking bufferPendingCN1848096A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CNA2005100649997ACN1848096A (en)2005-04-112005-04-11Improved apparatus and method for avoiding repeated matched inputting in switching side-looking buffer

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CNA2005100649997ACN1848096A (en)2005-04-112005-04-11Improved apparatus and method for avoiding repeated matched inputting in switching side-looking buffer

Publications (1)

Publication NumberPublication Date
CN1848096Atrue CN1848096A (en)2006-10-18

Family

ID=37077664

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CNA2005100649997APendingCN1848096A (en)2005-04-112005-04-11Improved apparatus and method for avoiding repeated matched inputting in switching side-looking buffer

Country Status (1)

CountryLink
CN (1)CN1848096A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108132894A (en)*2017-12-232018-06-08天津国芯科技有限公司The positioning device and method of the more hit exceptions of TLB in a kind of CPU

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN108132894A (en)*2017-12-232018-06-08天津国芯科技有限公司The positioning device and method of the more hit exceptions of TLB in a kind of CPU

Similar Documents

PublicationPublication DateTitle
CN1315060C (en)Tranfer translation sideviewing buffer for storing memory type data
US10552339B2 (en)Dynamically adapting mechanism for translation lookaside buffer shootdowns
US8669992B2 (en)Shared virtual memory between a host and discrete graphics device in a computing system
TWI471730B (en)Method and apparatus to facilitate shared pointers in a heterogeneous platform
KR101563659B1 (en)Extended page size using aggregated small pages
US10423354B2 (en)Selective data copying between memory modules
US8055805B2 (en)Opportunistic improvement of MMIO request handling based on target reporting of space requirements
US20130275657A1 (en)Data storage device and operating method thereof
DE112011106013T5 (en) System and method for intelligent data transfer from a processor to a storage subsystem
KR101787851B1 (en)Apparatus and method for a multiple page size translation lookaside buffer (tlb)
CN87107293A (en) Bus interface circuit for digital data processor
US20180095892A1 (en)Processors, methods, systems, and instructions to determine page group identifiers, and optionally page group metadata, associated with logical memory addresses
CN112631961A (en)Memory management unit, address translation method and processor
CN117083599A (en)Hardware assisted memory access tracking
CN117120990A (en)Method and apparatus for transferring hierarchical memory management
CN101008922A (en)Segmentation and paging data storage space management method facing heterogeneous polynuclear system
US20140244932A1 (en)Method and apparatus for caching and indexing victim pre-decode information
CN1704912A (en)Address translator and address translation method
CN108027726B (en) A hardware mechanism for implementing atomic actions on remote processors
US20070266199A1 (en)Virtual Address Cache and Method for Sharing Data Stored in a Virtual Address Cache
CN113490921A (en)Apparatus, method and system for collecting cold pages
TWI407306B (en)Mcache memory system and accessing method thereof and computer program product
US12271327B2 (en)Device, system, and method for inspecting direct memory access requests
US20070038797A1 (en)Methods and apparatus for invalidating multiple address cache entries
CN1848096A (en)Improved apparatus and method for avoiding repeated matched inputting in switching side-looking buffer

Legal Events

DateCodeTitleDescription
C06Publication
PB01Publication
C10Entry into substantive examination
SE01Entry into force of request for substantive examination
C02Deemed withdrawal of patent application after publication (patent law 2001)
WD01Invention patent application deemed withdrawn after publication

[8]ページ先頭

©2009-2025 Movatter.jp