本發明係與處理器的暫存器有關,特別是指一種擴充RISC處理器暫存器之方法。The present invention relates to a scratchpad of a processor, and more particularly to a method of extending a RISC processor register.
在RISC(精簡指令集)處理器暫存器中,一般的指令格式中的運算元欄(register fields)是採用直接編碼(direct encoding)的編碼方式,故指令格式中的運算元欄編碼空間直接的影響了架構暫存器(architectural register)的數量多寡。然而增加運算元欄的編碼空間雖然可以改善架構暫存器的數量,但是整體的指令碼大小(code size)也會因此而增加,除此之外,增加指令格式中的運算元欄編碼空間也意味著指令長度也會因此擴增,這樣的指令在管線式指令解碼階段(pipeline decode stage)時也會變得更為複雜,進而增加處理器的電源功率消耗。如此結果,對於視功率消耗與儲存空間為重要考量的嵌入式處理器(embedded processors)而言,是一種與期望相反的結果。In the RISC (Reduced Instruction Set) processor scratchpad, the register fields in the general instruction format are encoded by direct encoding, so the operation element field in the instruction format is directly encoded. The impact of the number of architectural registers (architural registers). However, increasing the coding space of the operation element column can improve the number of architecture registers, but the overall code size will increase accordingly. In addition, the operation element column coding space in the instruction format is also increased. This means that the instruction length will also be augmented. Such instructions will become more complicated during the pipeline decode stage, which in turn increases the power consumption of the processor. As a result, for embedded processors where power consumption and storage space are important considerations, it is a result contrary to expectations.
市面上某些主流的處理器,如MIPS、ARM、Alpha、THUMB、X86、UltraSPARC以及Power等,其架構暫存器數量大多為8到32個,亦即其運算元欄的編碼空間為3到5個位元,此嚴重限制了可使用的架構暫存器的數量,也自然的成為提昇程式執行效率的一個瓶頸。Some mainstream processors in the market, such as MIPS, ARM, Alpha, THUMB, X86, UltraSPARC, and Power, have a total of 8 to 32 architecture registers, that is, the coding space of their operation element column is 3 to 5 bits, which severely limits the number of architecture registers available, and naturally becomes a bottleneck for improving program execution efficiency.
目前相關的研究與技術大多是以增加硬體的方式來改善可使用的架構暫存器數量,然而以硬體的方式改善可能會存在一些副作用,例如硬體成本的增加、硬體的設計更為複雜、實用性受限於特定平台、以及使用的彈性較低等問題。At present, most of the related research and technology are to improve the number of available architecture registers in a hardware-increasing manner. However, hardware-based improvements may have some side effects, such as increased hardware cost and hardware design. For complexity, practicality is limited by specific platforms, and the flexibility of use is low.
經由上述,綜合歸納出目前的習知技術所存在的主要問題如下:Through the above, the main problems existing in the current conventional technology are summarized as follows:
1.傳統的指令格式將會嚴重限制架構暫存器的使用。1. The traditional instruction format will severely limit the use of the architecture register.
2.有些研究與技術只適用於特定平台,實用性以及使用彈性均受到限制。2. Some research and technology are only applicable to specific platforms, and the practicality and flexibility of use are limited.
3.程式執行的效率與指令碼大小難以同時兼顧。3. The efficiency of program execution and the size of the script are difficult to balance.
本發明之主要目的在於提供一種擴充RISC處理器暫存器之方法,其可突破指令格式的限制,有效提昇架構暫存器的使用,並可應用在不同的平台且可有效的提昇程式執行的效率。The main purpose of the present invention is to provide a method for expanding a RISC processor register, which can break the limitation of the instruction format, effectively improve the use of the architecture register, and can be applied to different platforms and can effectively improve program execution. effectiveness.
為了達成前述目的,依據本發明所提供之一種擴充RISC處理器暫存器之方法,包含有下列步驟:a)對該RISC處理器的指令格式進行規劃,其中該指令格式具有對應於複數運算元的二個以上的運算元欄(register field),將該等運算元欄總共所佔的位元規劃為二個位元組合,該指令格式對應於二暫存器庫(bank),其中第一位元組合具有8個位元,其前7位元的數值用來指定第一運算元在某一該暫存器庫中的位置(0-127),第8個位元的數值則指定該第一運算元在該二暫存器庫中的哪一個暫存器庫,而第二位元組合則具有至少2個位元;b)針對不具有交換性質的運算指令定義逆向運算指令,該逆向運算指令是先將該二暫存器庫中相同位置的運算元對調後再進行運算,用以解決在該二暫存器庫中的運算元因為所儲存的暫存器庫的順序不同而會影響運算結果的問題;c)規劃一暫存器分配演算法,將該二暫存器庫內分別取出相對應的一運算元結合定義為一個節點(Node),其演算法步驟為:c1)檢查目前的節點內的二運算元與其他節點的二運算元之間是否有完全相同或部份不同的關係,若部份不同則進行c2)步驟,若完全相同則進行c3)步驟,若無上述關係則自行找出在該二暫存器庫中可存放的位置;c2)以目前的節點內的二運算元為基礎,尋找與目前的節點內的二運算元有部份相同的其他節點,並且檢查該找到的節點不同於目前節點內的運算元是否為空或存在其他的關係,若是,則使用該找到的節點並且將該目前的節點的不同運算元搬移至該找到的節點,再將該目前的節點刪除;c3)以目前的節點內的二運算元為基礎,尋找與目前的節點內的二運算元完全相同的其他節點,並且將該目前的節點予以刪除;c4)依據運算指令來判斷是否有必須修改成逆向運算的指令。藉此,可以突破指令格式的限制,提高位元數,有效提昇架構暫存器的使用。除此之外,本發明可適用於不同的平台,並且不增加指令碼的大小,有效的提升程式的執行效率。In order to achieve the foregoing object, a method for extending a RISC processor register according to the present invention comprises the steps of: a) planning an instruction format of the RISC processor, wherein the instruction format has a complex operation element; Two or more operand fields (register fields), the bits occupied by the operands are planned as a combination of two bits, and the instruction format corresponds to a second bank, wherein the first The bit combination has 8 bits, and the value of the first 7 bits is used to specify the position of the first operand in a certain register library (0-127), and the value of the 8th bit specifies the The first operand is in the register library of the second register, and the second bit combination has at least 2 bits; b) the reverse operation instruction is defined for the operation instruction having no exchange property, The reverse operation instruction first performs the operation of the operands at the same position in the two register libraries, and then solves the operation elements in the second register library because the order of the stored scratchpad banks is different. Will affect the results of the operation; c) planning a temporary The processor assigns an algorithm, and the corresponding one of the two operands is respectively defined as a node (Node), and the algorithm steps are: c1) checking the two operands and other nodes in the current node. Whether there are identical or partially different relationships between the two operands. If they are different, perform the c2) step. If they are identical, perform the c3) step. If there is no such relationship, find the second register in the second register. a location that can be stored in the library; c2) based on the two operands in the current node, looking for other nodes that are partially identical to the two operands in the current node, and checking that the found node is different from the current node Whether the operand is empty or has other relationships, and if so, the found node is used and the different operands of the current node are moved to the found node, and the current node is deleted; c3) to the current Based on the two operands in the node, look for other nodes that are identical to the two operands in the current node, and delete the current node; c4) determine whether there is a necessity according to the operation instruction Must be modified to reverse the operation of the instruction. In this way, the limitation of the instruction format can be broken, the number of bits can be increased, and the use of the architecture register can be effectively improved. In addition, the present invention can be applied to different platforms without increasing the size of the instruction code, thereby effectively improving the execution efficiency of the program.
為了詳細說明本發明之技術特點所在,茲舉以下之較佳實施例並配合圖式說明如後,其中:如第一圖所示,本發明第一較佳實施例所提供之一種擴充RISC處理器暫存器之方法,包含有下列步驟:In order to explain the technical features of the present invention in detail, the following preferred embodiments will be described with reference to the accompanying drawings, wherein, as shown in the first figure, an extended RISC processing provided by the first preferred embodiment of the present invention is provided. The method of the scratchpad includes the following steps:
a)如第一圖所示,對該RISC處理器的指令格式(10)進行規劃,其中指令格式(10)係為R-TYPE指令格式,具有對應於複數運算元Rd、Rs、Rt的三個運算元欄(register field)(11),該等運算元欄(11)分別對應於實體的暫存器,將該等運算元欄(11)總共所佔的位元數(共有15位元)規劃為二個位元組合(12)(13),該指令格式(10)對應於二暫存器庫(bank)(15)(16),其中第一位元組合(12)具有8個位元,其前7位元的數值用來指定第一運算元欄Rd在某一該暫存器庫中的位置(0-127),第8個位元的數值則指定該第一運算元欄Rd在該二暫存器庫(15)(16)中的哪一個暫存器庫,而第二位元組合(13)則具有7個位元,用來指定運算元欄Rs及運算元欄Rt在暫存器庫中的位置(例如0-127共128個位置),其中運算元欄Rs位於第一暫存器庫(15)中,運算元欄Rt位於第二暫存器庫(16)中。各該運算元欄(11)用以存放運算元變數。a) As shown in the first figure, the instruction format (10) of the RISC processor is planned, wherein the instruction format (10) is an R-TYPE instruction format having three corresponding to the complex operation elements Rd, Rs, and Rt. a register field (11), the operand fields (11) respectively corresponding to the entity's register, the total number of bits occupied by the operation element column (11) (a total of 15 bits) The plan is a combination of two bits (12) (13) corresponding to a second bank (15) (16), wherein the first bit combination (12) has eight The value of the first 7 bits of the bit is used to specify the position of the first operation element column Rd in a certain register library (0-127), and the value of the 8th bit specifies the first operation element. The column Rd is in the register library of the two register libraries (15) (16), and the second bit combination (13) has 7 bits for specifying the operation element column Rs and the operation element. The position of the column Rt in the scratchpad library (for example, 128 positions of 0-127), wherein the operation element column Rs is located in the first register library (15), and the operation element column Rt is located in the second register library ( 16) Medium. Each of the operand fields (11) is used to store operand variables.
b)針對不具有交換性質的運算指令定義逆向運算指令。由於針對具有交換性質的運算指令(例如加法指令add)而言,其運算元變數所在的暫存器庫(15)(16)的順序並不會影響最後的執行結果(如第二圖所示),而針對不具有交換性質的運算指令(例如減法指令sub)而言,其運算元變數所在的暫存器庫(15)(16)的順序則會對最後的執行結果造成影響(如第三圖所示),因此該逆向運算指令是先將該二暫存器庫(15)(16)中相同位置的運算元變數對調後再進行運算,用以解決在該二暫存器庫(15)(16)中的運算元變數因為所儲存的暫存器庫的順序不同而會影響運算結果的問題。b) Define a reverse operation instruction for an operation instruction that does not have an exchange property. Since the order of the scratchpad library (15)(16) in which the operand variable is located is not affected by the operation result of the swapping property (for example, the add instruction add) (as shown in the second figure) ), but for no exchangeFor a logical operation instruction (such as the subtraction instruction sub), the order of the register library (15) (16) in which the operand variable is located will affect the final execution result (as shown in the third figure). The reverse operation instruction first performs the operation of the operand variables of the same position in the two register stores (15) and (16), and then performs operations on the second register library (15) (16). The operand variable affects the result of the operation because of the different order of the stored scratchpad libraries.
c)如第四圖至第五圖所示,規劃一暫存器分配演算法,將該二暫存器庫(15)(16)內分別取出相對應的一運算元變數結合定義為一個節點(Node),並由不同位置定義出對應的節點,並且由該等節點構成一節點關聯圖(Node-Dependence Graph,NDG)。其中,針對R-TYPE的指令格式(10),若存在某運算元欄Rd的運算元變數原先並不存在於該節點關聯圖中,則新增一個只存在該運算元變數的節點給它,而除了此種狀況之外,其餘的節點主要都是由運算元欄Rs與Rt所存的運算元變數所構成,因此一般的節點中都存在二個運算元變數。而各該節點內的二運算元變數的左右位置,視為存放在該二暫存器庫(15)(16)的順序,位於左邊的視為存放在第一暫存器庫(15),位於右邊的則視為存放在第二暫存器庫(16)。接著將該等節點進行分類,依節點之間的關係,分為如第四圖所示的部份不同關係的節點(D-Node),以及如第五圖所示的完全相同關係的節點(G-Node),其中部份不同關係的節點(D-Node)表示此節點的二運算元變數與其他節點的運算元變數有部份不同,而完全相同關係的節點(G-Node)則表示此節點與其他節點的二運算元變數是相同的。c) As shown in the fourth to fifth figures, a temporary register allocation algorithm is planned, and the corresponding one of the operand variables is respectively defined in the two registers (15) and (16) as one node. (Node), and corresponding nodes are defined by different locations, and a Node-Dependence Graph (NDG) is formed by the nodes. Wherein, for the instruction format (10) of the R-TYPE, if there is an operation element variable of an operation element column Rd that does not originally exist in the node association diagram, a node having only the operation element variable is added to it. Except for this situation, the other nodes are mainly composed of the operand variables stored in the operation element columns Rs and Rt. Therefore, there are two operand variables in the general node. The left and right positions of the two operand variables in each node are regarded as being stored in the order of the two register libraries (15) (16), and the left side is regarded as being stored in the first register library (15). On the right is considered to be stored in the second register library (16). Then, the nodes are classified, and according to the relationship between the nodes, the nodes (D-Node) which are partially different in relationship as shown in the fourth figure, and the nodes of the identical relationship as shown in the fifth figure ( G-Node), in which some nodes of different relationships (D-Node) indicate that the two operand variables of this node are partially different from the operand variables of other nodes, and the nodes of the identical relationship (G-Node) represent This node andThe two operand variables of the other nodes are the same.
除此之外,對於每一個部份不同關係的節點(D-Node),還要找出使用到其運算元變數的下一個節點,並以實線及箭頭來表示此種相依關係;又,對於每一個完全相同關係的節點(G-Node),則在其彼此間以虛線來表示此關係,並且計算出此完全相同關係的加權值,並標記該加權值於第一個完全相同關係的節點(G-Node),並且找出第一個完全相同關係的節點(G-Node)與最後一個完全相同關係的節點(G-Node)是否與其他的部份不同關係的節點(D-Node)有部份相依的關係,若存在部份相依的關係,則以實線及箭頭來表示,上述的實線、虛線以及箭頭的關係也顯示於第四圖至第五圖中。In addition, for each node with a different relationship (D-Node), it is necessary to find the next node using its operand variable, and express the dependency relationship by solid line and arrow; For each node (G-Node) with the exact same relationship, the relationship is represented by a broken line between them, and the weighted value of the identical relationship is calculated, and the weighted value is marked in the first identical relationship. A node (G-Node), and find out whether the node (G-Node) of the first identical relationship and the node (G-Node) of the last identical relationship are different from the other parts (D-Node). There is a partial dependence relationship. If there is a partial dependent relationship, it is indicated by a solid line and an arrow. The relationship between the above solid line, the dotted line and the arrow is also shown in the fourth to fifth figures.
該暫存器分配演算法步驟為:The register allocation algorithm steps are:
c1)檢查目前的節點內的二運算元變數與其他節點的二運算元變數之間是否有完全相同(G-Node)或部份不同(D-Node)的關係,若部份不同則進行c2)步驟,若完全相同則進行c3)步驟,若無上述關係(例如完全不同)則自行找出在該二暫存器庫(15)(16)中可存放的位置。C1) Check whether there is an identical (G-Node) or partial (D-Node) relationship between the two operand variables in the current node and the two operand variables of other nodes. If the difference is partially, c2 is performed. Steps, if they are identical, proceed to step c3). If there is no such relationship (for example, completely different), find the location that can be stored in the two registers (15) (16).
c2)以目前的節點內的二運算元為基礎,尋找與目前的節點內的二運算元變數有部份相同的其他節點(D-Node),並且檢查所找到的節點不同於目前節點的運算元變數是否為空或存在其他的關係,若是,則使用該找到的節點並且將該目前的節點的不同運算元變數搬移至該找到的節點,再將該目前的節點刪除。C2) based on the two operands in the current node, find other nodes (D-Nodes) that are partially identical to the two operand variables in the current node, and check that the found node is different from the current node. Whether the metavariant is empty or has other relationships, and if so, the found node is used and the different operand variables of the current node are moved to the found node, and the current node is deleted.
c3)以前目的節點內的二運算元變數為基礎,尋找與目前的節點內的二運算元變數完全相同的其他節點(G-Node),並且將該目前的節點予以刪除。C3) Based on the two operand variables in the previous destination node, look for other nodes (G-Nodes) that are identical to the two operand variables in the current node, and delete the current node.
c4)依據運算指令來判斷是否有必須修改成逆向運算的指令。C4) According to the operation instruction, it is judged whether there is an instruction that must be modified into a reverse operation.
接下來以第六圖說明如何以NDG(節點關聯圖)來作為分配暫存器的依據,並也說明何時應該使用逆向運算指令。Next, the sixth figure shows how to use NDG (node association graph) as the basis for allocating the scratchpad, and also explains when the reverse operation instruction should be used.
在NDG建構完畢後,執行步驟i1時,可直接分配運算元變數至指定的暫存器庫(15)(16)的位置。After the NDG is constructed, when step i1 is executed, the operand variable can be directly allocated to the location of the specified register library (15) (16).
在執行步驟i2時,由於部份不同關係的節點(D-Node)中的A存在部份相依的關係,故先檢查前一個用A的節點,其右邊的運算變數位置是否為空或者不存在部份相依的關係,若符合上述條件,則將X搬移至此節點的右邊運算元變數位置,並且將該NDG中的相依實線箭頭與原節點一併刪除。除此之外,還須檢查是否有必要將此運算轉換成逆向運算的指令。When step i2 is executed, since some of the nodes in different relationships (D-Node) have a partial dependency relationship, the node with the previous A is checked first, and the operation variable position on the right side is empty or does not exist. Partially dependent relationship, if the above conditions are met, X is moved to the right operand variable position of the node, and the dependent solid arrow in the NDG is deleted along with the original node. In addition to this, it is necessary to check whether it is necessary to convert this operation into an instruction of reverse operation.
在執行步驟i3時,由於其屬於完全相同關係的節點(G-Node),故無需作搬移的動作,只需檢查此運算是否具有交換性以及完全相同關係的節點(G-Node)中的運算元變數所在的暫存器庫(15)(16)位置是否相同,根據上述的條件,判斷是否有必要將此指令改為逆向運算的指令。由於本例中步驟i3為加法運算具有交換性,故無需改變成逆向運算指令,只需將完全相同關係的節點(G-Node)的加權值減1,並將此虛線關係與步驟i3所產生出來的節點刪除即可,此時,由於完全相同關係的節點(G-Node)的加權值因前一次的相減已變成1,故此完全相同關係的節點(G-Node)可轉換成部份不同關係的節點(D-Node)以方便之後的運算。When step i3 is executed, since it belongs to a node (G-Node) having the exact same relationship, there is no need to perform the movement, and it is only necessary to check whether the operation is exchangeable and the operation in the node (G-Node) of the identical relationship. Whether the location of the scratchpad library (15) (16) where the meta variable is located is the same, and according to the above conditions, it is judged whether it is necessary to change the instruction to the instruction of the reverse operation. Since step i3 in this example is interchangeable for the addition operation, there is no need to change to a reverse operation instruction, and only the node of the identical relationship (G-Node) is required.The weighted value is decremented by 1, and the dotted relationship is deleted from the node generated in step i3. At this time, since the weight of the node (G-Node) of the identical relationship has become 1 due to the previous subtraction, Therefore, the nodes of the identical relationship (G-Node) can be converted into nodes of different relationships (D-Node) to facilitate subsequent operations.
在執行步驟i4時,依據部份不同關係的節點(D-Node)的部份相依特性找到了前一個使用到運算元變數C的節點,此時,檢查此節點的左邊運算元變數位置是否為空,或者不存在部份相依的關係,若符合上述條件,則將運算元變數E搬移至原本運算元變數B的位置,並且將該NDG中的相依實線頭與原節點一併刪除,除此之外,還需檢查是否有必要將此運算轉換成逆向運算的指令。When step i4 is executed, the node that uses the operand variable C is found according to the partial dependency characteristic of the node (D-Node) of the different relationship. At this time, it is checked whether the position of the left operand variable of the node is Empty, or there is no partial dependent relationship. If the above conditions are met, the operand variable E is moved to the position of the original operand variable B, and the dependent solid line header in the NDG is deleted together with the original node, except In addition to this, it is also necessary to check whether it is necessary to convert this operation into an instruction of reverse operation.
在執行步驟i5時,依據部份不同關係的節點(D-Node)的部份相依特性找到了前一個使用到運算元變數A的節點,此時,檢查此節點右邊運算元變數位置是否為空,或者不存在部份相依的關係,若符合上述條件,則將運算元變數D搬移至原本運算元變數X的位置,並且將該NDG中的相依實線箭頭與原節點一併刪除,除此之外,還需檢查是否有必要將此運算轉換成逆向運算的指令。在此我們發現步驟i5的運算為不具交換性的減法運算,且節點中的運算元變數所在的不相同在的暫存器庫(15)(16)位置與指令中,故步驟i5的運算指令比須改成逆向運算的指令Rsub。When step i5 is executed, the node that uses the operand variable A is found according to the partial dependency characteristic of the node (D-Node) of some different relationships. At this time, it is checked whether the position of the operand variable on the right side of the node is empty. Or there is no partial dependent relationship. If the above conditions are met, the operand variable D is moved to the position of the original operand variable X, and the dependent solid arrow in the NDG is deleted together with the original node. In addition, you need to check if it is necessary to convert this operation into an instruction for reverse operation. Here, we find that the operation of step i5 is a non-commutative subtraction operation, and the operand variables in the node are different in the location of the register library (15) (16) and the instruction, so the operation instruction of step i5 It is more than the instruction Rsub that must be changed to the reverse operation.
藉由上述步驟中所揭露的對於指令格式(10)的規劃,以及定義逆向運算指令,以及暫存器分配演算法,即可此在不改變實體暫存器的結構之下,對原有的暫存器的分配進行重新規劃,而把可使用的暫存器數量從原來的3到5個位元(即8-32個)大幅度的增加到7個位元(128個),進而達到了擴充暫存器的效果。而藉由第六圖的說明也可以了解整個NDG透過節點的方式來執行的過程,進而完整的了解本發明的實施方式。With the planning of the instruction format (10) disclosed in the above steps,And defining a reverse operation instruction, and a scratchpad allocation algorithm, which can re-plan the allocation of the original register without changing the structure of the physical register, and use the temporary register The number has increased from 3 to 5 bits (ie, 8-32) to 7 bits (128), which has the effect of expanding the scratchpad. The process performed by the entire NDG through the node can also be understood by the description of the sixth figure, and the embodiment of the present invention is completely understood.
請再參閱第七圖,本發明第二較佳實施例所提供之一種擴充RISC處理器暫存器之方法,在步驟上均概同於前揭實施例,不同之處在於本第二實施例是針對I-TYPE指令格式(20),接下來針對與第一實施例不同處加以說明:如第七圖所示,在步驟a)中的指令格式係為I-TYPE,具有對應於複數運算元欄Rs、Rt的二個運算元欄(21)(register field),該等運算元欄(21)分別對應於實體的暫存器,將該等運算元欄(21)總共所佔的位元數(共有10位元)規劃為二個位元組合(22)(23),其中第一位元組合(22)具有8個位元,其前7位元的數值用來指定第一運算元欄Rs在某一該暫存器庫中的位置(0-127),第8個位元的數值則指定該第一運算元欄Rs在該二暫存器庫(25)(26)中的哪一個暫存器庫,而第二位元組合(23)則具有2個位元,其中第1個位元(Rt O)用來指定運算元欄Rt相對於運算元欄Rs的位移方向,第二個位元(Rt)則是用來指定運算元欄Rt的位移量。Please refer to the seventh figure, a method for extending the RISC processor register provided by the second preferred embodiment of the present invention, which is similar to the foregoing embodiment in the steps, except that the second embodiment For the I-TYPE instruction format (20), the following is explained for the difference from the first embodiment: as shown in the seventh figure, the instruction format in the step a) is I-TYPE, which has a function corresponding to the complex operation. The two operand fields (21) of the meta-fields Rs and Rt correspond to the register of the entity, and the bits occupied by the operands (21) in total are occupied. The number of elements (a total of 10 bits) is planned as a combination of two bits (22) (23), wherein the first bit combination (22) has 8 bits, and the first 7 bits of the value are used to specify the first operation. The position of the meta column Rs in a certain register library (0-127), the value of the eighth bit specifies that the first operation element column Rs is in the second register library (25) (26) Which of the scratchpad libraries, and the second bit combination (23) has 2 bits, wherein the first bit (Rt O) is used to specify the direction of displacement of the operation element column Rt relative to the operation element column Rs ,the second Element (Rt) is used to specify the amount of displacement of the operand field of Rt.
接下來同樣以第七圖來說明運算元欄Rs與運算元欄Rt之間的關係,在此圖中運算元欄Rs位於第一暫存器庫(25)(26)的位置6(0000110),則運算元欄Rt的可能所在位置有下列三種狀態:Next, the seventh diagram is used to illustrate the operation element column Rs and the operation element column.The relationship between Rt, in this figure, the operation element column Rs is located at the position 6 (0000110) of the first register library (25) (26), and the possible positions of the operation element column Rt have the following three states:
(1)第二位元組合的第二位元(Rt)=0,則運算元欄Rt的存放位置與運算元欄Rs在暫存器庫(25)(26)中所存放的位置相同,但是所存放的暫存器庫(25)(26)是不同的,如圖中的Rt(2)。(1) The second bit (Rt) of the second bit combination is 0, and the storage position of the operation element column Rt is the same as the position of the operation element column Rs stored in the register library (25) (26). However, the stored scratchpad library (25) (26) is different, as shown in the figure Rt (2).
(2)第二位元組合的第二位元(Rt)=1,且第一位元(Rt O)=0,則運算元欄Rt的存放位置為相對於運算元欄Rs的存放位置減1,如圖中的Rt(1)。(2) The second bit of the second bit combination (Rt)=1, and the first bit (Rt O)=0, the storage position of the operation element column Rt is reduced relative to the storage position of the operation element column Rs. 1, as shown in the figure Rt (1).
(3)第二位元組合的第二位元(Rt)=1,且第一位元(Rt O)=1,則運算元欄Rt的存放位置為相對於運算元欄Rs的存放位置加1,如圖中的Rt(3)。(3) The second bit of the second bit combination (Rt)=1, and the first bit (Rt O)=1, the storage position of the operation element column Rt is relative to the storage position of the operation element column Rs. 1, as shown in the figure Rt (3).
本第二實施例的其餘步驟均概同於前揭第一實施例,容不贅述。The remaining steps of the second embodiment are the same as those of the first embodiment, and are not described here.
接下來以第八圖說明本第二實施例在I-TYPE指令格式中,如何以NDG(節點關聯圖)來作為分配暫存器的依據,並且也說明何時應該使用逆向運算指令。在此圖例中,我們假設了分配暫存器時,各步驟所分配的節點位置如暫存器庫(25)(26)(register bank)所示。Next, the eighth embodiment will explain how the NDG (Node Association Diagram) is used as the basis for allocating the scratchpad in the I-TYPE instruction format, and also explains when the reverse operation instruction should be used. In this illustration, we assume that when allocating a scratchpad, the node locations assigned to each step are shown in the register bank (25) (26).
在NDG建構完畢後,執行步驟i1與i2時,依據NDG分配每個節點的位置如下方所對應的暫存器庫(25)(26)。After the NDG is constructed, when steps i1 and i2 are executed, the location of each node according to the NDG is allocated to the temporary register library (25) (26) corresponding to the following.
在執行步驟i3時,依據部份不同關係的節點(D-Node)的部份相依特性找到了前一個使用到運算元變數A的節點,此時,發現此節點的右邊運算元變數B,其存在部份相依的關係,即之後有其他指令要用到此運算元變數B,故不可取代之,因此退而求其次,檢查暫存器庫(25)(26)中相對於運算元變數B的前一個位置與後一個位置是否為空,或者存在其他部份相依的關係,若前一個位置或後一個位置為空,或不存在其他部份相依的關係,即可利用其中一個位置來當作步驟i3運算元變數C所需要的搬移位置。在這個例子當中,我們發現在第二暫存器庫(25)(26)的位置0與第二暫存器庫(25)(26)的位置2為可使用的位置,故在此選擇第二暫存器庫(25)(26)的位置0來當作運算元變數C的搬移位置,之後,再適當的針對運算元變數C來修改NDG中的節點與其相依關係。When step i3 is executed, the previous section using the operand variable A is found according to the partial dependency characteristics of the nodes (D-Node) of some different relationships.Point, at this time, find the right side of this node variable B, which has a partial dependent relationship, that is, there are other instructions to use this operand variable B, so it can not be replaced, so the next best, check temporarily Whether the previous position and the subsequent position of the operand variable B in the bank (25) (26) are empty or there is another partial dependency relationship, if the previous position or the next position is empty, or not There are other partially dependent relationships, and one of the positions can be used as the moving position required for the step variable i3 in step i3. In this example, we found that the position 0 of the second register library (25) (26) and the position 2 of the second register library (25) (26) are usable positions, so here is the selection. The position 0 of the second register library (25) (26) is taken as the shift position of the operand variable C, and then the node in the NDG is modified accordingly for the operand variable C.
由上述二實施例可知,本發明所可達成之功效在於:可以突破指令格式的限制,提高位元數,有效提昇架構暫存器的使用。除此之外,本發明可適用於不同的平台,並且不增加指令碼的大小,有效的提升程式的執行效率。It can be seen from the above two embodiments that the achievable effect of the present invention is that the limitation of the instruction format can be exceeded, the number of bits can be increased, and the use of the architecture register can be effectively improved. In addition, the present invention can be applied to different platforms without increasing the size of the instruction code, thereby effectively improving the execution efficiency of the program.
(10)‧‧‧指令格式(10) ‧ ‧ directive format
(11)‧‧‧運算元欄(11)‧‧‧Operational element
(12)‧‧‧第一位元組合(12) ‧‧‧ first bit combination
(13)‧‧‧第二位元組合(13) ‧‧‧ second bit combination
(15)‧‧‧第一暫存器庫(15) ‧‧‧First Scratchpad Library
(16)‧‧‧第二暫存器庫(16) ‧‧‧Second register library
(20)‧‧‧指令格式(20) ‧ ‧ directive format
(21)‧‧‧運算元欄(21)‧‧‧Architectum
(22)‧‧‧第一位元組合(22) ‧‧‧ first bit combination
(23)‧‧‧第二位元組合(23) ‧‧‧ second bit combination
(25)‧‧‧第一暫存器庫(25) ‧‧‧First Scratchpad Library
(26)‧‧‧第二暫存器庫(26)‧‧‧Second register library
第一圖係本發明第一較佳實施例之指令格式示意圖。The first figure is a schematic diagram of an instruction format of a first preferred embodiment of the present invention.
第二圖係本發明第一較佳實施例之運算示意圖,顯示針對具有交換性質運算指令的運算狀態。The second drawing is a schematic diagram of the operation of the first preferred embodiment of the present invention, showing an operational state for an instruction having an exchange property.
第三圖係本發明第一較佳實施例之運算示意圖,顯示針對不具有交換性質運算指令的運算狀態。The third figure is a schematic diagram of the operation of the first preferred embodiment of the present invention, showing an operation state for an operation instruction that does not have an exchange property.
第四圖係本發明第一較佳實施例之示意圖,顯示節點(D-NODE)的狀態。The fourth figure is a schematic view of a first preferred embodiment of the present invention showing the state of the node (D-NODE).
第五圖係本發明第一較佳實施例之示意圖,顯示節點(G-NODE)的狀態。Figure 5 is a schematic view of a first preferred embodiment of the present invention showing the state of a node (G-NODE).
第六圖係本發明第一較佳實施例之示意圖,顯示以NDG來分配暫存器以及運算的狀態。Figure 6 is a schematic diagram of a first preferred embodiment of the present invention showing the state in which the scratchpad is allocated by the NDG and the operation.
第七圖係本發明第二較佳實施例之指令格式示意圖。以及第八圖係本發明第二較佳實施例之示意圖,顯示以NDG來分配暫存器以及運算的狀態。Figure 7 is a diagram showing the format of an instruction of the second preferred embodiment of the present invention. And an eighth diagram showing a second preferred embodiment of the present invention, showing the state in which the scratchpad is allocated by the NDG and the operation.
(10)...指令格式(10). . . Instruction format
(11)...運算元欄(11). . . Operational element bar
(12)...第一位元組合(12). . . First meta combination
(13)...第二位元組合(13). . . Second bit combination
(15)...第一暫存器庫(15). . . First register library
(16)...第二暫存器庫(16). . . Second register library
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW98137487ATWI420390B (en) | 2009-11-04 | 2009-11-04 | Methods for extending the RISC processor scratchpad |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW98137487ATWI420390B (en) | 2009-11-04 | 2009-11-04 | Methods for extending the RISC processor scratchpad |
| Publication Number | Publication Date |
|---|---|
| TW201117093A TW201117093A (en) | 2011-05-16 |
| TWI420390Btrue TWI420390B (en) | 2013-12-21 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW98137487ATWI420390B (en) | 2009-11-04 | 2009-11-04 | Methods for extending the RISC processor scratchpad |
| Country | Link |
|---|---|
| TW (1) | TWI420390B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW200428288A (en)* | 2003-03-06 | 2004-12-16 | Northrop Grumman Corp | Direct instructions rendering emulation computer technique |
| US20070038984A1 (en)* | 2005-08-12 | 2007-02-15 | Gschwind Michael K | Methods for generating code for an architecture encoding an extended register specification |
| TW200821919A (en)* | 2006-03-02 | 2008-05-16 | Ibm | Method, system and program product for SIMD-oriented management of register maps for map-based indirect register-file access |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW200428288A (en)* | 2003-03-06 | 2004-12-16 | Northrop Grumman Corp | Direct instructions rendering emulation computer technique |
| US20070038984A1 (en)* | 2005-08-12 | 2007-02-15 | Gschwind Michael K | Methods for generating code for an architecture encoding an extended register specification |
| TW200821919A (en)* | 2006-03-02 | 2008-05-16 | Ibm | Method, system and program product for SIMD-oriented management of register maps for map-based indirect register-file access |
| Publication number | Publication date |
|---|---|
| TW201117093A (en) | 2011-05-16 |
| Publication | Publication Date | Title |
|---|---|---|
| TWI740851B (en) | Data processing apparatus, method and computer program for vector load instruction | |
| JP6971220B2 (en) | Equipment and methods for performing splice operations | |
| TWI507980B (en) | Optimizing register initialization operations | |
| JP5471082B2 (en) | Arithmetic processing device and control method of arithmetic processing device | |
| TWI713594B (en) | Vector data transfer instruction | |
| TWI713589B (en) | Handling exceptional conditions for vector arithmetic instruction | |
| JP5612148B2 (en) | Suppressing branch misprediction behavior in zero predicate branch misprediction | |
| JP2023051994A (en) | Systems and methods for implementing chained tile operations | |
| CN107851013B (en) | Data processing apparatus and method | |
| JP2012529096A (en) | Data processing apparatus and method for handling vector instructions | |
| JP6616608B2 (en) | Semiconductor device | |
| KR100817056B1 (en) | Branch history length indicator, branch prediction system and branch prediction method | |
| JP2017538213A (en) | Method and apparatus for implementing and maintaining a stack of predicate values using stack synchronization instructions in an out-of-order hardware software co-design processor | |
| RU2583744C2 (en) | Device and method for binding operations in memory | |
| Thomas et al. | Improving branch prediction by dynamic dataflow-based identification of correlated branches from a large global history | |
| US20210019149A1 (en) | Detecting a dynamic control flow re-convergence point for conditional branches in hardware | |
| TW201737075A (en) | Complex multiply instruction | |
| JP2013080497A (en) | Sliding-window, block-based branch target address cache | |
| JP6807073B2 (en) | Dynamic memory contention detection with fast vector | |
| CN105993000A (en) | Processor and method for floating point register aliasing | |
| JP2006506735A (en) | Pipeline processor method and circuit | |
| CN101226506A (en) | Lockdown control of a multi-way set associative cache memory | |
| US20190138308A1 (en) | Unaligned memory accesses | |
| TWI420390B (en) | Methods for extending the RISC processor scratchpad | |
| JP4444305B2 (en) | Semiconductor device |
| Date | Code | Title | Description |
|---|---|---|---|
| MM4A | Annulment or lapse of patent due to non-payment of fees |