CN102999441B

Movatterモバイル変換

Info

Publication number: CN102999441B
Application number: CN201210460512.7A
Authority: CN
Inventors: 汪东升; 高鹏; 王海霞
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2012-11-15
Filing date: 2012-11-15
Publication date: 2015-06-17
Anticipated expiration: 2032-11-15
Also published as: CN102999441A

Abstract

Translated fromChinese

本发明涉及计算机系统结构技术领域，公开了一种细粒度内存访问的方法。本发明通过在字节这一级别标识被修改的数据和零值数据来避免无效传输，因此降低了高速缓存数据区到内存的带宽占用，降低了额外写入的开销；另外，对于写损耗的存储器件，该方法可以减少其平均写入次数，延长其寿命，同时降低功耗。

The invention relates to the technical field of computer system structure and discloses a fine-grained memory access method. The present invention avoids invalid transmission by identifying modified data and zero-value data at the byte level, thereby reducing the bandwidth occupation from the cache data area to the memory, and reducing the overhead of additional writing; in addition, for the loss of writing For a storage device, the method can reduce its average write times, prolong its life, and reduce power consumption at the same time.

Description

Translated fromChinese

一种细粒度内存访问的方法A method for fine-grained memory access

技术领域technical field

本发明涉及计算机系统结构技术领域，特别是涉及一种细粒度内存访问的方法。The invention relates to the technical field of computer system structure, in particular to a fine-grained memory access method.

背景技术Background technique

计算机内存的性能提升速度远远落后于处理器性能提升的速度。相对于处理器来说，内存访问延迟以每十年5倍的速度增长，这种系统结构的失衡，形成了阻碍处理器性能提升的“存储墙”，从而使得内存系统成为整个计算机系统的性能瓶颈之一。为了解决这一问题，很多新的内存技术被提出来，细粒度内存访问就是其中之一。细粒度内存访问可以精确控制每一片存储芯片，还可以避免额外的读写，节省带宽。The rate at which computer memory improves performance lags far behind the rate at which processor performance improves. Compared with the processor, the memory access delay increases at a rate of 5 times every ten years. The imbalance of this system structure forms a "storage wall" that hinders the performance improvement of the processor, thus making the memory system become the performance of the entire computer system. One of the bottlenecks. In order to solve this problem, many new memory technologies have been proposed, and fine-grained memory access is one of them. Fine-grained memory access can precisely control each memory chip, avoid additional read and write, and save bandwidth.

目前的细粒度内存访问机制集中在DRAM（动态随机存取存储器）实现上，目的是为了更好地在多核处理器环境下挖掘空间局部性来提高内存访问的效率，效果都不理想。而对于NAND-FLASH、相变内存等具有写损耗的器件，现有的内存访问方法都不能减少其损耗。The current fine-grained memory access mechanism focuses on the implementation of DRAM (Dynamic Random Access Memory), the purpose is to better tap the spatial locality in the multi-core processor environment to improve the efficiency of memory access, but the effect is not satisfactory. However, for devices with write loss such as NAND-FLASH and phase-change memory, none of the existing memory access methods can reduce the loss.

发明内容Contents of the invention

（一）要解决的技术问题(1) Technical problems to be solved

本发明首先要解决的技术问题是：如何避免内存访问过程中的无效传输。The first technical problem to be solved by the present invention is: how to avoid invalid transmission in the memory access process.

（二）技术方案(2) Technical solutions

为了解决上述技术问题，本发明提供一种细粒度内存访问的方法，包括以下步骤：In order to solve the above technical problems, the present invention provides a method for fine-grained memory access, comprising the following steps:

S1、按照如下方式定义细粒度高速缓存脏位图：所述细粒度高速缓存脏位图使用一个或多个比特位标识高速缓存数据区的一行中的一个或多个8比特存储单元的内容是否与读入时的初始值不同；所述高速缓存数据区是没有写损耗或者具有写损耗的存储器件的数据区；S1. Define the fine-grained cache dirty bitmap as follows: the fine-grained cache dirty bitmap uses one or more bits to identify whether the content of one or more 8-bit storage units in a row of the cache data area is Different from the initial value when reading in; the cache data area is a data area of a storage device without write loss or with write loss;

S2、按照如下方式定义零值位图：所述零值位图使用一个或多个比特位标识内存中的一个或多个8比特存储单元内的数据是否为零；S2. Define the zero-value bitmap as follows: the zero-value bitmap uses one or more bits to identify whether the data in one or more 8-bit storage units in the internal memory is zero;

S3、按照如下方式定义内存行：在内存中，多个易失或非易失存储芯片通过共享读写地址来增加每个地址上能存储的数据，每个读写地址对应的存储空间为一个内存行，所述内存行由8个或更多个1字节位宽的存储芯片构成；所述内存是具有所述没有写损耗或者具有写损耗的存储器件的内存；S3. Define the memory row as follows: In the memory, multiple volatile or non-volatile memory chips increase the data that can be stored at each address by sharing the read-write address, and the storage space corresponding to each read-write address is one A memory row, the memory row is composed of 8 or more memory chips with a bit width of 1 byte; the memory is a memory with the storage device without write loss or with write loss;

S4、利用所述细粒度高速缓存脏位图实现高速缓存数据区的读写；S4. Using the fine-grained cache dirty bitmap to read and write the cache data area;

S5、利用所述零值位图和内存行实现内存的读写。S5. Using the zero-value bitmap and the memory row to read and write the memory.

优选地，步骤S4具体为：Preferably, step S4 is specifically:

当所述高速缓存数据区的一行数据被读入时，初始的细粒度高速缓存脏位图中的比特位视为全0或全1；When a line of data in the cache data area is read in, the bits in the initial fine-grained cache dirty bitmap are regarded as all 0 or all 1;

当所述高速缓存数据区的一行数据被更新时，按照字节比较新的数据与原有数据，根据二者是否相同来修改所述细粒度高速缓存脏位图中的内容；When a line of data in the cache data area is updated, compare the new data with the original data according to the byte, and modify the content in the fine-grained cache dirty bitmap according to whether the two are the same;

当所述高速缓存数据区的一行数据被替换出时，若所述细粒度高速缓存脏位图中的比特位标识高速缓存数据区的数据没有变化，则丢弃被替换出的数据，否则根据所述细粒度高速缓存脏位图中的比特位，将所述高速缓存数据区中有修改的字节中的内容写入内存。When a line of data in the cache data area is replaced, if the bit in the fine-grained cache dirty bitmap indicates that the data in the cache data area has not changed, the replaced data is discarded, otherwise, according to the the bits in the dirty bitmap of the fine-grained cache, and write the contents of the modified bytes in the cache data area into the memory.

优选地，步骤S5具体为：Preferably, step S5 is specifically:

根据所述细粒度高速缓存脏位图中的比特位和地址，将所述高速缓存数据区中被替换出的行发送给内存中相应内存行的存储芯片；According to the bits and addresses in the dirty bitmap of the fine-grained cache, send the replaced row in the cache data area to the memory chip of the corresponding memory row in the memory;

当内存被读取时，根据零值位图中相应的比特位，只发送不为零的数据，对于未发送的数据，在目的地填充零；When the memory is read, according to the corresponding bit in the zero-value bitmap, only send data that is not zero, and fill zero at the destination for unsent data;

当数据写入到内存时，根据写入内存的数据生成对应于零值位图中的比特位，并使用所生成的比特位更新零值位图。When data is written into the memory, bits corresponding to the zero-value bitmap are generated according to the data written into the memory, and the zero-value bitmap is updated using the generated bits.

优选地，所述细粒度高速缓存脏位图的大小为高速缓存数据区大小的1/8。Preferably, the size of the fine-grained cache dirty bitmap is 1/8 of the size of the cache data area.

优选地，所述零值位图的大小为所述内存大小的1/8。Preferably, the size of the zero-value bitmap is 1/8 of the memory size.

优选地，所述内存行为64比特或更高位宽。Preferably, the memory row has a width of 64 bits or higher.

（三）有益效果(3) Beneficial effects

上述技术方案具有如下优点：通过在字节这一级别标识被修改的数据和零值数据来避免无效传输，因此降低了高速缓存数据区到内存的带宽占用，降低了额外写入的开销；另外，对于写损耗的存储器件，该方法可以减少其平均写入次数，延长其寿命，同时降低功耗。The above technical solution has the following advantages: by identifying the modified data and zero-value data at the byte level to avoid invalid transmission, thus reducing the bandwidth occupation from the cache data area to the memory and reducing the overhead of additional writing; in addition , for a memory device with write loss, the method can reduce its average write times, prolong its life, and reduce power consumption at the same time.

附图说明Description of drawings

图1是本发明的方法流程图；Fig. 1 is method flowchart of the present invention;

图2是本发明的方法中定义的细粒度内存访问架构的示意图；Fig. 2 is a schematic diagram of the fine-grained memory access architecture defined in the method of the present invention;

图3是定义的细粒度高速缓存脏位图示意图；Fig. 3 is a schematic diagram of a defined fine-grained cache dirty bitmap;

图4是定义的零值位图的示意图；Fig. 4 is a schematic diagram of a defined zero-value bitmap;

图5是细粒度内存访问架构的写过程；Figure 5 is the writing process of the fine-grained memory access architecture;

图6是图5细粒度内存访问架构的读过程。Fig. 6 is the reading process of the fine-grained memory access architecture in Fig. 5 .

具体实施方式Detailed ways

下面结合附图和实施例，对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明，但不用来限制本发明的范围。The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

如图1、图2所示，本发明提供一种细粒度内存访问的方法，包括以下步骤：As shown in Figure 1 and Figure 2, the present invention provides a method for fine-grained memory access, comprising the following steps:

S1、按照如下方式定义细粒度高速缓存脏位图：所述细粒度高速缓存脏位图使用一个或多个比特位标识高速缓存数据区的一行中的一个或多个8比特存储单元的内容是否与读入时的初始值不同，也就是说被写入过与原来不同的值；所述高速缓存数据区是具有写损耗的存储器件（也可以是没有写损耗的存储器件）的数据区；如图3所示。图3中，细粒度高速缓存脏位图的每个比特标识高速缓存数据区的8比特数据。S1. Define the fine-grained cache dirty bitmap as follows: the fine-grained cache dirty bitmap uses one or more bits to identify whether the content of one or more 8-bit storage units in a row of the cache data area is It is different from the initial value at the time of reading, that is to say, a value different from the original value has been written; the cache data area is a data area of a storage device with write loss (or a storage device without write loss); As shown in Figure 3. In FIG. 3 , each bit of the fine-grained cache dirty bitmap identifies 8-bit data of the cache data area.

S2、按照如下方式定义零值位图：所述零值位图使用一个或多个比特位标识内存中的一个或多个8比特存储单元内的数据是否为零；如图4所示。图4中，每一位标识内存行中的一个8比特存储单元。S2. Define the zero-value bitmap as follows: the zero-value bitmap uses one or more bits to identify whether the data in one or more 8-bit storage units in the internal memory is zero; as shown in FIG. 4 . In Figure 4, each bit identifies an 8-bit memory cell in a memory row.

S3、按照如下方式定义内存行：在内存中，多个易失或非易失存储芯片通过共享读写地址来增加每个地址上能存储的数据，每个读写地址对应的存储空间为一个内存行，为64比特位宽，所述内存行由8个或更多个8比特位宽的存储芯片构成，零值位图存储在额外的存储芯片中；所述内存是具有所述写损耗的存储器件（也可以是没有写损耗的存储器件）的内存；S3. Define the memory row as follows: In the memory, multiple volatile or non-volatile memory chips increase the data that can be stored at each address by sharing the read-write address, and the storage space corresponding to each read-write address is one The memory row is 64-bit wide, and the memory row is composed of 8 or more 8-bit-wide memory chips, and the zero-value bitmap is stored in the extra memory chip; the memory has the write loss The memory of the storage device (or a storage device without write loss);

S4、利用所述细粒度高速缓存脏位图实现高速缓存数据区的读写；步骤S4是高速缓存数据区的读写规则。S4. Using the fine-grained cache dirty bitmap to read and write the cache data area; step S4 is the read and write rules of the cache data area.

S5、利用所述零值位图和内存行实现内存的读写。步骤S5描述了内存读写的规则。高速缓存数据区替换出来的数据是内存数据的来源之一。同时，内存数据也是高速缓存数据的来源。S5. Using the zero-value bitmap and the memory row to read and write the memory. Step S5 describes the rules of memory reading and writing. The data replaced by the cache data area is one of the sources of memory data. At the same time, memory data is also the source of cache data.

如图5、图6所示，步骤S4具体为：As shown in Figure 5 and Figure 6, step S4 is specifically:

如图5、图6所示，步骤S5具体为：As shown in Figure 5 and Figure 6, step S5 is specifically:

在内存控制器中，根据所述细粒度高速缓存脏位图中的比特位和地址，将所述高速缓存数据区中被替换出的行发送给内存中相应内存行的存储芯片；In the memory controller, according to the bits and addresses in the dirty bitmap of the fine-grained cache, send the replaced row in the cache data area to the memory chip of the corresponding memory row in the memory;

当数据从磁盘被读入内存时，根据每个byte的值，对零值位图进行赋值。当该byte为0时，零值位图中对应的比特被赋值为0或1。当该byte不为0时，赋值为其相反数。When data is read from the disk into the memory, the zero value bitmap is assigned according to the value of each byte. When the byte is 0, the corresponding bit in the zero-value bitmap is assigned a value of 0 or 1. When the byte is not 0, it is assigned its opposite number.

当高速缓存需要从内存中读入一行数据时，内存控制器根据零值位图中相应的比特位，只发送不为零的数据，高速缓存接收到数据后，自动填充0值数据，对于未发送的数据，在目的地填充零。When the cache needs to read a line of data from the memory, the memory controller only sends data that is not zero according to the corresponding bit in the zero-value bitmap. After the cache receives the data, it automatically fills in 0-value data. Data sent, zero-padded at destination.

由以上实施例可以看出，本发明通过在字节这一级别标识被修改的数据和零值数据来避免无效传输，因此降低了高速缓存数据区到内存带宽的占用，降低了额外写入的开销；另外，对于写损耗的存储器件，该方法可以减少其平均写入次数，延长其寿命，同时降低功耗。It can be seen from the above embodiments that the present invention avoids invalid transmission by identifying modified data and zero-value data at the byte level, thereby reducing the occupation of cache data area to memory bandwidth and reducing the cost of additional writing. overhead; in addition, for write-weary storage devices, this method can reduce the average number of writes, prolong its life, and reduce power consumption at the same time.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明技术原理的前提下，还可以做出若干改进和替换，这些改进和替换也应视为本发明的保护范围。The above is only a preferred embodiment of the present invention, it should be pointed out that for those of ordinary skill in the art, without departing from the technical principle of the present invention, some improvements and replacements can also be made, these improvements and replacements It should also be regarded as the protection scope of the present invention.

Claims

Translated fromChinese

1.一种细粒度内存访问的方法，其特征在于，包括以下步骤：1. A method for fine-grained memory access, comprising the following steps:

2.如权利要求1所述的方法，其特征在于，步骤S4具体为：2. The method according to claim 1, characterized in that step S4 is specifically:

当将读取的数据写入到高速缓存数据区的一行时，初始的细粒度高速缓存脏位图中的比特位视为全0或全1；When the read data is written to a row of the cache data area, the bits in the initial fine-grained cache dirty bitmap are regarded as all 0 or all 1;

3.如权利要求1所述的方法，其特征在于，步骤S5具体为：3. The method according to claim 1, characterized in that step S5 is specifically:

4.如权利要求1所述的方法，其特征在于，所述细粒度高速缓存脏位图的大小为高速缓存数据区大小的1/8。4. The method according to claim 1, wherein the size of the fine-grained cache dirty bitmap is 1/8 of the size of the cache data area.

5.如权利要求1所述的方法，其特征在于，所述零值位图的大小为所述内存大小的1/8。5. The method according to claim 1, wherein the size of the zero-value bitmap is 1/8 of the memory size.

6.如权利要求3所述的方法，其特征在于，所述内存中相应内存行为64比特或更高位宽。6. The method according to claim 3, wherein the corresponding memory row in the memory has a bit width of 64 bits or higher.