BACKGROUNDFIG. 1 (prior art) shows an example of aconventional memory system100. In this example,memory system100 resides on acomputer motherboard105 and is actually a subsystem of the motherboard.System100 includes a plurality of femaleelectrical connectors110, each of which accepts a memory module115 (only one of which is shown here). Eachmemory module115 contains a plurality ofmemory devices120, typically packaged as discrete integrated circuits (ICs).Memory devices120 are usually some type of read/write memory, such as Dynamic Random Access Memories (DRAMS), Static Random Access Memories (SDRAMs), Flash RAM, or other types. Read-Only Memories (ROM) devices might also be used.
Motherboard105 includes amemory controller125 connected viaconductive traces130 toconnectors110.Memory controller125 communicates withmemory modules115 throughconductive traces130.Memory controller125 also has an interface (not shown) that communicates with other components on the motherboard, allowing those components to read from and write to memory.
Eachmemory module115 typically contains a fixed-width data path interface. The fixed-width nature of the interface is generally a result of a desire to create an industry standard interface that can accommodate interoperable modules from a large number of suppliers.
System100 works with different numbers ofmemory modules115 installed, and with modules having different memory capacities and/or organizations. However, a system such as this is normally designed for a specific system data path width, i.e., for a specified number of data bit lines fromcontroller125 tomemory modules115.
Memory devices can be targeted to a wide variety of markets with very different sets of cost and performance constraints; consequently, the optimal device width can vary significantly from one application to the next. Unfortunately, these variations make it difficult for memory suppliers and distributors to accurately predict the customer demand mix for memory devices of various widths. Inaccuracies in demand-mix prediction can cause supply/demand imbalances and inventory management difficulties, which in turn can lead to pricing instability and highly variable profit margins. Furthermore, a memory device manufacturer may find that optimizing the cost for each target device width means a different design at the die level and potentially at the package level. This can increase the time-to-market and level of financial and engineering resources required to deliver each of these products to market.
Fixed-width devices have other drawbacks related to inflexible data path configuration. Because the system memory interface width and memory device interface widths are fixed, the addition of more memory devices or modules to the system typically requires multiple ranks, which generally necessitates the use of a multi-drop datapath topology. Adding more drops to the system tends to degrade signaling performance.
One way to reduce time-to-market and resource requirements is to create a common die design and package pinout that can support a variety of device data path widths. Some memory manufacturers support this capability through memory designs that allow configurations to be postponed until relatively late in the manufacturing process. A configuration is typically selected through one of several possible schemes, such as fuse or anti-fuse programmability, wire-bonding options, or upper level metal mask changes. This flexibility allows the device to be tested at the target width and sold as a fixed-width device.
Another way to reduce time-to-market and resource requirements associated with fixed-width memories is to use a memory design in which the width (e.g., the number of data pins) can be dynamically changed to suit the needs of a particular system. One such memory design is depicted in U.S. patent application Ser. No. 5,893,927 to William P. Hovis, which is incorporated herein by reference.FIG. 2, taken from the Hovis patent, illustrates a conventional synchronous dynamic random access memory (SDRAM)200 having a programmable device width. SDRAM200 includes aclock generator205 that provides clock signals to various components of SDRAM200. Acommand decoder210 receives chip select /CS, row enable /RAS, column enable /CAS and write command /W inputs. (The “/” preceding the signal names identifies the signals as active low. Overbars are used in the figures for the same purpose.)Command decoder210 recognizes, for example, a write command when /CS, /CAS, and /W are simultaneously asserted (i.e., logic low).Command decoder210 then outputs the command to somecontrol logic215, which controls the operation of the other components of SDRAM200 based on the received command.
Besides the commands of /CS, /RAS, /CAS and /W,command decoder210 also recognizes commands based on a combination of /CS, /RAS, /CAS, and /W. For instance,command decoder210 decodes the simultaneous receipt of /CS, /RAS, /CAS, and /W as a mode register set command. When a mode register set command is received,control logic215 causes amode register220 to latch the address data on address inputs A0-A10 and BA0-BA1.
The data on address inputs A0-A10, generally, represent either a row or column address, whereas the data on address inputs BA0-BA1, generally, represent a bank address. The bank addresses inputs BA0-BA1 specify one of the memory banks A-D discussed in detail below. During the mode register set operation, however, the data on address inputs A0-A10 and BA0-BA1 represent commands. Hereinafter, the address inputs and the data thereon will generically be referred to as address inputs.
SDRAM200 includes a row address buffer andrefresh counter225 and a column address buffer andburst counter230, both of which connect to address inputs A0-A10 and BA0-BA1. The row address buffer portion latches the address inputs at row-access-strobe (RAS) time and provides the row address to theappropriate row decoder235. The refresh counter portion refreshes the memory. The column address buffer portion latches the address inputs at column-access-strobe (CAS) time and provides the column address to theappropriate column decoder240. The burst counter portion controls the reading/writing of more than one column based on a pre-set burst length.
The memory of SDRAM200 is divided into four memory banks A-D that can be independently and simultaneously selected. Each memory bank A-D has associated therewith arow decoder235, asense amplifier255, and acolumn decoder240. Based on the address latched by the row address buffer and refreshcounter225, one ofrow decoders235 enables a row of bits in the corresponding bank. An associatedsense amplifier255 latches the columns of this row via sense amplification, and the associatedcolumn decoder240 outputs one or more bits depending on the device width and burst length.Sense amplifier255 typically represents a combination of column I/O amplifiers arranged along an edge of the array of banks and lower-level sense amplifiers interleaved between memory cells.
SDRAM200 includes configuration logic260 for setting the device width. Configuration logic260 connects tomode register220, and from there receives a memory-width configuration value stored inregister220 during device configuration. Based on this information, configuration logic260 configures a data control circuit265, alatch circuit270, and an input/output (I/O)buffer275 to obtain the device width associated with the memory-width configuration value. Specifically, configuration logic260 controls switches and multiplexers in data control circuit265 such that the number of active I/O drivers corresponds to the programmed device width.
Data control circuit265 is connected to eachcolumn decoder240, and to data I/O pin(s) DQ(s) vialatch circuit270 and input/output buffer275. During a read operation,sense amplifiers255 andcolumn decoders240 output data to data control circuit265 based on the row enabled bydecoder235, the column enabled bydecoder240, and the burst length. Data control circuit265 then routes the data to the number of I/O drivers set based on the device width. The data from the I/O drivers is then latched by thelatch circuit270, buffered by I/O buffer275, and output on the data I/O pin(s) DQ(s). The number of I/O pin(s) DQ(s) corresponds to the device width.
During a write operation, SDRAM200 receives data over the I/O pin(s) DQ(s). This data is buffered by I/O buffer275, latched bylatch circuit270, and received by data control circuit265. Data control circuit265 sends the data to theappropriate column decoders240 for storage in the memory banks A-D according to the enabled row and column.
SDRAM200 also includes an input DQM tolatch circuit270 for every 8 bits of input/output. For instance a x16 SDRAM will have two inputs DQM0 and DQM1. When enabled, the input DQM prevents reading or writing the remainder of a burst. In this manner, the burst length can be controlled.
Each read operation presents an entire row of data to senseamps255. Each write operation similarly involves an entire row. InSDRAM200, changing the memory width merely changes the number of bits selected from the accessed row: the narrower the memory configuration, the fewer bits are selected from the accessed row. Since the power required to perform a row access does not change with changes in device width, the relative power efficiency of row accesses reduces with memory width.
BRIEF DESCRIPTION OF THE FIGURESFIG. 1 (prior art) shows an example of aconventional memory system100.
FIG. 2 (prior art) illustrates a conventional synchronous dynamic random access memory (SDRAM)200 having a programmable device width.
FIG. 3 depicts a variable-width memory300 in accordance with an embodiment of the invention.
FIG. 4A details a portion of an embodiment ofmemory300 ofFIG. 3.
FIG. 4B details a portion of anotherembodiment memory300 ofFIG. 3.
FIGS. 5A-5C depict various width configurations of amemory module500 that includes four variable-width memories300 of the type described above in connection withFIGS. 3 and 4.
FIGS. 6A and 6B depict acomputer motherboard600 adapted to use a variable-width memory in accordance with an embodiment of the invention.
FIG. 7 depicts aportion700 ofmotherboard600 detailing the signal-line configuration.
FIG. 8 depictsportion700 ofFIG. 7 with amemory module800 and shortingmodule810 installed.
FIGS. 9A-9D depict memory configurations that respectively accommodate one, two, three, or four memory modules.
FIG. 10 (prior art) shows a floor plan of a conventional 1Gb DRAM1000 having a 16-bit wide data path D0-D15.
FIGS. 11A and 11B depict a high-level floor plan of aDRAM1100 featuring a configurable core.
FIG. 12 depicts a specific implementation of aconfigurable core1200 and associated circuitry.
FIGS.13A-F are simplified block diagrams ofcore1200 ofFIG. 12 illustrating access timing in a number of memory-access configurations.
DETAILED DESCRIPTIONFIG. 3 depicts a variable-width memory300 in accordance with an embodiment of the invention.Memory300 is similar toSDRAM200 ofFIG. 2, like-numbered elements being the same.Memory300 differs fromSDRAM200, however, in that the memory core organization changes with device width, resulting in reduced power usage for relatively narrow memory configurations. Also advantageous, reorganizing the core for relatively narrow memory widths increases the number of logical memory banks, and consequently reduces the likelihood of bank conflicts. Fewer conflicts means improved speed performance. These and other benefits of the invention are detailed below.
Much of the operation ofmemory300 is similar toSDRAM200 ofFIG. 2. A discussion of those portions ofmemory300 in common withSDRAM200 is omitted here for brevity. The elements ofFIG. 3 described above in connection withFIG. 2 are numbered in the two-hundreds (e.g., 2XX) for convenience. In general, the first digit of numerical designations indicates the figure in which the identified element is introduced.
Memory300 includes aconfigurable memory core305. In the example,memory core305 includes eight physical memory banks PB0-PB7, though the number of physical banks may vary according to need. Physical banks PB0-PB7 are interconnected such that they can be combined to form different numbers of logical banks. In the example, pairs of physical banks (e.g., PB0 and PB1) can be combined to form four logical banks LB0-LB3, collections of four physical banks (e.g., PB0-PB3) can be combined to form two logical banks LB4 and LB5, and all eight physical banks can be combined to form a single logical bank LB0-7. Assuming, for simplicity, that each physical bank PB0-PB7 includes a single data I/O terminal,memory core305 can be configured as a one-bit-wide memory with eight logical banks, a two-bit-wide memory with four logical banks, a four-bit-wide memory with two logical banks, or an eight-bit-wide memory with one logical bank.
Someconfiguration logic310 controls the configuration ofmemory core305 via adata control circuit315.Configuration logic310 also controls the data width through a collection oflatches320 and a collection of I/O buffers325. As detailed below,data control circuit315 includes some data routing logic, such as a crossbar switch, to provide flexible routing between the memory banks and data terminals DQs. The purpose and operation of these blocks is described below in more detail. As noted inFIG. 3, the data terminals (DQs) can be configured to have widths of x1, x2, x4, and x8.
FIG. 4A shows a specific implementation of aconfigurable core400 and associated circuitry. In one embodiment,core400 is a portion ofmemory300 ofFIG. 3. The number of physical banks is reduced to four physical banks PB0-PB3 inFIG. 4 for brevity.Memory300 might include two memory “slices,” each of which comprises amemory core400. The manner of extending the memory core ofFIG. 4A to eight or more banks will be readily apparent to those of skill in the art.
The components ofcore400 are similar to like-numbered elements inFIG. 3. For this embodiment, the serialization ratio is 1:1. Serialization ratios greater than 1:1 are possible with the addition of serial-to-parallel (write) and parallel-to-serial (read) conversion circuits. In this example, there are four physical banks PB0-3 supporting four read data bits and four write data bits. Generally,data control circuit315 contains multiplexing logic for read operations and demultiplexing logic for write operations. The multiplexing logic and demultiplexing logic are designed to allow one, two, or four device data lines D0-D3 to be routed to the four physical banks PB0-PB3.
In the one-bit wide configuration, device data line D0 can be routed to/from any of the four physical banks PB0-PB3. In the 2-bit wide configuration (“x2”), device data lines D0 and D1 can be routed to/from physical banks PB0 and PB1 (collectively, logical bank LB0,1) or physical banks PB2 and PB3 (collectively logical banks LB2,3). Finally, in the 4-bit wide configuration, device data lines D0, D1, D2, and D3 can be routed to/from respective physical banks PB0, PB1, PB2, and PB3 (collectively, logical bank LB0-3).Core400 can thus be configured as a one-, two-, or four-bank memory with respective widths of four (x4), two (x2), and one (x1) data bits.
Core400 is a synchronous memory; consequently, each physical bank PB0-PB3 includes aninput latch405 and anoutput latch410. Physical banks PB0-PB3 additionally include respective memory arrays MA0-MA3, sense amplifiers SA0-SA3, and bank-select terminals BS1-BS3. Asserting a bank select signal on one of terminals BS1-BS3 loads the data in the addressed location within the selected memory array into the respective one of sense amplifiers SA1-SA3.
Latch320 includes a pair oflatches415 and420 for each physical bank PB0-PB3.Data control circuit315 includes fivemultiplexers425,430,435,440, and445 that communicate data betweenlatch320 and physical banks PB0-PB3.Multiplexers425 and430 are controlled by a write control signal WB;multiplexer435 is controlled by a read control signal RA;multiplexer440 is controlled by a write control signal WA; andmultiplexer445 is controlled by two read control signals RA and RB. Write control signals WA and WB and read control signals RA and RB are based on the selected data path width and bits of the requested memory address or transfer phase. Configuration logic310 (FIG. 3) produces these signals in response to the programmed data width, whether the operation is a read or write operation, and appropriate addressing information.
Table 1 shows the control values used for data path slice widths of one, two, and four. Table 1 also indicates which of data terminals D
0-D
3 are used for each data width.
| WIDTH | WA | WB | RA | RB | DATA TERMINALS | |
|
| 1 | 1 | 1 | A0 | A1 | D0 |
| 2 | 0 | 1 | 0 | A0 | D0 & D1 |
| 4 | 0 | 0 | 0 | 0 | D0, D1, D2, & D3 |
|
When a width of one is selected during a read operation, theconfiguration logic310 allows data from any one of the four physical banks PH0-PH3 to be presented at data terminal D0. Control signals RA and RB determine which data-bit signals will be presented at any given time. Control signals RA and RB are set (at this data width) to equal the two least-significant bits (Al, AO) of the memory address corresponding to the current read operation.
When a width of one is selected during a write operation, the circuit accepts the data bit signal from data terminal DQØ and routes it to all four physical banks PB0-PB3 simultaneously. Control signals WA and WB are both set to a logical value of one to produce this routing. Other logic circuits (not shown) withinconfiguration logic310 control which of input latches405 and410 are active during any single write operation, so that each data bit signal is latched into the appropriate physical bank. For a given physical bank, only one oflatches405 and410 is active during any given memory cycle.
When a width of two is selected during a read operation,configuration logic310 allows two of the four data bit signals associated with physical banks PB0-PB3 to be present at data terminals DQØ and DQ1. To obtain this result, control signal RA is set to 0, and control signal RB is equal to the lower bit (A0) of the memory address corresponding to the current read operation. Control signal RB determines which of two pairs of data bit signals (0 and1 or2 and3) are presented at data terminals DQØ and DQ1 during a given read operation.
When a width of two is selected during a write operation,configuration logic310 accepts the data bit signals from physical banks PB0 and PB1 and routes them either to data terminals DQØ and DQ1 or DQ2 and DQ3. In this configuration, physical banks PB0 and PB1 collectively form one logical bank LB0,1 and physical banks PB2 and PB3 collectively form a second logical bank LB2,3. Control signals WA and WB are set to 0 and 1, respectively, to obtain this result.
A width of four is selected by setting all of the control signals (RA, RB, WA, and W,) to 0. Read and write data signals are then passed directly between physical banks PB0-PB3 and corresponding data terminals DQØ-DQ3.
For each row access, data moves from memory arrays MA0-MA3 to their respective sense amplifiers SA0-SA3.Core400 minimizes the power required to perform a row access by limiting each row access to the selected physical bank(s). To this end, bank-select signals on lines BS0-BS3 are only asserted to selected banks.
Configuration logic310 determines which of physical banks PB
0-PB
3 are selected, and consequently which bank-select signals are asserted, based upon the selected device width and memory address. The following Table 2 summarizes the logic within
configuration logic310 that generates the appropriate bank-select signals.
| TABLE 2 |
| |
| |
| ADDRESS LINES A1:A0 |
| WIDTH | 00 | 01 | 10 | 11 |
|
| 1 | BS0 | BS1 | BS2 | BS3 |
| 2 | BS0 & BS1 | BS2 & BS3 | BS0 & BS1 | BS2 & BS3 |
| 4 | BS0-BS3 | BS0-BS3 | BS0-BS3 | BS0-BS3 |
|
When
core400 is configured to have a width of one, the two least-significant address bits A
0 and A
1 are decoded to select one of physical banks PB
0-PB
3; when
core400 is configured to have a width of two, address bit A
0 enables the physical banks within either of logical banks LB
0,
1 or LB
2,
3; and when
core400 is configured to have a width of four, address bits A
0 and A
1 are ignored and all physical banks PB
0-PB
3 are selected (i.e., enabled).
The circuit ofFIG. 4A is just one example of many possible designs. Other embodiments will benefit from other configurations. For example, it is possible to use more or less elaborate data routing schemes to account for the different connection needs for memory systems with more or fewer modules. Moreover,multiple memory cores400 may be used to construct devices with greater than four device data connections. For example, a device having sixteen device data connections could use four memory cores while supporting three programmable widths; namely, 16, 8, or 4-bits widths. There are many possible alternatives for the number and width of physical and logical banks, the number of device data connections per device, serialization ratios, and data-path widths.
All data to and frommemory core400 passes through data terminal DQØ in the x1 mode, terminals DQØ and DQ1 in the x2 mode, and terminals DQØ-DQ3 in the x4 mode.FIG. 4B depicts anembodiment450 that benefits from a more flexible routing scheme in which the data terminals DQØ-DQ3 can be routed to different input/output pins of the memory module upon whichcore305 is mounted.Embodiment450 substitutes data controlcircuit315 ofFIG. 4A with a moreflexible crossbar switch460. In the depicted embodiment, the data terminals to and from physical bank PB0 can be routed to any of data connections DQØ-DQ3 in the x1 mode; the data terminals to and from physical banks PB0 and PB1 can be routed to either data connections DQ0 and DQ1 or data connections DQ2 and DQ3, respectively, in the x2 mode; and the data terminals to and from physical banks PB0-PB3 can be routed to data connections DQ0-DQ3, respectively, in the x4 mode. U.S. Pat. Nos. 5,530,814 and 5,717,871 describe various types of crossbar switches, and are incorporated herein by reference.
FIG. 5A depicts amemory module500 that includes four variable-width memories502 of the type described above in connection withFIGS. 3, 4A, and4B.Module500, typically a printed circuit board, also includes a number ofconductive traces505 that convey data between the data pins (3,2,1,0 ) ofmemories502 and corresponding module pins510. InFIG. 5A, eachmemory502 is configured to be one-bit wide, and the resulting four data bits are connected to four consecutive ones ofpins510. The selected traces are identified as bold lines; the selected module pins are crosshatched.
FIG. 5B depicts thesame memory module500 ofFIG. 5A; unlike inFIG. 5A, however, eachmemory502 is configured to be two-bits wide, and the resulting eight data bits are connected to eight consecutive ones ofpins510. Thememory module500 ofFIG. 5B is thus configured to be twice as wide (and half as deep) as thesame module500 ofFIG. 5A. As inFIG. 5A, the selected traces are identified as bold lines; the selected pins are crosshatched.
FIG. 5C depicts thesame memory module500 ofFIGS. 5A and 5B; unlike inFIGS. 5A and 5B, however, eachmemory502 is configured to be four-bits wide, and the resulting sixteen data bits are connected to sixteen consecutive ones ofpins510. Thememory module500 ofFIG. 5C is thus configured to be twice as wide (and half as deep) as thesame module500 ofFIG. 5B and four times as wide (and one forth as deep) as thesame memory module500 ofFIG. 5A. Once again, the selected traces are identified as bold lines; the selected pins are crosshatched.
FIGS. 6A and 6B depict a computer motherboard (or system backplane)600 adapted to use a variable-width memory in accordance with an embodiment of the invention.Motherboard600 includes amemory controller605 and a plurality of electrical receptacles orconnectors610 and615. The connectors are memory module sockets, and are configured to receive installable/removable memory modules620 and625.
Each ofmemory modules620 and625 comprises amodule backplane630 and a plurality ofintegrated memory circuits635. Each memory module also includes first and second opposed rows of electrical contacts (module pins)640 along opposite surfaces of its backplane. Only one row ofcontacts640 is visible inFIG. 6A. There are corresponding rows of connector contacts (not visible inFIG. 6A) in each ofconnectors610 and615.
A plurality of signal lines, or “traces,” extend betweenmemory controller605 andelectrical connectors610 and615 for electrical communication withmemory modules620 and625. More specifically, there are a plurality of sets of signal lines, each set extending to a corresponding, different one ofconnectors610 and615. A first set ofsignal lines645 extends to firstelectrical connector610, and a second set ofsignal lines650 extends to secondelectrical connector615.Motherboard600 also has a third set ofsignal lines655 that extends between the two connectors.
In the embodiment shown, the signal lines comprise system data lines—they carry data that has been read from or that is to be written tomemory modules620 and625. It is also possible that other signal lines, such as address and control lines, would couple to the memory modules through the connectors. These additional signal lines could have a different interconnection topology than what is shown forsignal lines645,650, and655.
The routing of the signal lines is more clearly visible inFIG. 6B, in whichmemory modules620 and625 have been omitted for clarity. The illustrated physical routing is shown only as a conceptual aid—actual routing is likely to be more direct, through multiple layers of a printed-circuit board.
FIG. 7A depicts aportion700 ofmotherboard600 detailing the signal-line configuration. This view shows cross-sections ofconnectors610 and615. Electrical conductors, traces, and/or contacts are indicated symbolically inFIG. 7A by relatively thick solid or dashed lines. Each of the three previously described sets of signal lines is represented by a single one of its conductors, which has been labeled with the reference numeral of the signal line set to which it belongs. The respective lines of a particular set of signal lines are routed individually in the manner shown.
As discussed above, eachconnector610 and615 has first and second opposed rows of contacts.FIG. 7A showsindividual contacts705 and710 corresponding respectively to the two contact rows of each connector. It is to be understood that these, again, are representative of the remaining contacts of the respective contact rows.
As is apparent inFIG. 7A, the first set ofsignal lines645 extends tofirst contact row705 offirst connector610. The second set ofsignal lines650 extends to thefirst contact row705 ofsecond connector615. In addition, a third set ofsignal lines655 extends between thesecond contact row710 offirst connector610 andsecond contact row710 ofsecond connector615. The third set ofsignal lines655 is represented by a dashed line, indicating that these lines are used only in certain configurations; specifically,signal lines655 are used only when a shorting module is inserted intoconnector610 or615. Such a shorting module, the use of which will be explained in more detail below, results in both sets ofsignal lines645 and650 being configured for communications with a single memory module.
The system ofFIG. 7A can be configured to include either one or two memory modules.FIG. 8 illustrates the first configuration, which includes amemory module800 in thefirst connector610 and ashorting module810 in thesecond connector615. The shorting module has shortingconductors815, corresponding to opposing pairs of connector contacts, between the first and second rows of the second connector. Inserting shortingmodule810 intoconnector615 connects or couples thesecond set650 of signal lines to thesecond contact row705 offirst connector610 through the third set of signal lines655. In this configuration, the two sets ofsignal lines645 and650 are used collectively to communicate betweenmemory controller605 andmemory module800.
In a two-module configuration, shortingbar810 is replaced with asecond memory module800. Ifmodules800 are adapted in accordance with the invention to support two width configurations and to include one half of the module pins640 on either side, then there is no need for a switch matrix likedata control circuit315 ofFIG. 4A orcrossbar switch460 ofFIG. 4B. Instead, merely including shortingmodule810 provides the memory controller access to the module pins640 on both sides of the onemodule800. Alternatively, including twomemory modules800 will provide the memory controller access to the same half of the module pins640 (those on the left-hand side of connector610) on both memory modules; the other half of the module pins640 are not used. More complex routing schemes can likewise be employed to support additional modules and width configurations. The two-module configuration thus provides the same data width as the single-module configuration, with each module providing half the width.
For a more detailed discussion ofmotherboard600, see copending U.S. patent application Ser. No. 09/797,099 filed Feb. 28, 2001 entitled “Upgradeable Memory System with Reconfigurable Interconnect,” by Richard E. Perego et al., publication no. 2004/0221106, which is incorporated herein by reference.
In some embodiments, the access configurations of the memory modules are controllable and programmable bymemory controller605 in the manner described above in connection withFIGS. 3, 4A,4B,5A, and5B. In such embodiments, the memory controller may be adapted to detect which connectors have installed memory modules, and to set the configuration of each module accordingly. This allows either one or two memory module to be used in a system without requiring manual configuration steps. If one module is used, it may be configured to use two signal-line sets for the best possible performance. If two memory modules are present, they may each be configured to use one signal-line set. This idea can be extended to support memory systems that can accommodate more than two memory modules, though the routing scheme becomes more complex with support for additional modules.
The integrated memory circuit can be configured for the appropriate access mode using control pins. These control pins might be part of the signal line sets645,650, and655, or they might be part of a different set of signal lines. These control pins might be dedicated to this configuration function, or they might be shared with other functions. Also, the integrated memory circuit might utilize programmable fuses to specify the configuration mode. Integrated memory circuit configurability might also be implemented, for example, by the use of jumpers on the memory modules. Note that the memory capacity of a module remains the same regardless of how it is configured. However, when it is accessed through one signal line set it requires a greater memory addressing range than when it is accessed through two signal line sets. Also note that the two configurations shown inFIGS. 6-8 could also be implemented with a shorting connector instead of a shorting module. A shorting connector shorts its opposing contacts when no module is inserted (the same result as when theconnector615 ofFIG. 7B has a shorting module inserted). A shorting connector with a memory module inserted is functionally identical to theconnector610 inFIG. 7.
As noted above, the general signal line scheme can be generalized for use with n connectors and memory modules. Generally stated, a system such as this uses a plurality of signal-line sets, each extending to a respective module connector. At least one of these sets is configurable or bypassable to extend to a connector other than its own respective connector. Stated alternatively, there are1 through n sets of signal lines that extend respectively tocorresponding connectors1 through n.Sets1 through n-1 of the signal lines are configurable to extend respectively to additional ones of the connectors other than their corresponding connectors.FIGS. 9A-9D illustrate this generalization, in amemory system900 in which n=4.
Referring first toFIG. 9A, this configuration includes amemory controller905; four memory slots orconnectors910,915,920, and925; and four signal line sets930,935,940, and945. Each signal line set is shown as a single line, and is shown as a dashed line when it extends beneath one of the connectors without connection. Physical connections of the signal line sets to the connectors are shown as solid dots. Inserted memory modules are shown as diagonally hatched rectangles, with solid dots indicating signal connections. Note that each inserted memory module can connect to up to four signal line sets. The number of signal line sets to which it actually connects depends upon the connector into which it is inserted. The connectors are identical components, but appear different to the memory modules because of the routing pattern of the four signal line sets on the motherboard.
Each signal line set extends to a corresponding connector. Furthermore, signal lines sets935,940, and945 are extendable to connectors other than their corresponding connectors: signal line set935 is extendable toconnector925; signal line set940 is extendable to bothconnectors920 and925; signal line set945 is extendable toconnector925. More specifically, a first signal line set930 extends directly to afirst memory connector925 without connection to any of the other connectors. It connects to corresponding contacts of the first contact row ofconnector925. A second signal line set935 extends directly to asecond memory connector920, where it connects to corresponding contacts of the first contact row. The corresponding contacts of the second contact row are connected to corresponding contacts of the first contact row offirst connector925, allowing the second signal line set to bypasssecond connector920 when a shorting module is placed inconnector920.
A third signal line set940 extends directly to athird memory connector915, where it connects to corresponding contacts of the first contact row. The corresponding contacts of the second contact row are connected to corresponding contacts of the first contact row ofconnector920. The corresponding second contact row contacts ofconnector920 are connected to the corresponding contacts of the first contact row ofconnector925.
A fourth signal line set945 extends directly to afourth memory connector910, where it connects to corresponding contacts of the first contact row ofconnector910. The corresponding contacts of the second contact row are connected to corresponding contacts of the first contact row offirst connector925.
This configuration, with appropriate use of shorting or bypass modules, accommodates one, two, three, or four physically identical memory modules. Each memory module permits simultaneous access through one, two, or four of its four available signal line sets. In the configuration ofFIG. 9A, a single memory module is inserted infirst connector925. This memory module is configured to permit simultaneous accesses on all of its four signal line sets, which correspond to all four signal line sets.Connectors910,915, and920 are shorted by inserted shorting modules as shown so that signal line sets935,940, and945 extend toconnector925.
FIG. 9B illustrates a second configuration in whichconnectors910 and915 are shorted by inserting shorting modules. Thus, signal line sets930 and945 extend toconnector925 and the inserted memory module is configured to permit simultaneous accesses on these two signal line sets. Signal line sets935 and940 extend toconnector920 and the inserted memory module is configured to permit simultaneous accesses on these two signal line sets.
FIG. 9C illustrates a third configuration in whichconnector910 is shorted by inserting a shorting module, and memory modules are positioned inconnectors915,920, and925. Signal line sets930 and945 extend toconnector925 and the inserted memory module is configured to permit simultaneous accesses on these two signal line sets. Signal line set935 extends toconnector920 and the inserted memory module is configured to permit accesses on this signal line set. Signal line set940 extends toconnector915 and the inserted memory module is configured to permit accesses on this signal line set.
FIG. 9D illustrates a fourth configuration, with a memory module in each of the four available memory connectors. Each module is connected to use a respective one of the four signal line sets, with no shorting modules in use.
An interesting aspect of a memory device with programmable data access width relates to the characteristic of the device that its bandwidth may generally be reduced as its data width is narrowed. As device bandwidth is reduced, opportunities increase for altering the device's memory array configuration to provide greater independence between array partitions.
FIG. 10 shows an example of a conventional 1Gb density DRAM1000 with a 16-bit wide data path D0-D15.FIG. 10 shows a high-level floor plan of the DRAM die, including left (“L”) and right (“R”) bank subdivisions, row decoders, column decoders, I/O sense amps (I/O), and data pin locations D0-D7 and D8-D15. A pair ofregions1005 and1012 within memory banks B0-L and B0-R (i.e., the left and right halves of bank0) indicates a sample page location for an 8 KB page within bank zero. 4 KB worth of sense amp circuitry for the left and right halves ofDRAM1000 are accessed in parallel via a pair ofmultiplexers1010 and1015 to form an 8 KB page. In this design, data from left and right halves of the die are accessed in parallel to meet the device peak bandwidth requirement. This also allows the data paths for the left and right halves of the die to be largely independent. (This aspect of some embodiments is discussed in more detail below in connection withFIG. 12.)
FIGS. 11A and 11B depict a high-level floor plan of aDRAM1100 featuring a configurable core in accordance with one embodiment.DRAM1100 can operate asDRAM1000 ofFIG. 10, but can also be configured to reduce peak device bandwidth by a factor of two. Such a bandwidth reduction allows the full amount of device bandwidth to be serviced by either the left half (FIG. 11A) or right half (FIG. 11B) of the device. In this embodiment, the eight active device data connections D0-D7—shown in bold—are located on the left side of the die, requiring that adata path1105 be provided from the right side memory array to the left side data connections D0-D7. With the memory array divided into left and right halves, it becomes feasible to manage banks on each side independently. In this case, the 16-bit wide device that supported eight independent banks accessed via data terminals D0-D15 (likeDRAM1000 ofFIG. 10) can be reconfigured as an 8-bit wide device supporting16 independent banks, with data access provided via either data terminals D0-D7 or D8-D15.
There is typically some incremental circuit overhead associated with increasing the bank count of the device, setting a practical limit to the number of independent banks that could potentially be supported. However, a performance improvement related to the increased number of banks may justify some increase in device cost.
In the embodiment ofFIGS. 11A and 11B, device page size is reduced for the 8-bit wide configuration (4 KB) relative to the 16 -bit wide configuration (8 KB). Reducing the page size is attractive from a power consumption perspective because fewer sense amps are activated during a RAS operation. In addition to activating fewer sense amps, it is also possible to subdivide word lines using a technique known as “sub-page activation.” In this scheme, word lines are divided into multiple sections, one or more of which are activated for a particular RAS operation. This technique typically adds some incremental die area overhead in exchange for reduced power consumption and potentially improved array access or cycle times.
The examples highlighted inFIGS. 11A and 11B are intended to illustrate the concept of how a configurable array organization can be used to reduce power consumption and increase the number of logical memory banks. Write transactions are not described for this embodiment, although the same principles of power reduction and memory bank count apply to writes as well. The basic principles of configurable array organization can be exploited regardless of the type or capacity of memory device.
FIG. 12 depicts a specific implementation of aconfigurable core1200 and associated circuitry, the combination of which may be integrated to form a memory component.Core1200 is similar tocore450 ofFIG. 4B, like-named elements being the same.Core1200 provides the same functionality ascore450, but the configuration and switching logic is modified to afford users the ability to partition the four physical banks PB0-PB3 into two separately addressable memories, each of which can be either one or two bits wide. Some elements are omitted from the depiction ofFIG. 12 for brevity. For example,core1200 may also includeregisters405 and410.
Physical bank PB0 includes a row decoder RD0, a memory array MA0, a sense amp SA0 (actually a collection of sense amplifiers), and a column decoder CD0. Each of the remaining physical banks PB1-PB3 includes identical structures. The row decoders, memory banks, sense amps, and column decoders are omitted fromFIG. 4B for brevity, but are included inFIG. 12 to illustrate an addressing scheme that enables core1200 to independently access address logical blocks LB0,1 and LB2,3.
Address buffers225 and230, introduced inFIG. 3, connect directly to the row and column decoders of physical banks PB2 and PB3.Configuration logic310, also introduced inFIG. 3, connects to the bank-select terminals BS3-0 and to acrossbar switch1207. Address buffers225 and230 are also selectively connected to the row and column decoders in physical banks PB0 and PB1 via amultiplexer1205.
The configuration and switching logic ofcore1200 is extended to include a second set of address buffers (row and column)1209 and a second set ofconfiguration logic1210.Address buffers1209 connect to the row and column decoders in physical banks PB0 and PB1 viamultiplexer1205.Configuration logic1210 connects tocrossbar switch1207—the data control circuit in this embodiment—and to bank-select terminals BS0 and BS1 viamultiplexer1205. A configuration-select bus CONF fromconfiguration logic310 includes three control lines C0-C2 that connect tocrossbar switch1207. Line C2 additionally connects to the select terminal ofmultiplexer1205. In this embodiment, mode register220 (FIG. 3) is adapted to store configuration data establishing the levels provided on lines C0-C2.
Core1200 supports four operational modes, or “configurations,” in addition to those described above in connection with
FIGS. 3, 4A, and
4B. These modes are summarized below in Table 3.
| TABLE 3 |
|
|
| CONF# | CORE CONFIGURATION | C2 | C1 | C0 | |
|
| 1 | Single Address, Variable Width | 1 | X | X |
| 2 | Separate Addresses, separate 2-bit busses | 0 | 0 | 0 |
| (DQ3/DQ2 and DQ1/DQ0) |
| 3 | Separate Addresses, memories share lines | 0 | 0 | 1 |
| DQ1 and DQ0 |
| 4 | Separate Addresses, memories share lines | 0 | 1 | 0 |
| DQ3 and DQ2 |
| 5 | Separate Addresses, banks configured to | 0 | 1 | 1 |
| be 1-bit wide, data on lines DQ0 and DQ1 |
|
Core1200 is operationally identical tocore450 ofFIG. 4B if each of lines C0-C2 is set to logic one. In that case, the logic one on line C2 causesmultiplexer1205 to pass the address fromaddress buffers225 and230 to physical banks PB0 and PB1. The logic levels on lines C0 and C1 are irrelevant in this configuration.
Driving line C2 to a voltage level representative of a logic zero causesmultiplexer1205 to convey the contents of the second set ofaddress buffers1209 to physical banks PB0 and PB1, and additionally causescrossbar switch1207 to respond to the control signals on lines C0 and C1. Logical banks LB0,1 and LB2,3 are thereby separated to provide independent memory access. Logical banks LB0,1 and LB2,3 are separately addressable in each of configurations two through five of Table 3. Though not shown, logical banks LB0,1 and LB2,3 can be adapted to receive either the same clock signal or separate clock signals.
In configuration number two,crossbar switch1207 accesses logical bank LB0,1 on lines DQ0 and DQ1 and logical bank LB2,3 on lines DQ2 and DQ3.Core1200 is therefore divided into a pair of two-bit memories accessed via separate two-bit data busses.
In configuration number three,crossbar switch1207 alternatively accesses either logical bank LB0,1 or logical bank LB2,3 via lines DQ0 and DQ1.Core1200 is therefore divided into two separately addressable two-bit memories that share a two-bit data bus. Configuration number four is similar, but access is provided via lines DQ2 and DQ3.
Configuration number fivedivides core1200 into two separately addressable, one-bit-wide memories. In effect, each pair of physical blocks within logical blocks LB0,1 and LB2,3 is combined to form a single-bit memory with twice the address locations of a parallel configuration. Each of the resulting one-bit-wide memories is then separately accessible via one bus line.
The modes of Table 3 are not exhaustive. More control signals and/or additional control logic can be included to increase the available memory configurations. For example, configuration number five might be extended to include the ability to select the bus line upon which data is made available, or the two-bit modes could be extended to provide data on additional pairs of bus lines.
The mode-select aspect allows core1200 to efficiently support data of different word lengths. Processors, which receive instructions and data from memory likecore1200, are sometimes asked to alternatively perform complex sets of instructions on relatively small data structures or perform simple instructions on relatively large data structures. In graphics programs, for example, the computationally simple task of refreshing an image employs large data structures, while more complex image processing tasks (e.g., texture mapping and removing hidden features) often employ relatively small data structures.Core1200 can dynamically switch between configurations to best support the task at hand by altering the contents of mode register220 (FIG. 3). In the graphics-program example, instructions that contend with relatively large data structures might simultaneously access both logical blocks LB0,1 and LB2,3 in parallel, and instructions that contend with relatively small data structures might access logical blocks LB0,1 and LB2,3 separately using separate addresses.Core1200 may therefore provide more efficient memory usage. As withcores400 and450,core1200 minimizes the power required to perform a row access by limiting each row access to the selected physical bank(s).
FIG. 13A is a simplified block diagram1300 ofcore1200 ofFIG. 12 illustrating memory access timing in one memory-access mode. In this example,core1200 is configured to deliver full-width data from combined logical blocks LB2,3 and LB0,1. The pairs of memory blocks within each logical block LB2,3 and LB0,1 are combined for simplicity of illustration. At time T1, the data stored in row address location ADD0 in each of logical blocks LB2,3 and LB0,1 are each loaded simultaneously into respective sense amplifiers SA2/3 and SA0/1. The row address ADD0 used for each logical block is the same. Then, at time T2, the contents at the same column address of the two sense amplifiers are accessed simultaneously with data lines DQ3/2 and DQ1/0 viaswitch1207. Time T1 precedes time T2.
FIG. 13B is a block diagram1310 ofcore1200 ofFIG. 12 illustrating access timing in a second memory-access mode. In this example,core1200 is configured to alternatively deliver half-width data by separately accessing logical blocks LB2,3 and LB0,1. At time T1, the contents of row address ADD0 in logical block LB2,3 loads into sense amplifiers SA2/3. At another time T2 (where T2 may be earlier or later than T1), the contents of row address ADD0 in local block LB0/1 loads into sense amplifiers SA0/1. Of interest, at each of times T1 and T2 only the accessed physical blocks are enabled using the appropriate bank-select signals BS3-0 (seeFIG. 12). The content at a column address of sense amplifiers SA2/3 is accessed at time T3 via the data lines DQ0/1. The content at the same column address of sense amplifiers SA0/1 is accessed at another time T4 via the data lines DQ0/1 (where T4 may be earlier or later than T3). Time T1 precedes time T3, and time T2 precedes T14.
FIG. 13C is a simplified block diagram1315 ofcore1200 ofFIG. 12 illustrating access timing in a third memory-access configuration. As in the example ofFIG. 13A,core1200 is configured to deliver full-width data from combined logical blocks LB2,3 and LB0,1; unlike the example ofFIG. 13A, however, diagram1315 illustrates the case in which logical blocks LB2,3 and LB0,1 are addressed separately. At time T1, the contents of row address ADD0 in logical block LB2,3 and row address ADD1 in logical block LB0,1 are loaded substantially simultaneously into respective sense amplifiers SA2/3 and SA0/1. The term “substantially simultaneous” is used here to indicate the possibility that these two operations are not precisely simultaneous (coincident), but nevertheless overlap. The content at a first column address of sense amplifiers SA2/3 is accessed at time T2 via the data lines DQ0/1. The content at a second column address of sense amplifiers SA0/1 is accessed substantially simultaneously at time T2 via the data lines DQ0/1. Time T1 precedes time T2.
FIG. 13D is a block diagram1320 ofcore1200 ofFIG. 12 illustrating access timing in a fourth memory-access mode. With respect to timing, diagram1320 is similar to diagram1310 ofFIG. 13B. Diagram1320 differs from diagram1310, however, in that each of logical block LB2,3 and LB0,1 is independently addressed.Core1200 can therefore interleave data from different addresses in logical banks LB2,3 and LB0,1 and provide the resulting data on data lines DQ1 and DQ0. Specifically, at time T1, the contents of row address ADD0 in logical block LB2,3 loads into sense amplifiers SA2/3. At another time T2 (where T2 may be earlier or later or the same as T1), the contents of another row address ADD1 in logical block LB0/1 loads into sense amplifiers SA0/1 (ADD0 and ADD1 may be the same or different). The content at a first column address of sense amplifiers SA2/3 is accessed at time T3 via the data lines DQ0/1. The content at a second column address of sense amplifiers SA0/1 is accessed at another time T4 via the data lines DQ0/1 (where T4 may be earlier or later than T3 ). Time T1 precedes time T3, and time T2 precedes T4.
FIG. 13E is a simplified block diagram1325 ofcore1200 illustrating access timing in a mode that delivers full-width data from combined logical blocks LB2,3 and LB0,1. With respect to timing, diagram1325 is similar to diagram1300 ofFIG. 13A. Diagram1325 differs from diagram1300, however, in that each of logical blocks LB2,3 and LB0,1 is independently addressed.
FIG. 13F is a simplified block diagram1330 ofcore1200 illustrating access timing in a mode that delivers half-width data from independently addressed logical blocks LB2,3 and LB0,1. The flow of data in diagram1330 is similar to that of diagram1320 ofFIG. 13D. However, diagram1330 differs from diagram1320 with respect to timing because the contents of address locations ADD0 of logical block LB2,3 and ADD1 of logical block LB0,1 are delivered to respective sense amplifiers SA2/3 and SA0/1 substantially simultaneously.
Although details of specific implementations and embodiments are described above, such details are intended to satisfy statutory disclosure obligations rather than to limit the scope of the following claims. Thus, the invention as defined by the claims is not limited to the specific features described above. Rather, the invention is claimed in any of its forms or modifications that fall within the proper scope of the appended claims, appropriately interpreted in accordance with the doctrine of equivalents.