TECHNICAL FIELDEmbodiments of the present invention related to computer systems, and, in particular, to optical interconnects.
BACKGROUNDNewly developed software and improvements to existing software continue to place ever increasing demands on processing power and memory capacity of computer systems. Typical high performance rack mounted computer systems, such as a blade system, comprise a number of processor boards and memory boards that are in electronic communication over an electronic interconnect fabric. An ideal interconnect fabric allows processors and memory to scale independently in order to reconfigure computer systems with enough memory or processing speed to meet the computational demands of the software.
BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1 shows a schematic representation of an example one-dimensional multi-bus fabric configured in accordance with embodiments of the present invention.
FIG. 2 shows a schematic representation of an example transceiver configured in accordance with embodiments of the present invention.
FIG. 3 shows a schematic representation of an example optical bus optically coupled to four nodes in accordance with embodiments of the presenting invention.
FIG. 4 shows an example of n nodes in optical communication with a waveguide of an optical bus in accordance with embodiments of the present invention.
FIG. 5 shows a schematic representation of an example two-dimensional multi-bus fabric configured in accordance with embodiments of the present invention.
FIG. 6 shows a schematic representation of an example one-dimensional memory fabric configured in accordance with embodiments of the present invention.
FIG. 7 shows a schematic representation of an example transceiver for an expansion memory device configured in accordance with embodiments of the present invention.
FIG. 8 shows a schematic representation of an example two-dimensional memory fabric configured in accordance with embodiments of the present invention.
FIG. 9 shows an example of a single-socket processor board configured in accordance with embodiments of the present invention.
FIG. 10 shows an example of a dual-socket processor board configured in accordance with embodiments of the present invention.
FIG. 11 shows an isometric view and schematic representation of an example eight socket computer system configured in accordance with embodiments of the present invention.
FIG. 12 shows an isometric view and schematic representation of an example thirty-two socket computer system configured in accordance with embodiments of the present invention.
FIG. 13 shows an example of a memory expansion board configured in accordance with embodiments of the present invention.
FIG. 14 shows an end-on view of an example computer system with memory expansion configured in accordance with embodiments of the present invention.
FIG. 15 shows an example of a ring-based optical interconnect for a one-dimensional arrangement of four nodes of computer system.
FIG. 16 shows an example of a switch-based optical interconnect of four nodes of a computer system.
FIG. 17 shows an example of a torus-based optical interconnect for sixteen nodes of a computer system.
FIG. 18 shows an example of a two-layer switch-based optical interconnect of a sixteen node computer system.
DETAILED DESCRIPTIONIn practice, an interconnect fabric of a computer system provides limited scaling capacity and may not be configured to scale with the computational demands of software that may be run on the same system in the future. The speed of processors has steadily increased due to technological improvements, which has helped to address some data processing demands. In particular, developments in integrated circuit technology has shown remarkable progress in reducing the size of computer components, which has led to increases in component densities, decreases in the cross-sectional dimensions of signal lines, and increased data rates. Electronic signal line buses have traditionally been used to connect boards in interconnect fabrics, but as data rates have increased, signal integrity has diminished. As a result, the number of circuit boards in a computer system connected to a bus has decreased and in many cases buses have been replaced by point-to-point electronic interconnects. Point-to-point electronic interconnects can be used to create scalable electronic interconnect fabrics.
However, these electronic interconnect fabrics have a number of disadvantages. They can be labor intensive to set up, and sending electronic signals over conventional electronic interconnects consumes large amounts of power. In addition, it is becoming increasing difficult to scale the bandwidth of electronic interconnects, and the relative amount of time needed to send electronic signals over a conventional electronic interconnect fabric is becoming too long to take full advantage of the high-speed performance offered by smaller and faster components.
Manufacturers, designers, and users of computer systems have recognized a market for high-speed, bus-based interconnect fabrics that can be scaled to meet the ever increasing demands on computer systems without the constraints inherent in currently employed electronic interconnect fabrics.
Various embodiments of the present invention are directed to arrangements of multiple optical buses to create scalable optical interconnect fabrics for computer systems. Optical interconnect fabrics configured in accordance with embodiments of the present invention allow processors and memory of a computer system to scale independently so that the computer system can be reconfigured to meet the changing computational demands of software. In particular, optical interconnect fabric embodiments allow large numbers of processors to be interconnected at low latency and memory can be scaled efficiently in terms of power consumption and size and without adding significant latency. The optical interconnects fabrics include multiple optical buses that provide a relatively lower number of optical signal hop counts, lower power consumption, and can accommodate relatively higher data rates than conventional electronic and optical interconnect fabrics.
The detailed description of the present invention is organized as follows. A general description of optical interconnect fabric embodiments is provided in a first subsection. Implementations of optical interconnect fabrics is provided in a second subsection. Finally, a description of advantages of optical interconnect fabric embodiments is provided in a third subsection.
Optical Interconnect FabricsFIG. 1 shows a schematic representation of an example multi-busoptical fabric102 configured in accordance with embodiments of the present invention. Themulti-bus fabric102 is a one-dimensional optical interconnect comprising four optical buses104-107 that provide optical communication between four nodes108-111. The nodes can be any combination of processors, memory controllers, servers, clusters of multi-core processing units, circuit boards, external network connections, or any other data processing, storing, or transmitting device. For example, thenode108 includes arouter112 in electronic communication with a processing element (“PE”)114. The PE114 can be a single processor or a multi-core processor and includes local memory. InFIG. 1, directional arrows, such as arrows124-127, represent optical signals sent between the nodes108-111 and the optical buses104-107. The optical buses104-107 include four separate fabric ports for transmitting the optical signals between the nodes108-111 and the optical buses104-107. Each node is in optical communication with one of the optical buses for broadcasting optical signals and is in optical communication with the remaining three optical buses for receiving optical signals that are broadcast from the other three nodes. In other words, each node broadcasts the optical signals to the other nodes over an associated optical bus. A broadcasting node can be referred to a master node. For example, PE114 generates data encoded inelectronic signals116 that are sent torouter112. Therouter112, in this embodiment, includes a transceiver described below with reference toFIG. 2 that converts the electronic signals into optical signals encoding the same information and sends theoptical signals124 to theoptical bus104. Theoptical bus104 broadcast the optical signals received from thenode108 to each of the nodes109-111, as indicated by directional arrows130-132. As shown in the example ofFIG. 1, nodes109-111 broadcast optical signals in the same manner on corresponding optical buses105-107, respectively. Directional arrows125-127 representoptical signals node108 receives from nodes109-111, respectively.Node108 converts the data encoded in the optical signals into electronic signals for processing.
Each of the nodes108-111 includes a transceiver (not shown inFIG. 1) for sending and receiving optical signals. In certain embodiments, transceivers can be implemented as part of the router. In other embodiments, transceivers can be implemented as a separate device that is in electronic communication with the router, as described below in the Implementations subsection.FIG. 2 shows a schematic representation of anexample transceiver202 configured in accordance with embodiments of the present invention. Thetransceiver202 comprises atransmitter204, three receivers205-207, andinterface electronics208, which are in electronic communication with thetransmitter204 and the receivers205-207. Thetransmitter204 comprises an array of light-emitting sources, such as light-emitting diodes, semiconductor lasers, or vertical cavity surface emitting lasers (“VCSELs”). In certain embodiments, the sources can be configured to emit electromagnetic radiation with approximately the same wavelength. In other embodiments, each source can be configured to emit a different wavelength providing for dense-wave division multiplexing channel spacing. In still other embodiments, the sources can be configured to emit wavelengths in wavelength ranges providing for coarse-wave division multiplexing channel spacing. The use of wavelength division multiplexing reduces the number of waveguides needed for the same number of channels. In the example shown inFIG. 2, thetransmitter204 comprises 12 sources, each of which is separately controlled by theinterface electronics208 to emit an optical signal.Directional arrows210 each represent a separate optical signal generated by one of the 12 sources. In certain embodiments, theoptical signals210 can be sent in separate waveguides to one of the optical buses in themulti-bus fabric102, or in other embodiments, the separateoptical signals210 can be optically coupled directly from thetransmitter204 sources into an associated optical bus as described below with reference toFIGS. 3 and 4. For example, thetransceiver202 can represent the transceiver ofnode108, and the 12optical signals210 generated by thetransmitter204 are represented inFIG. 1 bydirectional arrow124.
Each of the receivers205-207 comprises an array of photodetectors. In the example shown inFIG. 2, the receivers205-207 each comprise an array of 12 photodetectors. The photodetectors can be p-n junction or p-i-n junction photodetectors. Sets of arrows211-213 each represent 12 optical signals generated by different routers in the same manner the optical signals are generated by thetransmitter204. For example, thetransceiver202 can represent the transceiver ofnode108 inFIG. 1, and the sets of optical signals211-213 sent to receivers205-207 are represented inFIG. 1 by directional arrows125-127, respectively. In certain embodiments, each optical signal can be carried to a photodetector of a receiver via a separate waveguide. In other embodiments, each optical signal can be optically coupled directly from the associated optical bus to a photodetector of a receiver.
Interface electronics208 electronically couple thetransmitter204 and receiver205-207 to the electronic components of a corresponding node. Theinterface electronics208 may include drivers for operating the light-emitting sources of thetransmitter204 and may include amplifiers for amplifying the electronic signals generated by the photodetectors of the receivers205-207. Theinterface electronics208 receive electronic signals from the node and send the electronic signals to thetransmitter204 to generate optical signals. The waveguides211-213 direct separate optical signals to the photodetectors of the receivers205-207. The separate optical signals are converted into separate corresponding electronic signals that are sent to theinterface electronics208, which sends the electronic signals to the node.
Returning toFIG. 1, the optical buses104-107 are each configured with a number of separate waveguides and optical taps for broadcasting the optical signals generated by one node the other three nodes. In particular, themulti-bus fabric102 can be configured so that the optical signals broadcast by any one node and received by the other three nodes arrive with approximately the same optical power.
FIG. 3 shows a schematic representation of theoptical bus104 optically coupled to the nodes108-111, shown inFIG. 1, in accordance with embodiments of the presenting invention. Theoptical bus104 comprises 12separate waveguides302.FIG. 3 reveals that each of theoptical signals124, shown inFIG. 1, is optically coupled via areflective device304, such as a mirror, to one of thewaveguides302 in theoptical bus104. Wavelength division multiplexing can be used to inject each of the optical signals output from thenode108 into each of thewaveguides302.FIG. 3 also reveals that the optical signals126-128, shown inFIG. 1, are composed of 12 separate optical signals. Each of theoptical signals126 is directed from awaveguide302 via anoptical tap306 tonode109, each of theoptical signals127 is directed from awaveguide302 via anoptical tap308 tonode110, and each of theoptical signals128 is directed from to awaveguide302 via areflective device310 to thenode111.
Theoptical bus104 is a fan-out bus configured to receive 12 optical signals generated bynode108 and broadcast the 12 optical signals to each of the nodes109-111 with approximately the same optical power. In certain embodiments, theoptical taps306 can be configured to operate as ⅓:⅔ beamsplitters andoptical taps308 can be configured to operate as 50:50 beamsplitters. For example, consider the path of a single optical signal generated by thenode108. The optical signal is reflected by amirror304 into an optically coupledwaveguide302. Anoptical tap306 reflects approximately ⅓ of the optical power of the optical signal tonode109 and approximately ⅔ of the optical power is transmitted along thesame waveguide302 to anoptical tap308. At theoptical tap308, approximately ½ of the remaining ⅔ optical power is reflected tonode110 and the remaining ⅓ of the optical power is transmitted along thesame waveguide302 through theoptical tap308 to amirror310 where the remaining optical signal is reflected tonode111. Thus, for each of the 12 optical signals generated by thenode108, the nodes109-111 receive approximately ⅓ of the optical power associated with each optical signal
Multi-bus fabric102 is not limited to optically interconnecting four nodes. In other embodiments, a single one-dimensional multi-bus fabric can be configured to accommodate as few as 2 nodes and as many as 5, 6, 7, or 10 or more nodes. The maximum number of nodes is determined by the transmit power, the overall system loss and the minimum receiver sensitivity. In general, the optical taps of an optical bus are configured so that when an optical signal is broadcast by a node over the optical bus, each of the nodes receive approximately 1/(n−1) of the total optical power P of the optical signal.FIG. 4 shows an example of n nodes in optical communication with awaveguide402 in accordance with embodiments of the present invention. Thewaveguide402 can be a waveguide of an optical bus of a multi-bus fabric described above with five of n−2 optical taps404-408 and twomirrors410 and411 represented. Node412 outputs anoptical signal414 with optical power P to thewaveguide402. The optical signal can be generated by driving a source of a transmitter of the node412, as described above with reference toFIG. 2. The optical taps are configured so that each node receives a reflected portion of the optical signal with an approximate optical power P/(n−1).
The optical taps denoted by OTminFIG. 4 reflect a fraction of the optical signal power to an optically coupled node in accordance with:
and transmit a fraction of the optical signal power in accordance with:
Thus, an optical tap OTmreceives an optical signal with optical power P from a broadcasting node and outputs a reflected portion with optical power PRmtoward an optically coupled node and outputs a transmitted portion with optical power PTm, where P=PRm+PTm+Lmwith Lmrepresenting the optical power loss at the optical tap OTmdue to absorption, scattering, or misalignment. For a general description of broadcasting an optical signal with substantially the same optical power to a number of nodes over a single waveguide. In other embodiments, the optical buses of a one-dimensional multi-bus fabric can be implemented using star couplers. For example, each optical bus of themulti-bus fabric102 can be implemented using 12 star couplers. InFIG. 3, a star coupler comprises one input port that carries one of theoptical signals124 and three output ports, each output port carriers one of the optical signals126-128. Each star coupler can be configured so that an optical signal received in the input port is split into three output optical signals, each output optical signal carrying approximately ⅓ of the optical power of the input optical signal.
The waveguides of the optical buses104-107 can be optical fibers, optical waveguides or hollow waveguides. A hollow waveguides is composed of a tube with an air core. The structural tube forming the hollow waveguide can have inner core materials with refractive indices greater than or less than one. The tubing can be composed of a suitable metal, glass, or plastic and metallic and dielectric films can be deposited on the inner surface of the tubing. The hollow waveguides can be hollow metal waveguides with high reflective metal coatings lining the interior surface of the core. The air core can have a cross-sectional shape that is circular, elliptical, square, rectangular, or any other shape that is suitable for guiding light. Because the waveguide is hollow, optical signals can travel along the core of a hollow waveguide with an effective index of about 1. In other words, light propagates along the core of a hollow waveguide at the speed of light in air or vacuum.
The example shown inFIG. 1 represents a one-dimensional multi-bus fabric for interconnecting nodes. In general, a one-dimensional multi-bus fabric for interconnecting n nodes comprises n optical buses. Each optical bus is driven by one node, and each node can receive broadcast optical signals on the other n−1 optical buses. In addition, a one-dimensional multi-bus fabric corresponds to n transceivers, or in other words, n transmitters and n2−n receivers. However, it is desirable to limit n due to power constraints in fan-out optical buses and because of the square growth in receivers per transceiver.
Two-dimensional arrangements of nodes, on the other hand, can limit optical bus fan-out and reduce the number of receivers per transceiver.FIG. 5 shows a schematic representation and an example of a two-dimensionalmulti-bus fabric500 for interconnecting 16 nodes in accordance with embodiments of the present invention. As shown inFIG. 5, themulti-bus fabric500 comprises eight separate one-dimensional multi-bus fabrics configured as described above with reference toFIGS. 1-3, and each node is in optical communication with two one-dimensional multi-bus fabrics. For example,node502 is in optical communication with a one-dimensional multi-bus fabric identified by dashed-line enclosure504 and a one-dimensional multi-bus fabric identified by dashed-line enclosure506. In order to implement the two-dimensionalmulti-bus fabric500, each node is configured with two transceivers that are configured and operated as described above with reference toFIG. 2.
Optical fabric embodiments of the present invention also include optical memory fabrics for interconnect expansion memory to nodes.FIG. 6 shows a schematic representation of an example one-dimensionalmemory multi-bus fabric602 configured in accordance with embodiments of the present invention. Thememory fabric602 comprises four optical buses604-607 that provide optical communication between anode608 and three expansion memory devices610-612. Directional arrows614-617 between the node and the optical buses604-607 represent separate optical signals, and directional arrows between the memory devices610-612, such as arrows618 and619, also represent separate optical signals. Theoptical bus604 is configured to broadcast the optical signals to each of the expansion memory devices610-612 in the same manner thenode108 broadcast optical signals to nodes109-111 described above with reference toFIGS. 1-3. Expansion memory devices610-612 send optical signals tonode608 over optical buses605-607, respectively. Thenode608 converts the optical signals into electronic signals for processing.
Thenode608 includes the same transceiver described above with reference toFIG. 2.FIG. 7 shows a schematic representation of anexample transceiver702 for the expansion memory devices610-612 configured in accordance with embodiments of the present invention. Thetransceiver702 comprises atransmitter704, areceiver706, andinterface electronics708. In the example shown inFIG. 7, thetransmitter704 comprises an array of 12 light-emitting sources, such as VCSELs, each of which is separately controlled by theinterface electronics708 to emit an optical signal into a separate waveguide. Thereceiver706 comprises an array of 12 pn junction or p-i-n junction photodetectors and receiver integrated circuits which perform the optical to electrical conversion.Interface electronics708 electronically couple thetransmitter704 and thereceiver706 to the electronic components of the memory. Thetransmitter704 andreceiver706 are operated in the same manner as thetransmitter204 and receivers205-207 described above with reference toFIGS. 2 and 3.
The example shown inFIG. 6 represents a memory multi-bus fabric for interconnecting a node with three expansion memory devices. Like the two-dimensional multi-bus fabric expansion described above with reference toFIG. 5, memory expansion can also be treated as an additional dimension.FIG. 8 shows a schematic representation and an example of a two-dimensional memory fabric800 for interconnecting 4 nodes and 12 expansion memory devices in accordance with embodiments of the present invention. The two-dimensional memory fabric800 comprises four one-dimensional memory multi-bus fabrics and one one-dimensional multi-bus fabric. The four nodes are in optical communication via the multi-bus fabric identified by a dashed-line enclosure802, as described above with reference toFIGS. 1-3. In addition, each node is in optical communication with three expansion memory devices via one of the four memory multi-bus fabrics. For example,node804 is optically coupled expansion memory devices805-807 via a memory multi-bus fabric identified by a dashed-line enclosure808. In order to implement the two-dimensionalmulti-bus fabric800, each node is configured with two transceivers that are configured and operated as described above with reference toFIG. 2.
Optical signals generated by the nodes and the expansion memory devices can be in the form of packets that include headers. Each header identifies a particular node or expansion memory device as the destination for the data encoded in the optical signals. For example, when thenode108 sends optical signals directed to one of the nodes109-111, shown inFIG. 1, the optical signals are broadcast to the nodes109-111 over the sameoptical bus104. However, because the header of each packet identifies the particular node as the destination of the data, only the node identified by the header actually processes the optical signals. The other nodes also receive the optical signals, but because they are not identified by the header they can discard the data encoded in the optical signals.
ImplementationsFIG. 9 shows an example single-socket processor board900 configured in accordance with embodiments of the present invention. Theprocessor board900 includes arouter902 and aprocessing element904 in electronic communication viabi-directional communication links906 imprinted on theboard900. Theprocessing element904 can be a single processor or a multi-core processor and includes local memory. In this embodiment, theprocessor board900 includes four transceivers908-911 that are in electronic communication withrouter902 and in optical communication with fabric ports A-D, respectively. The four fabric ports A-D form a single socket for optical connections with other devices. The transceivers908-911 are configured as described above with reference toFIG. 2. For example, in fabric port A,directional arrow914 represents a number of waveguides emanating from corresponding lasers in a transmitter, anddirectional arrows916 each represent a number of waveguides that carrier optical signals to photodetectors of three corresponding receivers. In other embodiments, the number of transceivers can range from as few as one to as many 5, 6, 7 or more. The number of fabric ports per socket determines the potential dimensionality and versatility of a multi-bus fabric for a scalable computer system comprising numerous processor boards as described below. In other embodiments, the transceivers can be integrated within the electronic components of therouter902, as described above with reference toFIGS. 1-8.
FIG. 10 shows an example dual-socket processor board1000 configured in accordance with embodiments of the present invention. Theprocessor board1000 includes tworouters1002 and1004 and twoprocessing elements1006 and1008. Theprocessing elements1006 and1008 can be multi-core processors with local memory. Theprocessing element1006 is in electronic communication with therouter1002 viabi-directional communication links1010, and theprocessing element1008 is in electronic communication with therouter1004 viabi-directional communication links1012. Theprocessing board1000 is also configured with two additionalbi-directional communication links1014 and1016 providing electronic communication between theprocessing element1006 and therouter1004 and between theprocessing element1008 and therouter1002, respectively. As shown inFIG. 10, theprocessing board1000 includes two sets of four transceivers, each set with fabric ports labeled A-D, as described above with reference toFIG. 9. Each set of fabric ports forms a socket. Thus,board1000 is referred to as dual-socket processor board.
In accordance with embodiments of the present invention, the multi-bus fabrics described above with reference toFIGS. 1-8 can be used to form one-, two-, and higher dimensional computer systems comprising any number of single- or dual-socket processor boards.FIG. 11 shows an isometric view and schematic representation of an example eightsocket computer system1100 configured in accordance with embodiments of the present invention. The eightsocket system1100 comprises four processor boards1101-1104 in optical communication via four separate multi-bus fabrics1106-1109. In the example ofFIG. 11, the processor boards1101-1104 are configured in the same manner as theprocessor board1000. The multi-bus fabrics are configured and operated as described above with reference toFIGS. 1-3 and optically coupled to the A and B fabric ports. Each multi-bus fabric provides the same optical communication described above with reference toFIGS. 1-3. The processor boards1101-1104 can be mounted in a cabinet or chassis. Multi-bus fabrics1106-1109 can be implemented in the cabinet backplane. Each processor board is inserted into the cabinet with the A and B fabric ports inserted into corresponding receptacles in the multi-bus fabrics1106-1109.
Computer system embodiments are not limited to four dual-socket processor boards. In other embodiments, the one-dimensional multi-bus fabrics can be used to form two-dimensional multi-bus fabrics for large scale computer systems. The single-socket and dual-socket boards described above include additional fabric ports C and D that can be used to increase the number of processor boards in the computer systems. For example, the multi-bus fabrics can be configured to combine multiple eight socket systems to form larger computer systems.
FIG. 12 shows an isometric view and schematic representation of an example 32socket computer system1200 configured in accordance with embodiments of the present invention. Thesocket system1200 comprises four eight socket computer systems1201-1204, each eight socket computer system is configured and optically interconnected using four multi-bus fabrics as described above with reference toFIG. 11. The fabric ports C of the dual socket boards are used to optically couple the eight socket systems to one another. For example, 4-wayoptical star coupler1205 can be used to optically couple all four dual socket boards1206-1209. The 4-wayoptical coupler1205 enables the boards1206-1209 to exchange optical signals. For example, optical signals output from fabric port C ofboard1207 are split into four substantially identical sets of the same optical signals that are input to corresponding fabric ports C of theboards1206,1208, and1209. The 4-wayoptical coupler1205 enables each of theboards1206,1208, and1209 to send optical signals to the other three boards in the same manner. Theoptical coupler1205 can be replicated at most 15 more times for all C and D fabric ports to give full bandwidth communication. The processor boards1101-1104 can be mounted in a cabinet or chassis. The four multi-bus fabrics optically connecting the boards in each eight socket system and the 4-way optical couplers form a two-dimensional multi-bus fabric that can be implemented in the backplane of the cabinet.
In other embodiments, the C fabric ports may be used to provide the second dimension of optical communication between dual-socket boards of the eight socket systems, and the D fabric ports may be used to provide optical communication with other devices, such as expansion memory boards.FIG. 13 shows an examplememory expansion board1300 configured in accordance with embodiments of the present invention. Theboard1300 comprises twomemory controllers1302 and1304, memory1306-1309, and four transceivers1312-1315. Memory1306-1309 are each composed of eight Dual In-line Memory Modules (“DIMMs”), and transceivers1312-1315 are configured as described above with reference toFIG. 7, and are in optical communication with fabric ports labeled X and Y. As shown inFIG. 13, thememory controller1302 is in electronic communication withmemory1306 and1307 andtransceivers1312 and1313, and thememory controller1304 is in electronic communication withmemory1308 and1309 andtransceivers1314 and1315. Thememory controllers1302 and1304 manage the flow of data between the transceivers and the memory. In other embodiments, thememory controllers1302 and1304 can be in electronic communication with each other viabi-directional communication links1316 and1318 in order to coordinate storage of and access to stored data.
Expansion memory boards can be rack mounted and placed into optical communication with single and dual-socket systems described above using memory multi-bus fabrics.FIG. 14 shows an end-on view of anexample computer system1400 with memory expansion configured in accordance with embodiments of the present invention.Center shelves1402 and1404 house the 32socket system1200 shown inFIG. 12. In particular,shelf1402houses socket systems1201 and1202 andshelf1404houses socket systems1203 and1204, with dual socket boards represented by rectangles, such asrectangle1406. Dashed-line rectangles1408 identify the multi-bus fabrics used to optically couple the A and B fabric ports of the dual socket boards within the same socket system, as described above with reference toFIG. 11. C fabric ports of the dual socket boards are used to provide a second dimension of optical communication between the four socket systems1201-1204 using star couplers, such asstar coupler1205 described above with reference toFIG. 12. Note that in order to avoid clutteringFIG. 14 only onestar coupler1205 is shown. Shelves1410-1412 each house eight memory expansion boards, such asexpansion memory board1414. In the example show inFIG. 14, D fabric ports of the dual socket boards provide an additional dimension for memory expansion. The memory expansion boards can be optically coupled to D fabric ports of the dual socket boards via memory multi-bus fabrics. For example, as shown inFIG. 14,memory expansion boards1414 and1416 are optically coupled to dual-socket board1406 viaoptical bus1418, as described above with reference toFIG. 6.
Multi-bus fabrics of the present invention offer a number of advantages over other optical interconnect systems. The multi-bus fabrics allow large numbers of processing elements to be connected at low latency, and memory can be scaled efficiently in terms of power and without adding significant latency. In general, when multi-bus fabrics are compared to other optical interconnects, multi-bus fabrics have fewer transmitters than photodetectors per node, a lower average optical hop count, and lowest worst case optical hop count. Some advantages of having fewer transmitters than photodetectors per node are that photodetectors typically have lower power consumption and higher reliability than transmitters. Thus, multi-bus fabrics of the present invention provide better reliability and lower power consumption when the connection count (i.e., total number of transmitters and receivers) is the same as other optical interconnect systems.
FIG. 15 shows an example ring-based optical interconnect for an arrangement of four nodes of computer system, andFIG. 16 shows an example switch-based optical interconnect also of four nodes of a computer system. Table I compares performance parameters of themulti-bus fabric102, shown inFIG. 1, with the ring and switch-based interconnects shown inFIGS. 15 and 16.
| TABLE I |
| |
| 4 nodes | Ring | Multi-bus | Switch |
| |
|
| Transmitters/node | 2 | 1 | 2 |
| Receivers/node | 2 | 3 | 2 |
| Optical connections/node | 4 | 4 | 4 |
| Ave.optical hops | 1 | 0.75 | 1.5 |
| Worst caseoptical hops | 2 | 1 | 2 |
| |
Table I reveals that although the
multi-bus fabric102 has more receivers, it has fewer transmitters. Table I also reveals that the
multi-bus fabric102 has fewer average optical hops and a lower number of worst case optical signal hops.
FIG. 17 shows an example torus-based optical interconnect for an arrangement of sixteen nodes of a computer system, andFIG. 18 shows an example two-layer switch-based optical interconnect for sixteen nodes of a computer system. Table II compares performance parameters of themulti-bus fabric500, shown inFIG. 5, with the ring and switch-based interconnects shown inFIGS. 17 and 18.
| TABLE II |
| |
| 16 nodes | Ring | Multi-bus | Switch |
| |
|
| Transmitters/node | 4 | 2 | 4 |
| Receivers/node | 4 | 6 | 4 |
| Ave.optical hops | 2 | 1.5 | 3.375 |
| Worst case optical hops | 4 | 2 | 4 |
| |
Again, even though the fabrics have been scaled up to handle a larger number of nodes, Table II reveals that the
multi-bus fabric500 has more receivers, but fewer transmitters. Table II also reveals that the
multi-bus fabric500 maintains fewer average optical hops and a lower number of worst case optical hops.
Table III compares performance parameters of a multi-bus fabric for a 64 node three-dimensional computer system with analogous ring and switch-based systems.
| TABLE III |
| |
| 64 nodes | Ring | Multi-bus | Switch |
| |
|
| Transmitters/node | 6 | 3 | 6 |
| Receivers/node | 6 | 9 | 6 |
| Ave. optical hops | 3 | 2.25 | 5.34 |
| Worst case optical hops | 6 | 3 | 6 |
| |
Table III reveals that the multi-bus fabric again has more receivers per node, but fewer transmitters, and that the multi-bus fabric maintains fewer average optical hops and a lower number of worst case optical hops.
The foregoing description, for purposes of explanation, used specific nomenclature to provide a thorough understanding of the invention. However, it will be apparent to one skilled in the art that the specific details are not required in order to practice the invention. The foregoing descriptions of specific embodiments of the present invention are presented for purposes of illustration and description. They are not intended to be exhaustive of or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in view of the above teachings. The embodiments are shown and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various embodiments with various modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the following claims and their equivalents: