'•?c?5ief ί> IRELAND PATENTS ACT. 1964 i t*· "* ·' I ί · 1 ' ·. ,-. */ 1 .: · ; .,, . ... .. .. -·’·- * · - -· · 1" m ^ w' £-· v,j l __!PROVISIONAL SPECIFICATION"Improvements in and relating to Stable Memory Circuits" ........ 553/9/ m ^Patent Application by TOLSYS LIMITED, an Irish company of Innovation Centre, TrinityCollege, College Green, Dublin 2, Ireland.- 2 - 5 e l 5 3 oIntroductionThe present invention relates to stable memory circuits such as that described in the co-pending Irish Patent Application No. 2223/89.
In this specification, the phrase "stable memory circuit" is intended to cover a circuit which includes not only banks of stable memory but also control and manager circuits associated A, with the stable memory banks.
The present invention is directed towards providing a stable 10 memory circuit with improved protection, cacheing, disk and network support, and fully or partially instantiated N-dimensional configurations for performing various functions.
Statements of InventionAccording to the invention, there is provided a stable memory 15 circuit comprising means for storing capabilities indicating protection for and function of the portions of stable memory banks.
Ideally, there is capability for each portion of a menjory bank and a method for indicating the capabilities for those - 3 - i v i ;; -: .- '' v portions that each process or activity may access (i.e. which capabilities it "owns").
In one embodiment, there is an object capability table and a process capability table, the object capability table 5 specifying a capability entry for each page of the memory bank, the process capability table storing object capabilities for those pages which may be accessed by each process.
The invention also provides a method of accessing a portion of a memory bank in a stable memory circuit comprising the 10 sub-step of accessing the capability tables to determine whether or not access is allowed to that pbrtion of the memory bank.
According to another aspect of the invention, there is provided a stable cache circuit incorporating a stable memory 15 and a cache controller. In one embodiment, the memory banks may store data for a cache. In another embodiment, both data and tags for a cache may be stored in the memory banks. In a still further embodiment, separate memories are included in the stable memory circuit for storage of cache tags and data.
According to another aspect, the invention provides a stable memory circuit comprising an interface for connection of a data bus with a peripheral device. The peripheral device may be a network or a disk, either silicon or rotating. The interface may be connected to a manager address bus for memory banks of the stable memory circuit or connected to a stable data path between memory banks of the stable memory circuit.- 4 - 9 t ο 5 5 3 *Detailed Description of the Invention 5 The invention will be more clearly understood from the following description of some embodiments thereof, given by way of example only with reference to the accompanying drawings in which:-Fig. 1 is a representation of a stable memory circuit 10 such as that described in the co-pending Irish PatentApplication No. 2223/89;Fig. 2 is a representation of one embodiment of the capability mechanism of a stable memory circuit of the invention; 15 Figs. 3(a) and 3(b) are representations of stable cache circuits incorporating a stable memory;Fig. 4 is a representation of a stable memory circuit including interfaces for peripheral devices;Figs. 5 and 6 are views showing network arrays 20 incorporating stable memory circuits.- 5 - - « τ V ώ 5 5 -Referring to the drawings, and initially to Fig. 1 there is illustrated a stable memory circuit such as that described in the co-pending Irish Patent Application No. 2223/89. The stable memory circuit 1 comprises a pair of stable memory 5 banks 2 made up of dual-ported memory circuits such as VRAMs. The memory banks 2 are interconnected by serial data paths 3 in which is connected a stable data engine 4. The stable memory circuit 1 also includes a manager circuit 5 having serial links 6, and a parallel bus interface 7 connected to a 10 parallel bus 8.
Whilst the protection of stable memory can be supported by simple access flags in the address path, 'here we describe a more elaborate mechanism, where the protection is supported by "capabilities".
Referring now to Fig. 2, one embodiment of the capability mechanism 10 forming part of a stable memory circuit of the invention is illustrated. This capability mechanism 10 uses two tables, namely, an object capability table 14 and a process capability table 15. The object capability table 14 20 stores a capability entry 16 for each page of the stable memory bank and the process capability table 15 stores a list of the object capabilities for those pages that a process may validly access. As shown, the object capability table is addressed using the memory address. Hash function 18 is 25 applied to the memory address, process identifier, and node _ 6 _ £910553’ identifier in the addressing of the process capability table 15, for which re-hashing 19 may be required. An output line 20 is used for signalling that access may validly proceed and an output line 21 is used if an access may not validly 5 proceed, in which case a "Capability Microtrap" may be generated. In this embodiment, the capability tables 14 and 15 are held within the stable memory banks themselves, each memory board containing entries relevant to its own memory. This allows for automatic expansion of the capabilities as the 10 capacity of each board is increased or as the number of memory boards is increased. The definition and handling of these tables should be kept entirely distinct from those for virtual memory management schemes. ΛThe capability tables may be cached at two levels. Firstly, 15 since the tables may be extensive, the entries in memory at any particular time may only be a cached subset of larger tables that may exist on disk. Secondly, to improve performance the parallel bus interface may cache a subset of these memory-resident entries. Parallel bus interface 20 capability cache misses can be automatically resolved by fetching the relevant entry from memory. If no memory- resident entry exists, a Capability Microtrap can be used to interrupt the manager circuit so that it can respond. If the miss indicates a capability entry that is still on disk it may 25 cause the entry to be fetched from disk, otherwise it may handle this as a fault. Faulting on write may be deliberately - 7 - i^fli ί 6 5 5 Su- Λ enforced after every checkpoint so that a list of those pages modified since the last checkpoint may be collected for each process. This is a very important feature.
The most important feature is that the control and accessing 5 of the capability tables takes place within the stable memory circuit and accordingly the host processor is not aware of their values. In this way, it is impossible to modify any capabilities either deliberately or by accident.
Capability tables define the protection for memory, and imply 10 very little about the stability state of the memory, which is properly maintained within 'stability data structures' for allocated stable memory (stability state is only appropriate to stable memory). However, capabilities should distinguish between stable and non-stable memory, simply to protect the 15 stable memory areas, and the stability data structures can be merged into the capability tables.
"Stable pages" may be allocated either individually or in clusters. Since a stable page is composed from two normal pages, each with its own capability entry, the capability 20 tables for stable pages may be replicated (as stable pages), thereby enhancing fault tolerance. The replication also allows a memory read access of the capability entry to be returned from an alternate bank to that for which the access is intended, although this requires that the address for the -I* t9'0553.alternate bank by derived by such logic that it points to the correct capability entry. These entries may be cached by capability caches and checked by appropriate capability logic at the parallel bus interface, and subjected to error 5 detection and correction in the same way as for data.
The stability data structures and capability entries may initially be set to non-stable by the manager circuit, allowing host processors to consider the stable memory circuit as a normal bank of memory of twice or more the capacity of 10 each memory bank. Stable memory can be allocated on a per-page basis, each stable page occupying the space normally occupied by two or more (not necessarily'contiguous) memory pages, thus reducing the total capacity incrementally. When no more free pages exist, the host processor must surrender 15 the two or more normal pages for every stable page it asks the manager circuit to allocate, and remap the new stable page, implying that only host processors supporting pages virtual memory schemes may efficiently utilise this stable memory.
It will be appreciated that the use of capability tables in 20 this way protects the memory banks from failed host processors so that existing non-fail-stop processors may be used for low-cost, high-availability applications.
Checkpointing is more difficult if caches are used externally to the stable memory circuit, since dependencies between 25 processes will need to be tracked, and there may be - 9 - ~^? 6 5 5 5 * > dependencies between data in the various caches that will need to be flushed together. Capabilities can be used to resolve dependencies for write-back external caches. The cache circuit can 'call-in' the outstanding dirty data from write-5 back external caches for a checkpoint, either by requesting a flush of all dirty data from the external caches that hold related data (related by dependencies), or by selectively flushing by issuing parallel bus reads for this data (the external caches will then intervene on behalf of memory) . The 10 capability mechanism can assist in either of these procedures, since the Capability Microtrap service routines may gather a list of all the modified pages of a process since the last checkpoint, the Process Capability Tables may indicate all those external caches and processes that have shared access to 15 those pages, and a centralised chain directory (for the external caches) within the stable memory circuit may indicated which external cache lines in a page will be dirty.
Referring now to Figs 3(a) and 3(b), cache circuits are illustrated incorporating stable memory circuits of the 20 invention. In Fig. 3(a), there is illustrated a cache circuit 22 incorporating a stable memory circuit 23 which stores both the data and tags for the cache. The cache circuit 22 includes a cache controller 24 connected to a tag cache 25, a data cache 26 and an chain directory cache 27.
, - J ·ι t,j ^ .
In Fig. 3(b), there is illustrated a cache circuit 28 having a stable memory circuit 29 which stores cache tags and a separate stable memory circuit 30 which stores cache data. Each of the cache circuits 22 and 28 could be regarded as a 5 "transparent stable cache" as the stable memory operations would not be monitored or visible to a host processor. Because tags are stored in the tag cache 25 for both cache circuits, a high probability exists that the tags may be found in the tag cache without having to recall them from the 10 associated stable memory circuit. If the tag cache is a write-through cache, then the cache controller 24 may update the tags at the tag cache 25 and leave the tag cache 25 to update the stable memory circuit 23/ or 2f9, as the case may be. Similarly, the stable memory circuit 23 or 30, is 15 supplemented by the write-through or blocking data cache 26, which will not affect cache coherency but may affect speed.
Additional external caches may be used provided they have write-through or blocking protocols. When only a single parallel bus interface is used, additional external write-20 through or blocking caches will not affect cache coherency but may affect speed. Where multiple parallel bus interfaces are used, centralised "chain directories" (for the external caches) for each parallel bus interface can indicate which cache lines are held externally, and hence which accesses to 25 the stable memory circuit must be propagated forward to allow invalidation of the relevant external caches attached to that- 11 - '> Ί ·5 s 'j J* parallel bus interface. Performance may be greatly improved by maintaining these in the chain directory cache 27.
The cache circuits may be used as the only memory in the computing system which allows all system memory, including 5 caches, to be stable. This simplifies the allocation and management of stable memory, as well as simplifying recovery after a fault.
Referring now to Fig. 4, there is illustrated a stable memory circuit, in which parts similar to those described with 10 reference to Fig. 1, are identified by the same reference numerals. The stable memory circuit in Fig'. 4 includes a disk interface 31 connected to silicon disks 32 and to rotating disks 33. The parallel bus 8 is also connected to a network interface 34 for a network 35. The stable data path 3 is 15 connected to a disk interface 36 for silicon disks 37 and rotating disks 38. The stable data path 3 is also connected to a network interface 39 for a network 40. Thus, the stable memory circuit described in Fig. 4 may be referred to as "transparent stable disk".
It will be appreciated that the connection of the stable data paths to disks and to a network yield very high bandwidth transparent disk and network interfaces. Accordingly, operations such as network communications may be performed by the memory itself and may be exploited in any of the ways that - i2 - -9105 $j· - * a stable memory circuit may be. Multiple disk and network interfaces can provide redundancy for fault tolerance.
This is an important feature of the invention as it allows disks to be invisible to a host processor, resulting in lower 5 processor overhead and more efficient utilisation of disks, especially rotating disks. Regarding the rotating disks, the disk drive head will generally be operated in a more efficient manner when controlled via the stable memory circuit as sectors of the disk can be written in sequence. This can be 10 done by aggregating disk writes in a buffer in stable memory, and writing these to disk only when absolutely necessary, and even then writing in large blocks at the end of a log - the disk is written progressively from the first track to the last. This increases system performance. In previous memory-15 based logging file systems the memory was not fault-tolerant, but in this case, since the log is supported by the stable memory circuit, it is resilient to failures, and hence it is safe to keep some committed data only in the log.
With it's strong support for cacheing, the stable memory 20 circuit of the invention may form the core of the nodes of a fully or partially instantiated N-dimensional array network computing system, mapped into a global address space and supporting a system cacheing mechanism that both maintains global cache coherency and minimises the traffic on the system 25 interfaces. An internal write-through or blocking cache, or -13- &·®1.§55ΐ’ additional external write-through caches, will not affect cache coherency, but may affect speed. The capabilities may be used to allow external write-back cacheing.
N-dimensional array networks may be constructed in two ways, 5 both of which are most easily explained for the fully instantiated two-dimensional case where a two-dimensional array of busses will arise. Firstly, referring to Fig. 5 which illustrates a stable memory circuit bus network 45, the busses include internal stable memory circuit parallel busses 10 46 (also 8), with a parallel bus 47 bridging the array bus intersections. Host processors 48 are attached to the intersection busses or stable memory circuit processors 49 are attached to the array busses. Although stable memory is attached to the array busses, it is associated with the node 15 at the intersection of these array busses.
Referring now to Fig. 6, a parallel bus network 50 is illustrated in which the roles of the stable memory circuit and parallel buses are swapped so that the array buses are the parallel busses 51 with each stable memory circuit bridging 20 the array bus intersections. Either stable memory circuit processors 52 are attached to the intersection busses, or host processors 53 are attached to the intersection busses via additional manager circuits, or host processors attached to the array busses. Since stable memory is attached to the -14- "·?555ϊ intersection busses, it is naturally associated with the node at the intersection of the array busses.
This is a more expensive arrangement than that shown in Fig. 5 as every node requires at least two manager circuits instead 5 of one. However, tolerance to faults increases, since each node has duplicated intersection busses, and so can survive a failure of any component relating to one of those busses. Moreover, the capability mechanism allows the stable memory to be better isolated against array bus failures, particularly 10 those that might propagate random behaviour. For these reasons the parallel bus network might be better suited for highly reliable systems. Either form will give the same functionality. Because the stable memory attaches to the array busses, the stable memory circuit bus network will 15 result in much greater array bus traffic. This need not be critical, since all memory does not need to be stable, and non-stable memory may be attached to the intersection bus to reduce the usage of array bus bandwidth. To extend the array from two to N dimensions, more manager circuits must be added, 20 thereby increasing the number of array busses. Anyone of ordinary skill in the art will recognise that most other network topologies such as hierarchical networks, can be formed from a partially instantiated N-dimensional array network.-15- ~S1055 j.
In addition to fully and partially instantiated N-dimensional array networks constructed in the above fashion, the stable data paths may be connected in a fully or partially instantiated N-dimensional array network, with a plurality of 5 stable data engines bridging a plurality of stable data paths, each stable data engine connecting to a plurality of memory banks .
In one embodiment that has been constructed, one dimension of a two-dimensional array of stable data paths interconnects 10 stable data engines that connect to two memory banks in the other dimension. Again, anyone of ordinary skill in the art will recognise that most other network 'topologies can be formed from a partially instantiated N-dimensional array network.
Since transfer of information with a stable memory circuit occurs via the stable data paths, while normal random accesses to the memories can continue to occur concurrently (ie, transparently), the stable data paths can be intercepted or observed by a stable data engine configured in specific ways, 20 for example, configured as a 'vector engine' to allow trivial and non-trivial vector-related functions to be applied to the stable memory data. Trivial functions can include summing or scaling a vector in one bank, and non-trivial functions can include FIR and HR filtering functions, or for example, the 25 computation for the response of a neuron (as a vector -16- jv multiply-accumulate). Various other configurations are also described below.
The stable memory circuit may be configured to support both generation-based and copying 'transparent garbage collectors', 5 and indeed for an asynchronous garbage collection of directed graphs of heap allocated storage for functional or object based language environments. In cases where the locations of references to allocated memory objects are discernable to the stable memory circuit, it can then trace these asynchronously, 10 identifying and collecting garbage, and optionally compacting the heap region as a result.•*sFor stable memory circuit systems with only two banks of memory any such garbage collection is only really efficient for large objects, which can be collected transparently using 15 the stable memory circuit block copying mechanisms. If, however, more than two banks of memory are available, and if any one of these can be exclusively allocated to the Garbage Collector (either the manager or some specialised devise) at any time, then the heap may be (transparently) block copied to 20 this bank, where the Garbage Collector can (transparently) identify and collect garbage, and optionally compact the heap. When finished, either the bank can simply be reallocated for access by the host processors, or the heap can be (transparently) block copied back to whence it came; either 25 way the result is a new heap. A post process will then need -17 - · -, ^ S' · ^ 5 | W -3< to gather any intervening heap changes and update the new heap accordingly.
Note that if concurrent usage of the 'old' heap is not required then the bank holding the heap can simply be 5 allocated to the Garbage Collector, and no post processing will be required.
Consistency (within a database) is often maintained by atomic commit operations using Stable Storage, where performance is limited by the disk characteristics. Ultimately this limits 10 the number of Transactions-Per-Second (TPS), generally considered a vital figure of merit for database systems. A stable memory circuit can be configured to perform fast and fault tolerant transparent atomic commits by buffering updates in stable memory, giving dramatically increased TPS.
The stable memory circuit can be configured as a buffer pool for database segments and persistent memory. Such segments can be mapped on demand by the stable memory circuit if it intercepts accesses by segments by translating a segment name to an appropriate position in the buffer pool. Active 20 segments may be pinned while in use, and unpinned segments may be flushed. The stable memory circuit can be used to provide efficient support for this mechanism by converting segment names to buffer pool addresses and managing the buffer pool. Since the buffer is supported by the stable memory circuit, it - is - - '·. : 4 · is resilient to failures, and hence it is safe to keep some persistent data only in the buffer.
The stable memory circuit can be configured to provide efficient support for log based recovery for recoverable 5 memory by reducing disk writes for frequently committed data. The stable memory circuit can be organised as a stable cyclic buffer log, in which multiple versions of the same data may be found, the most recent of which is near the front of the log.
Committed data items may be flushed from the log 10 asynchronously, omitting disk writes for data for which there is a later committed version in the log', thus saving disk writes .
The invention is not limited to the embodiments hereinbefore described but may be varied in construction and detail.
Dated this 19th day of February 1991 CRUICKSHANK & CO.,dk\A:PROV5\tolsy®.|23 βγ EXECUTIVEAgents for the Applicant 1, Holies Street,Dublin 2.