Our CDC 6500 was basically two CDC 6400 CPUs in one cabinet, sharing memory and I/O.The 6000 CPU was RISC long before it was popular to have a reduced instruction set. TheCPU was usually said to have around 74 instructions (the exact number depends on how youcount 'em), but by modern standards the number was less than that. The rough number 74counts each of 8 addressing modes three times, whereas you could reasonably say that anaddressing mode shouldn't be counted as a separate instruction at all. Despite the leaninstruction set, there were few complaints about the instruction set missinginstructions.
Arithmetics was 1's complement. This was sometimes inconvenient, becausea word of all 1's tested as zero, just as did a word of all 0's.Even then, 2's complement (in which a word of all 1's had a value of -1) wasmore common - and now 2's complement is nearly universal.One of the few advantages of 1's complement arithmetic was that taking thenegative of a number involved simply inverting all the bits, whereas in2's complement, you need to invert all bits and then add 1.A computer science professor once told us that Control Data chose the inferior1's complement approach because 2's complement was patented,andthis web page seems to confirm that.
Central memory (CM) was organized as 60-bit words. There was no byte addressability. Ifyou wanted to store multiple characters in a 60-bit word, you had to shift and mask.Typically, a six-bit character set was used, which meant no lower-case letters. Our siteinvented a 12-bit character set, which was basically 7-bit ASCII with 5 wasted bits. Othersites used special shift/unshift characters in a 6-bit character set to achieveupper/lower case. The short-lived Cyber 70 Series which followed the 6000 Series added aCompare and Move Unit (CMU) which did complex character handling in hardware. The CMU wasnot used much, probably due to compatibility concerns. The CMU was such a departure fromthe 6000's lean and mean instruction set that the CDC engineers must have been relieved tobe able to omit it from the next line of computers, the Cyber 170 Series.
CM addresses were 18 bits wide, though I believe that in the original 6000 line, thesign bit had to be zero, limiting addresses to 17 bits. Even without the sign bit problem,though, the amount of addressable central memory was extremely limited by modernstandards. A maxed-out 170 Series from around 1980 was limited to 256K words, which intotal bits is slightly less than 2 megabytes (using 8-bit bytes purely as a means tocompare with modern machines). In the early days, 256K words was more than anyone couldafford, but eventually this addressability problem became a real problem. CDC never founda way around it.
The closest there was to a workaround was the Extended Core Storage (ECS) unit. Thiswas auxiliary memory made from the same magnetic cores of which CM was fabricated. (Morerecent versions of ECS were named ESM, Extended Semiconductor Memory.) ECS was accessibleonly by block moves to or from CM. I can't remember the address width of ECS, but it wasmuch larger than 18 bits. But not being able to run programs or directly access data fromECS meant it was used mostly to store operating system tables or to swap programs.
I say "swap" programs because there was no virtual memory on the machine.Memory management was primitive. Each user program had to be allocated a single region ofcontiguous memory. This region started at the address in the RA (Reference Address)register and went for a certain number of words, as dictated by the contents of the FL(Field Length) register. The CPU hardware always added the contents of the RA register toall address references before the memory access was made; as far as the program wasconcerned, its first address was always 0. Any attempt to access memory >= FL resultedin a fatal error.
As programs came and went from CM, holes opened up between regions of memory. To placeprograms optimally in memory, an operating system had to suspend the execution of aprogram, copy its field length to close up a gap, adjust the RA register to point to theprogram's new location, and resume execution. On the 6500, it was actually faster to do ablock move to ECS and then a block move from ECS than it was to move memory in a tightloop coded with the obvious load and store instructions. This changed with the Cyber170/750--at least at our site, which retained its old core-based ECS even when it upgradedto the 750.
Incidentally, the CPU enforced access to ECS in much the same way as it did to CM.There were two registers specifying the beginning address and number of words of thesingle region of ECS to which the CPU had access at any time. At our site, user programsalways had an ECS field length of zero. Users weren't allowed access to ECS at allbecause it was felt that the OS could make better use of that resource.
The 6000 CPU had a load/store architecture: data in memory could be referenced only byload and store instructions. To increment a memory location, then, you had to execute atleast three instructions: load from memory, do an add, and store from memory.
Memory access was interleaved. I believe that the 6500's memory was divided into 16independent banks, so usually the CPU did not have to wait for a memory cycle to completebefore starting a new one. I think that the 750 only had 4-way interleave. This soundslike a step down from the 6500. However, it may have been unnecessary to interleave tosuch a high degree on the more recent 750, since it had semiconductor memory as opposed tothe 6500's slower core memory.
In addition to the obvious program counter (P register), the 6000 Series had 24user-accessible CPU registers. There were 3 types of registers, 8 of each type: A, B, andX. Registers of each type were numbered 0-7.
X registers were 60 bits wide and were general-purpose data registers. Mostinstructions operated only on X registers.
"A" registers were 18-bit address registers with a strange relationship to Xregisters: loading a value (let's call itm) into any register A1 - A5 would causethe CPU to load the correspondingly-numbered X register from memory locationm.Loading A6 or A7 withmwould cause the correspondingly-number X register to bestored at that location. This was the only way that data could be moved between anyregister and memory.
A0 was a pretty worthless register. I believe that by convention, code generated byFORTRAN kept a pointer to the beginning of the current subroutine in A0, to aid insubroutine traceback in cause an error occurred. Similarly, X0 was not too useful, as itcould neither be loaded from or stored to memory directly. However, it was moderatelyuseful for holding intermediate results.
The B registers were index registers that could also be used for light-duty arithmetic.B registers tended to not get used a whole lot because
B0 was hardwired to 0. Any attempt to set B0 was ignored by the CPU. In fact, on someCPUs, it was faster to execute a 30-bit instruction to load B0 with a constant than it wasto execute two consecutive no-ops (which were 15-bit instructions). Therefore, if you hadto "force upper" by 30 or more bits, it made sense to use a 30-bit load into B0.Fortunately, the assembler did force uppers automatically when necessary, so programmerswere generally isolated from those details.
Many programmers felt that CDC should also have hardwired B1 to 1, since there was noincrement or decrement instruction. Since there was no register hardwired to 1, manyassembly language programs started with "SB1 1", the instruction to load a 1into B1.
Instructions in the CPU were 15 or 30 bits. The 30-bit instructions contained an 18-bitconstant. Usually this was an address, but it could also be used as an arbitrary 18-bitinteger. From the point of view of the instruction decoder, each 60-bit word was dividedinto four 15-bit instruction parcels. While up to four instructions could be packed into a60-bit word, instructions could not be broken across word boundaries. If you needed toexecute a 30-bit instruction and the current position was 45 bits into a word, you had tofill out the word with a no-op and start the 30-bit instruction at the beginning of thenext word. I suspect that the 6000 Series made heavier use of its no-op instruction(46000 octal) than nearly any other machine. No-ops were also necessary to padout a word if the next instruction was to be the target of a branch. Branches could bedone only to whole-word boundaries. The act of inserting no-ops to word-align the nextinstruction was calling doing a "force-upper".
There was no condition code register in the 6000 Series. Instructions that didconditional branches actually did the test and then branched on the result. This, ofcourse, is in contrast to many architectures such as the Intel x86, which uses a conditioncode register that stores the result of the last arithmetic operation. When I learnedabout condition code registers years after first learning the 6000 architecture, I wasshocked. Having a single condition code register seemed to me to be a significantpotential bottleneck. It would make execution of multiple instructions simultaneously verydifficult. I still think that having a single condition code register is stupid, but Imust admit that the Intel Pentium Pro and successors, for instance, are pretty darned fast anyway.
The instruction set included integer (I), logical (B), and floating-point (F)instructions. The assembler syntax was different than most assemblers. There were very fewdifferent mnemonics; differentiation amongst instructions was done largely by operators.Arithmetic instructions were mostly three-address; that is, an operation was performed ontwo registers, with the result going to a third register. (Remember that the 6000'sload/store architecture precluded working with memory-based operands.) For instance, toadd two integers in X1 and X5 and place the result in X6, you did:
IX6 X1+X5
A floating-point multiplication of X3 and X7, with the result going to X0, would be:
FX0 X3*X7
An Exclusive Or of X6 and X1, with the result going to X6, would be:
BX6 X6-X1
Initially, there was no integer multiply instruction. Integer multiply was added to theinstruction set pretty early in the game, though, when CDC engineers figured out a way ofusing existing floating-point hardware to implement the integer multiply. The downside ofthis clever move was that the integer multiply could multiply only numbers that could fitinto the 48-bit mantissa field of a 60-bit register. If your integers were bigger than 48bits, you'd get unexpected results.
You'd think that 60-bit floating-point numbers (1 sign bit, 11-bit exponent includingbias, 48-bit bit-normalized mantissa) would be large enough to satisfy anyone. Nope: the6000 instruction set, lean as it was, did include double precision instructions foraddition, subtraction, and multiplication. They operated on 60-bit quantities, just assingle precision numbers; the only difference is that the double precision instructionsreturned a floating point number with the 48 least-significant bits, rather than the 48most-significant bits. So, double precision operations--especially multiplication anddivision--required several instructions to produce the final 120-bit result. Doubleprecision numbers were just two single precision numbers back-to-back, with the secondexponent being essentially redundant. It was a waste of 12 bits, but you still got 96 bitsof precision.
You can tell that floating point was important to CDC when you consider that there wasseparate rounding versions of the single precision operations. These were rarely used, forsome reason. The non-rounding versions needed to be in the instruction set because theywere required for double-precision work. The mnemonic for double precision operations wasD (as inDX7 X2*X3) and for rounded operations was R.
Another instruction that is surprising to find in such a lean instruction set wasPopulation Count. This instruction counted the number of 1 bits in a word. CX6X2, for instance, would count the number of bits in X2 and place the result inX6. This was the slowest instruction on most 6000 machines. Rumor had it thatthe instruction was implemented at the request of the National Security Agency for use incryptanalysis.
For more details, see theCDC 6000 Instruction Set.
In addition to one or two CPUs, all CDC 6000-style machines at least 10 peripheralprocessors (PPs). As the name implies, these were simple built-in computers witharchitectures oriented toward doing I/O. However, in practice, much of the operatingsystem was also implemented in PPs.
Each PP had 4096 words of 12 bits each. These 12-bit units were often referred to asbytes, to distinguish them from the 60-bit CPU words. However, I found the terminology abit misleading, as some people referred to the 6-bit characters used on the CDC systems asbytes. Each PP had its own 4096-byte memory. There was no way to directly access anotherPP's memory, though you could have two PPs talk to each other over an I/O channel.
The original 6000 machines had one bank of 10 PPs. Later machinestypically had two banks of 12 for a total of 24 PPs. The fact that on somemachines, PPs were divided into multiple banks was a hardware implementationissue and was not programmer-visible.
PPs had instruction sets that were reminiscent of the PDP-8 or the later Motorola 6800.There was only one data register, an 18-bit A register, and there was a 12-bit instructionpointer P. Instead of index registers and the like, the architecture provided easy accessto the first 100 octal (that's 100B in CDC talk) memory locations. These were called"direct cells". Many PP instructions had a 6-bit field that referred to a directcell, and used direct cells in much the way you'd use real registers. There was also a Qregister, which was used internally and was not generally programmer-visible. I think itwas used to manage multiple-word transfers to/from central memory.
One little-mentioned fact regarding peripheral processors is that they were virtualprocessors. A single physical arithmetic and logic unit did the actual work for 10 (or onlater machines, 12) PPs. A set of 10 PPs would consists of 10 4096-byte memories and 10sets of registers. A single ALU would service each PP in a round-robin fashion in anarrangement referred to as a "barrel". Early 6000 machines had one PP ALUimplementing 10 PPs; later machines had two implementing 12 each for a total of 24 PPs.
PPs could read and write any central memory location at will; hence, you didn't wantusers to be able to run their own PP programs. The simple Reference Address + Field Lengthmemory protection enforced on the CPU did not apply to PPs. The need to read/write CM wasthe reason for the PP's A register being 18 bits long. As you can imagine, computing18-bit CM addresses on a 12-bit machine was tedious. Systems programmers had powerfulassembly language macros the ease the task.
PPs did I/O by attaching to channels and performing input or output 12 bits at a time.A 6000 CPU did not have I/O instructions (though the later 7600 did). Thus, PPs wereutterly crucial for running a system.
For a CPU program to do input, it would have to get a PP program to read data from aperipheral device into PP memory, and then turn around and write the data into centralmemory. Central memory could be written only in units of 60-bit CM words, so PPstransferred data to/from CM 5 bytes (5*12=60) at a time.
CDC mainframes were operated from a proprietary console that included a keyboard, adeadstart (reboot) button, and two displays. On the 6000 Series, the two displays wereidentical round CRTs adjacent to each other; on the Cyber 700 Series, there was one largerCRT that normally displayed the two logically distinct screens. There was a rocker switchon the 700 console to select the left screen, the right screen, or both. Displaying justthe left or right screen resulted in larger characters. But the CRT was so big that wealways left the display set to show both screens on the tube.
Just what was displayed on the screens was completely under program control; seeDSD: Operator console andConsoleCommands. The console was the only device in the world whose native character set wasDisplay Code. The number of displayable characters was about 48: the 26 upper-caseletters, the 10 digits, and a few punctuation symbols. Since Display Code was a 6-bitcharacter set, this left about 16 characters left over. These characters were used toimplement a very simple graphics mode. In graphics mode, the only operation you couldperform was to place a dot on the screen.
The console display had two characteristics that would surprise most modern-daydevelopers. For one thing, characters were drawn "caligraphically". That is,unlike televisions and most modern CRTs, the screen was not scanned left-to-right andtop-to-bottom. Instead, the beam was moved around to and fro in response to commandsreceived on the I/O channel. When the console was told to draw a character, it moved theelectron beam around the same way a human would move a pencil. It drew fully-formedcharacters, not characters made of dots.
Secondly, the console had no memory of what it had just drawn. Characters and graphicsstayed on the screen only as long as the persistence of the phosphor. For a screen to staydisplayed, the controlling peripheral processor had to send the same display commandsagain and again, constantly. This meant that in practice, you needed to have a PPcompletely dedicated to driving the console. Even with a dedicated PP, character-onlyscreens typically flickered somewhat and screens containing graphics flickeredextensively.
The need to constantly refresh the screen had some advantages. Areas of the screencould be made brighter, or could be made to blink, simply by refreshing them more or lessoften. The usual PP program that drove the console used this feature in an innovative way:When the operator had typed enough characters to uniquely identify a command, a ripplingeffect was created by varying the intensity of successive characters over time.
Drawing characters and dots on the screen was done like this.
First, you would select dot mode, or character mode; if character mode,the size (small, medium, large -- 8, 16, or 32 dots high).
Next you'd send a sequence of 12 bit data words. There are three possibilities:
The fact that the only graphics-mode operation was to draw a dot meant that graphicsperformance was very poor, and graphics were barely used at all.
The console had a limited keyboard containing the 26 alphabetic characters,10 digits, and a very few punctuation characters, not much more than ., + - ( ) The + and - keys were used like PageUp and PageDown keys. There were also two special keys named "Left Blank" and "RightBlank". By convention, Left Blank was used to clear the current keyboard entry and any error messages. Right Blank was used to advance the left screen display sequence established by the DSD SET command.
The hardware interface to the keyboard was quite simplistic andrequired polling by the controlling PP. Each key generates a 6 bit code in the range 1 to 62 octal. The PPwould look for keystrokes by selecting input mode, then reading from the display channel. This would produce a 12 bit data word. If zero,it means no key is depressed. Otherwise, the PP would see the OR of the keycodes for the currently depressed keys. (So if you press A andB both, you get 0003 -- the keycode for C.) Usually this was a nuisance, but in some gamesit was used to good advantage.
The 6000 machines did not have a PC-style ROM BIOS, much less a separate mainframecontrol processor as do modern big machines. When the machine was booted, all it had waswhatever very short program could be entered on the deadstart panel. The deadstart panelcontained rows of 12 on/off switches, each row corresponding to one PP word. There were12 rows on the 6500 panel, and 16 on the 750 panel. When a machine wasdeadstarted (booted), the contents of the deadstart panel were read into the memory ofperipheral processor 0, starting at location 1. Control was then given to PP 0.There was a switch to control which of two PPs was to be number zero. This allowed you toboot even if one PP went bad.
The deadstart panel was hidden away inside the machine. It wasn't often necessary tochange the deadstart program that had been toggled in. CDC's design of the panel, withenough switches for an entire small program, made deadstarting much easier than on, say,some DEC PDP models. On those machines, you had to toggle in a bootup program one word ata time and press "Enter" to enter the word. With CDC, you just set it and forgotit. (Though some diagnostic programs run by the CDC customer engineers may haverequired a different deadstart panel configuration.) It was bad news ifthe switches themselves went bad. The 6500 switches were large toggle switchesthat snapped into place with a satisfying click, but the 750 switches were small flimsy ones,and once a switch went bad on us. Ouch!