JPH02143638A

Movatterモバイル変換

Info

Publication number: JPH02143638A
Application number: JP63296031A
Authority: JP
Inventors: Shigeki Yamada; 茂樹山田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1988-11-25
Filing date: 1988-11-25
Publication date: 1990-06-01

Abstract

PURPOSE:To decrease the average delay time by selecting the number of inputs and outputs of a switching network to be a specific relation, constituting processors into plural groups and decreasing a delay time between the processors in the same group with a high frequency of traffic more than the transfer delay time with a different group. CONSTITUTION:A switching network (SNW) of N input X N output (N=2<k>) consists of a 1st level SNW having N/G sets of groups while using G input XG output (G=2<m>) as one group unit and a 2nd level SNW having N/G sets of (N/S) Y input X (N/S) Y output (S=2<p>). Then the 1st level SNW consists of switch groups of logs G stages and G/S switches are arranged to each stage. The 2nd level SNW consists of switch groups of logR(N/S) stages (R=2<q>) and (N/SR)Y switches are arranged to each stage. Prescribe condition is applied with the constitution above, then even if when the network scale is increased, the average delay time is kept.

Description

Translated fromJapanese

【発明の詳細な説明】〔産業上の利用分野〕本発明は、フォノ・ノイマン型あるは非フォン・ノイマ
ン型のプロセッサあるいはメモリシステム等を多数組み
合わせたマルチプロセッサシステムにおける、高能率で
経済的なプロセッサ間通信手段に関するものである。[Detailed Description of the Invention] [Field of Industrial Application] The present invention is directed to a highly efficient and economical multiprocessor system that combines a large number of phono-neumann type or non-von neumann type processors or memory systems. The present invention relates to inter-processor communication means.

〔従来の技術〕[Conventional technology]

従来、マルチプロセッサシステムにおいて、プロセッサ
・プロセッサ間、プロセッサ・メモリ間、あるいはシス
テム中のサブユニット間の結合方式として、バス方式、
クロスバネットワーク方式、あるいは複数個のスイッチ
を多段に組み合わせた相互結合ネットワーク方式等が提
案されている。Conventionally, in multiprocessor systems, bus methods have been used as connection methods between processors, between processors and memory, or between subunits in the system
A crossbar network system, a mutually coupled network system in which a plurality of switches are combined in multiple stages, etc. have been proposed.

これらの結合方式のうち、バス方式は、ハードウェア量
は少ないが、バスの物理的な転送速度に上限があり、大
規模なマルチプロセッサシステムを実現するのが極めて
困難であった。また、クロスバネットワーク方式は、転
送容量、転送速度を太き（確保することができるが、ハ
ードウェア量が非常に多くなる欠点があった。Among these coupling methods, the bus method requires a small amount of hardware, but has an upper limit on the physical transfer speed of the bus, making it extremely difficult to realize a large-scale multiprocessor system. Further, although the crossbar network method can secure a large transfer capacity and transfer speed, it has the disadvantage of requiring a large amount of hardware.

一方、従来の相互結合ネットワーク方式は、大規模シス
テムを構成した場合、クロスバネットワーク方式に比較
して、スイッチを多段通過することにより、転送遅延が
大きくなり、且つ、クロスバネットワーク方式はどでは
ないがハードウェア量もかなり多いという欠点があった
。これを第３図を用いて詳細に説明する。On the other hand, when a large-scale system is configured with the conventional mutually coupled network method, compared to the crossbar network method, the transfer delay is larger due to the multiple stages of switching. The drawback was that the amount of hardware was quite large. This will be explained in detail using FIG.

第３図は、従来の相互結合ネットワークの１例（デルタ
ネットワーク）を示したもので、２人力×２出力のスイ
ッチを３段に組み合わせて８人力×８出力のネットワー
クを構成している。FIG. 3 shows an example of a conventional interconnection network (delta network), in which switches of 2 manpower and 2 outputs are combined in three stages to form a network of 8 manpower and 8 outputs.

同図において、（１−１）〜（１−８）はそれぞれプロ
セッサ、（２−１）〜（２−４）はそれぞれ第１段目を
構成するスイッチ、（３−１）〜（３−４）はそれぞれ
第２段目を構成するスイッチ、（４−１）〜（４−４）
はそれぞれ第３段目を構成するスイッチ、である。In the figure, (1-1) to (1-8) are processors, (2-1) to (2-4) are switches constituting the first stage, and (3-1) to (3- 4) are switches that constitute the second stage, (4-1) to (4-4)
are switches constituting the third stage, respectively.

このネットワークに、第４図に示すヘッダ付きの転送デ
ータを流すと、第１段目（ｔ＝１．２゜３）の各スイッ
チは第４図のヘッダのａ３−４　ビットを参照して、０
ならば、自スイッチの第１番目の出力へ、１ならば２番
目の出力へルーチングする機能を持つ。ここで自スイッ
チがスイッチ（２−１）なら第１番目の出力とは（６−
１＞を指し、第２番目の出力とは（６−２）を指し、ス
イッチ（３−３）なら第１番目の出力とは（７−５）を
指し、第２番目の出力とは（７−６）を指す。When the transfer data with the header shown in FIG. 4 is sent to this network, each switch in the first stage (t=1.2°3) refers to the a3-4 bits of the header shown in FIG. 0
If it is 1, it has the function of routing to the first output of the own switch, and if it is 1, it has the function of routing to the second output. Here, if the own switch is switch (2-1), the first output is (6-
1>, the second output refers to (6-2), and if the switch (3-3), the first output refers to (7-5), and the second output refers to ( 7-6).

例えば、プロセッサ１　（１−２）から、プロセッサ５
　（１−６）にデータを転送したい場合には、ヘッダ情
報をａｔａ＋ａｏ−１０１（転送先のプロセッサ番号５
の２進数表現）とすることにより、上記のルーチング規
則により転送データが第３図の太線に示すようにルーチ
ングされて、プロセッサ５　（１−６）に到着する。For example, from processor 1 (1-2) to processor 5
(1-6), change the header information to ata+ao-101 (destination processor number 5
(binary representation), the transfer data is routed as shown by the bold line in FIG. 3 according to the above routing rules and arrives at the processor 5 (1-6).

〔発明が解決しようとする課題〕[Problem to be solved by the invention]

この構成例において、ネットワーク規模を８人力×８出
力から、１６人力×１６出力に増大させると、スイッチ
段数は１０ｇｇＢ＝３段から　ｌ。In this configuration example, if the network scale is increased from 8 manpower x 8 outputs to 16 manpower x 16 outputs, the number of switch stages will increase from 10ggB = 3 stages to l.

ｇ　！　１６　＝　４段に増加し、転送時間の増加を招
くとともに、スイッチ数（ハードウェア量）は、８／２
Ｘ１ｏｇｇ８−１２個から、１６　／　２　Ｘ　１　ｏ
　ｇ　ｔ１６−３２個に増加する。しかもこの方式では
、任意の入力端子から、任意の出力端子への転送に対し
て同一の転送時間を要するため、特定の入力と出力の間
で頻繁に転送が行われる場合に、その間の転送遅延を短
くして全体の処理スピードを向上したいという要求に応
えることができないと言う欠点がある。G! 16 = increased to 4 stages, resulting in an increase in transfer time, and the number of switches (amount of hardware) is reduced to 8/2
From X1 ogg8-12 pieces, 16 / 2 X 1 o
g t Increase to 16-32 pieces. Moreover, with this method, the same transfer time is required for transfer from any input terminal to any output terminal, so if transfers are frequently performed between a specific input and output, the transfer delay between The disadvantage is that it cannot meet the demand for improving the overall processing speed by shortening the length of the process.

本発明の目的は、相互結合ネットワークにおいてネット
ワーク規模を大きくした場合にも、ネットワーク全体と
して平均転送遅延時間を少な目に維持するとともに、少
ないスイッチ数で実現が可能な相互結合ネットワークを
提供することにある。An object of the present invention is to provide a mutually coupled network that maintains a small average transfer delay time as a whole network even when the scale of the interconnected network is increased, and that can be realized with a small number of switches. .

〔課題を解決するための手段〕[Means to solve the problem]

本発明では、マルチプロセッサシステムによ（見受けら
れる処理の局所性に着目して、２レベルの接続構造をと
ることによって、ネットワーク規模を太き（した場合に
も、トラヒック頻度の高い入力・出力間では転送時間を
短くするとともに、これらを効率よくルーチングする構
成とした。In the present invention, by focusing on the locality of processing that can be seen in multiprocessor systems, and by adopting a two-level connection structure, the network size can be increased (even if In addition to shortening the transfer time, we designed a configuration that routes these efficiently.

（作用〕即ち、本発明の基本は、スイッチを多段に相互接続する
ことにより構成したＮ入力×Ｎ出力（Ｎ−２″　ｋは自
然数）のスイッチングネットワークにおいて、前記スイ
ッチングネットワークは更に、Ｇ入力ｘ’ｃ出力（Ｇ＝
２’″　；ｍは自然数）を１つのグループ単位としてＮ
７０個のグループを有する第１レベルのスイッチングネ
ットワークと、（Ｎ／Ｓ）Ｙ入力Ｘ（Ｎ／Ｓ）Ｙ出力（
Ｓ＝２’　　；ｐは自然数）を有する第２レベルのスイ
ッチングネットワークとにより構成され、第１レベルの
スイッチングネットワークは、１　ｏ　ｇｓＧ段のスイ
ッチ群より構成され、第１番目（ｉ＝１．２．・・・ｌ
ｏｇＳＧ）にはそれぞれ、０７３個のスイッチを配置す
るとともに、第１段目の各スイッチはＳ入力Ｘ（Ｓ＋Ｙ
）出力のスイッチで構成され、第１ｏｇＳＧ段目の各ス
イッチは（Ｓ＋Ｙ）入力×Ｓ出力のスイッチで構成され
（但し、Ｙは１又は複数）、残りの各段のスイッチはＳ
入力×Ｓ出力のスイッチで構成され、第１段目の各スイ
ッチの第Ｓ＋１番目から第Ｓ＋Ｙ番目までの出力を７本
づつ、計（Ｎ／Ｓ）Ｙ本を第２レベルのスイッチングネ
ットワークに入力し、第ｌｏｇＳＧ段目の各スイッチの
第Ｓ＋１番目から第Ｓ＋Ｙ番目までの入力を７本づつ、
計（Ｎ／Ｓ）Ｙ本を第２レベルのスイッチングネットワ
ークの出力から引き込み、残りの第１段目の各スイッチ
の第１番目から第Ｓ番目までの出力は、次段の各スイッ
チに相互接続し、第２レベルのスイッチングネットワー
クは、１０ｇｍ（Ｎ／Ｓ）段（Ｒ＝２ｑ；ｑは自然数）
のスイッチ群より構成され、第３段目（ｊ＝１．２゜・
・・、ｌ　ｏ　ｇ＊　（Ｎ／Ｓ））にはそれぞれ、（Ｎ
／５Ｒ）Ｙ個のＲ入力×Ｒ出力のスイッチを配置し、第
３段目の各スイッチの第１番目から第８番目までの出力
は、次段の各スイッチに相互接続し、ｋビットのヘッダ
（２進数表示でａ　Ｋ−１８Ｋ−２・・・ａＯ）を有す
る転送データを前記スイッチングネットワークの入力端
から出力端に転送するために、第１レベルのスイッチン
グネットワークの第１段目の各スイッチはヘッダの上位
（ｋ−ｍ）ビットと自グループ識別コードとを比較し、
一致していれば、ヘッダの残りのビット情報（ｍビット
）を用いて第１レベルのスイッチングネットワーク内で
自己ルーチングして、自グループ内の出力端子に出力し
、不一致であれば、第１段目のスイッチの第Ｓ＋１番目
から第Ｓ＋Ｙ番目までの出力にルーチングし、第２レベ
ルのスイッチングネットワークでは、ヘッダの上位（ｋ
−ｐ）ビットの情報を用いて、自己ルーチングを行い、
第２レベルの出力端子に出力するようにしている。(Operation) That is, the basis of the present invention is that in a switching network of N inputs x N outputs (N-2'' k is a natural number) configured by interconnecting switches in multiple stages, the switching network further has G input x 'c output (G=
2'''; m is a natural number) as one group unit
A first level switching network with 70 groups and (N/S)Y inputs X (N/S)Y outputs (
S = 2'; p is a natural number), and the first level switching network is comprised of 1 o gsG stage switch groups, ...l
ogSG), each switch has 073 switches, and each switch in the first stage has an S input X (S+Y
) output switch, and each switch in the first ogSG stage is composed of (S+Y) input x S output switch (where Y is one or more), and the remaining switches in each stage are S
Consisting of input x S output switches, 7 outputs from S+1st to S+Yth of each switch in the first stage, a total of (N/S)Y outputs, are input to the second level switching network. Then, input seven inputs from S+1st to S+Yth of each switch in the logSG stage,
A total of (N/S) Y outputs are drawn from the output of the second level switching network, and the remaining outputs from the first to Sth outputs of each switch in the first stage are interconnected to each switch in the next stage. The second level switching network has 10 gm (N/S) stages (R=2q; q is a natural number).
The third stage (j=1.2°・
..., log* (N/S)), respectively, (N
/5R) Arrange Y switches of R input x R output, and the 1st to 8th outputs of each switch in the third stage are interconnected to each switch in the next stage, and the k-bit In order to transfer transfer data having a header (a K-18K-2...aO in binary notation) from the input end of the switching network to the output end, each of the first stages of the first level switching network The switch compares the upper (km) bits of the header with its own group identification code,
If they match, self-routing is performed within the first level switching network using the remaining bit information (m bits) of the header and output to the output terminal within its own group; if there is a mismatch, the first stage In the second level switching network, the upper (k
-p) performs self-routing using bit information;
The signal is output to the second level output terminal.

第８図は、以上の構成をまとめて図示した構成図である
ので参照されたい。Please refer to FIG. 8, which is a block diagram showing all the above structures.

〔実施例〕〔Example〕

第１図は本発明の一実施例を示す構成図である。FIG. 1 is a block diagram showing an embodiment of the present invention.

同実施例は１６台のプロセッサ（２０−１，・・・２０
−１６）が本発明の相互結合ネットワークに接続されて
いる。この相互結合ネットワークは２レベルからなる階
層構成をとっており、第１レベルは、第１段のスイッチ
群（２１−１，・・・２ｌ−８）、第２段のスイッチ群
（２２−１，・・・２２−８）から構成され、第２レベ
ルは、第１段のスイッチ群（２３−１，・・・２３−４
）、第２段のスイッチ群（２４−１，・・・２４−４）
、第３段のスイッチ群（２５−１，・・・２５−４）と
それらを結合するインタフェース線群（３０−１から３
５−８まで）より構成される。In this embodiment, 16 processors (20-1, . . . 20
-16) are connected to the interconnection network of the present invention. This mutually coupled network has a hierarchical structure consisting of two levels. ,...22-8), and the second level is composed of the first stage switch group (23-1,...23-4).
), second stage switch group (24-1,...24-4)
, the third stage switch group (25-1,...25-4) and the interface line group (30-1 to 30-3) connecting them.
5-8).

このうち、第１レベル初段（第１段）の各スイッチ（２
１−１，・・・２ｌ−８）は２人力×３出力、第１レベ
ル最終段（第２段）の各スイッチは３人力×２出力、第
２レベルの各スイッチはすべて２人力×２出力である。Among these, each switch (2
1-1,...2l-8) are 2 human power x 3 outputs, each switch on the final stage of the 1st level (2nd stage) is 3 human power x 2 outputs, and each switch on the 2nd level is 2 human power x 2 This is the output.

第１図の実施例では、プロセッサ（（２０−１）〜（２
０−１６））は番号の若い順に４台づつ計４つのグルー
プを組んでいる。後述するように、同一グループ内のプ
ロセッサ間の通信時間は、異なるグループに属するプロ
セッサ間の通信時間よりも短くなるように構成されてい
る。In the embodiment of FIG. 1, the processors ((20-1) to (20-1)
0-16)) are formed into four groups of four cars in descending order of number. As will be described later, the communication time between processors within the same group is configured to be shorter than the communication time between processors belonging to different groups.

第２図はこれらのスイッチを一般的に表現するために、
３人力×３出力の最大構成で記載したものである。同図
において、（５０−１）〜（５０−３）はスイッチの各
入力線、（５２−１）〜（５２−３）はヘッダ付きの転
送データを一時的に格納する入力バッファ、（５３−１
）〜（５３−３）はヘッダの情報を参照し、各出力線（
５１−１）〜（５１−３）への行き先を判定する「ルー
チング情報識別回路Ｊである。５４は３個の入力バッフ
ァ（５２−１，５２−２，５２−３）からの出力のうち
、−時に１個のみが同時に同じ出力線（例えば５１−１
）に出力されるように調整する競合調整回路であるが、
本発明の詳細な説明するうえで直接、関係はしないので
、それ以上の説明は省略する。（５５−１）〜（５５−
３）、（５６−１）〜（５６−３）、（５７−１）〜（
５７−３）はＡＮＤゲート、（５Ｂ−１）〜（５８−３
）はＯＲゲートで出力線（５１−１）〜（５１−３）の
選択制御に使用される。Figure 2 shows these switches in a general way.
The maximum configuration is 3 manpower x 3 outputs. In the same figure, (50-1) to (50-3) are each input line of the switch, (52-1) to (52-3) are input buffers that temporarily store transfer data with a header, and (53) are input lines of the switch. -1
) to (53-3) refer to the header information and connect each output line (
51-1) to (51-3). 54 is a routing information identification circuit J that determines the destination of the output from the three input buffers (52-1, 52-2, 52-3). , - when only one output line is connected at the same time (e.g. 51-1
) is a competition adjustment circuit that adjusts the output to
Since this is not directly relevant to the detailed explanation of the present invention, further explanation will be omitted. (55-1) ~ (55-
3), (56-1) ~ (56-3), (57-1) ~ (
57-3) is an AND gate, (5B-1) to (58-3
) is an OR gate used to select and control the output lines (51-1) to (51-3).

このスイッチを２人力×３出力として使用する場合には
、第３番目の入力線（５０−３）を未使用状態にしてお
く。同様に、３人力×２出力として使用する場合は、第
３番目の出力線（５１−３）を未使用状態に、２人力×
２出力として使用する時は、第３番目の入力線（５０−
３）と第３番目の出力線（５１−３）を未使用状態にし
ておく。When this switch is used as 2-man power x 3 outputs, the third input line (50-3) is left unused. Similarly, when using 3 manpower x 2 outputs, leave the third output line (51-3) unused and 2 manpower x 2 outputs.
When used as two outputs, connect the third input line (50-
3) and the third output line (51-3) are left unused.

各スイッチは第５図に示すような４ビツト（ａ３ａｔａ
ｌａｏ　）のヘッダ付き転送データが入力されると、ヘ
ッダの情報（ルーチング識別ビット）とグループ識別コ
ードによって自己ルーチングを行う。第１図に各スイッ
チにおけるグループの識別コードと、ルーチング識別ビ
ットは表１のように規則的に割り当てておく。Each switch has 4 bits (a3ata
When the header-attached transfer data (lao) is input, self-routing is performed using the header information (routing identification bit) and group identification code. In FIG. 1, group identification codes and routing identification bits in each switch are regularly assigned as shown in Table 1.

（以下、余白）表１の意味は、以下の通りである。例えば、第１レベル
第１段目のスイッチ２１−２では、入力線（（５０−１
）〜（５０−３））の各々に対応してルーチング情報識
別回路（（５３−１）〜（５３−３））がグループ識別
コード００を、ヘッダ情報のａ３ａ、と比較する。もし
、両者が一致すれば、ヘッダのａ１ビットをルーチング
情報として使用する。この時、ａ、＝０ならば、入力デ
ータをスイッチの第１出力線（５１−１）に出力し、ａ
、＝１ならば、入力データをスイッチの第２出力線（５
１−２）に出力する。また、グループ識別コードとヘッ
ダ情報（ａｓａｔ　）とが一致しなければ、入力データ
をスイッチの第３出力線（５１−３）に出力する。(Hereinafter, blank space) The meaning of Table 1 is as follows. For example, in the first stage switch 21-2 of the first level, the input line ((50-1
) to (50-3)), the routing information identification circuits ((53-1) to (53-3)) compare the group identification code 00 with the header information a3a. If they match, the a1 bit of the header is used as routing information. At this time, if a = 0, the input data is output to the first output line (51-1) of the switch, and a
, = 1, the input data is sent to the second output line of the switch (5
1-2). Further, if the group identification code and the header information (asat) do not match, the input data is output to the third output line (51-3) of the switch.

また、表１で第１レベル第２段目のスイッチ（２２−６
）は、グループ識別コードを有していないので、ルーチ
ング情報識別回路では無条件にルーチング識別ビットａ
０を参照して、その値に応じて第１出力線（５１−１）
または第２出力線（５２−２）に出力する。第２レベル
の各スイッチもこれと同様に、ルーチング識別ビットの
みで出力光を振り分ける。Also, in Table 1, the first level second stage switch (22-6
) does not have a group identification code, so the routing information identification circuit unconditionally uses the routing identification bit a.
0 and depending on the value, the first output line (51-1)
Or output to the second output line (52-2). Similarly, each switch at the second level also distributes output light based only on the routing identification bit.

なお、前述の「作用」の欄及び特許請求の範囲の欄で述
べた一般記号との対応で言えば、本実施例は、Ｎ＝１６
、Ｇ＝４、Ｙ＝１、Ｓ＝２、Ｒ＝２、ｋ＝＝４、ｍ＝２
、ｐ＝１、ｑ＝ｌである。第１図において、各スイッチ
の第１番目、第２番目の入力は、前段からの出力につな
がっている。また、第１レベル第２段（最終段）スイッ
チについては、第３番目の入力は、第２レベルの第３段
（最終段）のスイッチの出力からつながれている。In addition, in terms of the correspondence with the general symbols mentioned in the above-mentioned "effect" column and claims column, this example has N=16.
, G=4, Y=1, S=2, R=2, k==4, m=2
, p=1, q=l. In FIG. 1, the first and second inputs of each switch are connected to the output from the previous stage. Further, for the first level second stage (final stage) switch, the third input is connected to the output of the third stage (final stage) switch of the second level.

さらに、第１レベル初段（第１段）スイッチの第３番目
の出力は、第２レベルの第１段の入力につながれている
。Further, the third output of the first level first stage (first stage) switch is connected to the input of the first stage of the second level.

まず、プロセッサＯ（２０−１）から、それと同一のグ
ループに属するプロセッサ２　（２０−３）への通信は
、以下の方法でスイッチ（２１−１）、（２２−２）の
２段のみの経由で行われる。First, communication from processor O (20-1) to processor 2 (20-3) belonging to the same group is performed using only two stages of switches (21-1) and (22-2) using the following method. It is done via

すなわち、転送先のプロセッサ番号は２なので、第５図
のヘッダ情報はａｘａｚａＩａｏ＝ｏＯ１０と割り付け
る。プロセッサＯ（２０−１）は、このヘッダ付きの転
送データをプロセッサ０の出力線（３０−１）経由でス
イッチ（２１−１）の第１番目の入力に入れる。第２図
との対応で言えば、この入力が入力線（５０−１）に対
応し、ヘッダ付き転送データが入力バッファ（５２−１
）に−旦記憶される。That is, since the transfer destination processor number is 2, the header information in FIG. 5 is assigned as axazaIao=oO10. Processor O (20-1) inputs this header-attached transfer data to the first input of switch (21-1) via output line (30-1) of processor 0. In terms of correspondence with Figure 2, this input corresponds to the input line (50-1), and the header-attached transfer data corresponds to the input buffer (52-1).
) to be stored.

ルーチング情報識別回路（５３−１）は、ヘッダの上位
２ピツ）ａ：１ａｔ＝ｏｏをみて、グループ識別コード
（表１から、スイッチ２１−１の識別コードは００）と
一致することから、自スイッチの第１番目または第２番
目の出力（５１−１，５ｌ−２）のいずれかにルーチン
グすべきものと判定する。この場合、表１から、このル
ーチング識別ビットはａｌであり、ａ、＝１なので、第
２番目の出力（５１−２）にルーチングすべきことがわ
かり、制御線５９−２を１にする。その結果、バッファ
（５２−１）内のヘッダ付きデータはＡＮＤゲート（５
５−２）、ＯＲゲート（５８−２）を経由して出力線（
５１−２）に出力される。The routing information identification circuit (53-1) looks at the top two bits (a:1at=oo) of the header and identifies it automatically because it matches the group identification code (from Table 1, the identification code of the switch 21-1 is 00). It is determined that the output should be routed to either the first or second output (51-1, 5l-2) of the switch. In this case, from Table 1, since this routing identification bit is al and a,=1, it is found that the second output (51-2) should be routed, and the control line 59-2 is set to 1. As a result, the header data in the buffer (52-1) is transferred to the AND gate (52-1).
5-2), the output line (
51-2).

この出力線は、第１図のスイッチ（２１−１）の第２番
目の出力線（３１−２）に対応しているので、ヘッダ付
き転送データは第２レベル第２段目のスイッチ（２２−
２）の第１入力に入れられる。This output line corresponds to the second output line (31-2) of the switch (21-1) in FIG. −
2) into the first input.

スイッチ（２２−２）の動作を、再び第２図を用いて説
明すると、ヘッダ付き転送データは入力線（５０−１）
を経由して入力バッファ（５２−１）に入れられる。ス
イッチ（２２−２）のルーチング情報識別回路（５３−
１）は、表１のルーチング規則に従い、ヘッダの最下位
１ビツトａ。To explain the operation of the switch (22-2) again using FIG. 2, the header-attached transfer data is transferred to the input line (50-1).
is input into the input buffer (52-1). Routing information identification circuit (53-) of switch (22-2)
1) is the lowest 1 bit a of the header according to the routing rules in Table 1.

＝０を見て、第１番目の出力にルーチングすべきことが
わかり、制御線（５９−１）を１にする。=0, it is found that routing should be made to the first output, and the control line (59-1) is set to 1.

その結果、ヘッダ付きデータはＡＮＤゲート（５５−１
）、ＯＲゲート（５８−１）を経由して、第１番目の出
力ｖＡ（５１−１）に出力される。この出力は、第１図
のスイッチ（２２−２）の第１出力線（３２−３）に対
応するので、結局、転送データはプロセッサ２　（２０
−３）に転送されることになる。すなわち、プロセッサ
４　（２０−１）から、それと同一グループに属するプ
ロセッサ２（２０−３）への通信は、スイッチ２段のみ
の経由で高速に行うことができる。As a result, the header data is AND gate (55-1
) and is output to the first output vA (51-1) via the OR gate (58-1). Since this output corresponds to the first output line (32-3) of the switch (22-2) in FIG.
-3). That is, communication from processor 4 (20-1) to processor 2 (20-3) belonging to the same group can be performed at high speed via only two stages of switches.

次にプロセンサＩ　Ｃ２０−２）から、それき異なるグ
ループに属するプロセッサ１３　（２０−１４）への通
信方法を説明する。Next, a method of communication from the processor IC 20-2) to the processor 13 (20-14) belonging to a different group will be explained.

転送先プロセッサ番号は１３なので、ヘッダ情報ａｓａ
ｚａ＋ａｏ＝１１０１と設定し、プロセッサ１　（２０
−２）より、ヘッダ付き転送データがインタフェース線
３０−２経由でスイッチ２１−１の第２人力に入れられ
る。スイッチ２１−１の動作を第２図を用いて説明する
と、ヘッダ付き転送データは入力線５０−２を経由して
入力バッファ５２−２に入れられる。ルーチング情報識
別回路５３−２は、表１に示すルーチング規則に従って
、まず、ヘッダの上位２ピノ）ａｉａｚ＝１１をみて、
グループ識別コード（表１より、００）と比較する。両
者は一致しないので、他グループ宛の転送データである
と判断し、スイッチの第３番目の出力にルーチングする
ために、制御線６０−３を１にする。その結果、ヘッダ
付きデータはＡＮＤゲート５６−３、ＯＲゲート５８−
３を経由して、第３番目の出力線５１−３に出力される
。この出力は、第１図のインタフェース線（３１−３）
に対応するので、第２レベル第１段目のスイッチ２３−
１の第１人力にヘッダ付き転送データが送り込まれる。Since the transfer destination processor number is 13, the header information asa
Set za+ao=1101 and processor 1 (20
-2), the header-attached transfer data is input to the second input of the switch 21-1 via the interface line 30-2. The operation of the switch 21-1 will be explained using FIG. 2. Header-attached transfer data is input to the input buffer 52-2 via the input line 50-2. According to the routing rules shown in Table 1, the routing information identification circuit 53-2 first looks at the top two pins (aiaz=11) in the header, and
Compare with the group identification code (00 from Table 1). Since the two do not match, it is determined that the data is to be transferred to another group, and the control line 60-3 is set to 1 in order to route it to the third output of the switch. As a result, the header data is transferred to the AND gate 56-3, the OR gate 58-
3, and is output to the third output line 51-3. This output is connected to the interface line (31-3) in Figure 1.
Since it corresponds to the second level first stage switch 23-
The header-attached transfer data is sent to the first person of No. 1.

スイッチ（２３−１）の動作を、再び第２図を用いて説
明すると、ヘッダ付き転送データは入力線５０−１を経
由して入力バッファ５２−１に入れられる。スイッチ２
３−１のルーチング情報識別回路５３−１は、表１に示
すルーチング規則に従って、ヘッダの第３ビツト目ａｓ
””１を見て、第２番目の出力にルーチングすべきこと
がわかり、制御線５９−２を１にする。その結果、ヘッ
ダ付きデータはＡＮＤゲート５５−２、ＲＯゲート５８
−２を経由して、第２番目の出力線５１−２に出力され
る。この出力は、第１図の第２レベル第２段目のスイッ
チ（２４−３）の第１番目の入力（３３−２）となる。The operation of the switch (23-1) will be explained again using FIG. 2. Header-attached transfer data is input into the input buffer 52-1 via the input line 50-1. switch 2
The routing information identification circuit 53-1 of 3-1 selects the third bit of the header as according to the routing rules shown in Table 1.
Looking at "" 1, it is known that the second output should be routed, and the control line 59-2 is set to 1. As a result, the header data is passed through the AND gate 55-2 and the RO gate 58.
-2, and is output to the second output line 51-2. This output becomes the first input (33-2) of the second level second stage switch (24-3) in FIG.

このスイッチ（２４−３）のルーチング識別ビットは、
表１に示すようにａｚである以外は前述のスイッチ（２
３−１）と同じ働きをするので、以後、詳細説明は省略
するが、ａｔ””１なのでヘッダ付データはスイッチ２
４−３の第２番目の出力線（３４−６）に出力され、第
２レベルの最終段（第３段）のスイッチ（２５−４）に
入力される。The routing identification bit of this switch (24-3) is
As shown in Table 1, the above switches (2
Since it has the same function as 3-1), detailed explanation will be omitted hereafter, but since at""1, data with header is sent to switch 2.
The signal is output to the second output line (34-6) of 4-3, and is input to the switch (25-4) at the final stage (third stage) of the second level.

スイッチ（２５−４）では、表１に示すように、ルーチ
ング識別ビットはａｌであり、ａ、＝０なので、ヘッダ
付きデータはスイッチ（２５−４）の第１番目の出力線
（３５−７）に出力され、第１レベルの最終段（第２段
）のスイッチ（２２−７）に第３番目の入力線に入力さ
れる。スイッチ（２２−７）では、表１に示すように、
ルーチング識別ビットはａｏであり、ａ　ｏ　＝　１な
ので、ヘッダ付きデータはスイッチ（２２−７）の第２
番目の出力線（３２−１４）に出力され、これを経由し
て最終宛先のプロセッサ１３　（２０−１４）に転送デ
ータが送り届けられる。In the switch (25-4), as shown in Table 1, the routing identification bit is al and a = 0, so the header data is sent to the first output line (35-7) of the switch (25-4). ), and is input to the third input line of the switch (22-7) at the final stage (second stage) of the first level. In the switch (22-7), as shown in Table 1,
The routing identification bit is ao, and since ao = 1, the header data is sent to the second switch (22-7).
The transfer data is output to the th output line (32-14), and the transfer data is sent to the final destination processor 13 (20-14) via this.

このように、グループ間にまたがるプロセッサ間の通信
は、５段のスイッチ（２１−１，２３−１，２４−３，
２５−４，２２−７）を経由して行われる。In this way, communication between processors across groups is performed using five stages of switches (21-1, 23-1, 24-3,
25-4, 22-7).

以上、まとめると、全プロセッサ数Ｎ＝１６、グループ
あたりのプロセッサ数Ｇ＝４、スイッチサイズＳ＝２の
場合、本発明の方式では、同一グループに属するプロセ
ッサ間の通信は２段のスイッチ経由で実現され、異なる
グループに属するプロセッサ間の通信は５段のスイッチ
経由で実現される。このときのスイッチ数は第１レベル
が１６゜第２レベルが１２、合計２８個で実現される。In summary, when the total number of processors N = 16, the number of processors per group G = 4, and the switch size S = 2, in the method of the present invention, communication between processors belonging to the same group is performed via two-stage switches. Communication between processors belonging to different groups is realized via five-stage switches. The number of switches at this time is 16 degrees at the first level and 12 at the second level, making a total of 28 switches.

これを第３図のような従来型の相互結合ネットワークと
比較すると、従来型では、同様の条件下（プロセッサが
１６台で２人力×２出力スイッチ）を使った場合、プロ
セッサ間通信は、宛先に関係なく、いずれの場合も１０
ｇｚ１６＝４段のスイッチ経由で行われ、スイッチ数も
１６　／　２　Ｘ　ｌ　。Comparing this with the conventional interconnection network shown in Figure 3, under similar conditions (16 processors, 2 human power x 2 output switches), inter-processor communication is 10 in any case, regardless of
gz16=This is done via 4 stages of switches, and the number of switches is 16 / 2 X l.

ｇｚ１６＝３２個必要となる。したがって、本発明の方
式は従来方式に比べて、グループ内の通信時間は短く、
グループ間の通信時間は少し長くなるとともに、スイッ
チ個数も少ない、特に通信時間に関しては、マルチプロ
セッサシステムによく見られる性質として、処理の局所
性（一部のごく限られたプロセッサ同士の間では、通信
量が多く、それ以外のプロセッサ間では、通信量が少な
いという性質）があるので、関連の深いプロセッサ同士
を同一グループに収容することによって、システム全体
としての平均的な通信時間を大幅に削減することが可能
である。gz16=32 pieces are required. Therefore, in the method of the present invention, the communication time within a group is shorter than in the conventional method.
The communication time between groups is a little longer, and the number of switches is smaller.Especially regarding communication time, a characteristic often seen in multiprocessor systems is the locality of processing (between a very limited number of processors, Since the amount of communication is large and the amount of communication between other processors is small, by accommodating closely related processors in the same group, the average communication time for the entire system can be significantly reduced. It is possible to do so.

本発明における他の実施例をそれぞれ第６図、第７図に
示す。Other embodiments of the present invention are shown in FIGS. 6 and 7, respectively.

第６図は、１６台のプロセッサ（（８０−１）〜（８０
−１６））、第１レベル第１段のスイッチ群（（８１−
１）〜（８１−８））、第１レベル第２段のスイッチ群
（（８２−１）〜（８２−８））、第２レベル第１段の
スイッチ群（（８３−１）〜（８３−８））、第２レベ
ル第２段のスイッチ群（（８４−１〜（８４−８））、
第２レベル第３段のスイッチ群（（８５−１）〜（８５
−５））、から構成される。Figure 6 shows 16 processors ((80-1) to (80
-16)), first level first stage switch group ((81-
1) to (81-8)), first level second stage switch group ((82-1) to (82-8)), second level first stage switch group ((83-1) to ( 83-8)), second level second stage switch group ((84-1 to (84-8)),
Second level third stage switch group ((85-1) to (85
-5)).

前述の「作用」の欄及び特許請求の範囲の欄で述べた一
般記号との対応で言えば、本実施例は、Ｎ＝１６、Ｇ＝
４、Ｙ＝２、Ｓ＝２、Ｒ＝２で、基本的には第１図と同
じであるが、第１図と異なる点は、第１レベル第１段ス
イッチのサイズが２人力×３出力から、２人力×４出力
に変更され、第１レベル第２段スイッチのサイズが３人
力×２出力から、４人力×２出力に変更されている点で
ある。In terms of correspondence with the general symbols mentioned in the above-mentioned "effect" column and claims column, in this example, N=16, G=
4, Y = 2, S = 2, R = 2, basically the same as Figure 1, but the difference from Figure 1 is that the size of the 1st level 1st stage switch is 2 manual x 3 The output has been changed to 2 manpower x 4 outputs, and the size of the first level second stage switch has been changed from 3 manpower x 2 outputs to 4 manpower x 2 outputs.

その結果、第１レベルと第２レベルの間の転送幅が第１
図の場合の２倍に拡大（Ｙ＝１からＹ＝２に変更された
こと）されたことになる。ルーチングの方法としては、
第１レベルでは第１図の場合と全く同じであり、第２レ
ベルでは第１段がルーチング識別ビットとしてａ、を、
第２段と第３段がそれぞれａ！、ａｔを使用する。同一
グループを構成する４台のプロセッサ間の通信は、第１
図の場合と同様、スイッチ２段の経由で済み、グループ
間での通信はスイッチ５段の経由で実現される。第６図
のような構成は、グループ間の通信頻度が比較的多く、
グループ間で大きな転送幅を確保しておきたい場合に有
効である。As a result, the transfer width between the first level and the second level is
This means that it has been enlarged twice as much as in the case shown in the figure (Y=1 has been changed to Y=2). As for the routing method,
At the first level, it is exactly the same as in Figure 1, and at the second level, the first stage uses a as the routing identification bit,
The second and third stages are each a! , at is used. Communication between the four processors that make up the same group is
As in the case shown in the figure, only two stages of switches are required, and communication between groups is realized through five stages of switches. In the configuration shown in Figure 6, communication frequency between groups is relatively high.
This is effective when you want to secure a large transfer width between groups.

第７図は、１６台のプロセッサ（（９０−１）〜（９０
−１６））、第１レベル第１段のスイッチ群Ｎ９ｌ−１
）〜（９１−８））、第１レベル第２段のスイッチ群（
（９２−１）〜（９２−８））、第２ベル第１段のスイ
ッチ群（（９３−１）〜（９３−４）Ｌ第２レベル第２
段のスイッチ群（（９４−１）〜（９４−４））、第３
レベル第１段のスイッチ群（（９５−１）〜（９５−２
））、第３レベル第２段のスイッチ群（（９６−１）〜
（９６−２））から構成される。Figure 7 shows 16 processors ((90-1) to (90
-16)), first level first stage switch group N9l-1
) to (91-8)), first level second stage switch group (
(92-1) to (92-8)), second bell first stage switch group ((93-1) to (93-4)L second level second
Stage switch group ((94-1) to (94-4)), third
Level 1 switch group ((95-1) to (95-2)
)), 3rd level 2nd stage switch group ((96-1) ~
(96-2)).

本実施例は、第１図における第２レベルのネットワーク
を更に２レベルに分け、システム全体を３階層化したも
のである。即ち、第２レベルの第１段スイッチ群（（９
３−１）〜（９３−４））は２人力×３出力のスイッチ
で構成し、第３番目の出力を第３レベルのネットワーク
に接続する。In this embodiment, the second level network in FIG. 1 is further divided into two levels, and the entire system is made into three layers. That is, the second level first stage switch group ((9
3-1) to (93-4)) are composed of two-manpower x three-output switches, and the third output is connected to the third-level network.

また、第２レベル第２段のスイッチ群（（９４−１）〜
（９４−４））は３人力×２出力のスイッチで構成し、
第３番目の入力は第３レベルのネッワークの出力から引
き込む、第３レベルは２人力×２出力のスイッチ２段で
構成したネットワークである。第１レベルでは、第１図
と同様、４台のプロセッサを１つのグループにしている
が、第２レベルでは、隣接する２つの第２レベルグルー
プ（プロセッサ８台分、例えば（９０−１）〜（９０−
８））で第２グループを組むように構成したものである
。ルーチングは、第１レベルの各段では第１図の場合と
同様である。In addition, the second level second stage switch group ((94-1) ~
(94-4)) consists of 3 human power x 2 output switches,
The third input is drawn from the output of the third level network, and the third level is a network consisting of two stages of two-manpower x two-output switches. At the first level, four processors are grouped into one group as in FIG. 1, but at the second level, two adjacent second level groups (eight processors, for example (90-1) to (90-
8)) to form a second group. Routing is the same as in FIG. 1 for each stage of the first level.

第２レベルの第１段では、転送データのヘッダ（第５図
）の上位１ビツト（ａ、）を、予め割り当てられたグル
ープ識別コード（１ビツト）と比較し、自グループ（第
２レベルグループ）宛なら、第２レベルの第１段と第２
段とにより、ヘッダの次の上位２ビツト（ａＺａ＋　）
で第２レベルをルーチングする。また、ヘッダが他の第
２レベルグループ宛ならば、第２レベル第１段スイッチ
の第３出力経由で第３レベルに迂回させる。第３レベル
ではヘッダの上位２ピツ）（ａ＝ａＺ　）を用いてルー
チングする。このようなルーチングを行うことにより、
プロセッサ４台からなる第１レベルのグループ内通信は
第１図の場合と同様、スイッチ２段分経由で実現される
。また、プロセッサ８台からなる第２グループ間通信は
、スイッチ４段分経由で、更に、異なる第２レベルグル
ープ間通信では、６段のスイッチ経由で実現される。In the first stage of the second level, the upper 1 bit (a,) of the header of the transfer data (Fig. 5) is compared with the group identification code (1 bit) assigned in advance, and the ), the first and second rows of the second level
The next upper 2 bits of the header (aZa+)
to route the second level. Furthermore, if the header is addressed to another second level group, it is detoured to the third level via the third output of the second level first stage switch. At the third level, routing is performed using the top two pits of the header (a=aZ). By performing such routing,
Communication within a first level group consisting of four processors is achieved via two stages of switches, as in the case of FIG. Furthermore, communication between a second group consisting of eight processors is achieved via four stages of switches, and communication between different second level groups is achieved via six stages of switches.

このように、グループが階層構成をなしているようなシ
ステムに対しても、階層のレベルに応じて、転送遅延時
間も適切に設定することができる。In this way, even for a system in which groups have a hierarchical structure, the transfer delay time can be appropriately set according to the hierarchical level.

〔発明の効果〕〔Effect of the invention〕

以上、説明したように、同一グループに組んだプロセッ
サの間では、転送遅延時間の少ない通信を、異なるグル
ープのプロセッサ間では、転送遅延時間が少し大きい通
信を実現しているので、通常、よく見られる局所性のあ
るプロセッサ間通信の場合に、システム全体として通信
時間を大きく短縮することができる。As explained above, communication between processors in the same group has a small transfer delay time, while communication between processors in different groups has a slightly longer transfer delay time. In the case of inter-processor communication with locality, the communication time can be significantly reduced for the entire system.

また、グループを構成するプロセッサの数が一定ならば
、全体のプロセッサ数を増加させた場合にも、異なるグ
ループ間の通信時間が延びるだけで、同一グループ内の
通信時間（スイッチ段数）を一定に保つことができるの
で、大規模なシステムを構築しやすいと言う利点がある
。Furthermore, if the number of processors that make up a group is constant, even if the overall number of processors is increased, the communication time between different groups will only increase, while the communication time (number of switch stages) within the same group will remain constant. This has the advantage of making it easy to construct large-scale systems.

さらに、従来の相互結合ネットワークに比較して、少な
いスイッチ数で経済的にネットワークを構成することが
できる。Furthermore, compared to conventional interconnected networks, the network can be constructed economically with fewer switches.

また、システムの通信トラヒックの要求条件に応じてス
イッチサイズ、グループサイズ、あるいはレベルの段数
等の組み合せを変えることができる柔軟性を有している
ので、要求条件に最適なネットワークを提供することが
できる。In addition, it has the flexibility to change the combination of switch size, group size, number of levels, etc. according to the communication traffic requirements of the system, so it is possible to provide a network that is optimal for the requirements. can.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図は本発明の一実施例を示す構成図、第２図は、本
発明によるネットワークを構成するスイッチの内部構成
図、第３図は従来型の相互結合ネットワークの構成図、
第４図は第３図のネットワーク内を転送させる「ヘッダ
付き転送データ」のフォーマット説明図、第５図は第１
図、第６図、第７図のネットワーク内を転送させる「ヘ
ッダ付き転送データ」のフォーマット説明図、第６図、
第７図はそれぞれ本発明の他の実施例を示す構成図、第
８図は課題解決手段と作用を図示して示した構成図、で
ある。符号の説明ｌ・・・プロセッサ、２．３．４・・・スイッチ、５゜
６．７．８・・・インタフェース線、２０・・・プロセ
ッサ、２１，２２，２３．２４．２５・・・スイッチ、
３０．３１，３２．３３．３４．３５・・・インタフェ
ース線、５０・・・入力インタフェース線、５１・・・
出力インタフェース線、５２・・・入力バッファ、５３
・・・ルーチング情報識別回路、５４・・・競合調整回
路、５５，５６．５７・・・ＡＮＤゲート、５８・・・
ＯＲゲート、８０・・・プロセッサ、８１，８２．８３
゜８４．８５・・・スイッチ、９０・・・プロセッサ、
９１゜９２．９３，９４，９５，９６．　　・・・スイ
ッチ代理人　弁理士　並　木　昭　夫代理人　弁理士　松　崎　　　清第１段第３仄１段％３ｇ＆′！ｓ４図第図FIG. 1 is a configuration diagram showing an embodiment of the present invention, FIG. 2 is an internal configuration diagram of a switch configuring a network according to the present invention, and FIG. 3 is a configuration diagram of a conventional interconnection network.
Figure 4 is an explanatory diagram of the format of the "header-attached transfer data" that is transferred within the network in Figure 3, and Figure 5 is the
An explanatory diagram of the format of "header-attached transfer data" transferred within the network in Figures 6 and 7, Figure 6,
FIG. 7 is a block diagram showing other embodiments of the present invention, and FIG. 8 is a block diagram illustrating problem-solving means and operations. Explanation of symbols l...Processor, 2.3.4...Switch, 5゜6.7.8...Interface line, 20...Processor, 21, 22, 23.24.25... switch,
30.31, 32.33.34.35...Interface line, 50...Input interface line, 51...
Output interface line, 52...input buffer, 53
... Routing information identification circuit, 54... Competition adjustment circuit, 55, 56.57... AND gate, 58...
OR gate, 80...processor, 81, 82.83
゜84.85...Switch, 90...Processor,
91°92.93,94,95,96. ...Switch Agent Patent Attorney Akio Namiki Agent Patent Attorney Kiyoshi Matsuzaki 1st Dan 3 21st Dan% 3g&'! s4 diagram diagram

Claims

Translated fromJapanese