Movatterモバイル変換

 Parallel Processing, Flynn’s Classification ofComputers Pipelining Instruction Pipeline Pipeline Hazards and their solution Array and Vector ProcessingPipelining and VectorProcessing

Parallel Processing It refers to techniques that are used to providesimultaneous data processing. The system may have two or more ALUs to be able toexecute two or more instruction at the same time. The system may have two or more processorsoperating concurrently. It can be achieved by having multiple functionalunits that perform same or different operationsimultaneously.

Classification There are variety of ways in which the parallelprocessing can be classified Internal Organization of Processor Interconnection structure between processors Flow of information through system

 M.J. Flynn classify the computer on the basis ofnumber of instruction and data items processedsimultaneously. Single Instruction Stream, Single Data Stream(SISD) Single Instruction Stream, Multiple Data Stream(SIMD) Multiple Instruction Stream, Single Data Stream(MISD) Multiple Instruction Stream, Multiple Data Stream(MIMD)

 SISD represents the organization containing singlecontrol unit, a processor unit and a memory unit.Instruction are executed sequentially and systemmay or may not have internal parallel processingcapabilities. SIMD represents an organization that includes manyprocessing units under the supervision of a commoncontrol unit.

 MISD structure is of only theoretical interest sinceno practical system has been constructed using thisorganization. MIMD organization refers to a computer systemcapable of processing several programs at the sametime.

 Flynn’s classification emphasize on the behavioralcharacteristics of the computer system rather thanits operational and structural interconnections. Onetype of parallel processing that does not fit in theFlynn’s classification is Pipelining. Parallel Processing can be discussed under followingtopics: Pipeline Processing Vector Processing Array Processors

Pipelining It is a technique of decomposing a sequential processinto sub operations, with each sub process beingexecuted in a special dedicated segments thatoperates concurrently with all other segments. Each segment performs partial processing dictatedby the way task is partitioned. The result obtained from each segment is transferredto next segment. The final result is obtained when data have passedthrough all segments.

Example Suppose we have to perform the following task: Each sub operation is to be performed in a segmentwithin a pipeline. Each segment has one or tworegisters and a combinational circuit.

 The sub operations in each segment of the pipelineare as follows:

General Consideration Let us consider the case where k segments pipelinewith a clock cycle time tp is used to execute n tasks. The first task T1 require time ktp to complete sincethere are k segments. The remaining (n-1) tasks emerge from pipe at therate one task per cycle. They will complete after time(n-1)tp. So total time required is k+(n-1) clock cycles. Calculate total cycles in previous example.

 Now consider non pipeline unit that performs thesame operation and takes time equal to tn tocomplete each task. Total time required is ntn. The speedup ration is given as:

Arithmetic Pipeline Pipeline arithmetic units are usually found in veryhigh speed computers. They are used to implement floating pointoperations. We will now discuss the pipeline unit for the floatingpoint addition and subtraction.

 The inputs to floating point adder pipeline are twonormalized floating point numbers. A and B are mantissas and a and b are theexponents. The floating point addition and subtraction can beperformed in four segments.

 The sub-operation performed in each segments are: Compare the exponents Align the mantissas Add or subtract the mantissas Normalize the result

Instruction Pipeline Pipeline processing can occur not only in the datastream but in the instruction stream as well. An instruction pipeline reads consecutive instructionfrom memory while previous instruction are beingexecuted in other segments. This caused the instruction fetch and executesegments to overlap and perform simultaneousoperation.

Four Segment CPU Pipeline FI segment fetches the instruction. DA segment decodes the instruction and calculatethe effective address. FO segment fetches the operand. EX segment executes the instruction.

Handling Data Dependency This problem can be solved in the following ways: Hardware interlocks: It is the circuit that detects theconflict situation and delayed the instruction by sufficientcycles to resolve the conflict. Operand Forwarding: It uses the special hardware todetect the conflict and avoid it by routing the datathrough the special path between pipeline segments. Delayed Loads: The compiler detects the data conflict andreorder the instruction as necessary to delay the loadingof the conflicting data by inserting no operationinstruction.

Handling of Branch Instruction Pre fetch the target instruction. Branch target buffer(BTB) included in the fetchsegment of the pipeline Branch Prediction Delayed Branch

RISC Pipeline Simplicity of instruction set is utilized to implementan instruction pipeline using small number of sub-operation, with each being executed in single clockcycle. Since all operation are performed in the register,there is no need of effective address calculation.

Three Segment Instruction Pipeline I: Instruction Fetch A: ALU Operation E: Execute Instruction

Delayed Branch Let us consider the program having the following 5instructions

Vector Processing There is a class of computational problems that arebeyond the capabilities of the conventionalcomputer. These are characterized by the fact that they requirevast number of computation and it take aconventional computer days or even weeks tocomplete. Computers with vector processing are able to handlesuch instruction and they have application infollowing fields:

 Long range weather forecasting Petroleum exploration Seismic data analysis Medical diagnosis Aerodynamics and space simulation Artificial Intelligence and expert system Mapping the human genome Image Processing

Vector Operation A vector V of length n is represented as row vector by The element Vi of vector V is written as V(I) and theindex I refers to a memory address or register wherethe number is stored.

 Let us consider the program in assembly languagethat two vectors A and B of length 100 and put theresult in vector C.

 A computer capable of vector processing eliminatesthe overhead associated with the time it takes tofetch and execute the instructions in the programloop. It allows operations to be specified with a singlevector instruction of the form:

Matrix Multiplication Let us consider the multiplication of two 3*3 matrixA and B.

 This requires three multiplication and(afterinitializing c11 to 0) three addition. Total number of addition or multiplication requiredis 3*9. In general inner product consists of the sum of kproduct terms of the form:

 In typical application value of k may be 100 or even1000. The inner product calculation on a pipeline vectorprocessor is shown below. Floating point adder and multiplier are assumed tohave four segments each.

 The four partial sum are added to form the final sum

Array Processor An array processor is a processor that performs thecomputations on large arrays of data. There are two different types of array processor: Attached Array Processor SIMD Array Processor

Attached Array Processor It is designed as a peripheral for a conventional hostcomputer. Its purpose is to enhance the performance of thecomputer by providing vector processing. It achieves high performance by means of parallelprocessing with multiple functional units.

SIMD Array Processor It is processor which consists of multiple processingunit operating in parallel. The processing units are synchronized to performthe same task under control of common control unit. Each processor elements(PE) includes an ALU , afloating point arithmetic unit and working register.

Pipelining and vector processing

Movatterモバイル変換

Change Language

Pipelining and vector processing

Embed presentation

Recommended

More Related Content

What's hot

Viewers also liked

Similar to Pipelining and vector processing

More from Kamal Acharya

Recently uploaded

Pipelining and vector processing