Movatterモバイル変換


[0]ホーム

URL:


KA
Uploaded byKamal Acharya
PPTX, PDF68,276 views

Pipelining and vector processing

The document discusses parallel processing techniques, categorizing computers based on Flynn's classification, which includes SISD, SIMD, MISD, and MIMD structures. It explains pipelining as a method to improve instruction and data processing efficiency through simultaneous operations, and delves into instruction pipelines and vector processing for complex computations. Additionally, the document highlights array processors and their function in enhancing computer performance through parallelism.

Embed presentation

Downloaded 3,092 times
 Parallel Processing, Flynn’s Classification ofComputers Pipelining Instruction Pipeline Pipeline Hazards and their solution Array and Vector ProcessingPipelining and VectorProcessing
Parallel Processing It refers to techniques that are used to providesimultaneous data processing. The system may have two or more ALUs to be able toexecute two or more instruction at the same time. The system may have two or more processorsoperating concurrently. It can be achieved by having multiple functionalunits that perform same or different operationsimultaneously.
Classification There are variety of ways in which the parallelprocessing can be classified Internal Organization of Processor Interconnection structure between processors Flow of information through system
 M.J. Flynn classify the computer on the basis ofnumber of instruction and data items processedsimultaneously. Single Instruction Stream, Single Data Stream(SISD) Single Instruction Stream, Multiple Data Stream(SIMD) Multiple Instruction Stream, Single Data Stream(MISD) Multiple Instruction Stream, Multiple Data Stream(MIMD)
 SISD represents the organization containing singlecontrol unit, a processor unit and a memory unit.Instruction are executed sequentially and systemmay or may not have internal parallel processingcapabilities. SIMD represents an organization that includes manyprocessing units under the supervision of a commoncontrol unit.
 MISD structure is of only theoretical interest sinceno practical system has been constructed using thisorganization. MIMD organization refers to a computer systemcapable of processing several programs at the sametime.
 Flynn’s classification emphasize on the behavioralcharacteristics of the computer system rather thanits operational and structural interconnections. Onetype of parallel processing that does not fit in theFlynn’s classification is Pipelining. Parallel Processing can be discussed under followingtopics: Pipeline Processing Vector Processing Array Processors
Pipelining It is a technique of decomposing a sequential processinto sub operations, with each sub process beingexecuted in a special dedicated segments thatoperates concurrently with all other segments. Each segment performs partial processing dictatedby the way task is partitioned. The result obtained from each segment is transferredto next segment. The final result is obtained when data have passedthrough all segments.
Example Suppose we have to perform the following task: Each sub operation is to be performed in a segmentwithin a pipeline. Each segment has one or tworegisters and a combinational circuit.
 The sub operations in each segment of the pipelineare as follows:
General Consideration Let us consider the case where k segments pipelinewith a clock cycle time tp is used to execute n tasks. The first task T1 require time ktp to complete sincethere are k segments. The remaining (n-1) tasks emerge from pipe at therate one task per cycle. They will complete after time(n-1)tp. So total time required is k+(n-1) clock cycles. Calculate total cycles in previous example.
 Now consider non pipeline unit that performs thesame operation and takes time equal to tn tocomplete each task. Total time required is ntn. The speedup ration is given as:
Arithmetic Pipeline Pipeline arithmetic units are usually found in veryhigh speed computers. They are used to implement floating pointoperations. We will now discuss the pipeline unit for the floatingpoint addition and subtraction.
 The inputs to floating point adder pipeline are twonormalized floating point numbers. A and B are mantissas and a and b are theexponents. The floating point addition and subtraction can beperformed in four segments.
 The sub-operation performed in each segments are: Compare the exponents Align the mantissas Add or subtract the mantissas Normalize the result
Instruction Pipeline Pipeline processing can occur not only in the datastream but in the instruction stream as well. An instruction pipeline reads consecutive instructionfrom memory while previous instruction are beingexecuted in other segments. This caused the instruction fetch and executesegments to overlap and perform simultaneousoperation.
Four Segment CPU Pipeline FI segment fetches the instruction. DA segment decodes the instruction and calculatethe effective address. FO segment fetches the operand. EX segment executes the instruction.
Handling Data Dependency This problem can be solved in the following ways: Hardware interlocks: It is the circuit that detects theconflict situation and delayed the instruction by sufficientcycles to resolve the conflict. Operand Forwarding: It uses the special hardware todetect the conflict and avoid it by routing the datathrough the special path between pipeline segments. Delayed Loads: The compiler detects the data conflict andreorder the instruction as necessary to delay the loadingof the conflicting data by inserting no operationinstruction.
Handling of Branch Instruction Pre fetch the target instruction. Branch target buffer(BTB) included in the fetchsegment of the pipeline Branch Prediction Delayed Branch
RISC Pipeline Simplicity of instruction set is utilized to implementan instruction pipeline using small number of sub-operation, with each being executed in single clockcycle. Since all operation are performed in the register,there is no need of effective address calculation.
Three Segment Instruction Pipeline I: Instruction Fetch A: ALU Operation E: Execute Instruction
Delayed Load
Delayed Branch Let us consider the program having the following 5instructions
Vector Processing There is a class of computational problems that arebeyond the capabilities of the conventionalcomputer. These are characterized by the fact that they requirevast number of computation and it take aconventional computer days or even weeks tocomplete. Computers with vector processing are able to handlesuch instruction and they have application infollowing fields:
 Long range weather forecasting Petroleum exploration Seismic data analysis Medical diagnosis Aerodynamics and space simulation Artificial Intelligence and expert system Mapping the human genome Image Processing
Vector Operation A vector V of length n is represented as row vector by The element Vi of vector V is written as V(I) and theindex I refers to a memory address or register wherethe number is stored.
 Let us consider the program in assembly languagethat two vectors A and B of length 100 and put theresult in vector C.
 A computer capable of vector processing eliminatesthe overhead associated with the time it takes tofetch and execute the instructions in the programloop. It allows operations to be specified with a singlevector instruction of the form:
Matrix Multiplication Let us consider the multiplication of two 3*3 matrixA and B.
 This requires three multiplication and(afterinitializing c11 to 0) three addition. Total number of addition or multiplication requiredis 3*9. In general inner product consists of the sum of kproduct terms of the form:
 In typical application value of k may be 100 or even1000. The inner product calculation on a pipeline vectorprocessor is shown below. Floating point adder and multiplier are assumed tohave four segments each.
 The four partial sum are added to form the final sum
Memory Interleaving
Array Processor An array processor is a processor that performs thecomputations on large arrays of data. There are two different types of array processor: Attached Array Processor SIMD Array Processor
Attached Array Processor It is designed as a peripheral for a conventional hostcomputer. Its purpose is to enhance the performance of thecomputer by providing vector processing. It achieves high performance by means of parallelprocessing with multiple functional units.
SIMD Array Processor It is processor which consists of multiple processingunit operating in parallel. The processing units are synchronized to performthe same task under control of common control unit. Each processor elements(PE) includes an ALU , afloating point arithmetic unit and working register.
Pipelining and vector processing

Recommended

PPT
Pipeline hazards in computer Architecture ppt
PPT
Computer architecture pipelining
PPT
Unit 3-pipelining & vector processing
PPT
Parallel processing
PPTX
Instruction pipeline: Computer Architecture
PPT
Microprogram Control
PPTX
Instruction pipelining
PDF
Pipelining
PPTX
Register transfer language
PPT
pipelining
PPTX
Microprogrammed Control Unit
PPS
Ram and-rom-chips
PPTX
Accessing I/O Devices
PPT
Pipeline hazard
PDF
Computer organisation -morris mano
PPTX
design of accumlator
PDF
Addressing modes in computer organization
DOCX
Control Units : Microprogrammed and Hardwired:control unit
PPT
Instruction cycle
PPTX
Timing and control
PDF
Processor Organization and Architecture
PPTX
Stacks & subroutines 1
PPTX
DMA and DMA controller
PPTX
Computer network switching
PPT
Arithmetic Logic Unit (ALU)
PPTX
Pipelining And Vector Processing
PPTX
Signed Addition And Subtraction
PPS
Virtual memory
PDF
Instruction pipelining (i)
PPTX
Pipelining, processors, risc and cisc

More Related Content

PPT
Pipeline hazards in computer Architecture ppt
PPT
Computer architecture pipelining
PPT
Unit 3-pipelining & vector processing
PPT
Parallel processing
PPTX
Instruction pipeline: Computer Architecture
PPT
Microprogram Control
PPTX
Instruction pipelining
PDF
Pipelining
Pipeline hazards in computer Architecture ppt
Computer architecture pipelining
Unit 3-pipelining & vector processing
Parallel processing
Instruction pipeline: Computer Architecture
Microprogram Control
Instruction pipelining
Pipelining

What's hot

PPTX
Register transfer language
PPT
pipelining
PPTX
Microprogrammed Control Unit
PPS
Ram and-rom-chips
PPTX
Accessing I/O Devices
PPT
Pipeline hazard
PDF
Computer organisation -morris mano
PPTX
design of accumlator
PDF
Addressing modes in computer organization
DOCX
Control Units : Microprogrammed and Hardwired:control unit
PPT
Instruction cycle
PPTX
Timing and control
PDF
Processor Organization and Architecture
PPTX
Stacks & subroutines 1
PPTX
DMA and DMA controller
PPTX
Computer network switching
PPT
Arithmetic Logic Unit (ALU)
PPTX
Pipelining And Vector Processing
PPTX
Signed Addition And Subtraction
PPS
Virtual memory
Register transfer language
pipelining
Microprogrammed Control Unit
Ram and-rom-chips
Accessing I/O Devices
Pipeline hazard
Computer organisation -morris mano
design of accumlator
Addressing modes in computer organization
Control Units : Microprogrammed and Hardwired:control unit
Instruction cycle
Timing and control
Processor Organization and Architecture
Stacks & subroutines 1
DMA and DMA controller
Computer network switching
Arithmetic Logic Unit (ALU)
Pipelining And Vector Processing
Signed Addition And Subtraction
Virtual memory

Viewers also liked

PDF
Instruction pipelining (i)
PPTX
Pipelining, processors, risc and cisc
PPTX
Lecture 46
PDF
Array Processor
PPTX
pipelining
PPT
Booths Multiplication Algorithm
PPT
09 Arithmetic
PPTX
Booths algorithm for Multiplication
PPT
Booths Multiplication Algorithm
PPT
8237 / 8257 DMA
PPTX
DMA controller intel 8257
PPTX
8237 dma controller
PPTX
Memory management
PPT
Booth Multiplier
Instruction pipelining (i)
Pipelining, processors, risc and cisc
Lecture 46
Array Processor
pipelining
Booths Multiplication Algorithm
09 Arithmetic
Booths algorithm for Multiplication
Booths Multiplication Algorithm
8237 / 8257 DMA
DMA controller intel 8257
8237 dma controller
Memory management
Booth Multiplier

Similar to Pipelining and vector processing

PPTX
Unit 4 COA.pptx
PPTX
Pipeline and Vector Processing Computer Org. Architecture.pptx
PPTX
Pipeline_and_Vector_Processing with info.pptx
PPTX
vector processing, pipelining - computer organization.pptx
PPTX
ehhhhhhhhhhhhhhhhhhhhhhhhhjjjjjllaye.pptx
PPTX
pipelining
PPTX
Unit - 5 Pipelining.pptx
PDF
236UNIT5-COA.pdfvfrffccvnnnjuyyhgfeeesdd
PPT
Computer Organozation
PPTX
Ch-7 COAwrdftghkjnxcvgbdxfhbgfjmgdxghn.pptx
PPTX
Ch-7.pptx about architecture and computer
 
PPTX
BTCS501_MM_Ch9.pptx
PPTX
UNIT 6 in computer organization cse.pptx
PPT
Unit 6 of OS in computer science and engineering
PPT
Computer_Architecture_3rd_Edition_by_Moris_Mano_Ch_09.ppt
PPTX
arithmaticpipline-170310085040.pptx
PPTX
Arithmatic pipline
PDF
Parallel Processing Techniques Pipelining
PPT
Pipelining (COA)okokokokokokokokokokok.ppt
PDF
CS304PC:Computer Organization and Architecture Session 33 demo 1 ppt.pdf
Unit 4 COA.pptx
Pipeline and Vector Processing Computer Org. Architecture.pptx
Pipeline_and_Vector_Processing with info.pptx
vector processing, pipelining - computer organization.pptx
ehhhhhhhhhhhhhhhhhhhhhhhhhjjjjjllaye.pptx
pipelining
Unit - 5 Pipelining.pptx
236UNIT5-COA.pdfvfrffccvnnnjuyyhgfeeesdd
Computer Organozation
Ch-7 COAwrdftghkjnxcvgbdxfhbgfjmgdxghn.pptx
Ch-7.pptx about architecture and computer
 
BTCS501_MM_Ch9.pptx
UNIT 6 in computer organization cse.pptx
Unit 6 of OS in computer science and engineering
Computer_Architecture_3rd_Edition_by_Moris_Mano_Ch_09.ppt
arithmaticpipline-170310085040.pptx
Arithmatic pipline
Parallel Processing Techniques Pipelining
Pipelining (COA)okokokokokokokokokokok.ppt
CS304PC:Computer Organization and Architecture Session 33 demo 1 ppt.pdf

More from Kamal Acharya

PPTX
Programming the basic computer
PPTX
Computer Arithmetic
PPTX
Introduction to Computer Security
PPTX
Session and Cookies
PPTX
Functions in php
PPTX
Web forms in php
PPTX
Making decision and repeating in PHP
PPTX
Working with arrays in php
PPTX
Text and Numbers (Data Types)in PHP
PPTX
Introduction to PHP
PPTX
Capacity Planning of Data Warehousing
PPTX
Data Warehousing
PPTX
Search Engines
PPTX
Web Mining
PPTX
Information Privacy and Data Mining
PPTX
Cluster Analysis
PPTX
Association Analysis in Data Mining
PPTX
Classification techniques in data mining
PPTX
Data Preprocessing
PPTX
Introduction to Data Mining and Data Warehousing
Programming the basic computer
Computer Arithmetic
Introduction to Computer Security
Session and Cookies
Functions in php
Web forms in php
Making decision and repeating in PHP
Working with arrays in php
Text and Numbers (Data Types)in PHP
Introduction to PHP
Capacity Planning of Data Warehousing
Data Warehousing
Search Engines
Web Mining
Information Privacy and Data Mining
Cluster Analysis
Association Analysis in Data Mining
Classification techniques in data mining
Data Preprocessing
Introduction to Data Mining and Data Warehousing

Recently uploaded

PPTX
Prelims - History and Geography Quiz - Around the World in 80 Questions - IITK
PPTX
Session 5 Overview of the PPST and Its Indicators (COI and NCOI).pptx
PPTX
LYMPHATIC SYSTEM.pptx it includes lymph, lymph nodes, bone marrow, spleen
PDF
IMPATT Diodes: Theory, Construction, Operation, and Microwave Applications"
PDF
Agentic AI and AI Agents 20251121.pdf - by Ms. Oceana Wong
PDF
1. Doing Academic Research: Problems and Issues, 2. Academic Research Writing...
PDF
Conferencia de Abertura_Virgilio Almeida.pdf
PPTX
Anatomy of the eyeball An overviews.pptx
PDF
45 ĐỀ LUYỆN THI IOE LỚP 8 THEO CHƯƠNG TRÌNH MỚI - NĂM HỌC 2024-2025 (CÓ LINK ...
PPT
n-1-PMES-Guidelines-for-SY-2025-2026.ppt
PDF
Unit 4_ small scale industries & Entrepreneurship
PDF
Hybrid Electric Vehicles Descriptive Questions
PPTX
Masterclass on Cybercrime, Scams & Safety Hacks.pptx
PPTX
Time Series Analysis - Least Square Method Fitting a Linear Trend Equation
PPTX
Elderly in India: The Changing Scenario.pptx
 
PPTX
Time Series Analysis - Method of Simple Moving Average 3 Year and 4 Year Movi...
PDF
Rigor, ethics, wellbeing and resilience in the biomedical doctoral journey
 
PDF
CXC-AD Associate Degree Handbook (Revised)
PDF
Capitol Webinar November 2025 Emily Barnes.pdf
PDF
AI Workflows and Workflow Rhetoric - by Ms. Oceana Wong
Prelims - History and Geography Quiz - Around the World in 80 Questions - IITK
Session 5 Overview of the PPST and Its Indicators (COI and NCOI).pptx
LYMPHATIC SYSTEM.pptx it includes lymph, lymph nodes, bone marrow, spleen
IMPATT Diodes: Theory, Construction, Operation, and Microwave Applications"
Agentic AI and AI Agents 20251121.pdf - by Ms. Oceana Wong
1. Doing Academic Research: Problems and Issues, 2. Academic Research Writing...
Conferencia de Abertura_Virgilio Almeida.pdf
Anatomy of the eyeball An overviews.pptx
45 ĐỀ LUYỆN THI IOE LỚP 8 THEO CHƯƠNG TRÌNH MỚI - NĂM HỌC 2024-2025 (CÓ LINK ...
n-1-PMES-Guidelines-for-SY-2025-2026.ppt
Unit 4_ small scale industries & Entrepreneurship
Hybrid Electric Vehicles Descriptive Questions
Masterclass on Cybercrime, Scams & Safety Hacks.pptx
Time Series Analysis - Least Square Method Fitting a Linear Trend Equation
Elderly in India: The Changing Scenario.pptx
 
Time Series Analysis - Method of Simple Moving Average 3 Year and 4 Year Movi...
Rigor, ethics, wellbeing and resilience in the biomedical doctoral journey
 
CXC-AD Associate Degree Handbook (Revised)
Capitol Webinar November 2025 Emily Barnes.pdf
AI Workflows and Workflow Rhetoric - by Ms. Oceana Wong

Pipelining and vector processing

  • 1.
     Parallel Processing,Flynn’s Classification ofComputers Pipelining Instruction Pipeline Pipeline Hazards and their solution Array and Vector ProcessingPipelining and VectorProcessing
  • 2.
    Parallel Processing Itrefers to techniques that are used to providesimultaneous data processing. The system may have two or more ALUs to be able toexecute two or more instruction at the same time. The system may have two or more processorsoperating concurrently. It can be achieved by having multiple functionalunits that perform same or different operationsimultaneously.
  • 4.
    Classification There arevariety of ways in which the parallelprocessing can be classified Internal Organization of Processor Interconnection structure between processors Flow of information through system
  • 5.
     M.J. Flynnclassify the computer on the basis ofnumber of instruction and data items processedsimultaneously. Single Instruction Stream, Single Data Stream(SISD) Single Instruction Stream, Multiple Data Stream(SIMD) Multiple Instruction Stream, Single Data Stream(MISD) Multiple Instruction Stream, Multiple Data Stream(MIMD)
  • 6.
     SISD representsthe organization containing singlecontrol unit, a processor unit and a memory unit.Instruction are executed sequentially and systemmay or may not have internal parallel processingcapabilities. SIMD represents an organization that includes manyprocessing units under the supervision of a commoncontrol unit.
  • 7.
     MISD structureis of only theoretical interest sinceno practical system has been constructed using thisorganization. MIMD organization refers to a computer systemcapable of processing several programs at the sametime.
  • 8.
     Flynn’s classificationemphasize on the behavioralcharacteristics of the computer system rather thanits operational and structural interconnections. Onetype of parallel processing that does not fit in theFlynn’s classification is Pipelining. Parallel Processing can be discussed under followingtopics: Pipeline Processing Vector Processing Array Processors
  • 9.
    Pipelining It isa technique of decomposing a sequential processinto sub operations, with each sub process beingexecuted in a special dedicated segments thatoperates concurrently with all other segments. Each segment performs partial processing dictatedby the way task is partitioned. The result obtained from each segment is transferredto next segment. The final result is obtained when data have passedthrough all segments.
  • 10.
    Example Suppose wehave to perform the following task: Each sub operation is to be performed in a segmentwithin a pipeline. Each segment has one or tworegisters and a combinational circuit.
  • 11.
     The suboperations in each segment of the pipelineare as follows:
  • 14.
    General Consideration Letus consider the case where k segments pipelinewith a clock cycle time tp is used to execute n tasks. The first task T1 require time ktp to complete sincethere are k segments. The remaining (n-1) tasks emerge from pipe at therate one task per cycle. They will complete after time(n-1)tp. So total time required is k+(n-1) clock cycles. Calculate total cycles in previous example.
  • 15.
     Now considernon pipeline unit that performs thesame operation and takes time equal to tn tocomplete each task. Total time required is ntn. The speedup ration is given as:
  • 17.
    Arithmetic Pipeline Pipelinearithmetic units are usually found in veryhigh speed computers. They are used to implement floating pointoperations. We will now discuss the pipeline unit for the floatingpoint addition and subtraction.
  • 18.
     The inputsto floating point adder pipeline are twonormalized floating point numbers. A and B are mantissas and a and b are theexponents. The floating point addition and subtraction can beperformed in four segments.
  • 19.
     The sub-operationperformed in each segments are: Compare the exponents Align the mantissas Add or subtract the mantissas Normalize the result
  • 21.
    Instruction Pipeline Pipelineprocessing can occur not only in the datastream but in the instruction stream as well. An instruction pipeline reads consecutive instructionfrom memory while previous instruction are beingexecuted in other segments. This caused the instruction fetch and executesegments to overlap and perform simultaneousoperation.
  • 22.
    Four Segment CPUPipeline FI segment fetches the instruction. DA segment decodes the instruction and calculatethe effective address. FO segment fetches the operand. EX segment executes the instruction.
  • 26.
    Handling Data DependencyThis problem can be solved in the following ways: Hardware interlocks: It is the circuit that detects theconflict situation and delayed the instruction by sufficientcycles to resolve the conflict. Operand Forwarding: It uses the special hardware todetect the conflict and avoid it by routing the datathrough the special path between pipeline segments. Delayed Loads: The compiler detects the data conflict andreorder the instruction as necessary to delay the loadingof the conflicting data by inserting no operationinstruction.
  • 27.
    Handling of BranchInstruction Pre fetch the target instruction. Branch target buffer(BTB) included in the fetchsegment of the pipeline Branch Prediction Delayed Branch
  • 28.
    RISC Pipeline Simplicityof instruction set is utilized to implementan instruction pipeline using small number of sub-operation, with each being executed in single clockcycle. Since all operation are performed in the register,there is no need of effective address calculation.
  • 29.
    Three Segment InstructionPipeline I: Instruction Fetch A: ALU Operation E: Execute Instruction
  • 30.
  • 33.
    Delayed Branch Letus consider the program having the following 5instructions
  • 36.
    Vector Processing Thereis a class of computational problems that arebeyond the capabilities of the conventionalcomputer. These are characterized by the fact that they requirevast number of computation and it take aconventional computer days or even weeks tocomplete. Computers with vector processing are able to handlesuch instruction and they have application infollowing fields:
  • 37.
     Long rangeweather forecasting Petroleum exploration Seismic data analysis Medical diagnosis Aerodynamics and space simulation Artificial Intelligence and expert system Mapping the human genome Image Processing
  • 38.
    Vector Operation Avector V of length n is represented as row vector by The element Vi of vector V is written as V(I) and theindex I refers to a memory address or register wherethe number is stored.
  • 39.
     Let usconsider the program in assembly languagethat two vectors A and B of length 100 and put theresult in vector C.
  • 40.
     A computercapable of vector processing eliminatesthe overhead associated with the time it takes tofetch and execute the instructions in the programloop. It allows operations to be specified with a singlevector instruction of the form:
  • 42.
    Matrix Multiplication Letus consider the multiplication of two 3*3 matrixA and B.
  • 43.
     This requiresthree multiplication and(afterinitializing c11 to 0) three addition. Total number of addition or multiplication requiredis 3*9. In general inner product consists of the sum of kproduct terms of the form:
  • 44.
     In typicalapplication value of k may be 100 or even1000. The inner product calculation on a pipeline vectorprocessor is shown below. Floating point adder and multiplier are assumed tohave four segments each.
  • 46.
     The fourpartial sum are added to form the final sum
  • 47.
  • 48.
    Array Processor Anarray processor is a processor that performs thecomputations on large arrays of data. There are two different types of array processor: Attached Array Processor SIMD Array Processor
  • 49.
    Attached Array ProcessorIt is designed as a peripheral for a conventional hostcomputer. Its purpose is to enhance the performance of thecomputer by providing vector processing. It achieves high performance by means of parallelprocessing with multiple functional units.
  • 51.
    SIMD Array ProcessorIt is processor which consists of multiple processingunit operating in parallel. The processing units are synchronized to performthe same task under control of common control unit. Each processor elements(PE) includes an ALU , afloating point arithmetic unit and working register.

[8]ページ先頭

©2009-2025 Movatter.jp