Movatterモバイル変換

[0]ホーム

Jump to content

Parallel algorithm

Edit links

From Wikipedia, the free encyclopedia

Algorithm which can do multiple operations in a given time

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Parallel algorithm" – news ·newspapers ·books ·scholar ·JSTOR(November 2012) (Learn how and when to remove this message)

Incomputer science, aparallel algorithm, as opposed to a traditionalserial algorithm, is analgorithm which can do multiple operations in a given time. It has been a tradition ofcomputer science to describe serial algorithms inabstract machine models, often the one known asrandom-access machine. Similarly, many computer science researchers have used a so-calledparallel random-access machine (PRAM) as a parallel abstract machine (shared-memory).^[1]^[2]

Many parallel algorithms are executedconcurrently – though in generalconcurrent algorithms are a distinct concept – and thus these concepts are often conflated, with which aspect of an algorithm is parallel and which is concurrent not being clearly distinguished. Further, non-parallel, non-concurrent algorithms are often referred to as "sequential algorithms", by contrast with concurrent algorithms.

Parallelizability

[edit]

Algorithms vary significantly in how parallelizable they are, ranging from easily parallelizable to completely unparallelizable. Further, a given problem may accommodate different algorithms, which may be more or less parallelizable.

Some problems are easy to divide up into pieces in this way – these are calledembarrassingly parallel problems. Examples include many algorithms to solveRubik's Cubes and find values which result in a givenhash.^{[citation needed]}

Some problems cannot be split up into parallel portions, as they require the results from a preceding step to effectively carry on with the next step – these are calledinherently serial problems. Examples include iterativenumerical methods, such asNewton's method, iterative solutions to thethree-body problem, and most of the available algorithms to computepi (π).^{[citation needed]} Some sequential algorithms can be converted into parallel algorithms usingautomatic parallelization.^[3]

In many cases developing an effective parallel algorithm for solution of some task requires attraction of new ideas and methods comparing to creating a sequential algorithm version. These are, for instance, practically important problems of searching a target element in data structures, evaluation of an algebraic expression, etc.^[4]

Motivation

[edit]

Parallel algorithms on individual devices have become more common since the early 2000s because of substantial improvements inmultiprocessing systems and the rise ofmulti-core processors. Up until the end of 2004, single-core processor performance rapidly increased viafrequency scaling, and thus it was easier to construct a computer with a single fast core than one with many slower cores with the samethroughput, so multicore systems were of more limited use. Since 2004 however, frequency scaling hit a wall, and thus multicore systems have become more widespread, making parallel algorithms of more general use.

Issues

[edit]

Communication

[edit]

The cost or complexity of serial algorithms is estimated in terms of the space (memory) and time (processor cycles) that they take. Parallel algorithms need to optimize one more resource, the communication between different processors. There are two ways parallel processors communicate, shared memory or message passing.

Shared memory processing needs additionallocking for the data, imposes the overhead of additional processor and bus cycles, and also serializes some portion of the algorithm.

Message passing processing uses channels and message boxes but this communication adds transfer overhead on the bus, additional memory need for queues and message boxes and latency in the messages. Designs of parallel processors use specialbuses likecrossbar so that the communication overhead will be small but it is the parallel algorithm that decides the volume of the traffic.

If the communication overhead of additional processors outweighs the benefit of adding another processor, one encountersparallel slowdown.

Load balancing

[edit]

Main article:Load balancing (computing)

Another problem with parallel algorithms is ensuring that they are suitablyload balanced, by ensuring thatload (overall work) is balanced, rather than input size being balanced. For example, checking all numbers from one to a hundred thousand for primality is easy to split among processors; however, if the numbers are simply divided out evenly (1–1,000, 1,001–2,000, etc.), the amount of work will be unbalanced, as smaller numbers are easier to process by this algorithm (easier to test for primality), and thus some processors will get more work to do than the others, which will sit idle until the loaded processors complete.

Distributed algorithms

[edit]

Main article:Distributed algorithm

This sectionneeds expansion. You can help byadding to it.(February 2014)

A subtype of parallel algorithms,distributed algorithms, are algorithms designed to work incluster computing anddistributed computing environments, where additional concerns beyond the scope of "classical" parallel algorithms need to be addressed.

References

[edit]

^Blelloch, Guy E.; Maggs, Bruce M."Parallel Algorithms"(PDF). USA: School of Computer Science,Carnegie Mellon University. Retrieved2015-07-27.
^Vishkin, Uzi (2009)."Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques, 104 pages"(PDF). Class notes of courses on parallel algorithms taught since 1992 at the University of Maryland, College Park, Tel Aviv University and the Technion.
^Megson G M; Chen Xian (4 January 1997).Automatic Parallelization For A Class Of Regular Computations. World Scientific.ISBN 978-981-4498-41-8.
^Kurgalin, Sergei; Borzunov, Sergei (2020).The discrete math workbook: a companion manual using Python. Texts in Computer Science (2nd ed.). Cham, Switzerland: Springer Naturel.ISBN 978-3-030-42220-2.

External links

[edit]

Designing and Building Parallel Programs, US Argonne National Laboratory

v t e Parallel computing
General	Distributed computing Parallel computing Parallel algorithm Massively parallel Cloud computing High-performance computing Multiprocessing Manycore processor GPGPU Computer network Systolic array
Levels	Bit Instruction Thread Task Data Memory Loop Pipeline
Multithreading	Temporal Simultaneous (SMT) Simultaneous and heterogenous Speculative (SpMT) Preemptive Cooperative Clustered multi-thread (CMT) Hardware scout
Theory	PRAM model PEM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window Array
Coordination	Multiprocessing Memory coherence Cache coherence Cache invalidation Barrier Synchronization Application checkpointing
Programming	Stream processing Dataflow programming Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD Array processing (SIMT) Pipelined processing Associative processing MISD MIMD Dataflow architecture Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Beowulf cluster Grid computer Hardware acceleration
APIs	Ateji PX Boost Chapel HPX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP Global Arrays GPUOpen MPI OpenMP OpenCL OpenHMPP OpenACC Parallel Extensions PVM pthreads RaftLib ROCm UPC TBB ZPL
Problems	Automatic parallelization Deadlock Deterministic algorithm Embarrassingly parallel Parallel slowdown Race condition Software lockout Scalability Starvation
Category: Parallel computing