Movatterモバイル変換

Jump to content

Message Passing Interface

From Wikipedia, the free encyclopedia

Message-passing system for parallel computers

TheMessage Passing Interface (MPI) is a portablemessage-passing standard designed to function onparallel computing architectures.^[1] The MPI standard defines thesyntax andsemantics oflibrary routines that are useful to a wide range of users writingportable message-passing programs inC,C++, andFortran. There are severalopen-source MPIimplementations, which fostered the development of a parallelsoftware industry, and encouraged development of portable and scalable large-scale parallel applications.

History

The message passing interface effort began in the summer of 1991 when a small group of researchers started discussions at a mountain retreat in Austria. Out of that discussion came a Workshop on Standards for Message Passing in a Distributed Memory Environment, held on April 29–30, 1992 inWilliamsburg, Virginia.^[2] Attendees at Williamsburg discussed the basic features essential to a standard message-passing interface and established a working group to continue the standardization process.Jack Dongarra,Tony Hey, and David W. Walker put forward a preliminary draft proposal, "MPI1", in November 1992. In November 1992 a meeting of the MPI working group took place in Minneapolis and decided to place the standardization process on a more formal footing. The MPI working group met every 6 weeks throughout the first 9 months of 1993. The draft MPI standard was presented at the Supercomputing '93 conference in November 1993.^[3] After a period of public comments, which resulted in some changes in MPI, version 1.0 of MPI was released in June 1994. These meetings and the email discussion together constituted the MPI Forum, membership of which has been open to all members of thehigh-performance-computing community.

The MPI effort involved about 80 people from 40 organizations, mainly in the United States and Europe. Most of the major vendors ofconcurrent computers were involved in the MPI effort, collaborating with researchers from universities, government laboratories, andindustry.

MPI provides parallel hardware vendors with a clearly defined base set of routines that can be efficiently implemented. As a result, hardware vendors can build upon this collection of standardlow-level routines to createhigher-level routines for the distributed-memory communication environment supplied with theirparallel machines. MPI provides a simple-to-use portable interface for the basic user, yet one powerful enough to allow programmers to use the high-performance message passing operations available on advanced machines.

In an effort to create a universal standard for message passing, researchers did not base it off of a single system but it incorporated the most useful features of several systems, including those designed by IBM,Intel,nCUBE,PVM, Express, P4 and PARMACS. The message-passing paradigm is attractive because of wide portability and can be used in communication for distributed-memory and shared-memory multiprocessors, networks of workstations, and a combination of these elements. The paradigm can apply in multiple settings, independent of network speed or memory architecture.

Support for MPI meetings came in part fromDARPA and from the U.S.National Science Foundation (NSF) under grant ASC-9310330, NSF Science and Technology Center Cooperative agreement number CCR-8809615, and from theEuropean Commission through Esprit Project P6643. TheUniversity of Tennessee also made financial contributions to the MPI Forum.

In subsequent years the MPI standard has evolved through multiple major and minor revisions, each introducing new capabilities and improvements. For example, MPI-3.0 introduced nonblocking collective operations, enhancements to one-sided communication, and updated language bindings; MPI-4.0 added large-count routines, persistent collectives, and refined initialization methods; and MPI-5.0 introduced a standardized application binary interface (ABI) to improve interoperability between implementations.^[4]

Overview

MPI is acommunication protocol for programming^[5]parallel computers. Both point-to-point and collective communication are supported. MPI "is a message-passing application programmer interface, together with protocol and semantic specifications for how its features must behave in any implementation".^[6] MPI's goals are high performance, scalability, and portability. MPI remained the dominant model used inhigh-performance computing as of 2006.^[7]

The standard’s goals include high performance, scalability, and portability across parallel computing architectures. Although not governed by a formal international standards body, MPI is widely regarded as a de facto standard for message passing in high-performance computing applications.^[8]

MPI is not sanctioned by any major standards body; nevertheless, it has become ade facto standard forcommunication among processes that model aparallel program running on adistributed memory system. Actual distributed memory supercomputers such as computer clusters often run such programs.

The principal MPI-1 model has noshared memory concept, and MPI-2 has only a limiteddistributed shared memory concept. Nonetheless, MPI programs are regularly run on shared-memory computers, and bothMPICH andOpen MPI can use shared memory for message transfer if it is available.^[9]^[10] Designing programs around the MPI model (contrary to explicitshared memory models) has advantages when running onNUMA architectures, since MPI encouragesmemory locality. Explicit shared-memory programming was introduced in MPI-3.^[11]^[12]^[13]

Although MPI belongs in layers 5 and higher of theOSI Reference Model, implementations may cover most layers, withsockets andTransmission Control Protocol (TCP) used in the transport layer.

Most MPI implementations consist of a specific set of routines directly callable fromC,C++,Fortran (i.e., an API) and any language able to interface with such libraries, includingC#,Java orPython. The advantages of MPI over older message-passing libraries are portability (because MPI has been implemented for almost every distributed-memory architecture) and speed (because each implementation is in principle optimized for the hardware on which it runs).

MPI usesLanguage Independent Specifications (LIS) for calls and language bindings. The first MPI standard specifiedANSI C and Fortran 77 bindings together with the LIS. The draft was presented at Supercomputing 1994 (November 1994)^[14] and finalized soon thereafter. About 128 functions constitute the MPI-1.3 standard, which was released as the final end of the MPI-1 series in 2008.^[15]

At present, the standard has several versions: version 1.3 (commonly abbreviatedMPI-1), which emphasizes message passing and has a static runtime environment, MPI-2.2 (MPI-2), which includes new features such as parallel I/O, dynamic process management and remote memory operations,^[16] and MPI-3.1 (MPI-3), which includes extensions to the collective operations with non-blocking versions and extensions to the one-sided operations.^[17]MPI-2's LIS specifies over 500 functions and provides language bindings for ISO C, ISO C++, andFortran 90. Object interoperability was also added to allow easier mixed-language message-passing programming. A side-effect of standardizing MPI-2, completed in 1996, was clarifying the MPI-1 standard, creating the MPI-1.2.

MPI-2 is mostly a superset of MPI-1, although some functions have been deprecated. MPI-1.3 programs still work under MPI implementations compliant with the MPI-2 standard.

MPI-3.0 introduces significant updates to the MPI standard, including nonblocking versions of collective operations, enhancements to one-sided operations, and a Fortran 2008 binding. It removes deprecated C++ bindings and various obsolete routines and objects. Importantly, any valid MPI-2.2 program that avoids the removed elements is also valid in MPI-3.0.

MPI-3.1 is a minor update focused on corrections and clarifications, particularly for Fortran bindings. It introduces new functions for manipulatingMPI_Aint values, nonblocking collective I/O routines, and methods for retrieving index values by name forMPI_T performance variables. Additionally, a general index was added. All valid MPI-3.0 programs are also valid in MPI-3.1.

MPI-4.0 is a major update that introduces large-count versions of many routines, persistent collective operations, partitioned communications, and a new MPI initialization method. It also adds application info assertions and improves error handling definitions, along with various smaller enhancements. Any valid MPI-3.1 program is compatible with MPI-4.0.

MPI-4.1 is a minor update focused on corrections and clarifications to the MPI-4.0 standard. It deprecates several routines, theMPI_HOST attribute key, and thempif.h Fortran include file. A new routine has been added to inquire about the hardware running the MPI program. Any valid MPI-4.0 program remains valid in MPI-4.1.

MPI-5.0 is a major update that introduces anapplication binary interface. This allows for increased interoperability of MPI libraries from different MPI vendors, as well as increased performance in containerized environments.^[18]

MPI is often compared withParallel Virtual Machine (PVM), which was a popular distributed environment and message-passing system developed in 1989, and which was one of the systems that motivated the need for standard parallel message passing. Threaded shared-memory programming models (such asPthreads andOpenMP) and message-passing programming (MPI/PVM) can be considered complementary and have been used together on occasion in, for example, servers with multiple large shared-memory nodes.

Functionality

This sectiondoes notcite anysources. Please helpimprove this section byadding citations to reliable sources. Unsourced material may be challenged andremoved.(July 2021) (Learn how and when to remove this message)

The MPI interface is meant to provide essential virtual topology,synchronization, and communication functionality between a set of processes (that have been mapped to nodes/servers/computer instances) in a language-independent way, with language-specific syntax (bindings), plus a few language-specific features. MPI programs always work with processes, but programmers commonly refer to the processes as processors. Typically, for maximum performance, eachCPU (orcore in a multi-core machine) will be assigned just a single process. This assignment happens at runtime through the agent that starts the MPI program, normally called mpirun or mpiexec.

MPI supports bothpoint-to-point communication, where messages are exchanged directly between pairs of processes, andcollective communication, where groups of processes cooperate on operations such as broadcast and reduction. Language bindings allow MPI routines to be used from multiple programming languages, with the C and Fortran bindings being the most common.^[19]

MPI library functions include, but are not limited to, point-to-point rendezvous-type send/receive operations, choosing between aCartesian orgraph-like logical process topology, exchanging data between process pairs (send/receive operations), combining partial results of computations (gather and reduce operations), synchronizing nodes (barrier operation) as well as obtaining network-related information such as the number of processes in the computing session, current processor identity that a process is mapped to, neighboring processes accessible in a logical topology, and so on. Point-to-point operations come insynchronous,asynchronous, buffered, andready forms, to allow both relatively stronger and weakersemantics for the synchronization aspects of a rendezvous-send. Many pending operations are possible in asynchronous mode, in most implementations.

MPI-1 and MPI-2 both enable implementations that overlap communication and computation, but practice and theory differ. MPI also specifiesthread safe interfaces, which havecohesion andcoupling strategies that help avoid hidden state within the interface. It is relatively easy to write multithreaded point-to-point MPI code, and some implementations support such code.Multithreaded collective communication is best accomplished with multiple copies of Communicators, as described below.

Concepts

MPI provides several features. The following concepts provide context for all of those abilities and help the programmer to decide what functionality to use in their application programs. Four of MPI's eight basic concepts are unique to MPI-2.

Communicator

Communicator objects connect groups of processes in the MPI session. Each communicator gives each contained process an independent identifier and arranges its contained processes in an orderedtopology. MPI also has explicit groups, but these are mainly good for organizing and reorganizing groups of processes before another communicator is made. MPI understands single group intracommunicator operations, and bilateral intercommunicator communication. In MPI-1, single group operations are most prevalent.Bilateral operations mostly appear in MPI-2 where they include collective communication and dynamic in-process management.

Communicators can be partitioned using several MPI commands. These commands includeMPI_COMM_SPLIT, where each process joins one of several colored sub-communicators by declaring itself to have that color.

Point-to-point basics

A number of important MPI functions involve communication between two specific processes. A popular example isMPI_Send, which allows one specified process to send a message to a second specified process. Point-to-point operations, as these are called, are particularly useful in patterned or irregular communication, for example, adata-parallel architecture in which each processor routinely swaps regions of data with specific other processors between calculation steps, or amaster–slave architecture in which the master sends new task data to a slave whenever the prior task is completed.

MPI-1 specifies mechanisms for bothblocking and non-blocking point-to-point communication mechanisms, as well as the so-called 'ready-send' mechanism whereby a send request can be made only when the matching receive request has already been made.

Collective basics

Collective functions involve communication among all processes in a process group (which can mean the entire process pool or a program-defined subset). A typical function is theMPI_Bcast call (short for "broadcast"). This function takes data from one node and sends it to all processes in the process group. A reverse operation is theMPI_Reduce call, which takes data from all processes in a group, performs an operation (such as summing), and stores the results on one node.MPI_Reduce is often useful at the start or end of a large distributed calculation, where each processor operates on a part of the data and then combines it into a result.

Other operations perform more sophisticated tasks, such asMPI_Alltoall which rearrangesn items of data such that thenth node gets thenth item of data from each.

Derived data types

Many MPI functions require specifing the type of data which is sent between processes. This is because MPI aims to support heterogeneous environments where types might be represented differently on the different nodes^[20] (for example they might be running different CPU architectures that have differentendianness), in which case MPI implementations can performdata conversion.^[20] Since the C language does not allow a type itself to be passed as a parameter, MPI predefines the constantsMPI_INT,MPI_CHAR,MPI_DOUBLE to correspond withint,char,double, etc.

Here is an example in C that passes arrays ofints from all processes to one. The one receiving process is called the "root" process, and it can be any designated process but normally it will be process 0. All the processes ask to send their arrays to the root withMPI_Gather, which is equivalent to having each process (including the root itself) callMPI_Send and the root make the corresponding number of orderedMPI_Recv calls to assemble all of these arrays into a larger one:^[21]

intsend_array[100];introot=0;/* or whatever */intnum_procs,*recv_array;MPI_Comm_size(comm,&num_procs);recv_array=malloc(num_procs*sizeof(send_array));MPI_Gather(send_array,sizeof(send_array)/sizeof(*send_array),MPI_INT,recv_array,sizeof(send_array)/sizeof(*send_array),MPI_INT,root,comm);

However, it may be instead desirable to send data as one block as opposed to 100ints. To do this define a "contiguous block" derived data type:

MPI_Datatypenewtype;MPI_Type_contiguous(100,MPI_INT,&newtype);MPI_Type_commit(&newtype);MPI_Gather(array,1,newtype,receive_array,1,newtype,root,comm);

For passing a class or a data structure,MPI_Type_create_struct creates an MPI derived data type fromMPI_predefined data types, as follows:

intMPI_Type_create_struct(intcount,int*blocklen,MPI_Aint*disp,MPI_Datatype*type,MPI_Datatype*newtype)

where:

count is a number of blocks, and specifies the length (in elements) of the arraysblocklen,disp, andtype.
blocklen contains numbers of elements in each block,
disp contains byte displacements of each block,
type contains types of element in each block.
newtype (an output) contains the new derived type created by this function

Thedisp (displacements) array is needed fordata structure alignment, since the compiler may pad the variables in a class or data structure. The safest way to find the distance between different fields is by obtaining their addresses in memory. This is done withMPI_Get_address, which is normally the same as C's& operator but that might not be true when dealing withmemory segmentation.^[22]

Passing a data structure as one block is significantly faster than passing one item at a time, especially if the operation is to be repeated. This is because fixed-size blocks do not requireserialization during transfer.^[23]

Given the following data structures:

structA{intf;shortp;};structB{structAa;intpp,vp;};

Here's the C code for building an MPI-derived data type:

staticconstintblocklen[]={1,1,1,1};staticconstMPI_Aintdisp[]={offsetof(structB,a)+offsetof(structA,f),offsetof(structB,a)+offsetof(structA,p),offsetof(structB,pp),offsetof(structB,vp)};staticMPI_Datatypetype[]={MPI_INT,MPI_SHORT,MPI_INT,MPI_INT};MPI_Datatypenewtype;MPI_Type_create_struct(sizeof(type)/sizeof(*type),blocklen,disp,type,&newtype);MPI_Type_commit(&newtype);

MPI-2 concepts

One-sided communication

MPI-2 defines three one-sided communications operations,MPI_Put,MPI_Get, andMPI_Accumulate, being a write to remote memory, a read from remote memory, and a reduction operation on the same memory across a number of tasks, respectively. Also defined are three different methods to synchronize this communication (global, pairwise, and remote locks) as the specification does not guarantee that these operations have taken place until a synchronization point.

These types of call can often be useful for algorithms in which synchronization would be inconvenient (e.g. distributedmatrix multiplication), or where it is desirable for tasks to be able to balance their load while other processors are operating on data.

Dynamic process management

This sectionneeds expansion. You can help byadding missing information.(June 2008)

The key aspect is "the ability of an MPI process to participate in the creation of new MPI processes or to establish communication with MPI processes that have been started separately." The MPI-2 specification describes three main interfaces by which MPI processes can dynamically establish communications,MPI_Comm_spawn,MPI_Comm_accept/MPI_Comm_connect andMPI_Comm_join. TheMPI_Comm_spawn interface allows an MPI process to spawn a number of instances of the named MPI process. The newly spawned set of MPI processes form a newMPI_COMM_WORLD intracommunicator but can communicate with the parent and the intercommunicator the function returns.MPI_Comm_spawn_multiple is an alternate interface that allows the different instances spawned to be different binaries with different arguments.^[24]

I/O

The parallel I/O feature is sometimes called MPI-IO,^[25] and refers to a set of functions designed to abstract I/O management on distributed systems to MPI, and allow files to be easily accessed in a patterned way using the existing derived datatype functionality.

The little research that has been done on this feature indicates that it may not be trivial to get high performance gains by using MPI-IO. For example, an implementation of sparsematrix-vector multiplications using the MPI I/O library shows a general behavior of minor performance gain, but these results are inconclusive.^[26] It was not until the idea of collective I/O^[27] implemented into MPI-IO that MPI-IO started to reach widespread adoption. Collective I/O substantially boosts applications' I/O bandwidth by having processes collectively transform the small and noncontiguous I/O operations into large and contiguous ones, thereby reducing thelocking and disk seek overhead. Due to its vast performance benefits, MPI-IO also became the underlying I/O layer for many state-of-the-art I/O libraries, such asHDF5 andParallel NetCDF. Its popularity also triggered research on collective I/O optimizations, such as layout-aware I/O^[28] and cross-file aggregation.^[29]^[30]

Official implementations

The initial implementation of the MPI 1.x standard wasMPICH, fromArgonne National Laboratory (ANL) andMississippi State University.IBM also was an early implementor, and most early 90ssupercomputer companies either commercialized MPICH, or built their own implementation.LAM/MPI fromOhio Supercomputer Center was another early open implementation. ANL has continued developing MPICH for over a decade, and now offers MPICH-4.3.0, implementing the MPI-4.1 standard.
Open MPI (not to be confused withOpenMP) was formed by the merging FT-MPI, LA-MPI,LAM/MPI, and PACX-MPI, and is found in manyTOP-500 supercomputers.

Many other efforts are derivatives of MPICH, LAM, and other works, including, but not limited to, commercial implementations fromHPE,Intel,Microsoft, andNEC.

While the specifications mandate a C and Fortran interface, the language used to implement MPI is not constrained to match the language or languages it seeks to support at runtime. Most implementations combine C, C++ and assembly language, and target C, C++, and Fortran programmers. Bindings are available for many other languages, including Perl, Python, R, Ruby, Java, andCL (see#Language bindings).

TheABI of MPI implementations are roughly split betweenMPICH andOpen MPI derivatives, so that a library from one family works as a drop-in replacement of one from the same family, but direct replacement across families is impossible. The FrenchCEA maintains a wrapper interface to facilitate such switches.^[31]

Hardware

MPI hardware research focuses on implementing MPI directly in hardware, for example viaprocessor-in-memory, building MPI operations into the microcircuitry of theRAM chips in each node. By implication, this approach is independent of language, operating system, and CPU, but cannot be readily updated or removed.

Another approach has been to add hardware acceleration to one or more parts of the operation, including hardware processing of MPI queues and usingRDMA to directly transfer data between memory and thenetwork interface controller without CPU or OS kernel intervention.

Compiler wrappers

mpicc (and similarlympic++,mpif90, etc.) is a program that wraps over an existing compiler to set the necessary command-line flags when compiling code that uses MPI. Typically, it adds a few flags that enable the code to be the compiled and linked against the MPI library.^[32]

Language bindings

Bindings are libraries that extend MPI support to other languages by wrapping an existing MPI implementation such as MPICH or Open MPI.

Common Language Infrastructure

The two managedCommon Language Infrastructure .NET implementations are Pure Mpi.NET^[33] and MPI.NET,^[34] a research effort atIndiana University licensed under aBSD-style license. It is compatible withMono, and can make full use of underlying low-latency MPI network fabrics.

Java

AlthoughJava does not have an official MPI binding, several groups attempt to bridge the two, with different degrees of success and compatibility. One of the first attempts was Bryan Carpenter's mpiJava,^[35] essentially a set ofJava Native Interface (JNI) wrappers to a local C MPI library, resulting in a hybrid implementation with limited portability, which also has to be compiled against the specific MPI library being used.

However, this original project also defined the mpiJava API^[36] (ade facto MPIAPI for Java that closely followed the equivalent C++ bindings) which other subsequent Java MPI projects adopted. One less-used API is MPJ API, which was designed to be moreobject-oriented and closer toSun Microsystems' coding conventions.^[37] Beyond the API, Java MPI libraries can be either dependent on a local MPI library, or implement the message passing functions in Java, while some likeP2P-MPI also providepeer-to-peer functionality and allow mixed-platform operation.

Some of the most challenging parts of Java/MPI arise from Java characteristics such as the lack of explicitpointers and thelinear memory address space for its objects, which make transferring multidimensional arrays and complex objects inefficient. Workarounds usually involve transferring one line at a time and/or performing explicit de-serialization andcasting at both the sending and receiving ends, simulating C or Fortran-like arrays by the use of a one-dimensional array, and pointers to primitive types by the use of single-element arrays, thus resulting in programming styles quite far from Java conventions.

Another Java message passing system is MPJ Express.^[38] Recent versions can be executed in cluster and multicore configurations. In the cluster configuration, it can execute parallel Java applications on clusters and clouds. Here Java sockets or specialized I/O interconnects likeMyrinet can support messaging between MPJ Express processes. It can also utilize native C implementation of MPI using its native device. In the multicore configuration, a parallel Java application is executed on multicore processors. In this mode, MPJ Express processes are represented by Java threads.

Julia

There is aJulia language wrapper for MPI.^[39]

MATLAB

There are a few academic implementations of MPI usingMATLAB. MATLAB has its own parallel extension library implemented using MPI andPVM.

OCaml

The OCamlMPI Module^[40] implements a large subset of MPI functions and is in active use in scientific computing. An 11,000-lineOCaml program was "MPI-ified" using the module, with an additional 500 lines of code and slight restructuring and ran with excellent results on up to 170 nodes in a supercomputer.^[41]

PARI/GP

PARI/GP can be built^[42] to use MPI as its multi-thread engine, allowing to run parallel PARI and GP programs on MPI clusters unmodified.

Python

Actively maintained MPI wrappers forPython include: mpi4py,^[43] numba-mpi^[44] and numba-jax.^[45]

Discontinued developments include: pyMPI, pypar,^[46] MYMPI^[47] and the MPI submodule inScientificPython.

R

R bindings of MPI includeRmpi^[48] andpbdMPI,^[49] where Rmpi focuses onmanager-workers parallelism while pbdMPI focuses onSPMD parallelism. Both implementations fully supportOpen MPI orMPICH2.

Example program

Here is a"Hello, World!" program in MPI written in C. In this example, we send a "hello" message to each processor, manipulate it trivially, return the results to the main process, and print the messages.

/*  "Hello World" MPI Test Program*/#include<assert.h>#include<stdio.h>#include<string.h>#include<mpi.h>intmain(intargc,char**argv){charbuf[256];intmy_rank,num_procs;/* Initialize the infrastructure necessary for communication */MPI_Init(&argc,&argv);/* Identify this process */MPI_Comm_rank(MPI_COMM_WORLD,&my_rank);/* Find out how many total processes are active */MPI_Comm_size(MPI_COMM_WORLD,&num_procs);/* Until this point, all programs have been doing exactly the same.       Here, we check the rank to distinguish the roles of the programs */if(my_rank==0){intother_rank;printf("We have %i processes.\n",num_procs);/* Send messages to all other processes */for(other_rank=1;other_rank<num_procs;other_rank++){sprintf(buf,"Hello %i!",other_rank);MPI_Send(buf,256,MPI_CHAR,other_rank,0,MPI_COMM_WORLD);}/* Receive messages from all other processes */for(other_rank=1;other_rank<num_procs;other_rank++){MPI_Recv(buf,256,MPI_CHAR,other_rank,0,MPI_COMM_WORLD,MPI_STATUS_IGNORE);printf("%s\n",buf);}}else{/* Receive message from process #0 */MPI_Recv(buf,256,MPI_CHAR,0,0,MPI_COMM_WORLD,MPI_STATUS_IGNORE);assert(memcmp(buf,"Hello ",6)==0);/* Send message to process #0 */sprintf(buf,"Process %i reporting for duty.",my_rank);MPI_Send(buf,256,MPI_CHAR,0,0,MPI_COMM_WORLD);}/* Tear down the communication infrastructure */MPI_Finalize();return0;}

When run with 4 processes, it should produce the following output:^[50]

$ mpicc example.c && mpiexec -n 4 ./a.outWe have 4 processes.Process 1 reporting for duty.Process 2 reporting for duty.Process 3 reporting for duty.

Here,mpiexec is a command used to execute the example program with 4processes, each of which is an independent instance of the program at run time and assigned ranks (i.e. numeric IDs) 0, 1, 2, and 3. The namempiexec is recommended by the MPI standard, although some implementations provide a similar command under the namempirun. TheMPI_COMM_WORLD is the communicator that consists of all the processes.

A single program, multiple data (SPMD) programming model is thereby facilitated, but not required; many MPI implementations allow multiple, different, executables to be started in the same MPI job. Each process has its own rank, the total number of processes in the world, and the ability to communicate between them either with point-to-point (send/receive) communication, or by collective communication among the group. It is enough for MPI to provide an SPMD-style program withMPI_COMM_WORLD, its own rank, and the size of the world to allow algorithms to decide what to do. In more realistic situations, I/O is more carefully managed than in this example. MPI does not stipulate how standard I/O (stdin, stdout, stderr) should work on a given system. It generally works as expected on the rank-0 process, and some implementations also capture and funnel the output from other processes.

MPI uses the notion of process rather than processor. Program copies aremapped to processors by the MPIruntime. In that sense, the parallel machine can map to one physical processor, or toN processors, whereN is the number of available processors, or even something in between. For maximum parallel speedup, more physical processors are used. This example adjusts its behavior to the size of the worldN, so it also seeks to scale to the runtime configuration without compilation for each size variation, although runtime decisions might vary depending on that absolute amount of concurrency available.

MPI-2 adoption

Adoption of MPI-1.2 has been universal, particularly in cluster computing, but acceptance of MPI-2.1 has been more limited. Issues include:

MPI-2 implementations include I/O and dynamic process management, and the size of the middleware is substantially larger. Most sites that use batch scheduling systems cannot support dynamic process management. MPI-2's parallel I/O is well accepted.^{[citation needed]}
Many MPI-1.2 programs were developed before MPI-2. Portability concerns initially slowed adoption, although wider support has lessened this.
Many MPI-1.2 applications use only a subset of that standard (16–25 functions) with no real need for MPI-2 functionality.

Future

Some aspects of the MPI's future appear solid; others less so. The MPI Forum reconvened in 2007 to clarify some MPI-2 issues and explore developments for a possible MPI-3, which resulted in versions MPI-3.0 (September 2012)^[51] and MPI-3.1 (June 2015).^[52] The development continued with the approval of MPI-4.0 on June 9, 2021,^[53]. MPI-4.1 was approved on November 2, 2023.^[54], with MPI-5.0 being approved on June 5, 2025, bringing significant new functionality; notably the addition of a standardApplication Binary Interface (ABI).^[55].

Architectures are changing, with greater internal concurrency (multi-core), better fine-grained concurrency control (threading, affinity), and more levels ofmemory hierarchy.Multithreaded programs can take advantage of these developments more easily than single-threaded applications. This has already yielded separate, complementary standards forsymmetric multiprocessing, namelyOpenMP. MPI-2 defines how standard-conforming implementations should deal with multithreaded issues, but does not require that implementations be multithreaded, or even thread-safe. MPI-3 adds the ability to use shared-memory parallelism within a node. Implementations of MPI such as Adaptive MPI, Hybrid MPI, Fine-Grained MPI, MPC and others offer extensions to the MPI standard that address different challenges in MPI.

Astrophysicist Jonathan Dursi wrote an opinion piece calling MPI obsolescent, pointing to newer technologies like theChapel language,Unified Parallel C,Hadoop,Spark andFlink.^[56] At the same time, nearly all of the projects in theExascale Computing Project build explicitly on MPI; MPI has been shown to scale to the largest machines as of the early 2020s and is widely considered to stay relevant for a long time to come.

See also

References

^"Message Passing Interface :: High Performance Computing".hpc.nmsu.edu. Retrieved2022-08-06.
^Walker DW (August 1992).Standards for message-passing in a distributed memory environment(PDF) (Report). Oak Ridge National Lab., TN (United States), Center for Research on Parallel Computing (CRPC). p. 25.OSTI 10170156. ORNL/TM-12147. Archived fromthe original(PDF) on 2023-11-15. Retrieved2019-08-18.
^The MPI Forum, CORPORATE (November 15–19, 1993). "MPI: A Message Passing Interface".Proceedings of the 1993 ACM/IEEE conference on Supercomputing.Supercomputing '93. Portland, Oregon, USA: ACM. pp. 878–883.doi:10.1145/169627.169855.ISBN 0-8186-4340-4.
^"Message Passing Interface",Wikipedia, 2025-12-23, retrieved2026-01-03
^Nielsen, Frank (2016)."2. Introduction to MPI: The MessagePassing Interface".Introduction to HPC with MPI for Data Science. Springer. pp. 195–211.ISBN 978-3-319-21903-5.
^Gropp, Lusk & Skjellum 1996, p. 3.
^Sur, Sayantan; Koop, Matthew J.; Panda, Dhabaleswar K. (11 November 2006). "High-performance and scalable MPI over InfiniBand with reduced memory usage: An in-depth performance analysis".Proceedings of the 2006 ACM/IEEE conference on Supercomputing – SC '06. ACM. p. 105.doi:10.1145/1188455.1188565.ISBN 978-0769527000.S2CID 818662.
^"Message Passing Interface",Wikipedia, 2025-12-23, retrieved2026-01-03
^KNEM: High-Performance Intra-Node MPI Communication: "MPICH2 (since release 1.1.1) uses KNEM in the DMA LMT to improve large message performance within a single node. Open MPI also includes KNEM support in its SM BTL component since release 1.5. Additionally, NetPIPE includes a KNEM backend since version 3.7.2."
^"FAQ: Tuning the run-time characteristics of MPI sm communications".www.open-mpi.org.
^The MPI-3 standard introduces another approach to hybrid programming that uses the new MPI Shared Memory (SHM) model.
^Shared Memory and MPI 3.0: "Various benchmarks can be run to determine which method is best for a particular application, whether using MPI + OpenMP or the MPI SHM extensions. On a fairly simple test case, speedups over a base version that used point to point communication were up to 5X, depending on the message."
^Using MPI-3 Shared Memory As a Multicore Programming System (PDF presentation slides).
^Table of Contents — September 1994, 8 (3-4). Hpc.sagepub.com. Retrieved on 2014-03-24.
^MPI Documents. Mpi-forum.org. Retrieved on 2014-03-24.
^Gropp, Lusk & Skjellum 1999b, pp. 4–5.
^MPI: A Message-Passing Interface Standard. Version 3.1, Message Passing Interface Forum, June 4, 2015 on www.mpi-forum.org.
^MPI Forum Meets at HLRS on Path to MPI 5.0. Retrieved 2025-09-30.
^"Message Passing Interface",Wikipedia, 2025-12-23, retrieved2026-01-03
^^a ^b"Type matching rules".mpi-forum.org.
^"MPI_Gather(3) man page (version 1.8.8)".www.open-mpi.org.
^"MPI_Get_address".www.mpich.org.
^Boost.MPI Skeleton/Content Mechanism rationale (performance comparison graphs were produced usingNetPIPE)
^Gropp, Lusk & Skjelling 1999b, p. 7 harvnb error: no target: CITEREFGroppLuskSkjelling1999b (help)
^Gropp, Lusk & Skjelling 1999b, pp. 5–6 harvnb error: no target: CITEREFGroppLuskSkjelling1999b (help)
^"Sparse matrix-vector multiplications using the MPI I/O library"(PDF).
^"Data Sieving and Collective I/O in ROMIO"(PDF). IEEE. Feb 1999.
^Chen, Yong; Sun, Xian-He; Thakur, Rajeev; Roth, Philip C.; Gropp, William D. (Sep 2011). "LACIO: A New Collective I/O Strategy for Parallel I/O Systems".2011 IEEE International Parallel & Distributed Processing Symposium. IEEE. pp. 794–804.CiteSeerX 10.1.1.699.8972.doi:10.1109/IPDPS.2011.79.ISBN 978-1-61284-372-8.S2CID 7110094.
^Teng Wang; Kevin Vasko; Zhuo Liu; Hui Chen; Weikuan Yu (2016). "Enhance parallel input/output with cross-bundle aggregation".The International Journal of High Performance Computing Applications.30 (2):241–256.doi:10.1177/1094342015618017.S2CID 12067366.
^Wang, Teng; Vasko, Kevin; Liu, Zhuo; Chen, Hui; Yu, Weikuan (Nov 2014). "BPAR: A Bundle-Based Parallel Aggregation Framework for Decoupled I/O Execution".2014 International Workshop on Data Intensive Scalable Computing Systems. IEEE. pp. 25–32.doi:10.1109/DISCS.2014.6.ISBN 978-1-4673-6750-9.S2CID 2402391.
^cea-hpc."cea-hpc/wi4mpi: Wrapper interface for MPI".GitHub.
^mpicc. Mpich.org. Retrieved on 2014-03-24.
^"移住の際は空き家バンクと自治体の支援制度を利用しよう - あいち移住ナビ". June 30, 2024.
^"MPI.NET: High-Performance C# Library for Message Passing".www.osl.iu.edu.
^"mpiJava Home Page".www.hpjava.org.
^"Introduction to the mpiJava API".www.hpjava.org.
^"The MPJ API Specification".www.hpjava.org.
^"MPJ Express Project".mpj-express.org.
^JuliaParallel/MPI.jl, Parallel Julia, 2019-10-03, retrieved2019-10-08
^"Xavier Leroy - Software".cristal.inria.fr.
^Archives of the Caml mailing list > Message from Yaron M. Minsky. Caml.inria.fr (2003-07-15). Retrieved on 2014-03-24.
^"Introduction to parallel GP"(PDF).pari.math.u-bordeaux.fr.
^"MPI for Python — MPI for Python 4.1.0 documentation".mpi4py.readthedocs.io.
^"Client Challenge".pypi.org.
^"mpi4jax — mpi4jax documentation".mpi4jax.readthedocs.io.
^"Google Code Archive - Long-term storage for Google Code Project Hosting".code.google.com.
^Now part ofPydusa
^Yu, Hao (2002)."Rmpi: Parallel Statistical Computing in R".R News.
^Chen, Wei-Chen; Ostrouchov, George; Schmidt, Drew; Patel, Pragneshkumar; Yu, Hao (2012)."pbdMPI: Programming with Big Data -- Interface to MPI".
^The output snippet was produced on an ordinary Linux desktop system with Open MPI installed.Distros usually place the mpicc command into an openmpi-devel or libopenmpi-dev package, and sometimes make it necessary to run "module add mpi/openmpi-x86_64" or similar before mpicc and mpiexec are available.
^"MPI: A Message-Passing Interface Standard Version 3.0"(PDF). Archived fromthe original(PDF) on 2013-03-19.
^"MPI: A Message-Passing Interface Standard Version 3.1"(PDF). Archived fromthe original(PDF) on 2015-07-06.
^"MPI: A Message-Passing Interface Standard Version 4.0"(PDF). 2021-06-09. Archived fromthe original(PDF) on 2021-06-28.
^"MPI: A Message-Passing Interface Standard Version 4.1"(PDF). 2023-11-02. Archived fromthe original(PDF) on 2023-11-15.
^"MPI: A Message-Passing Interface Standard"(PDF).www.mpi-forum.org.
^"HPC is dying, and MPI is killing it".www.dursi.ca.

Further reading

This article is based on material taken fromMessage Passing Interface at theFree On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of theGFDL, version 1.3 or later.
Aoyama, Yukiya; Nakano, Jun (1999)RS/6000 SP: Practical MPI Programming, ITSO
Foster, Ian (1995)Designing and Building Parallel Programs (Online) Addison-WesleyISBN 0-201-57594-9, chapter 8Message Passing Interface
Wijesuriya, Viraj Brian (2010-12-29)Daniweb: Sample Code for Matrix Multiplication using MPI Parallel Programming Approach
Using MPI series:
- Gropp, William; Lusk, Ewing; Skjellum, Anthony (1994).Using MPI: portable parallel programming with the message-passing interface. Cambridge, MA, USA:MIT Press Scientific And Engineering Computation Series.ISBN 978-0-262-57104-3.
- Gropp, William; Lusk, Ewing; Skjellum, Anthony (1999a).Using MPI, 2nd Edition: Portable Parallel Programming with the Message Passing Interface. Cambridge, MA, USA:MIT Press Scientific And Engineering Computation Series.ISBN 978-0-262-57132-6.
- Gropp, William; Lusk, Ewing; Skjellum, Anthony (1999b).Using MPI-2: Advanced Features of the Message Passing Interface.MIT Press.ISBN 978-0-262-57133-3.
- Gropp, William; Lusk, Ewing; Skjellum, Anthony (2014).Using MPI, 3rd edition: Portable Parallel Programming with the Message-Passing Interface. Cambridge, MA, USA:MIT Press Scientific And Engineering Computation Series.ISBN 978-0-262-52739-2.
Gropp, William; Lusk, Ewing; Skjellum, Anthony (1996). "A High-Performance, Portable Implementation of the MPI Message Passing Interface".Parallel Computing.22 (6):789–828.CiteSeerX 10.1.1.102.9485.doi:10.1016/0167-8191(96)00024-5.
Pacheco, Peter S. (1997)Parallel Programming with MPI.Parallel Programming with MPI 500 pp. Morgan KaufmannISBN 1-55860-339-5.
MPI—The Complete Reference series:
- Snir, Marc; Otto, Steve W.; Huss-Lederman, Steven; Walker, David W.; Dongarra, Jack J. (1995)MPI: The Complete Reference. MIT Press Cambridge, MA, USA.ISBN 0-262-69215-5
- Snir, Marc; Otto, Steve W.; Huss-Lederman, Steven; Walker, David W.; Dongarra, Jack J. (1998)MPI—The Complete Reference: Volume 1, The MPI Core. MIT Press, Cambridge, MA.ISBN 0-262-69215-5
- Gropp, William; Huss-Lederman, Steven; Lumsdaine, Andrew; Lusk, Ewing; Nitzberg, Bill; Saphir, William; and Snir, Marc (1998)MPI—The Complete Reference: Volume 2, The MPI-2 Extensions. MIT Press, Cambridge, MAISBN 978-0-262-57123-4
Firuziaan, Mohammad; Nommensen, O. (2002)Parallel Processing via MPI & OpenMP, Linux Enterprise, 10/2002
Vanneschi, Marco (1999)Parallel paradigms for scientific computing In Proceedings of the European School on Computational Chemistry (1999, Perugia, Italy), number 75 inLecture Notes in Chemistry, pages 170–183. Springer, 2000
Bala, Bruck, Cypher, Elustondo, A Ho, CT Ho, Kipnis, Snir (1995) ″A portable and tunable collective communication library for scalable parallel computers" in IEEE Transactions on Parallel and Distributed Systems,″ vol. 6, no. 2, pp. 154–164, Feb 1995.

External links

Wikibooks has a book on the topic of:Message-Passing Interface

v t e Parallel computing
General	Distributed computing Parallel computing Parallel algorithm Massively parallel Cloud computing High-performance computing Multiprocessing Manycore processor GPGPU Computer network Systolic array
Levels	Bit Instruction Thread Task Data Memory Loop Pipeline
Multithreading	Temporal Simultaneous (SMT) Simultaneous and heterogenous Speculative (SpMT) Preemptive Cooperative Clustered multi-thread (CMT) Hardware scout
Theory	PRAM model PEM model Analysis of parallel algorithms Amdahl's law Gustafson's law Cost efficiency Karp–Flatt metric Slowdown Speedup
Elements	Process Thread Fiber Instruction window Array
Coordination	Multiprocessing Memory coherence Cache coherence Cache invalidation Barrier Synchronization Application checkpointing
Programming	Stream processing Dataflow programming Models Implicit parallelism Explicit parallelism Concurrency Non-blocking algorithm
Hardware	Flynn's taxonomy SISD SIMD Array processing (SIMT) Pipelined processing Associative processing MISD MIMD Dataflow architecture Pipelined processor Superscalar processor Vector processor Multiprocessor symmetric asymmetric Memory shared distributed distributed shared UMA NUMA COMA Massively parallel computer Computer cluster Beowulf cluster Grid computer Hardware acceleration
APIs	Ateji PX Boost Chapel HPX Charm++ Cilk Coarray Fortran CUDA Dryad C++ AMP Global Arrays GPUOpen MPI OpenMP OpenCL OpenHMPP OpenACC Parallel Extensions PVM pthreads RaftLib ROCm UPC TBB ZPL
Problems	Automatic parallelization Cache stampede Deadlock Deterministic algorithm Embarrassingly parallel Parallel slowdown Race condition Software lockout Scalability Starvation
Category: Parallel computing

Retrieved from "https://en.wikipedia.org/w/index.php?title=Message_Passing_Interface&oldid=1336994125"

Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp