% Version as of October 8, 1993 \chapter{Introduction to MPI} \label{sec:intro} \label{chap:intro} %\footnotetext[1]{Version of October 8, 1993} \section{Overview and Goals} Message passing is a paradigm used widely on certain classes of parallel machines, especially those with distributed memory. Although there are many variations, the basic concept of processes communicating through messages is well understood. Over the last ten years, substantial progress has been made in casting significant applications in this paradigm. Each vendor has implemented its own variant. More recently, several systems have demonstrated that a message passing system can be efficiently and portably implemented. It is thus an appropriate time to try to define both the syntax and semantics of a core of library routines that will be useful to a wide range of users and efficiently implementable on a wide range of computers. In designing \MPI/ we have sought to make use of the most attractive features of a number of existing message passing systems, rather than selecting one of them and adopting it as the standard. Thus, \MPI/ has been strongly influenced by work at the IBM T. J. Watson Research Center \cite{IBM-report1,IBM-report2}, Intel's NX/2 \cite{NX-2}, Express \cite{express}, nCUBE's Vertex \cite{Vertex}, p4 \cite{p4paper,p4manual}, and PARMACS \cite{parmacs1,parmacs2}. Other important contributions have come from Zipcode \cite{zipcode1,zipcode2}, Chimp \cite{chimp1,chimp2}, PVM \cite{PVM2,PVM1}, Chameleon \cite{chameleon-user-ref}, and PICL \cite{picl}. The \MPI/ standardization effort involved about 60 people from 40 organizations mainly from the United States and Europe. Most of the major vendors of concurrent computers were involved in \MPI/, along with researchers from universities, government laboratories, and industry. The standardization process began with the Workshop on Standards for Message Passing in a Distributed Memory Environment, sponsored by the Center for Research on Parallel Computing, held April 29-30, 1992, in Williamsburg, Virginia \cite{walker92b}. At this workshop the basic features essential to a standard message passing interface were discussed, and a working group established to continue the standardization process. A preliminary draft proposal, known as {MPI1}, was put forward by Dongarra, Hempel, Hey, and Walker in November 1992, and a revised version was completed in February 1993 \cite{mpi1}. MPI1 embodied the main features that were identified at the Williamsburg workshop as being necessary in a message passing standard. Since {MPI1} was primarily intended to promote discussion and ``get the ball rolling,'' it focused mainly on point-to-point communications. {MPI1} brought to the forefront a number of important standardization issues, but did not include any collective communication routines and was not thread-safe. In November 1992, a meeting of the \MPI/ working group was held in Minneapolis, at which it was decided to place the standardization process on a more formal footing, and to generally adopt the procedures and organization of the High Performance Fortran Forum. Subcommittees were formed for the major component areas of the standard, and an email discussion service established for each. In addition, the goal of producing a draft \MPI/ standard by the Fall of 1993 was set. To achieve this goal the \MPI/ working group met every 6 weeks for two days throughout the first 9 months of 1993, and presented the draft \MPI/ standard at the Supercomputing 93 conference in November 1993. These meetings and the email discussion together constituted the \MPI/ Forum, membership of which has been open to all members of the high performance computing community. The main advantages of establishing a message-passing standard are portability and ease-of-use. In a distributed memory communication environment in which the higher level routines and/or abstractions are build upon lower level message passing routines the benefits of standardization are particularly apparent. Furthermore, the definition of a message passing standard, such as that proposed here, provides vendors with a clearly defined base set of routines that they can implement efficiently, or in some cases provide hardware support for, thereby enhancing scalability. The goal of the Message Passing Interface simply stated is to develop a widely used standard for writing message-passing programs. As such the interface should establish a practical, portable, efficient, and flexible standard for message passing. A complete list of goals follows. \begin{itemize} \item Design an application programming interface (not necessarily for compilers or a system implementation library). \item Allow efficient communication: Avoid memory-to-memory copying and allow overlap of computation and communication and offload to communication co-processor, where available. \item Allow for implementations that can be used in a heterogeneous environment. \item Allow convenient C and Fortran 77 bindings for the interface. \item Assume a reliable communication interface: the user need not cope with communication failures. Such failures are dealt with by the underlying communication subsystem. \item Define an interface that is not too different from current practice, such as PVM, NX, Express, p4, etc., and provides extensions that allow greater flexibility. \item Define an interface that can be implemented on many vendor's platforms, with no significant changes in the underlying communication and system software. \item Semantics of the interface should be language independent. \item The interface should be designed to allow for thread-safety. \end{itemize} \section{Who Should Use This Standard?} This standard is intended for use by all those who want to write portable message-passing programs in Fortran 77 and C. This includes individual application programmers, developers of software designed to run on parallel machines, and creators of environments and tools. In order to be attractive to this wide audience, the standard must provide a simple, easy-to-use interface for the basic user while not semantically precluding the high-performance message-passing operations available on advanced machines. \section{What Platforms Are Targets For Implementation?} The attractiveness of the message-passing paradigm at least partially stems from its wide portability. Programs expressed this way may run on distributed-memory multiprocessors, networks of workstations, and combinations of all of these. In addition, shared-memory implementations are possible. The paradigm will not be made obsolete by architectures combining the shared- and distributed-memory views, or by increases in network speeds. It thus should be both possible and useful to implement this standard on a great variety of machines, including those ``machines" consisting of collections of other machines, parallel or not, connected by a communication network. The interface is suitable for use by fully general MIMD programs, as well as those written in the more restricted style of SPMD. Although no explicit support for threads is provided, the interface has been designed so as not to prejudice their use. With this version of \MPI/ no support is provided for dynamic spawning of tasks. \MPI/ provides many features intended to improve performance on scalable parallel computers with specialized interprocessor communication hardware. Thus, we expect that native, high-performance implementations of \MPI/ will be provided on such machines. At the same time, implementations of \MPI/ on top of standard Unix interprocessor communication protocols will provide portability to workstation clusters and heterogenous networks of workstations. Several proprietary, native implementations of \MPI/, and a public domain, portable implementation of \MPI/ are in progress at the time of this writing \cite{MPIF,ANLMSUMPI}. \section{What Is Included In The Standard?} The standard includes: \begin{itemize} \item Point-to-point communication \item Collective operations \item Process groups \item Communication contexts \item Process topologies \item Bindings for Fortran 77 and C \item Environmental Management and inquiry \item Profiling interface \end{itemize} \section{What Is Not Included In The Standard?} The standard does not specify: \begin{itemize} \item Explicit shared-memory operations \item Operations that require more operating system support than is currently standard; for example, interrupt-driven receives, remote execution, or active messages \item Program construction tools \item Debugging facilities \item Explicit support for threads \item Support for task management \item I/O functions \end{itemize} There are many features that have been considered and not included in this standard. This happened for a number of reasons, one of which is the time constraint that was self-imposed in finishing the standard. Features that are not included can always be offered as extensions by specific implementations. Perhaps future versions of \MPI/ will address some of these issues. \section{Organization of this Document} The following is a list of the remaining chapters in this document, along with a brief description of each. \begin{itemize} \item Chapter \ref{sec:terms}, {\sf MPI Terms and Conventions}, explains notational terms and conventions used throughout the \MPI/ document. \item Chapter \ref{sec:pt2pt}, {\sf Point to Point Communication}, defines the basic, pairwise communication subset of \MPI/. {\em send} and {\em receive} are found here, along with many associated functions designed to make basic communication powerful and efficient. \item Chapter \ref{sec:coll}, {\sf Collective Communications}, defines process-group collective communication operations. Well known examples of this are barrier and broadcast over a group of processes (not necessarily all the processes). \item Chapter \ref{sec:context}, {\sf Groups, Contexts, and Communicators}, shows how groups of processes are formed and manipulated, how unique communication contexts are obtained, and how the two are bound together into a {\em communicator}. \item Chapter \ref{sec:topol}, {\sf Process Topologies}, explains a set of utility functions meant to assist in the mapping of process groups (a linearly ordered set) to richer topological structures such as multi-dimensional grids. \item Chapter \ref{sec:environment}, {\sf MPI Environmental Management}, explains how the programmer can manage and make inquiries of the current \MPI/ environment. These functions are needed for the writing of correct, robust programs, and are especially important for the construction of highly-portable message-passing programs. \item Chapter \ref{sec:prof}, {\sf Profiling Interface}, explains a simple name-shifting convention that any \MPI/ implementation must support. One motivation for this is the ability to put performance profiling calls into \MPI/ without the need for access to the \MPI/ source code. The name shift is merely an interface, it says nothing about how the actual profiling should be done and in fact, the name shift can be useful for other purposes. %\item %Chapter \ref{sec:iis}, {\sf Initial Implementation Subset}, %suggests to \MPI/ implementors a ``core'' subset of \MPI/ that is useful, %internally consistent, and should appear first in the evolution %of an \MPI/ implementation. The subset is defined so that consistent %implementations can appear rapidly, and portable parallel programming %with \MPI/ can start in a timely manner. \item Annex \ref{sec:lang}, {\sf Language Bindings}, gives specific syntax in Fortran 77 and C, for all \MPI/ functions, constants, and types. \item The {\sf MPI Function Index} is a simple index showing the location of the precise definition of each \MPI/ function, together with both C and Fortran bindings. \end{itemize} .