MPI Implementation News The initial public, portable implementation of MPI, scheduled for release at SuperComputing '93, is just going to make it, subject to lots of reservations. The purpose of this note is to fill you in on the current state of the implementation effort. This has become a joint-effort project between Argonne (Bill Gropp and Rusty Lusk) and Mississippi State (Tony Skjellum and Nathan Doss). Hubertus Franke of IBM has also made major contributions. Of course the hardest work was getting the spec right, which was done by the MPI Forum as a whole. We are very much in the middle of the implementation effort, and wish we had another couple of weeks before having to give this out, but now is the time, given the publicity that will be generated at SuperComputing '93 and the MPI workshop. We plan to tell everyone how to ftp the implementation at SC'93. The implementation at this point contains the C library for lots of the point-to-point chapter, most of the collective and context chapters, the topology chapter, and a small amount of the environment chapter. A small number of Fortran-callable versions of the library functions have been generated. It does not have the fancy datatypes yet. It contains several test programs, but has not been thoroughly tested. One major application at Argonne has been ported to this version of MPI, and other applications have been ported at IBM. The implementation is not complete, robust, or particularly efficient. On the other hand, enough of it is there that applications can begin to be ported to it, and it is becoming more complete, robust, and efficient every day. Things are moving rapidly. Details follow below. Our implementation is based on the abstract device interface that we talked about at the MPI meetings several times. MPI is implemented on top of the device interface, and we have implemented the device interface on top of Chameleon, Bill Gropp's lightweight portability package. It in turn uses vendor libraries on individual parallel machines, and either p4 or PVM on networks of workstations. Our testing so far has been with the p4 version; we hope to have the PVM version, and the Intel nX version, working by Saturday. As of right now (Thursday night, Nov 11) the implementation has been tested only on networks of Suns, networks of RS/6000's, and the IBM SP-1. Maybe we (or others) can hack the CM-5, Meiko, and nCube versions during the Conference. No obstacles are foreseen, given that Chameleon and p4 already run on those machines. The purpose of this initial implementation is to advance the cause of MPI by: 1. demonstrating the implementability of MPI and contributing to shaking down the specifications. 2. providing an avenue for applications to begin the porting process very early. 3. providing a shortcut path for vendor implementations. Vendors need provide only the device part of the implementation. Longer-term goals are to provide a portable MPI implementation for heterogeneous networks of workstations and to motivate research into device abstractions for high-performance message-passing. The current implementation has the following limitations, as of November 11, 1993. 1. All the code written so far is written for clarity and simplicity rather than efficiency. 2. The code is not thoroughly or systematically tested. 3. Only elementary datatypes are supported. 4. Messages longer than 100K are not yet supported. 5. Synchronous and Ready modes are not yet supported. 6. Assorted functions are still missing. New functions are added every few days. 7. Only a very few of the functions have Fortran-callable versions. On the other hand, here are a few good aspects of the current state: 1. Enough functions have been implemented both to port existing codes and to experiment with the new ideas in MPI. 2. The C code has been prepared for automatic generation of Fortran wrappers, so the Fortran-callable versions should appear quickly. 3. The Chameleon-p4 (and soon, Chameleon-PVM) implementation is portable to virtually any parallel machine and to heterogeneous networks of workstations. The implementation is not yet highly optimized but it is not naive. 4. A major application (a nuclear-structure code from Argonne's Physics Division) has been ported to MPI and runs on the IBM SP-1. So that's where we stand. I hope to see many of you in Portland. Encourage people you see to come to the MPI Workshop on Friday morning. I have inside information that it will be great! Regards, Rusty (and Bill and Tony and Nathan) What is here ============ This directory contains files releated to the Test Implemntation of the MPI (Message-Passing Interface) Draft Standard. They are: mpich.tar.Z - the implementation itself chameleon.tar.Z - the chameleon system, necessary to build and run the implementation p4-1.3.tar.Z - the p4 parallel programming system. To run the mpi test implementation on a network, you need either this or PVM. How to try out the test implementation ====================================== Create your mpi root directory (such as ~/mpitest ) Ftp the files chameleon.tar.Z, mpi.tar.Z, and optionally p4-1.3.tar.Z to that directory (you may also use PVM 2.4.x with Chameleon). Uncompress the files (uncompress chameleon.tar.Z etc) and untar them (tar xf chameleon.tar etc). You'll need to edit two files: mpich/makefile and mpich/examples/makefile. Change /home/gropp/tools.n to the root directory of Chameleon (usually .../tools.core) (We'll make this nicer later.) Change /home/lusk/mpich to the root directory for mpich. To make p4, cd p4-1.3 make MACHINE=SUN (for example) cd .. To make chameleon, setenv P4DIR `pwd`/p4-1.3 (for example) cd tools.core (the home of Chameleon) bin/install sun4 -libs g (for example) (for P4 or PVM) cd .. (see the readme in the top level Chameleon directory for more information, particularly for the "hosts" file) To make mpi and the examples, see the file README in this directory. To run the C examples, cd mpich/examples first -np 2 -mem 4 (for example) second -np 2 -mem 4 To run the Fortran example, cd mpich/examples secondf -np 2 -mem 4 (for example) You can add the option "-trace" to get a trace of the communication operations. Some notes on using Chameleon ============================= There is a "readme" in ./chameleon that you should look at. For networks, chameleon uses a "hosts" file (in ./chameleon/comm/hosts ) that contains the names of available machines as well as some limits on their use. The file ./chameleon/comm/hosts.sample contains a sample hosts file. On parallel machines (like the i860 and CM-5), the hosts file is ignored. Instead, you must get your program loaded on the parallel machine. For example, to build and run the C examples for the i860, do: cd mpi/examples_c make BOPT=g ARCH=intelnx getcube -t d1 load ring -np 2 load twin -np 2 relcube (you will have to build chameleon and mpi for the intelnx). The PVM version is built much in the same way as the P4 version; just do setenv PVMDIR before building chameleon. Use "COMM=pvm" instead of "COMM=p4". .