#!/bin/sh # This is a shell archive, meaning: # 1. Remove everything above the #!/bin/sh line. # 2. Save the resulting text in a file. # 3. Execute the file with /bin/sh (not csh) to create the files: # README # make.archive # makefile # putq.c # maxparms.h # lockon.s # lockoff.s # second.f # nops.s # i1 # i2 # i3 # i4 # i5 # i6 # i7 # i8 # indx0.h # indxj.h # I1 # I2 # I3 # I4 # I5 # I6 # I7 # I8 # B1 # B2 # B3 # B4 # B5 # B6 # B7 # B8 # ftsubs.f # ts_dynamic.f # blkjac.f # pgm.big.f # speedup.f # This archive created: Wed Jul 19 14:18:15 1989 export PATH; PATH=/bin:$PATH if test -f 'README' then echo shar: over-writing existing file "'README'" fi cat << \SHAR_EOF > 'README' Directory alliant:/afs1/hanson/dirsched contains an enhanced versions of the SCHEDULE Parallel Programming Package. By way of reusable or recycling queues, applications with the new ftsubs.f are now limited to 1000 active job processes, rather than a 1000 cumulative job processes (see the new version of the demonstration program ts_dynamic.f that illustrates the use of the new SCHEDULE subroutine GETTAG and see the description below). This new version of ftsubs.f now also permits iterations of static dependency graphs (see the example blkjac.f that illustrates the use of the new SCHEDULE subroutines RESET and RSCHED). This revised version of SCHEDULE is currently only available on the Alliant. Caution: The first two arguments of NXTAG and SPAWN are now reversed from older versions to make them consistent with the static dependency subroutines DEP and PUTQ. Caution: Calls to SCHED, PUTQ and SPAWN, should include at least one parameter in the argument list. As with netlib: Caveat Emptor. FILES: ftsubs.f : FORTRAN subroutines for the recyling queues and static iteration version of SCHEDULE; Port Caution: the Alliant command is called "fortran" and not "f77". putq.c : C subroutines, including putq, spawn, sched and work. The new putq.c requires "include" files: maxparms.h, indx0.h and indxj.h (these may be easily changed from 20 to 60 parameters for sched, putq and spawn calls; CAUTION: there is usually more overhead in subroutine argument passing than common argument passing, so use rely more on common unless the job is big enough to underwrite the extra overheadi or subroutine arguments can not be avoided). make.archive : makefile to compile and archive the SCHEDULE Programming Package. For Example, make -f make.archive lib produces the archive library for schedule, containing the object modules ftsubs.o, putq.o, nops.o, second.o, lockoff.o and lockon.o, provided their source modules exist or the object module exist in the archive file sched.a. Port Caution: second.f calls Alliant timer ETIME. Additional files: second.f, lockon.s, lockoff.s, nops.s. makefile : makefile for compiling user source files. pgm.f : current user source application calling SCHEDULE; compile by [cp [user source].f pgm.f] make run >& mrun.l & using the makefile in the directory with run as the replacement name for the default execution module a.out with mrun.l containing the compiler FORTRAN listing with errors if any; execute by run < i[j] > o[j] & in the background with input file i[j] and outfile o[j] where [j] = 1 to max[physical processors], for example; files beginning with the character o can be used for speedup calculations. for benchmarking use the form: execute -c[j] run < i[j] > o[j] & on the Alliant only. ts_dynamic.f : example program illustrating use of the new SCHEDULE with the triangular stuffer demonstration program; note especially the new subroutine GETTAG that gets a SCHEDULE generated job for each of the user's processes; also the subroutine NAME passes a 6 character string as the name of the current subroutine being passed to PUTQ (or SPAWN, in the dynamic case), but is only used in the SUN workstation sched.trace facility; note that the arguments of NXTAG and SPAWN have been reordered to be more like that of DEP and PUTQ. Copy in current source program pgm.f and run as instructed above. When pgm.f = ts_dynamic.f, the input file i[j] has the form: [j] = n_processors] [n_array_size] [n_work_iterations] for each [j]; blkjac.f : sample FORTRAN static iteration driver for ftsubs.f with block Jacobi iteration of a variable coefficient EPDE; run by the steps: cp blkjac.f pgm.f make run >& mrun.l & run < I[j] > o[j] & where [j] = 1 to 8 on the FX/8, for example; the input data file I[j] should have the form: [n_processors] [n_x_size] [n_y_size] [n_x_blocks] [n_y_blocks] [max_iterations] [n_result_precision] but a single line is suitable; up to 10 X 10 blocks are permitted. The new SCHEDULE subroutine RSET marks processes that will take part in an iteration. Another new SCHEDULE subroutine RSCHED restores only those parameters, such as ICANGO, that have changed; NSLOTS have been increased to 105 to permit at most 10 X 10 block iterations. ftsubs.graph.f : FORTRAN subroutines as in ftsubs.f, except they produce output in a file trace.graph usable in the SUN trace facility; construct from ftsubs.f using ":%s/cgraph//g" in vi or ex; compile source program pgm.f and execute by make rg <& mrg.l & rg < i[j] > o[j] & for example; to execute on an ACRF SUN use /usr/alcaid1/brewer/SCHED.TRACE/sched.trace and follow the menu. ftsubs.graph.f outputs the file: trace.graph. trace.graph is the primary input to sched.trace. ftsubs.term.f : FORTRAN subroutines as in ftsubs.graph.f, but outputs detailed terminal readable trace information about SCHEDULE execution in the file term.trace; SUN trace facility; construct from ftsubs.f using ":%s/cterm//g" in vi or ex; compile source program pgm.f and execute by make rt>&mrt.l& rto[j]& for example. ftsubs.big.f : similar to ftsubs.f, except that 3000 active processes can be run at any one time; the user creates ftsubs.big.f from ftsubs.f using the UNIX vi or ex editor using the commands on ftsubs.f: :%s/mxprcs = 1000/mxprcs = 3000/g :%s/iprcs = 200/iprcs = 300/g :%s/nslots = 105/nslots = 30/g :w ftsubs.big.f :q ftsubs.big.f must be used with putq.big.c and with object modules archived in big.a; depending where these files are ported, some or all of the following steps are needed: make -f make.archive big >& mbig.l & make rbig >& mrbig.l & rbig < B[j] > O[j] & assuming the files: make.archive, ftsubs.big.f, putq.big.c, makefile, lockon.s, lockoff.s, nops.s, maxparms.h, indx0.h, indxj.h, pgm.big.f (generic user code consisting here of sched stuffer demo code), B[j] (sample data files for stuffer demo code on [j] processors here). putq.big.c : C subroutines going with ftsubs.big.f; the user creates putq.big.c by using the vi or ex editor on putq.c with the commands: :%s/1001/3001/g :w putq.big.c :q or other changes in size that are compatible with the local system and the corresponding changes in ftsubs.big.f are made. speedup : execution file that calculates speedup and efficiency, from the output file o* or o1 to o8 from # markers placed in o* by the user source program; use the cmd make speedup >& speedup.l& to make it; see the example program ts-dynamic.f; execute by cat o* | grep # | speedup > sp.out; more sp.out where speedup if the excutable for speedup.f and the output is stored in the file sp.out; do not use with graphics and terminal trace versions to avoid extraneous overhead. SHAR_EOF if test -f 'make.archive' then echo shar: over-writing existing file "'make.archive'" fi cat << \SHAR_EOF > 'make.archive' FLAGS = -O -AS -recursive -c -g -l FILES = ftsubs.o putq.o second.o lockon.o lockoff.o nops.o lib : $(FILES) ar r sched.a $(FILES);ranlib sched.a FILEG = ftsubs.graph.o putq.o second.o lockon.o lockoff.o nops.o graph : $(FILEG) ar r graph.a $(FILEG);ranlib graph.a FILET = ftsubs.term.o putq.o second.o lockon.o lockoff.o nops.o term : $(FILET) ar r term.a $(FILET);ranlib term.a FILEI = ftsubs.iter.o putq.o second.o lockon.o lockoff.o nops.o iterate : $(FILEI) ar r riter.a $(FILEI);ranlib riter.a FILEB = ftsubs.big.o putq.big.o second.o lockon.o lockoff.o nops.o big : $(FILEB) ar r big.a $(FILEB);ranlib big.a FILEBG = ftsubs.biggrf.o putq.big.o second.o lockon.o lockoff.o nops.o biggrf : $(FILEBG) ar r biggrf.a $(FILEBG);ranlib biggrf.a .f.o : ; fortran $(FLAGS) $*.f .c.o : ; cc -c $*.c .s.o : ; as $*.s SHAR_EOF if test -f 'makefile' then echo shar: over-writing existing file "'makefile'" fi cat << \SHAR_EOF > 'makefile' # Test makefile for use with sched on the ACRF alliant # Change the stuff in < ... > to your specifics # # FILES = # : $(FILES) # fortran $(FLAGS) $(FILES) sched.a -o # # To use the tracing facility, change the reference to # sched.a to graph.a as shown below. After program has # executed a file named trace.graph will be produced. # SUN run: /usr/alcaid1/brewer/SCHED.TRACE/sched.trace # # FLAGS = -g -Ogv -AS -recursive -l # FILES = # LIB = /afs3/local/ftntools/schedule/lib/graph.a # : $(FILES) # fortran $(FLAGS) $(FILES) $(LIB)-o FLAGS = -g -Ogv -AS -recursive -l FILES = pgm.o run : $(FILES) fortran $(FLAGS) $(FILES) sched.a -o run FILEG = pgm.o rg : $(FILEG) fortran $(FLAGS) $(FILEG) graph.a -o rg FILET = pgm.o rt : $(FILET) fortran $(FLAGS) $(FILET) term.a -o rt FILEB = pgm.big.o rbig : $(FILEB) fortran $(FLAGS) $(FILEB) big.a -o rbig FILEBG = pgm.big.o rbiggrf : $(FILEBG) fortran $(FLAGS) $(FILEBG) biggrf.a -o rbiggrf FILESP = speedup.f speedup : $(FILESP) fortran -c $(FILESP) -l -o speedup .f.o : ; fortran $(FLAGS) -c $*.f SHAR_EOF if test -f 'putq.c' then echo shar: over-writing existing file "'putq.c'" fi cat << \SHAR_EOF > 'putq.c' #include #include "maxparms.h" /* Code: putq.c for number of active jobs up to 1000 (indx[1001]). Caution: spawn (& nxtag) are reordered to be consistent with putq (& dep) arguments. */ struct parms { int (*subname)(); long *parms[MAXPARMS]; }; struct parms indx[1001]; sched_(nprocs,parms) int *nprocs; struct parms parms; /* this procedure obtains nprocs physical processors devoted to the the execution of the parallel program indicated through parms which is a structure whose first entry is a subroutine name and whose remaining entries are parameters appearing in the calling sequence of that subroutine. */ { int libopn_(); bcopy(&parms, &indx[0], sizeof(struct parms)); /* the subroutine name and prameter list have been copied and placed in a special slot on the parmq then libopn is invoked to initialize pointers, grab physical processors and begin the computation */ libopn_(nprocs); return(0); } putq_(jobtag,parms) int *jobtag; struct parms parms; /* this procedure puts the descriptor of a schedulable process onto the problem queue. this process will be scheduled for execution when its data dependencies have been satisfied (indicated by icango==0). the argument parms is a structure whose first entry is a subroutine name and whose remaining entries are parameters appearing in the calling sequence of that subroutine. */ { register int j; int place_(); j = *jobtag; bcopy(&parms, &indx[j], sizeof(struct parms)); /* first the parms block is copied into the slot pointed to by by jobtag and then this descriptor is placed on the problem queue */ place_(jobtag); return(0); } spawn_(jobtag,parent,parms) int *jobtag,*parent; struct parms parms; /* this procedure puts the descriptor of a schedulable process onto the problem queue. this process will be scheduled for execution when its data dependencies have been satisfied (indicated by icango==0). the argument parms is a structure whose first entry is a subroutine name and whose remaining entries are parameters appearing in the calling sequence of that subroutine. the action of this procedure differs from putq in that the user does not assign jobtags or data dependencies. a parent may spawn any number of children but these child processes only report to the parent. Caution: First two arguments of NXTAG and SPAWN are reversed from older versions. */ { register int j,i; int place_(),clone_(); j = *jobtag; i = *parent; bcopy(&parms, &indx[j], sizeof(struct parms)); /* first the parms block is copied into the slot pointed to by by jobtag and then this descriptor is placed on the problem queue */ if (indx[j].subname == clone_) indx[j].subname = indx[i].subname; /* here we ask if this is a recursive spawning. if so the name clone has been called instead of subname so we replace the name clone by subname. */ place_(jobtag); return(0); } clone_() { /* this is a dummy routine to satisfy unresolved external */ return(0); } work_(id,jobtag) int *id,*jobtag; { int start2_(),gtprb_(); register int j,myjob; j = *id; if (j == 1) /* the worker whose id is 1 will execute the subroutine passed to sched. this subroutine executes the static data dependency graph. this graph must have at least one node. */ { #include "indx0.h" start2_(); } myjob = gtprb_(id,jobtag); while (myjob != 0) { j = *jobtag; if (myjob <= -1 ) { /* reenter... simple spawning was done all kids completed and no reentry is required. this indicates jobtag is all done and checkin can proceed. */ chekin_(jobtag); myjob = gtprb_(id,jobtag); } else { /* call subname().......... */ #include "indxj.h" chekin_(jobtag); myjob = gtprb_(id,jobtag); } } return(0); } SHAR_EOF if test -f 'maxparms.h' then echo shar: over-writing existing file "'maxparms.h'" fi cat << \SHAR_EOF > 'maxparms.h' #define MAXPARMS 20 SHAR_EOF if test -f 'lockon.s' then echo shar: over-writing existing file "'lockon.s'" fi cat << \SHAR_EOF > 'lockon.s' | Machine Code Listing of lockon.f | modified code to impliment a spin wait on the lock variable | sets lock on if value is zero | .data .bss _BBSS: .text _BTEXT: .globl _lockon_ _lockon_: movl a0@,a1 lp: tas a1@ bne lp rts SHAR_EOF if test -f 'lockoff.s' then echo shar: over-writing existing file "'lockoff.s'" fi cat << \SHAR_EOF > 'lockoff.s' | Machine Code Listing of lockoff.f | turns a locked variable off | sets lock variable to zero if value is not zero | .data .bss _BBSS: .text _BTEXT: .globl _lockoff_ _lockoff_: movl a0@,a1 clrl a1@ rts SHAR_EOF if test -f 'second.f' then echo shar: over-writing existing file "'second.f'" fi cat << \SHAR_EOF > 'second.f' real function second(t) real t real t1(2) t = etime(t1) t = t1(1) second = t return end SHAR_EOF if test -f 'nops.s' then echo shar: over-writing existing file "'nops.s'" fi cat << \SHAR_EOF > 'nops.s' | | .data .bss _BBSS: .text _BTEXT: .globl _nops_ _nops_: lp: | Delay to avoid hitting the lock so often nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop nop done: rts SHAR_EOF if test -f 'i1' then echo shar: over-writing existing file "'i1'" fi cat << \SHAR_EOF > 'i1' 1 44 1000 SHAR_EOF if test -f 'i2' then echo shar: over-writing existing file "'i2'" fi cat << \SHAR_EOF > 'i2' 2 44 1000 SHAR_EOF if test -f 'i3' then echo shar: over-writing existing file "'i3'" fi cat << \SHAR_EOF > 'i3' 3 44 1000 SHAR_EOF if test -f 'i4' then echo shar: over-writing existing file "'i4'" fi cat << \SHAR_EOF > 'i4' 4 44 1000 SHAR_EOF if test -f 'i5' then echo shar: over-writing existing file "'i5'" fi cat << \SHAR_EOF > 'i5' 5 44 1000 SHAR_EOF if test -f 'i6' then echo shar: over-writing existing file "'i6'" fi cat << \SHAR_EOF > 'i6' 6 44 1000 SHAR_EOF if test -f 'i7' then echo shar: over-writing existing file "'i7'" fi cat << \SHAR_EOF > 'i7' 7 44 1000 SHAR_EOF if test -f 'i8' then echo shar: over-writing existing file "'i8'" fi cat << \SHAR_EOF > 'i8' 8 44 1000 SHAR_EOF if test -f 'indx0.h' then echo shar: over-writing existing file "'indx0.h'" fi cat << \SHAR_EOF > 'indx0.h' indx[0].subname(indx[0].parms[0], indx[0].parms[1], indx[0].parms[2], indx[0].parms[3], indx[0].parms[4], indx[0].parms[5], indx[0].parms[6], indx[0].parms[7], indx[0].parms[8], indx[0].parms[9], indx[0].parms[10], indx[0].parms[11], indx[0].parms[12], indx[0].parms[13], indx[0].parms[14], indx[0].parms[15], indx[0].parms[16], indx[0].parms[17], indx[0].parms[18], indx[0].parms[19]); /* For more parms, remove comments and move paren/semicolon. indx[0].parms[20], indx[0].parms[21], indx[0].parms[22], indx[0].parms[23], indx[0].parms[24], indx[0].parms[25], indx[0].parms[26], indx[0].parms[27], indx[0].parms[28], indx[0].parms[29], indx[0].parms[30], indx[0].parms[31], indx[0].parms[32], indx[0].parms[33], indx[0].parms[34], indx[0].parms[35], indx[0].parms[36], indx[0].parms[37], indx[0].parms[38], indx[0].parms[39], indx[0].parms[40], indx[0].parms[41], indx[0].parms[42], indx[0].parms[43], indx[0].parms[44], indx[0].parms[45], indx[0].parms[46], indx[0].parms[47], indx[0].parms[48], indx[0].parms[49], indx[0].parms[50], indx[0].parms[51], indx[0].parms[52], indx[0].parms[53], indx[0].parms[54], indx[0].parms[55], indx[0].parms[56], indx[0].parms[57], indx[0].parms[58], indx[0].parms[59], indx[0].parms[60], indx[0].parms[61], indx[0].parms[62], indx[0].parms[63], indx[0].parms[64], indx[0].parms[65], indx[0].parms[66], indx[0].parms[67], indx[0].parms[68], indx[0].parms[69], indx[0].parms[70], indx[0].parms[71], indx[0].parms[72], indx[0].parms[73], indx[0].parms[74], indx[0].parms[75], indx[0].parms[76], indx[0].parms[77], indx[0].parms[78], indx[0].parms[79], indx[0].parms[80], indx[0].parms[81], indx[0].parms[82], indx[0].parms[83], indx[0].parms[84], indx[0].parms[85], indx[0].parms[86], indx[0].parms[87], indx[0].parms[88], indx[0].parms[89], indx[0].parms[90], indx[0].parms[91], indx[0].parms[92], indx[0].parms[93], indx[0].parms[94], indx[0].parms[95], indx[0].parms[96], indx[0].parms[97], indx[0].parms[98], indx[0].parms[99]); */ SHAR_EOF if test -f 'indxj.h' then echo shar: over-writing existing file "'indxj.h'" fi cat << \SHAR_EOF > 'indxj.h' indx[j].subname(indx[j].parms[0], indx[j].parms[1], indx[j].parms[2], indx[j].parms[3], indx[j].parms[4], indx[j].parms[5], indx[j].parms[6], indx[j].parms[7], indx[j].parms[8], indx[j].parms[9], indx[j].parms[10], indx[j].parms[11], indx[j].parms[12], indx[j].parms[13], indx[j].parms[14], indx[j].parms[15], indx[j].parms[16], indx[j].parms[17], indx[j].parms[18], indx[j].parms[19]); /* For more parms, remove comments and move paren/semicolon. indx[j].parms[20], indx[j].parms[21], indx[j].parms[22], indx[j].parms[23], indx[j].parms[24], indx[j].parms[25], indx[j].parms[26], indx[j].parms[27], indx[j].parms[28], indx[j].parms[29], indx[j].parms[30], indx[j].parms[31], indx[j].parms[32], indx[j].parms[33], indx[j].parms[34], indx[j].parms[35], indx[j].parms[36], indx[j].parms[37], indx[j].parms[38], indx[j].parms[39], indx[j].parms[40], indx[j].parms[41], indx[j].parms[42], indx[j].parms[43], indx[j].parms[44], indx[j].parms[45], indx[j].parms[46], indx[j].parms[47], indx[j].parms[48], indx[j].parms[49], indx[j].parms[50], indx[j].parms[51], indx[j].parms[52], indx[j].parms[53], indx[j].parms[54], indx[j].parms[55], indx[j].parms[56], indx[j].parms[57], indx[j].parms[58], indx[j].parms[59], indx[j].parms[60], indx[j].parms[61], indx[j].parms[62], indx[j].parms[63], indx[j].parms[64], indx[j].parms[65], indx[j].parms[66], indx[j].parms[67], indx[j].parms[68], indx[j].parms[69], indx[j].parms[70], indx[j].parms[71], indx[j].parms[72], indx[j].parms[73], indx[j].parms[74], indx[j].parms[75], indx[j].parms[76], indx[j].parms[77], indx[j].parms[78], indx[j].parms[79], indx[j].parms[80], indx[j].parms[81], indx[j].parms[82], indx[j].parms[83], indx[j].parms[84], indx[j].parms[85], indx[j].parms[86], indx[j].parms[87], indx[j].parms[88], indx[j].parms[89], indx[j].parms[90], indx[j].parms[91], indx[j].parms[92], indx[j].parms[93], indx[j].parms[94], indx[j].parms[95], indx[j].parms[96], indx[j].parms[97], indx[j].parms[98], indx[j].parms[99]); */ SHAR_EOF if test -f 'I1' then echo shar: over-writing existing file "'I1'" fi cat << \SHAR_EOF > 'I1' 1 100 100 10 10 100 2 SHAR_EOF if test -f 'I2' then echo shar: over-writing existing file "'I2'" fi cat << \SHAR_EOF > 'I2' 2 100 100 10 10 100 2 SHAR_EOF if test -f 'I3' then echo shar: over-writing existing file "'I3'" fi cat << \SHAR_EOF > 'I3' 3 100 100 10 10 100 2 SHAR_EOF if test -f 'I4' then echo shar: over-writing existing file "'I4'" fi cat << \SHAR_EOF > 'I4' 4 100 100 10 10 100 2 SHAR_EOF if test -f 'I5' then echo shar: over-writing existing file "'I5'" fi cat << \SHAR_EOF > 'I5' 5 100 100 10 10 100 2 SHAR_EOF if test -f 'I6' then echo shar: over-writing existing file "'I6'" fi cat << \SHAR_EOF > 'I6' 6 100 100 10 10 100 2 SHAR_EOF if test -f 'I7' then echo shar: over-writing existing file "'I7'" fi cat << \SHAR_EOF > 'I7' 7 100 100 10 10 100 2 SHAR_EOF if test -f 'I8' then echo shar: over-writing existing file "'I8'" fi cat << \SHAR_EOF > 'I8' 8 100 100 10 10 100 2 SHAR_EOF if test -f 'B1' then echo shar: over-writing existing file "'B1'" fi cat << \SHAR_EOF > 'B1' 1 1414 1000 SHAR_EOF if test -f 'B2' then echo shar: over-writing existing file "'B2'" fi cat << \SHAR_EOF > 'B2' 2 1414 1000 SHAR_EOF if test -f 'B3' then echo shar: over-writing existing file "'B3'" fi cat << \SHAR_EOF > 'B3' 3 1414 1000 SHAR_EOF if test -f 'B4' then echo shar: over-writing existing file "'B4'" fi cat << \SHAR_EOF > 'B4' 4 1414 1000 SHAR_EOF if test -f 'B5' then echo shar: over-writing existing file "'B5'" fi cat << \SHAR_EOF > 'B5' 5 1414 1000 SHAR_EOF if test -f 'B6' then echo shar: over-writing existing file "'B6'" fi cat << \SHAR_EOF > 'B6' 6 1414 1000 SHAR_EOF if test -f 'B7' then echo shar: over-writing existing file "'B7'" fi cat << \SHAR_EOF > 'B7' 7 1414 1000 SHAR_EOF if test -f 'B8' then echo shar: over-writing existing file "'B8'" fi cat << \SHAR_EOF > 'B8' 8 1414 1000 SHAR_EOF if test -f 'ftsubs.f' then echo shar: over-writing existing file "'ftsubs.f'" fi cat << \SHAR_EOF > 'ftsubs.f' CVD$G NOINLINE (DUMP,DUMP2,LOCKON,LOCKOFF,NOPS,SECOND,WORK) subroutine chekin(jobtag) Code path: alliant:/afs1/hanson/dirsched/ftsubs.f Comment: integrated iteration version of ftsubs.f and ftsubs.iter.f cont: with option to iterate a set of nodes with reset dependencies. Comment: combined graphics and terminal trace version of ftsubt.f Code parent: ftsubs.f from ~/ftntools/schedule/lib change(1): iprcs = 200 <- 120; change(2): automatic return stmt removed out of loop do 20 in chekin; change(3): installed vector-circular ready queue, cont: vector <= nproc sub-qs, elastic with nproc processors; cont: circular <= readyq free space wraps around from rtail to rhead, cont: with the top end of readyq connected to the bottom end; cont: ready(rhead(id)+ndmrsq*(id-1)) <- readyq(rhead(id),id); cont: most mxces replaced by nprocc = nproc = no. sub-qs; cont: ldimrq = leading dim of readyq = iprcs*mxces cont: ndmrsq = dim of a ready-sub-q = ldimrq/nproc cont: idrsq = id of ready-sub-q <- iwrkr; dummy iw used in do's; cont: installed SCHED ERROR flags for readyq over-runs (mtail cond.); cont: round robin test in getprb reduced to single statement. Change: corrected next in nxtag & intspn in start2 to recover lost tag. CAUTION: nxtag and spawn arguments are consistent with dep and putq cont: now, but order of arguments may not be consistent with older cont: versions of ftsubs.f. Change(4): installed circular parm queue, jobtag is the circular cont: (reusable) job tag with 1.le.jobtag.le.mxprcs, cont: snext is the schedule or sum or cumulative jobtag. Change(5): install super next tag, whereby user gets schedule job tags cont: from new schedule sub gettag; hence schedule has no knowledge cont: of user tags and consequently the principal restriction on user cont: is that there be less than "mxprcs" undone jobs at any time. cont: integer array "unitag" keeps a unique job tag for undone jobs. Change(6): install rest and save arrays for jobtags that will be cont: iterated more than once with original dependencies: ireset, cont: icnsav. install sub rsched to reset icangoes cont: and call sub place on iteration. Change(6a): nslots = 105 <- 30 to handle multiple dependencies. Change(7): installed common block CONWRT with key WRLOCK for concurrent cont: writes for use in both ftsubs.f and the user's driver code. Change(8): installed c-include indx*.h files to enable the passing of cont: up to 60 parameters with sched, putq and spawn calls (via m. cont: johnson, ssi). Change(9): installed lock initializations in libopn to make porting cont: to other machines without automatic variable initialization. CAUTION: subroutine second uses machine dependent timer, which must be cont: changed when porting to other machines. cgraphChange: install write nproc in sub libopn. cgraphChange: installed extra traces in chekin & place. cgraphChange: replaced qlock(mxprcs) by glock as igraph's own lock. cgraphChange: installed process names for Dongarra/Brewer's sched.trace. cgraphChange(8): cgraph lines made compatible for SCHED.TRACE/sched.trace. cgraphcdirectory: /usr/alcaid/brewer/SCHED.TRACE/sched.trace cgraphcomment: for graphics trace, change 'cgraph' to null '' and run. ctermComment: for terminal trace, change 'cterm' to null '' and run. CVD$R NOCONCUR integer jobtag c*********************************************************************** c c this subroutine reports unit of computation labeled by c jobtag has completed to all dependent nodes. these nodes are c recorded in parmq(i,jobtag) where 6 .le. i .le. nchks+5 c checkin consists of decrementing the value in each of these c locations by 1. each of these is done in a critical section c protected by qlock(jobtag) c c if the value in parmq(2,jobtag) is 0 where jobtag is a process c dependent upon this one then jobtag is placed on the readyq c by entering the critical section protected by trlock. the c pointer rtail to the tail of the readyq is incremented c unless task done is to be recorded. task done is placed on c the ready q and the pointer rtail left in place if nchks .eq. 0 c is found. c c see the common block description in libopn for more detail. c c*********************************************************************** parameter (mxprcs = 1000,iprcs = 200,mxces = 8,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,qlock,hrlock,trlock,intspn,rhead,rtail, & done,free,fhead,ftail,hflock,tflock,snext,unitag & ,ireset,icnsav common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) Caution: common block CONWRT is used for concurrent writes, cont: with WRLOCK as the key to the LOCK. INTEGER WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf,glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,glock,igraph(nslots,nbuffr) cterm integer endgrf,glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,glock,igraph(nslots,nbuffr) c c c common block description: c c a complete common block description is given in the routine libopn c c**************************************************************************** c c check to see if this process has completed. if not do not checkin c mtail = 0 idrsq = 0 c c first ask if any kids spawned by jobtag c if (parmq(4,jobtag) .ne. 0 .or. parmq(5,jobtag) .ne. 0 ) then c c either kids have been spawned or ientry has been referenced c indicating reentry is required c c c find out how many are waiting to complete c if (parmq(4,jobtag) .ne. 0) then call lockon(qlock(jobtag)) parmq(2,jobtag) = parmq(2,jobtag) - parmq(4,jobtag) call lockoff(qlock(jobtag)) endif c c reset number of kids c parmq(4,jobtag) = 0 c c update the number of times this procedure has been c entered c parmq(1,jobtag) = parmq(1,jobtag) + 1 c c return without checkin if all the kids have not c checked in to jobtag yet or if a reentry is required. c process jobtag is not done in either case. c comment: extra trace data. if (parmq(2,jobtag) .ne. 0) then cgraph call lockon(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call lockoff(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 7 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph else cgraphc trace for chekin/parent (entry_flag.ne.0.or.nkids.ne.0 & icango.ne.0) cgraph igraph(1,insrt) = 6 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph endif return endif c c if ientry has been called but jobtag is not waiting c on any kids then jobtag is placed back on the readyq c if ( parmq(5,jobtag) .ne. 0) then idrsq = mod((jobtag-1),nprocc) + 1 call lockon(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = jobtag rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call lockoff(trlock(idrsq)) cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 10 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(8,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.ne.0 & icango=0 & nkids=0) cterm igraph(1,insrt) = 9 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(8,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif return endif endif c c the process has completed so chekin proceeds c cgraph call lockon(glock) cgraph if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cgraph insrt = endgrf cgraph endgrf = endgrf + 1 cgraph call lockoff(glock) cgraph inext = unitag(jobtag) cgraph if (inext .ge. intspn) then cgraphc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 5 cgraph igraph(2,insrt) = parmq(6,jobtag) cgraph igraph(3,insrt) = inext cgraph igraph(4,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph else cgraphc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cgraph igraph(1,insrt) = 2 cgraph igraph(2,insrt) = inext cgraph igraph(3,insrt) = second(foo) cgraph gnames(insrt) = names(jobtag) cgraph endif c cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 5 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = jobtag cterm igraph(6,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (entry_flag.eq.0 & nkids = 0) cterm igraph(1,insrt) = 2 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = jobtag cterm igraph(5,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c c if (mtail .lt. 0) then write(6,*) '*************SCHED USER ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c nchks = parmq(3,jobtag) c c if this is the final process (indicated by nchks .eq. 0) then c record task done. do not advance the tail so task done sequence c is set. all subsequent gtprb queries will leave rhead .eq. rtail c with readyq(rhead+ndmrsq*(i-1)) .eq. done. c if (nchks .eq. 0) then do 20 iw = 1,nprocc call lockon(trlock(iw)) readyq(rtail(iw)+ndmrsq*(iw-1)) = done call lockoff(trlock(iw)) 20 continue cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 12 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = rhead(idrsq) cterm igraph(6,insrt) = rtail(idrsq) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.eq.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 11 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = rhead(idrsq) cterm igraph(5,insrt) = rtail(idrsq) cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif Change(2): removed following return from end of above loop do 20. return endif do 50 j = 6,nchks+5 mychek = parmq(j,jobtag) c c get unique access to the checkin node mychek c and checkin by decrementing the appropriate counter c mchkgo = 1 call lockon(qlock(mychek)) parmq(2,mychek) = parmq(2,mychek) - 1 mchkgo = parmq(2,mychek) call lockoff(qlock(mychek)) c c place mychek on readyq if parmq(2,mychek) is 0 c if (mchkgo .eq. 0 ) then idrsq = mod((mychek-1),nprocc) + 1 call lockon(trlock(idrsq)) if(mod(rtail(idrsq),ndmrsq) + 1 .ne. rhead(idrsq)) then readyq(rtail(idrsq)+ndmrsq*(idrsq-1)) = mychek rtail(idrsq) = mod(rtail(idrsq),ndmrsq) + 1 else mtail = -1 endif call lockoff(trlock(idrsq)) endif 50 continue c c place finished process at the end of the free list freeq c provided it will not be reset for another iteration. c if(ireset(jobtag).eq.0) then call lockon(tflock) ftail = mod(ftail,mxprcs) + 1 if(fhead.eq. ftail) free = 0 freeq(ftail) = jobtag call lockoff(tflock) endif c cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) cterm if (inext .ge. intspn) then ctermc trace for chekin/child (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 8 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = parmq(6,jobtag) cterm igraph(4,insrt) = idrsq cterm igraph(5,insrt) = fhead cterm igraph(6,insrt) = ftail cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm else ctermc trace for chekin/parent (nchks.ne.0 & nkids=0 & entry_flag=0) cterm igraph(1,insrt) = 7 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = idrsq cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail cterm igraph(6,insrt) = parmq(6,jobtag) cterm igraph(7,insrt) = jobtag cterm igraph(8,insrt) = second(foo) cterm gnames(insrt) = names(jobtag) cterm endif c if (mtail .lt. 0) then write(6,*) '*************SCHED LIMIT ERROR********************' write(6,*) ' user attempt to create too many processes' write(6,*) ' exceeding the space in a single sub-queue' write(6,*) ' the maximum allowed is ',ndmrsq,' per sub-q' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop endif c if ( free .eq. 0 ) then call lockon(WRLOCK) inext = unitag(jobtag) write(6,*) '*************SCHED ERROR*************************' write(6,*) ' more processes have checked into sub chekin,' write(6,*) ' than should be active for free slots in the' write(6,*) ' parmq parameter queue. jobs are too many.' write(6,*) ' total number of jobtags were:',inext write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine CHEKIN' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call lockoff(WRLOCK) c endif c return c c last card of chekin c end subroutine gettag(jobtag) CVD$R NOCONCUR integer jobtag c************************************************************************* c c this subroutine gets a schedule jobtag for problem on the queue, c provided a free column is available in parmq. c c************************************************************************* c parameter (mxprcs = 1000,iprcs = 200,mxces = 8,nslots = 105) parameter (nbuffr = 500,ldimrq = 8*iprcs) integer parmq,freeq,readyq,qlock,hrlock,trlock,intspn,rhead,rtail, & done,free,fhead,ftail,hflock,tflock,snext,unitag & ,ireset,icnsav common /qdata/ parmq(nslots,mxprcs),freeq(mxprcs),intspn, & readyq(ldimrq),rhead(mxces),rtail(mxces), & ndmrsq,nprocc,fhead,ftail,snext,unitag(mxprcs) common /qsync/ qlock(mxprcs),hrlock(mxces),trlock(mxces), & done,free,hflock,tflock common /qreset/ ireset(mxprcs),icnsav(mxprcs) INTEGER WRLOCK COMMON /CONWRT/ WRLOCK cgraph integer endgrf,glock cgraph real igraph cgraph character*6 names,gnames cgraph common /calls/ names(mxprcs) cgraph common /gphnam/ gnames(nbuffr) cgraph common /gphout/ endgrf,glock,igraph(nslots,nbuffr) cterm integer endgrf,glock cterm real igraph cterm character*6 names,gnames cterm common /calls/ names(mxprcs) cterm common /gphnam/ gnames(nbuffr) cterm common /gphout/ endgrf,glock,igraph(nslots,nbuffr) c c c common block description: c c for a complete common block description see the subroutine libopn c c if ( free .eq. 0 ) then call lockon(WRLOCK) write(6,*) '*************SCHED LIMIT ERROR*******************' write(6,*) ' user attempt to create to many active ' write(6,*) ' processes ; total number of jobs =',snext write(6,*) ' too many unfinished jobs while in gettag ' write(6,*) ' and no free slots on the parameter queue ' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop call lockoff(WRLOCK) c endif c c get tag for process on the next free column in the problem queue c call lockon(hflock) jobtag = freeq(fhead) snext = snext + 1 if(fhead.eq. ftail) free = 0 fhead = mod(fhead,mxprcs) + 1 if(jobtag.ge.1.and.jobtag.le.mxprcs) unitag(jobtag) = snext call lockoff(hflock) cterm call lockon(glock) cterm if (endgrf .gt. nbuffr) call dump(endgrf,igraph) cterm insrt = endgrf cterm endgrf = endgrf + 1 cterm call lockoff(glock) cterm inext = unitag(jobtag) ctermc trace for gettag cterm igraph(1,insrt) = 15 cterm igraph(2,insrt) = inext cterm igraph(3,insrt) = jobtag cterm igraph(4,insrt) = fhead cterm igraph(5,insrt) = ftail c if ( jobtag .le. 0 .or. jobtag .gt. mxprcs ) then write(6,*) '*************SCHED ERROR***********************' write(6,*) ' illegal jobtag for parmq column;' write(6,*) ' need 1 .le. jobtag .le. ',mxprcs,';' write(6,*) ' current jobtag =',jobtag,' in gettag' write(6,*) ' ' write(6,*) 'EXECUTION TERMINATED BY SCHED in subroutine GETTAG' cgraph call dump(endgrf,igraph) cterm call dump(endgrf,igraph) stop c endif c return c c last card of gettag c end subroutine rsched(jobtag,settag,kreset) CVD$R NOCONCUR integer jobtag,settag,kreset c************************************************************************* c comment: usage c subroutine paralg() c integer strtag,stptag,itag(*) c external start,test c . c . c call gettag(strtag) c itag(strtag) = strtag c . c . c call gettag(stptag) c itag(stptag) = stptag c . c . comment: start iteration or time stepping c jobtag = strtag c icango = 1 c nchks = ... c nreset = c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,start,itag(strtag)) c . c . comment: test and continue iteration at start if undone c jobtag = testag c icango = ... c nreset = c . c . c call dep(jobtag,icango,nchks,mychkn) c call reset(jobtag,nreset) c call putq(jobtag,test,itag(strtag),itag(stptag)) c . c . c subroutine test(jobtag,strtag,stptag) c common /