This work is partially supported by the grants ITDC-203 and RFBR 96-01-01557
Development of ASSY (ASsembly SYstems) language and system is based first on the analysis of results of big size problems solution including their parallelization, design of parallel algorithm, data transferring and communications, programming and debugging. An another source of design information is the analysis of current high level parallel programming languages and systems such as Linda, Strand-88, Occam, PVM etc. The third source is the theory of parallel program synthesis [Val88].
Certainly, simple analysis shows that the use of MCS can be organized on the basis of programming language of high nonprocedurality which must provide the generation of high performance and portable parallel programs. But high nonprocedurality of the language means its high specialization for effective solution of problems from specific area. This is a result of usage in program of specific large-scale objects of application area each of which defines ready made fragment of computations. Usually, these fragments can not be used in the other application areas for algorithms definition. In such a way, we can not hope to develop an universal programming language of high nonprocedurality eligable to be used in different application areas.
This being the case, the decision was made to develop and to support a general unified technology of parallel program creation suitable for a wide range of applications and architectures. This is the technology of a computation assembling out of preliminary made fragments.
One of the key element of assembly technology is a program synthesis. We are not going here to solve the problems instead of users, but we are going to support automatic program creation for effective solution of problems from the good studied areas. It means that knowledge of good studied areas (the algorithms of problem solution) can be "canonised" and used in the mode of calculator.
We proceed from the assumption that we need and we can assemble specific MCS installation oriented to a very effective solution of a specific problems from a chosen area. We believe, that user should use the same or similar facilities to create his programs. Our favourite architecture is now linear hierarchical MCSs (like cluster architecture) for which procedural version of ASSY approach was developed first.
In this paper the main principal features of ASSY are presented.
This work is being done in the framework of the academic research project Assembly Systems, which is funded by Russian Academy of Sciences. It is intended for studying assembling phenomenon in parallel processing.
In ASSY a pragmatic definition of function is used, which can be really implemented in a program. In our reasoning we abided by the source proposition that in a system of functional programming a function definition must be supported from the program synthesis method. It should guarantee the successful choice/derivation of suitable algorithm for realisation and support by doing so the portability of ASSY program, the diversity in architectures of multiprocessors and processor elements (PE) on which ASSY program can be executed.
Any ASSY function F is roughly defined as a finite set of different algorithms AiF , i=1,2,... n, computing F. Every elementary algorithm AiF is presented by a ready-made procedure (module) PiA . Thus, elementary algorithm is presented in ASSY as procedure, and function F is presented as a set of algorithms, computing F. We proceed from the assumption, that this set contains all the necessary algorithms. In particular, there are many algorithms for matrices multiplication, but in practice this is enough, usually, to have about 20 of them in the library to cover a wide range of pragmatic applications. Practically it means, that suitable algorithm to compute F, can be chosen among the finite set of ready-made algorithms. Certainly, this choice is done as derivation. If in some specific case there is no such suitable algorithm in F definition, it must be added in the definition of the function F . Usually the set of algorithms is represented as a net (a bi-partite graph, fig.3), which defines in compact form a big number of algorithms and their compositions/ superpositions. As variables the arrays can be used, it provides the use of infinite (but recursively countable) set of algorithms in function F definition. Knowledge is accumulated as a library of functions, this is a first stage (stage of analysis) of ASSY application. On this stage the basic functions of application area are analyzed and designed.
Drawing analogy to the geometry we can compare the ASSY and logic programming approaches in the following way. Logic approach uses Euclid system of axioms as a compact description of geometry (all the algorithms for solution of any geometric problem are presented here), but there is a problem to derive suitable algorithm. These axioms constitute the knowledge base of geometry. In our approach knowledge base accumulates the best algorithms for solution of the concrete problems. Therefore, if a requested algorithm is absent in the knowledge base, it can not be derived and should be appended into the knowledge base.
More exactly, but nevertheless roughly, a function F is defined as a tuple F=(X, G, V, W), where X={x,y,...,z} is a finite set of variables (simple variables, arrays, data structures), G ={a,b,...,c} is a finite set of functional symbols (operations). To each operation aïG the tuples of input in(a)=(x1,x2 ,...,xn) and output out(a)=(y1,y2,...,ym ) variables from X are associated. V is the set of arguments and W is a set output variables of function F . Function definition is depicted as a bi-partite graph (fig.3).
The terms from T(V,G) may contain superfluous computations, Thus, the set T1 is formed which includes only such terms of T(V,G) in whose graphical representation each operation on any of term's path from the leaves to the root occurs not more than once to compute each of its output variables. This condition is enough to delete insufficient computations.
The set T(V,W)={tET1
|in(t)CV&out(t)CW=O} can be specified now. The terms from T(V,W) define
the set of all the algorithms computing the function F. In the case the arrays are used the sets
T(V,G), T1 and T(V,W) are infinite sets, they are
built by specific derivation algorithm.
The interpretation I in the domain D bind each operation
aEG to a function fa =I(a), each
variable xEX to an element (value of x)
dx=I(x), dxED,
dx=Q , were Q ,QED, is a symbol of
indefiniteness. Each variable yEX, such that
yEX\V & $tïT(V,W) ( yEout(t)),
t=b(t1 ,...,tn ),
in(b)={x1,...,xn}, is bound to an element
dtyED,
dty=val(t,y)=
fb(val(t1,x1),...,val(tn,
xn)), val(x,x)=dx.
The function fa =I(a) is assumed to be finally computed by the program module moda. This module moda is either a ready made module from the library or, in the case the function fa =I(a) is presented as a function definition, moda should be synthesised from the definition. In ASSY moda can be developed with any suitable language. Some information on peculiarities of moda (language, estimation of consumed time and memory, properties of input data etc.) is kept on scheme level and used during algorithm derivation.
If yïX\V&û$tïTV (yïout(t)), then "tïTV (val(t,y)= ). The interpretation I is called correct interpretation if for any terms t1 and t2 from TWV such that x ï(out(t1)úout(t2)) the condition val(t1,x)=val(t2 ,x)=dx is satisfied. It means, that the function F definition is a single assignment scheme and the value of variable z (fig.3), computed by MML2 or MML1;TRANSF, will be the same (input data are the same too). Any way of variable z computation can be chosen depending on specific MCS or other significant factors, definition of F function is always excessive to provide a good choice.
ASSY approach to program synthesis is a very simple and very restricted approach, but with the growth of number of the algorithms in the knowledge base it provides, in numeric computations for example, a good choice and possibilities for derivation of suitable algorithms to be executed on specific installation. Moreover, the bigger flexibility in algorithm derivation can be harmful. Only expected algorithms must be derived in numeric computations! Additionally, it enable us to accumulate in libraries the standard schemes of computations, which are in common use together with the best ways of their execution. We hope there exist not many different schemes.
The second level is the level of computation semantic definition, the program is generated in chosen imperative language of parallel programming, resources are assigned, procedures are substituted instead of names of algorithms, program is compiled, data are input and a program is executed.
Utilization of assembly approach in development of MCS SIBERIA and solution of big size problems in seismic data and image processing, modelling natural phenomena, electrical physics, quantum chaos etc. shows this method can be used as a basis for creation and accumulation of knowledge on solution of big size problems on different architecture multicomputers.
We schedule to implement ASSY as a general technology of problems solution on multicomputer systems oriented, first of all, for solution of big size numeric problems.