Main Page > SSD projects > NumGRID

NumGrid Project

Preface

With evolution of mathematical modeling and creation of high-performance computer systems, many scientific applications have appeared that demand increasing computational performance, higher than any of available supercomputers can provide. In particular, for the super large scale numerical modeling it is necessary to integrate several supercomputers, i.e., to create a Grid. Not any application can be well solved on grids because of slow communications. However, such application problems as search for alien civilizations, prime numbers search and climate prediction are successfully running on grids.


Another problem is a rapid progress in microprocessor development that forces us to use heterogeneous computer systems for solution of the large-scale problems. In particular, in 2007 the Siberian Supercomputing center (Novosibirsk, Russia) exploits the following multicomputers: the 32 processors MVS-1000, based on the alpha microprocessor (833 Mhz), the 128 processors MVS-1000, based on the alpha microprocessor (633 Mhz) and the 60 processors HP cluster, based on the Intel Itanium II microprocessor. There are also two clusters based on Intel Pentium III and Opteron in Novosibirsk State University that are used in scientific calculations. Therefore, there is a necessity to create the software that will provide the large-scale simulation in heterogeneous environments. For now, numerous GRID projects oriented to different applications are under development.


The speed of communication is permanently growing and now it is possible to organize numerical simulation on GRID of multicomputers. In 2004, NumGRID project intended for the creation of the necessary grid system software started in Novosibirsk (ICM&MG).

 

Objectives

The main objective of the NumGRID project is to provide the use of remote multicomputers for large-scale numerical simulations
Several large-scale numerical applications were already solved in NumGRID environment:
1. The simulation of protoplanetary disc evolution and galaxy formation.
2. Digital Electromagnetic Model of the Power System: Parallel Implementation for Multicomputers.
3. A numerical model for shallow-water flows: dynamics of the eddy shedding, WSEAS Transactions on Environment and Development.

 

NumGRID middleware

The NumGRID middleware is a collection of tools that helps to join several computational clusters, which can be administered independently, to create more powerful computational resource. The NumGRID provides users with a partial implementation of MPI capable to spread MPI-applications over nodes of independent clusters, computational resource and job management system with convenient user interface and a library that allows for easy development of parallel programs with dynamic properties

The NumGRID middleware consists of 5 components:
1. NumGRID_MPI library (partial MPI support for Grid, including support for MPICH, LAM-MPI, ...)
2. NumGRID_jobmanager (application management, MPI message routing, queue systems support - SGE, PBS, ...)
3. Debugging and monitoring subsystem
4. Cross-platform user interface (client, server, security subsystem based on SSL)
5. libAPT (support for development of numerical applications: a growing collection of data structures specific for numerical simulations and supporting computational and communicational patters specific for this area of applications)

 

NumGRID features

1. Multicomputers/clusters are included into NumGRID
2. Each node of a multicomputer can be an SMP system (2 processors or more)
3. MPI programs can be executed on NumGRID without any changes. Global addressing of all the NumGRID resources is provided.
4. Automatic providing of the dynamic properties of application programs (tunability, dynamic load balancing, program execution monitoring, reliability)
5. Security and safety of calculations on the Grid are provided based on SSL
6. For any application a unique set of multicomputers can be linked in. Also NumGRID environment allows all multicomputers to keep their administrative policies unchanged


Support for development of numerical applications

  1. The idea:
    In numerical simulations, the parallel solution to a problem usually comes from data decomposition. Thus, the idea in the base of the libAPT library is to collect distributed implementations of data structures typical for numerical simulations.
  2. The principles:
    - Data structures must be distributable and provide users (application programmers) with different levels of control ranging from automatic execution to detailed planning of distribution by the user.
    - The control of distribution should be expressed by the user in the high level terms appropriate for the data structures. Thus, working with multidimensional array, the appropriate terms are: an array, a layer, an element, a column, a block, not bytes, doubles, memory locations, pointers, etc.
    - Dynamic redistribution of data structures must be supported with different levels of control as well.
    - Data structures that are used together in numerical simulation should be implemented with possible combined use in mind.
  3. Currently implemented:
    - Distributed multidimensional data array that can be specialized by any type of element. Several data arrays in a program can be bound to each other and be distributed appropriately. Dynamic redistribution of the arrays is supported. The user can control distribution of the data structures at different levels up to providing his own algorithms for redistribution planning. Reduction operations, input/output, exchange of the boundary values at cross-sections are supported.
    - Particle list that can be bound to a data array and be partitioned automatically according to the partitioning of data arrays. This data structure allows for implementation of Particle-in-Cell simulation methods.

 

Publications

[1] N.V.Malyshkin, B.Roux, D.Fougere, V.E.Malyshkin. The NumGRID metacomputing system. In Bulletin of the Novosibirsk Computing Center, series Computer Science, issue 21(2004), pp.57-68
[2] D.Fougere, M.Gorodnichev, N.Malyshkin, V.Malyshkin, B.Roux. NumGRID software for MPI based applications. // BULLETIN of the Novosibirsk Computing center series: Computer Science, issue 22(2005), NCC Publisher, Novosibirsk, 2005, pp. 41-51.
[3] D.Fougere, M.Gorodnichev, N.Malyshkin, V.Malyshkin, A.Merkulov, B.Roux. NumGrid Middleware: MPI Support for Computational Grids. // In Proceedings of the PaCT-2005, Springer Verlag, LNCS series, Vol. 3606, Krasnoyarsk, 2005, pp. 313-320.

 

Download

(Currently in russian only, without libAPT)

NumGRID package (v 1.0), 2007

NumGRID userguide (v 1.0), 2007