Перечислены приложения, на которых достигнута производительность более 1 PFLOP:
- VPIC - метод частиц-в-ячейках. The VPIC science problem had a 3,072x3,072x2,464 cell domain with 7.44103E+12 (7 trillion) particles. The science problem was run on 22,528 nodes with 180,224 MPI ranks with 4 OMP threads/rank, and achieved 1.25 PFLOPS sustained over 2.5 hrs.
- PPM - адаптивные сетки. The test case uses a 10,5603 zone mesh—more than 1 trillion cells. It was run across 702,784 cores of Blue Waters, with 681,472 worker threads organized into eight threads per MPI task. In total, 87,846 MPI ranks were running on 21,962 nodes, organized into 1,331 “teams,” each with its own object storage target (OST) for I/O control. The total simulation completed in just under 41 hours of wall time and sustained 1.5 PF/s. More than 587 TB of data was saved with an aggregate of over 17 GB/sec I/O rate. Communication and I/O are essentially 100% overlapped with computation.
- QMCPACK - метод Монте-Карло. QMCPACK was run with a 432-atom high-pressure hydrogen problem on 22,500 XE nodes with 4 MPI ranks per node and 8 OpenMP threads per rank. The run achieved sustained performance of 1.037 PF/s for less than 1 hour of execution.
- SPECFEM3d_GLOBE - метод конечных элементов на неструктурированной сетке. The run was done on 21,675 XE nodes with 693,600 MPI ranks and sustained over 1 PF/s.