SOLUTION OF LARGE LINEAR SYSTEMS ON PIPELINED SIMD MACHINES
β Scribed by NIKOLAUS GEERS; ROLAND KLEES
- Publisher
- John Wiley and Sons
- Year
- 1997
- Tongue
- English
- Weight
- 575 KB
- Volume
- 40
- Category
- Article
- ISSN
- 0029-5981
No coin nor oath required. For personal study only.
β¦ Synopsis
We developed a direct out-of-core solver for dense non-symmetric linear systems of 'arbitrary' size N;N. The algorithm fully employs the Basic Linear Algebra Subprograms (BLAS), and can therefore easily be adapted to different computer architectures by using the corresponding optimized routines. We used blocked versions of left-looking and right-looking variants of LU decomposition to perform most of the operations in Level 3 BLAS, to reduce the number of I/O operations and to minimize the CPU time usage. The storage requirements of the algorithm are only 2N;NB data elements where NB;N. Depending on the sustained floating point performance and the sustained I/O rate of the given hardware, we derived formulas that allow for choosing optimal values of NB to balance between CPU time and I/O time. We tested the algorithm by means of linear systems derived from 3D-BEM for strongly and weakly singular integral equations and from interpolation problems for scattered data on closed surfaces in 1. It took only about 2β’5 CPU minutes on a 5 GFLOPS vector computer SNI S600/20 to solve a linear system of size 10 000, which corresponds to a performance of 4β’3 GFLOPS; a value of NB"650 gives a reasonable I/O time and the necessary main storage size is about 13 Mwords. In addition, we compared the algorithm with (1) an out-of-core version of GMRES and (2) a wavelet transform followed by in-core GMRES after thresholding. At least for boundary integral equations of classical boundary value problems of potential theory, the out-of-core version of GMRES is superior to the direct out-of-core solver and the wavelet transform since the algorithm converged after at most 5 iteration steps. It took about 17 s to solve a system with 8192 unknowns compared with 146 s for direct out-of-core and 402 s for wavelet transform followed by in-core GMRES.
π SIMILAR VOLUMES
discontinuous initial values associated with (1), we have to consider generalized solutions in the form of generalized Initial value problems for linear hyperbolic systems with smooth 2Θ-periodic coefficients are solved numerically by a modified Fou-functions (or distributions) [5; 6; 4, Appendix A]
In this paper we present a general method for implementing certain types of arbitrary large neighbourhood non-linear control (B) templates on the cellular neural network universal machine (CNNUM). First, we show how a 3;3 non-linear B template can be replaced by a set of linear ones. We thereby obta
Connective stabilizability of large-scale systems, which are composed of interconnected subsystems, is considered using decentralized feedback. Both analytical and graph-theoretic conditions are derived directly in terms of the interconnection structure, which ensure that stability of the overall cl
Let K be a field of characteristic zero and M(Y ) = N a system of linear differential equations with coefficients in K(x). We propose a new algorithm to compute the set of rational solutions of such a system. This algorithm does not require the use of cyclic vectors. It has been implemented in Maple