Solution of large-scale least squares problems and systems of linear equations with sparse indefinite matrices.





	CMA-LS-SPARSE 2.0: Least Squares problems solver based on the Conjugate Gradients Method with consequent iterative refinement.

	Least squares solutions of overdetermined systems of linear algebraic equations are usually obtained by transforming the original system with m-by-n matrix into a system with square m-by-m matrix by left-multiplying the original system by the transposed matrix :


	That approach is very common and it is used in our CMA-LS-SPARSE solver. From the numerical point of view, however, such approach has a major shortcoming: the matrix can be very ill-conditioned. We will demonstrate that using an example of square matrix m-by-m (and a system of linear equations with a particular right-hand side vector), provided in the famous classic book by Wilkinson and Reinsch "Handbook for Automatic Computation".


	The reason for selecting that particular matrix was that it had the condition ratio rapidly increasing with the growth of the matrix size. That gives us a good opportunity to investigate: how the above-mentioned approach to the solution of the Least Squares problems would behave with matrices that are close to being singular. Below we provide a table of the condition ratio values (estimated by the DECOMP procedure from the classic book by Forsythe, Malcolm and Moler "Computer methods for mathematical computations ", implementing LU-factorization) for the abovementioned matrix of different dimensions.


	Judging on the values from the table, we can draw some important conclusions. Since CMA-LS-SPARSE 2.0 works with double precision variables, we can count that the solution in case of a "perfect" matrix (i.e. with the condition ratio close to 1.) would have up to 15 (or so) significant decimal digits. The condition ratio for the original matrix grows pretty fast with the increasing size of the matrix, so the solution would be rapidly losing significant digits; for m=100 we are unlikely to obtain more than 7 or 8 at most, since the condition ratio estimate is about 10⁷. Still, for the purposes of testing the solution of the system, we can use the solutions obtained by the DECOMP/SOLVE as a benchmark. That would allow us to demonstrate the fundamental results of our recent research project. 1) Even if the solution error estimate upon completion of the CGM iterations is satisfactory, the solution itself can be far from correct, and it can even contain no significant digits at all. 2) In order to obtain a solution, containing at least some significant digits, another iterative process, namely the iterative refinement, must be implemented "over" the complete CGM cycle of iterations. Our first example would be the SLAE shown above of the size of m=10. From the table above it can be seen that both matrices A and are reasonably well-conditioned, so the numerical results are very good in both cases: the LU-solution of the system has 14 significant digits, and the CGM-solution has at least 12 significant digits. Still, the difference is becoming obvious: since the condition ratio estimate of the matrix is the square of that of A, the solution of already has one or two significant digits less. It wouldn't be a problem, unless the CGM-algorithm delivered the correct estimate of the numerical error upon completion of the CGM iterations. However, that turned out not to be the case. In fact, the CGM algorithm, upon completion of the iterations, delivers the relative error estimate of 4.2510^-16, which is just slightly more optimistic than it should have been. But the situation quickly becomes troublesome with the growth of m: the larger the matrix, the less accurate are both the solution vector and the relative error evaluation. We should emphasize the most important point in this investigation, which is relevant to all numerical simulations. For any numerical aglorithm, it is not a trouble to report that a quality solution couldn't be obtained and that the numerical error is high. If one particular method doesn't deliver a quality result, we may consider searching for a different method, or develop and use some software emulating a processor with more significant digits. What is really a serious problem: when a particular algorithm delivers a solution and its error estimate both looking "reasonable", whereas, in fact, the solution may be totally wrong. Below we demonstrate how that can possibly happen, and what can be done to remedy the situation. Now we take a matrix of the same structure but of the size m=30, and the things become different. Namely, the CGM iterations complete with the relative error of 1.0510^-12, but the solution has only about six significant digits, as shown in the table below. An additional test of the quality of the obtained solution is made by multiplying A by the obtained solution. From the table below it is obvious that the LU solution is "good", but the CGM solution is far from perfect: the right-hand side, where all but the first element must be zeros, has only half of significant digits correct (about eight zeros out of 16). And this is where the iterative refinement comes handy. We apply the following procedure: calculate the residuals vector solve SLAE with the same matrix but with residuals as the right-hand side calculate new approximation to the solution vector: In this case, i.e. for m=30, that approach works perfectly well: the multiplication gives the right-hand side evaluation of the same quality as the solution obtained by using the LU-factorization of matrix A. We summarized the obtained results in the table below.


	Our next example has the size of m=50, and this is where things go very wrong unless we take care of accurate analysis and refinement of the results. The following table contains some numerical results (which we are going to discuss below).


	We can readily see that the CGM solution without iterative refinement has no significant digits at all; it is simply wrong. That's wouldn't be a trouble had the solver informed us about the fact that the matrix is very ill-condition and therefore that the solution is unacceptable. But, that was not the case: the standard CGM solver delivers an error estimate of 2.610^-9. That confirms our most important hypothesis: that for the Least Squares problems, iterative refinement is a must, regardless of how "trustworthy" the original results look* . Now, what we see after the 1st iterative refinement step: comparing the solution delivered by the LU-factorization with the CGM solution, we discover that the solution has only one or two significant digits. Again, the CGM reports a very decent error estimate of 1.9*10^-9. What is even more troublesome is that by multiplying A by x we obtain a vector reasonably close to the original right-hand side: it has four significant digits, not just one. Continuing the iterative refinement for each complete cycle of the CGM iterations, we could see that the solution gradually obtains more significant digits, and the right-hand side, obtained by multiplication , more and more resembles the original right-hand side (the unity vector with the first component equal 1.). We decided to stop at the 10th iterative refinement step; the results can be seen from the table: the solution has about 5 to 6 significant digits. Judging on the table of condition ratios (above) we should expect that for larger values of m the convergence would cease to take place, because the matrix becomes very ill-conditioned. Our numerical experiments have confirmed that. Take, for example, the case of m=60. In order to avoid pre-setting of the actual number of to-be-performed refinement iterations, the algorithm includes a check of the Euclidian norm of the correction vector (divided by the norm of the current approximation to the solution). The refinement iterations repeat until that ratio falls below 10^-7. As shown in the table below, the process stops at the iteration 71, which means that the matrix is quite ill-conditioned indeed. As it was mentioned before, the most "dangerous" (in numerical sense) circumstance is that on every refinement iteration the CGM error estimate fluctuates in the range 10^-10... 10^-6, creating an impression that the solution is acceptable whereas it is not until the scaled Euclidian norm of the correction vector falls below pre-defined value of 10^-7.


	Analysing the obtained solution, we can readily see that the neither the original CGM solution nor the solution after the 1st refinement iteration contain any significant digits. After 71 iterations, the algorithm obtained a reasonably good (5-6 significant digits) solution.


	For m=70, however, no significant digits were obtained at all. That was expected, as already mentioned, due to the large condition ratios of the matrix . Once again we should emphasize that we deliberately selected a "bad" matrix in a sense that its condition ratio makes singular with the growth of the matrix dimensions. As long as the matrix is reasonably well-conditioned, CMA-LS-SPARSE 2.0 product has no restrictions on using it for solving sparse systems of up to several million equations, provided there is sufficient amount of RAM installed on the computer. For the particular matrix investigated above, there are various possible ways to remedy the situation (for example, using some sophisticated pre-conditioning technics for the CGM procedure); but that issue was beyond the scope of this project.
	CMA-LS-SPARSE 2.0 is capable of solving very large (up to several million equations) sparse linear systems with general-case matrices, including over-determined systems.

	We invite you to consider using CMA-LS-SPARSE 2.0 for your research, development and commercial purposes. You are welcome to contact us at any time, preferably by e-mail. Our full contact details are provided here .

	CMA-LS-SPARSE 2.0 is available with complete source codes and with full commercial lisence allowing easy incorporation into your own applications. CMA-LS-SPARSE 2.0 codes are written in standard C and will compile and run on all UNIX and Windows platforms without changes. They are also fully commented, allowing easy modification at user's discretion. CMA-LS-SPARSE 2.0 is available on no-royalties and no-annual-renewals basis: there is a one-off price for an unlimited commercial lisence. The product distribution kit includes: - fully-commented source codes in C; - and the product's User Instructions containing decriptions of the functions and examples. To learn more about the full range of our products please follow this link. We also invite you to contact us with any questions, preferably by e-mail on comecau@ozemail.com.au or comecauinfo@gmail.com.


The Company \| Products \| Contact Us	Copyright(c) Computational Mechanics Australia Pty Ltd - All Rights Reserved A.B.N. 39 081 999 135