On the solution of high order stable time integration methods

  • Owe Axelsson1, 2,

    Affiliated with

    • Radim Blaheta2,

      Affiliated with

      • Stanislav Sysala2 and

        Affiliated with

        • Bashir Ahmad1Email author

          Affiliated with

          Boundary Value Problems20132013:108

          DOI: 10.1186/1687-2770-2013-108

          Received: 15 February 2013

          Accepted: 9 April 2013

          Published: 26 April 2013

          Abstract

          Evolution equations arise in many important practical problems. They are frequently stiff, i.e. involves fast, mostly exponentially, decreasing and/or oscillating components. To handle such problems, one must use proper forms of implicit numerical time-integration methods. In this paper, we consider two methods of high order of accuracy, one for parabolic problems and the other for hyperbolic type of problems. For parabolic problems, it is shown how the solution rapidly approaches the stationary solution. It is also shown how the arising quadratic polynomial algebraic systems can be solved efficiently by iteration and use of a proper preconditioner.

          1 Introduction

          Evolution equations arise in many important practical problems, such as for parabolic and hyperbolic partial differential equations. After application of a semi-discrete Galerkin finite element or a finite difference approximation method, a system of ordinary differential equations,
          M d u d t + A u ( t ) = f ( t ) , t > 0 , u ( 0 ) = u 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equa_HTML.gif

          arises. Here, u , f n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq1_HTML.gif, M is a mass matrix and M, A are n × n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq2_HTML.gif matrices. For a finite difference approximation, M = I http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq3_HTML.gif, the identity matrix.

          In the above applications, the order n of the system can be very large. Under reasonable assumptions of the given source function f, the system is stable, i.e. its solution is bounded for all t > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq4_HTML.gif and converges to a fixed stationary solution as t 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq5_HTML.gif, independent of the initial value u 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq6_HTML.gif. This holds if A is a normal matrix, that is, has a complete eigenvector space, and has eigenvalues with positive real parts. This condition holds for parabolic problems, where the eigenvalues of A are real and positive. In more involved problems, the matrix A may have complex eigenvalues with arbitrary large imaginary parts.

          Clearly, not all numerical time-integration methods preserve the above stability properties. Unless the time-step is sufficiently small, explicit time-integration methods do not converge and/or give unphysical oscillations in the numerical solution. Even with sufficiently small time-steps, algebraic errors may increase unboundedly due to the large number of time-steps. The simplest example where the stability holds is the Euler implicit method,
          u ˜ ( t + τ ) + τ A u ˜ ( t + τ ) = u ˜ ( t ) + τ f ( t + τ ) , t = τ , 2 τ , , u ˜ ( 0 ) = u ˜ 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equb_HTML.gif
          where τ > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq7_HTML.gif is the time-step. Here, the eigenvalues of the inverse of the resulting matrix in the corresponding system,
          ( I + τ A ) u ˜ ( t + τ ) = u ˜ ( t ) + τ f ( t + τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equc_HTML.gif
          equal ( 1 + τ λ ) 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq8_HTML.gif and satisfy the stability condition,
          | μ ( λ ) | = | ( 1 + τ λ ) 1 | < 1 , λ σ ( A ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equd_HTML.gif
          Here, σ ( A ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq9_HTML.gif denotes the set of eigenvalues of A. To more quickly damp out initial transients in the solution, which arises for instance due to that the initial value may not satisfy boundary conditions given in the parabolic problem, one should preferably have eigenvalues of the inverse of the discrete matrix B, that satisfies | μ ( λ ) | 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq10_HTML.gif for eigenvalues λ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq11_HTML.gif. This holds for the implicit Euler method, where
          B = I + τ A and μ ( λ ) = ( 1 + τ λ ) 1 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Eque_HTML.gif

          This method is only first-order accurate, i.e. its global time discretization error is O ( τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq12_HTML.gif. Therefore, to get a sufficiently small discretization error, one must choose very small time-steps, which means that the method becomes computationally expensive and also causes a stronger increase of round-off errors. However, there exists stable time-integration methods of arbitrary high order. They are of implicit Runge-Kutta quadrature type (see e.g. [15]), and belong to the class of A-stable methods, i.e. the eigenvalues μ ( B 1 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq13_HTML.gif of the corresponding matrix B where B u ˜ ( t + τ ) = u ˜ ( t ) + τ f ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq14_HTML.gif, and f ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq15_HTML.gif is a linear function of f ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq16_HTML.gif at the quadrature points in the interval [ t , t + τ ] http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq17_HTML.gif, satisfy | μ ( B 1 ) | < 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq18_HTML.gif for all normal matrices M 1 A http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq19_HTML.gif with e ( λ ) > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq20_HTML.gif. The highest order achieved, O ( τ 2 m ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq21_HTML.gif occurs for Gauss quadrature where m equals to the number of quadrature points within each time interval.

          To satisfy the second, desirable condition,
          lim λ | μ ( λ ) | 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equf_HTML.gif

          one can use a special subclass of such methods, based on Radau quadrature; see, e.g. [1, 5]. The discretization error is here only one order less, O ( τ 2 m 1 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq22_HTML.gif. For linear problems, all such stable methods lead to rational polynomial approximation matrices B, and hence the need to solve quadratic polynomial equations. For stable methods, it turns out that the roots of these polynomials are complex.

          In Section 2, a preconditioning method is described that is very efficient when solving such systems, without the need to factorize the quadratic polynomials in first order factors, thereby avoiding the need to use complex arithmetics. Section 3 discusses the special case where m = 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq23_HTML.gif. It shows also how the general case, where m > 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq24_HTML.gif, can be handled.

          Section 4 deals with the use of implicit Runge-Kutta methods of Gauss quadrature type for solving hyperbolic systems of Hamiltonian type.

          Section 5 presents a method to derive time discretization errors.

          In Section 6, some illustrating numerical tests are shown. The paper ends with concluding remarks.

          2 Preconditioners for quadratic matrix polynomials

          From the introduction, it follows that it is of importance to use an efficient solution method for quadratic matrix polynomials and not factorize them in first order factors when this results in complex valued factors. For a method to solve complex valued systems in real arithmetics, see, e.g. [6]. Here, we use a particular method that is suitable for the arising quadratic matrix polynomials.

          Consider then the matrix polynomial,
          B = M + a A + b 2 A M 1 A . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ1_HTML.gif
          (1)
          We assume that M is spd and that | a | < 2 b http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq25_HTML.gif, which latter implies that the first order factors of B are complex. Systems with B will be solved by iteration. As a preconditioner, we use the matrix
          C α = ( M + α A ) M 1 ( M + α A ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equg_HTML.gif
          where α > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq26_HTML.gif is a parameter. We assume that A is a normal matrix, that is, has a full eigenvector space and further that the symmetric part, A + A T http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq27_HTML.gif of A is spd. To estimate the eigenvalues of C α 1 B http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq28_HTML.gif, we write
          ( C α x , x ) ( B x , x ) = ( 2 α a ) ( A x , x ) + ( α 2 b 2 ) ( A M 1 A x , x ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equh_HTML.gif
          After a two-sided multiplication with M 1 / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq29_HTML.gif, we get
          ( C ˜ α x ˜ , x ˜ ) ( B ˜ x ˜ , x ˜ ) = ( 2 α a ) ( A ˜ x ˜ , x ˜ ) + ( α 2 b 2 ) ( A ˜ 2 x ˜ , x ˜ ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ2_HTML.gif
          (2)

          where C ˜ α = M 1 / 2 C α M 1 / 2 = ( I + α A ˜ ) 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq30_HTML.gif, etc. and x ˜ = M 1 / 2 x http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq31_HTML.gif. Note that, by similarity, C α 1 B http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq32_HTML.gif and C ˜ α 1 B ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq33_HTML.gif have the same eigenvalues.

          We are interested in cases where A ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq34_HTML.gif may have large eigenvalues. (In our application, A ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq34_HTML.gif involves a time-step factor τ, but since we use higher order time-discretization methods, τ will not be very small and cannot damp out the inverse to some power of the space-discretization parameter h that also occurs in A ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq35_HTML.gif.) Therefore, we choose α = b http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq36_HTML.gif. Note that this implies that 2 α a > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq37_HTML.gif.

          The resulting relation (2) can now be written
          ( x ˜ , x ˜ ) ( C ˜ α 1 B ˜ x ˜ , x ˜ ) = ( 2 α a ) ( C ˜ α 1 A ˜ x ˜ , x ˜ ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ3_HTML.gif
          (3)
          where
          ( C ˜ α 1 A ˜ x ˜ , x ˜ ) = ( ( I + α A ˜ ) 2 A ˜ x ˜ , x ˜ ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equi_HTML.gif
          Since 2 α a > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq38_HTML.gif, the real part of the eigenvalues of C ˜ α 1 B ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq39_HTML.gif are bounded above by 1. To find estimates of the eigenvalues λ ( μ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq40_HTML.gif of C ˜ α 1 B ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq41_HTML.gif, let ( μ , z ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq42_HTML.gif be eigensolutions of A ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq34_HTML.gif, i.e. let
          A ˜ z = μ z , | z | = 1 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equj_HTML.gif
          It follows from (3) that for x ˜ = z http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq43_HTML.gif,
          λ ( μ ) = ( C ˜ α 1 B ˜ z , z ) = 1 ( 1 a 2 α ) 2 α μ 1 + 2 α μ + ( α μ ) 2 = 1 ( 1 a 2 α ) 1 1 + 1 2 ( α μ + 1 α μ ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equk_HTML.gif

          We write α μ = μ 0 e i φ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq44_HTML.gif so 1 2 ( α μ + 1 α μ ) = 1 2 ( μ 0 + 1 μ 0 ) cos ( φ ) + i 2 ( μ 0 1 μ 0 ) sin ( φ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq45_HTML.gif, where i is the imaginary unit. Note that μ 0 > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq46_HTML.gif so 1 2 ( μ 0 + 1 μ 0 ) 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq47_HTML.gif. Since, by assumption, the real part of μ is positive, it holds | φ | φ 0 < π / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq48_HTML.gif. A computation shows that the values of the factor 1 1 + 1 2 ( α μ + 1 α μ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq49_HTML.gif are located in a disc in the complex plane with center at δ / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq50_HTML.gif and radius δ / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq51_HTML.gif, where δ = 1 / ( 1 + cos φ 0 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq52_HTML.gif.

          Hence, λ ( μ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq40_HTML.gif is located in a disc with center at 1 1 2 ( 1 a 2 α ) δ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq53_HTML.gif and radius 1 2 ( 1 a 2 α ) δ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq54_HTML.gif.

          For φ 0 = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq55_HTML.gif, i.e. for real eigenvalues of A ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq34_HTML.gif, then δ = 1 / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq56_HTML.gif and 1 λ ( μ ) 3 4 + 1 8 a α http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq57_HTML.gif.

          3 A stiffly stable time integration method

          Consider a system of ordinary differential equations,
          M d x d t + σ ( t ) ( A x ( t ) f ( t ) ) = 0 , t > 0 , x ( 0 ) = x 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ4_HTML.gif
          (4)

          where x , f n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq58_HTML.gif, σ ( t ) σ 0 > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq59_HTML.gif, M, A are n × n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq2_HTML.gif matrices, where M is assumed to be spd and the symmetric part of A is positive semidefinite. In the practical applications that we consider, M corresponds to a mass matrix and A to a second-order diffusion or diffusion-convection matrix. Hence, n is large. Under reasonable assumptions on the source function f, such a system is stable for all t and its solution approaches a finite function, independent on the initial value x 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq60_HTML.gif, as t http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq61_HTML.gif.

          Such stability results hold for more general problems, such as for a nonlinear parabolic problem,
          u t + F ( t , u ) = 0 , where  F ( t , u ) = ( a ( t , u , u ) u ) f ( t , u ) , x Ω , t > 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ5_HTML.gif
          (5)

          where f : ( 0 , ) × V V http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq62_HTML.gif and V is a reflexive Banach space.

          For proper functions a ( ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq63_HTML.gif and f ( ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq64_HTML.gif, then F is monotone, i.e.
          ( F ( t , u ) F ( t , v ) , u v ) ρ ( t ) u v 2 , u , v V , t > 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ6_HTML.gif
          (6)
          Here, ρ : ( 0 , ) R http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq65_HTML.gif, ρ ( t ) 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq66_HTML.gif and ( , ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq67_HTML.gif, http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq68_HTML.gif denote the scalar product, and the corresponding norm in L 2 ( Ω ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq69_HTML.gif, respectively. In this case, one can easily derive the bound
          1 2 d d t ( u v 2 ) = ( F ( t , u ) F ( t , v ) , u v ) ρ ( t ) u v 2 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equl_HTML.gif
          where u, v are solution of (5) corresponding to different initial values. Consequently making use of the Gronwall lemma, we obtain
          u ( t ) v ( t ) exp ( 0 t ρ ( s ) d s ) u ( 0 ) v ( 0 ) u ( 0 ) v ( 0 ) , t > 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equm_HTML.gif

          Hence, (5) is stable in this case.

          If F is strongly monotone (or dissipative), i.e. (6) is valid with ρ ( t ) ϱ 0 > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq70_HTML.gif, then
          u ( t ) v ( t ) exp ( t ρ 0 ) u ( 0 ) v ( 0 ) 0 , t , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equn_HTML.gif

          i.e. (5) is asymptotically stable. In particular, the above holds for the test problem considered in Section 6.

          For large eigenvalues of M 1 A http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq71_HTML.gif, such a system is stiff and can have fast decreasing and possibly oscillating components. This amounts to that the eigenvalues have large real part and possibly also large imaginary parts. To handle this, one needs stable numerical time-integration methods that do not contain corresponding increasing components. For σ ( t ) = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq72_HTML.gif, in (4), this amounts to proper approximations of the matrix exponential function exp ( t E ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq73_HTML.gif, E = M 1 A http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq74_HTML.gif, by a rational function,
          R m ( t E ) = Q m ( t E ) 1 P m ( t E ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equo_HTML.gif
          where
          R m ( t E ) 1 , t > 0 ,  for  Re { λ E } > 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equp_HTML.gif

          and λ E http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq75_HTML.gif denotes eigenvalues by E. Furthermore, to cope with problems where arg ( λ E ) α < π 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq76_HTML.gif, but arbitrarily close to π / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq77_HTML.gif, one needs A-stable methods; see e.g. [3, 7, 8]. To get stability for all times and time steps, one requires lim | λ | | R m ( λ ) | c < 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq78_HTML.gif where preferably c = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq79_HTML.gif. Such methods are called L-stable (Lambert) and stiffly A-stable [3], respectively.

          An important class of methods which are stiffly A-stable is a particular class of the implicit Runga-Kutta methods; see [1, 3, 5]. Such methods correspond to rational polynomial approximations of the matrix exponential function with denominator having a higher degree than the nominator. Examples of such methods are based on Radau quadrature where the quadrature points are zeros of P ˜ m ( ξ ) P ˜ m 1 ( ξ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq80_HTML.gif, where { P ˜ k } http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq81_HTML.gif are the Legendre polynomials, orthogonal on the interval ( 0 , 1 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq82_HTML.gif, see e.g. [1] and references therein. Note that ξ = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq83_HTML.gif is a root for all m 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq84_HTML.gif. The case m = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq85_HTML.gif is identical to the implicit Euler method.

          Following [5], we consider here the next simplest case, where m = 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq86_HTML.gif, for the numerical solution of (4) over a time interval [ t , t + τ ] http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq87_HTML.gif.

          In this case, the quadrature points (for a unit interval) are ξ 1 = 1 / 3 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq88_HTML.gif, ξ 2 = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq89_HTML.gif and the numerical solution x 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq90_HTML.gif, x 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq91_HTML.gif, at t + τ / 3 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq92_HTML.gif and t + τ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq93_HTML.gif satisfies
          [ M + 5 σ 1 A ˜ σ 2 A ˜ 9 σ 1 A ˜ M + 3 σ 2 A ˜ ] [ x 1 x 2 ] = [ M x 0 + τ 12 ( 5 f 1 f 2 ) M x 0 + τ 4 ( 3 f 1 + f 2 ) ] , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ7_HTML.gif
          (7)

          where x 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq94_HTML.gif is the solution at time t, σ 1 = σ ( t + τ / 3 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq95_HTML.gif, σ 2 = σ ( t + τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq96_HTML.gif, f 1 = f ( t + τ / 3 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq97_HTML.gif, f 2 = f ( t + τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq98_HTML.gif, and A ˜ = τ 12 A http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq99_HTML.gif. The global discretization error of the x 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq100_HTML.gif-component for this method is O ( τ 3 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq101_HTML.gif, i.e. it is a third-order method and it is stiffly A-stable even for arbitrary strong variations of the coefficient σ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq102_HTML.gif. This can be compared with the trapezoidal or implicit midpoint methods which are only second order accurate and not stiffly stable.

          The system in (7) can be solved via its Schur complement. Thereby, to avoid an inner system with matrix M + 5 σ 1 A ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq103_HTML.gif, we derive a modified form of the Schur complement system, that involves only an inner system with matrix M 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq104_HTML.gif. To this end, but only for the derivation of the method, we scale first the system with the block diagonal matrix [ M 1 0 0 M 1 ] http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq105_HTML.gif to get
          [ I + 5 σ 1 G σ 2 G 9 σ 1 G I + 3 σ 2 G ] [ x 1 x 2 ] = [ x 0 + τ 12 ( 5 f ˜ 1 f ˜ 2 ) x 0 + τ 4 ( 3 f ˜ 1 + f ˜ 2 ) ] , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equq_HTML.gif
          where G = τ 12 M 1 A http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq106_HTML.gif and f ˜ i = M 1 f i http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq107_HTML.gif, i = 1 , 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq108_HTML.gif. The Schur complement system for x 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq91_HTML.gif is multiplied with ( I + 5 σ 1 G ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq109_HTML.gif. Using commutativity, we get then
          [ ( I + 5 σ 1 G ) ( I + 3 σ 2 G ) + 9 σ 1 σ 2 G 2 ] x 2 = ( I + 5 σ 1 G ) [ x 0 + τ 4 ( 3 f ˜ 1 + f ˜ 2 ) ] 9 σ 1 G [ x 0 + τ 12 ( 5 f ˜ 1 f ˜ 2 ) ] http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equr_HTML.gif
          or
          [ I + ( 5 σ 1 + 3 σ 2 ) G + 24 σ 1 σ 2 G 2 ] x 2 = ( I 4 σ 1 G ) x 0 + τ 4 ( 3 f ˜ 1 + f ˜ 2 ) + 2 τ σ 1 G f ˜ 2 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equs_HTML.gif
          Hence,
          B x 2 = ( M τ 3 σ 1 A ) x 0 + τ 4 M ( 3 f ˜ 1 + f ˜ 2 ) + 1 6 τ 2 σ 1 A f ˜ 2 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equt_HTML.gif
          where
          B = M + τ 12 ( 5 σ 1 + 3 σ 2 ) A + τ 2 6 σ 1 σ 2 A M 1 A . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ8_HTML.gif
          (8)

          For higher order Radau quadrature methods, the corresponding matrix polynomial in M 1 B http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq110_HTML.gif is a m th order polynomial. By the fundamental theorem of algebra, one can factorize it in factors of at most second degree. They can be solved in a sequential order. Alternatively, using a method referred to in Remark 3.1, the solution components can be computed concurrently.

          Each second-order factor can be preconditioned by the method in Section 2. The ability to factorize Q m ( t E ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq111_HTML.gif in second-order factors and solve the arising systems as such two-by-two block matrix systems means that one only has to solve first-order systems. This is of importance if for instance M and A are large sparse bandmatrices, since then one avoids increasing bandwidths in matrix products and one can solve systems of linear combinations of M and A more efficiently than for higher order polynomial combinations. Furthermore, this enables one to keep matrices on element by element form (see, e.g. [9]) and it is in general not necessary to store the matrices M and A. The arising inner system can be solved by some inner iteration method.

          The problem with a direct factorization in first order factors is that complex matrix factors appear. This occurs for the matrix in (8) for a ratio of σ 1 σ 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq112_HTML.gif in the interval
          3 11 96 25 < σ 1 σ 2 < 3 11 + 96 25 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ9_HTML.gif
          (9)
          Therefore, it is more efficient to keep the second order factors and instead solve the corresponding systems by preconditioned iterations. Thereby, the preconditioner involves only first order factors. As shown in Section 2, a very efficient preconditioner for the matrix B in (8) is
          C = C α = ( M + α τ A ) M 1 ( M + σ τ A ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ10_HTML.gif
          (10)

          where α > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq26_HTML.gif is a parameter. As already shown in [5], for the above particular application it holds.

          Proposition 3.1 Let B, C be as defined in (8) and (10) and assume that M is spd and A is spsd. Then letting
          α = max { σ 1 σ 2 / 6 , ( 5 σ 1 + 3 σ 2 ) / 24 } http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equu_HTML.gif
          it holds
          κ ( C 1 B ) max i = 1 , 2 δ i 1 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equv_HTML.gif
          where
          1 δ 1 = ( 5 σ 1 + 3 σ 2 ) / 24 α 10 / 4 , 1 δ 2 = σ 1 σ 2 / 6 α 2 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equw_HTML.gif

          If 0.144 σ 1 σ 2 2.496 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq113_HTML.gif, then δ 2 = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq114_HTML.gif and δ 1 5 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq115_HTML.gif.

          The spectral condition number is then bounded by
          κ ( C 1 B ) 8 5 1.265 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equx_HTML.gif
          If σ 1 = σ 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq116_HTML.gif, then
          κ ( C 1 B ) 3 2 1.225 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equy_HTML.gif
          Proof Let ( u , v ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq117_HTML.gif be the 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq118_HTML.gif product of u , v n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq119_HTML.gif. We have
          ( C x , x ) ( B x , x ) = 2 σ τ ( 1 δ 1 ) ( A x , x ) + α 2 τ 2 ( 1 δ 2 ) ( A M 1 A x , x ) x n . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equz_HTML.gif
          It follows that
          ( B x , x ) ( C x , x ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equaa_HTML.gif
          By the arithmetic-geometric means inequality, we have
          δ 1 2 15 σ 1 σ 2 / α 1 2 90 = 10 4 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ11_HTML.gif
          (11)
          a computation shows that
          σ 1 σ 2 / 6 ( 5 σ 1 + 3 σ 2 24 ) 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equab_HTML.gif
          for 0.144 ξ 2.496 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq120_HTML.gif, where ξ = σ 1 / σ 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq121_HTML.gif. Further, a computation shows that δ 1 5 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq122_HTML.gif, which is in accordance with the lower bound in (11). Since
          ( C x , x ) 2 α τ ( A x , x ) + α 2 τ 2 ( A M 1 A x , x ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equac_HTML.gif
          it follows that
          1 ( B x , x ) ( C x , x ) 1 δ 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equad_HTML.gif
          or
          ( B x , x ) ( C x , x ) δ 1 = 5 8 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equae_HTML.gif
          For α 1 = α 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq123_HTML.gif, a computation shows that
          δ 1 = 1 3 6 = 2 3 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equaf_HTML.gif

           □

          We conclude that the condition number is very close to its ideal unit value 1, leading to very few iterations. For instance, it suffices with at most 5 conjugate gradient iterations for a relative accuracy of 10−6.

          Remark 3.1 High order implicit Runge-Kutta methods and their discretization error estimates can be derived using order tree methods as described in [1] and [10].

          For an early presentation of implicit Runge-Kutta methods, see [2] and also [4], where the method was called global integration method to indicate its capability for large values of m to use few, or even just one, time discretization steps. It was also shown that the coefficient matrix, formed by the quadrature coefficients had a dominating lower triangular part, enabling the use of a matrix splitting and Richardson iteration method. It can be of interest to point out that the Radau method for m = 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq86_HTML.gif can be described in an alternative way, using Radau quadrature for the whole time step interval and combined with a trapezoidal method for the shorter interval.

          Namely, let d u d t + f ( t , u ) = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq124_HTML.gif, t k 1 < t < t k http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq125_HTML.gif. Then Radau quadrature on the interval ( t k 1 , t k ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq126_HTML.gif has quadrature points t k 1 + τ / 3 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq127_HTML.gif, t k http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq128_HTML.gif, and coefficients b 1 = 3 / 4 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq129_HTML.gif, b 2 = 1 / 4 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq130_HTML.gif, which results in the relation
          u ˜ 1 u ˜ 0 + 3 τ 4 f ( t ˜ 1 / 3 , u ˜ 1 / 3 ) + τ 4 f ( t ˜ 1 , u ˜ 1 ) = 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equag_HTML.gif

          where u ˜ 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq131_HTML.gif, u ˜ 1 / 3 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq132_HTML.gif, u ˜ 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq133_HTML.gif denote the corresponding approximations of u at t ˜ 1 t k 1 + τ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq134_HTML.gif and t ˜ 1 / 3 = t k 1 + τ / 3 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq135_HTML.gif and t k 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq136_HTML.gif, respectively.

          This equation is coupled with an equation based on quadrature
          u ( t k 1 + τ / 3 ) u ( t k 1 ) + t k 1 t k f ( t , u ) d t t k 1 + τ / 3 t k f ( t , u ) d t = 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equah_HTML.gif
          which, using the stated quadrature rules, results in
          u ˜ 1 / 3 u ˜ 0 + 3 τ 4 f ( t ˜ 1 / 3 , u ˜ 1 / 3 ) + τ 4 f ( t ˜ 1 , u ˜ 1 ) 1 2 2 τ 3 [ f ( t ˜ 1 / 3 , u ˜ 1 / 3 ) + f ( t ˜ 1 , u ˜ 1 ) ] = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equai_HTML.gif
          that is,
          u ˜ 1 / 3 u ˜ 0 + 5 τ 12 f ( t ˜ 1 / 3 , u ˜ 1 / 3 ) τ 12 f ( t ˜ 1 , u ˜ 1 ) = 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equaj_HTML.gif

          Remark 3.2 The arising system in a high order method involving q 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq137_HTML.gif quadratic polynomial factors, can be solved sequentially in the order they appear. Alternatively (see, e.g. [11], Exercise 2.31), one can use a method based on solving a matrix polynomial equation, P 2 q ( A ) x = b http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq138_HTML.gif as x = k = 1 q 1 P 2 q ( r k ) x k http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq139_HTML.gif, x k = ( A r k I ) 1 b http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq140_HTML.gif, where { r k } 2 q http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq141_HTML.gif, is the set of zeros of the polynomial and it is assumed that A has no eigenvalues in this set. (This holds in our applications.) Then, combining pairs of terms corresponding to complex conjugate roots r k http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq142_HTML.gif, quadratic polynomials arise for the computation of the corresponding solution components. It is seen that in this method, the solution components can be computed concurrently.

          Remark 3.3 Differential algebraic equations (DAE) arise in many important problems; see, for instance [10, 12]. The simplest example of a DAE takes the form
          { d u d t = f ( t , u , v ) , g ( t , u , v ) = 0 , t > 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equak_HTML.gif
          with u ( 0 ) = u 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq143_HTML.gif, v ( 0 ) = v 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq144_HTML.gif and it is normally assumed that the initial values satisfy the constraint equation, i.e.
          g ( 0 , u 0 , v 0 ) = 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equal_HTML.gif
          If det ( g v ) 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq145_HTML.gif in a sufficiently large set around the solution, one can formally eliminate the second part of the solution to form a differential equation in standard form.
          d u d t = f ( t , u , v ( u ) ) , t > 0 , u ( 0 ) = u 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equam_HTML.gif
          Such a DAE is said to have index one, see e.g. [13]. It can be seen to be a limit case of the system
          { d u d t = f ( t , u , v ) , d u d t = 1 ε g ( t , u , v ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equan_HTML.gif

          where ε > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq146_HTML.gif and ε 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq147_HTML.gif.

          Hence, such an DAE can be considered as an infinitely stiff differential equation problem. For strongly or infinitely stiff problems, there can occur an order reduction phenomenae. This follows since some high order error terms in the error expansion (cf. Section 5) are multiplied with (infinitely) large factors, leading to an order reduction for some methods. Heuristically, this can be understood to occur for the Gauss integration form of IRK but does not occur for the stiffly stable variants, such as based on the Radau quadrature. For further discussions of this, see, e.g. [10, 13].

          4 High order integration methods for Hamiltonian systems

          Another important application of high order time integration methods occurs for Hamiltonian systems. Such systems occur in mechanics and particle physics, for instance. As an introduction, consider the conservation of energy principle. To this end, consider a mechanical system of k point masses and its associated Lagrangian functional,
          L = K V = i = 1 k 1 2 m i | x ˙ i | 2 V ( x 1 , , x k ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equao_HTML.gif

          where K is the kinetic energy and V the potential energy. Here, x i = ( x i , y i , z i ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq148_HTML.gif denote the Cartesian coordinate of the i th point mass m i http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq149_HTML.gif.

          The configuration strives to minimize the total energy. The corresponding Euler-Lagrange equations become then L x i = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq150_HTML.gif, that is,
          m i x ¨ i = V x i , i = 1 , 2 , , k . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ12_HTML.gif
          (12)
          We consider conservative systems, i.e. mechanical systems for which the total force on the elements of the system are related to the potential V : 3 k http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq151_HTML.gif according to
          F i = V x i . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equap_HTML.gif
          This means that the Euler-Lagrange equation (12) is identical to the classical Newton’s law
          m i x ¨ i = F i , i = 1 , 2 , , k . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equaq_HTML.gif
          Let p i = m i v i http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq152_HTML.gif be the momentum. Then
          K = 1 k 1 2 p i 2 m i . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equar_HTML.gif
          A mechanical system can be described by general coordinates
          q = ( q 1 , , q d ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equas_HTML.gif
          i.e. not necessarily Cartesian, but angles, length along a curve, etc. The Lagrangian takes the form L ( q , q ˙ , t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq153_HTML.gif. If q is determined to satisfy
          min q a b L ( q , q ˙ , t ) d t , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equat_HTML.gif
          then the motion of the system is described by the Lagrange equation,
          d d t L q ˙ ( q , q ˙ , t ) = L q ( q , q ˙ , t ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ13_HTML.gif
          (13)
          Letting here
          p k = L q k ˙ ( q , q ˙ ) , k = 1 , 2 , n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equau_HTML.gif
          be the momentum variable, and using the transformation ( q , q ˙ ) = ( q , p ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq154_HTML.gif we can write (13) as the Hamiltonian,
          H ( p , q , t ) = j = 1 n p j q ˙ j L ( q , q ˙ ( q , p , t ) , t ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equav_HTML.gif
          For a mechanical system with potential energy a function of configuration only and kinetic energy K given by a quadratic form
          K = 1 2 q ˙ T G ( q ) q ˙ , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equaw_HTML.gif
          where G is an spd matrix, possibly depending on q, we get
          p = G ( q ) q ˙ , q ˙ = G 1 ( q ) p http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ14_HTML.gif
          (14)
          and
          H ( p , q , t ) = p T G 1 ( q ) p 1 2 p T G 1 ( q ) p + V ( q ) = 1 2 p T G 1 ( q ) p + V ( q ) = K ( p , q ) + V ( q ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equax_HTML.gif

          which equals the total energy of the system.

          The corresponding Euler-Lagrange equations become now
          { p ˙ = H q , q ˙ = H p http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ15_HTML.gif
          (15)
          and are referred to as the Hamiltonian system. This follows from
          H p = q ˙ T + p T q ˙ p L q ˙ q ˙ p = q ˙ T , H q = p T q ˙ q L q L q ˙ q ˙ q = L q , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equay_HTML.gif

          which, since d d t ( L q ˙ ) = L q http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq155_HTML.gif implies p ˙ = L q http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq156_HTML.gif, are hence equivalent to the Lagrange equations.

          By (15), it holds
          d d t H ( p , q ) = H p p ˙ + H q q ˙ = 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ16_HTML.gif
          (16)

          that is, the Hamiltonian function H ( p , q ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq157_HTML.gif is a first integral for the system (15).

          The flow φ t : U 2 n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq158_HTML.gif of a Hamiltonian system is the mapping that describes the evolution of the solution by time, i.e. φ t ( p 0 , q 0 ) = ( p ( t , p 0 , q 0 ) , q ( t , p 0 , q 0 ) ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq159_HTML.gif, where p ( t , p 0 , q 0 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq160_HTML.gif, q ( t , p 0 , q 0 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq161_HTML.gif is the solution of the system for the initial values p ( 0 ) = p 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq162_HTML.gif, q ( 0 ) = q 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq163_HTML.gif.

          We consider now a Hamiltonian with a quadratic first integral in the form
          H ( y ) = y T C y , y = ( p , q ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ17_HTML.gif
          (17)

          where C is a symmetric matrix. For the solution of the Hamiltonian system (15), we shall use an implicit Runge-Kutta method based on Gauss quadrature.

          The s-stage Runge-Kutta method applied to an initial value problem, y ˙ = f ( t , y ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq164_HTML.gif, y ( t 0 ) = y 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq165_HTML.gif is defined by
          { k i = f ( t 0 + c i τ , y 0 + τ j = 1 s a i j k j ) , i = 1 , 2 , , s , y 1 = y 0 + τ i = 1 s b i k i , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ18_HTML.gif
          (18)

          where c i = i = 1 s a i j http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq166_HTML.gif, see e.g. [1, 4]. The familiar implicit midpoint rule is the special case where s = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq167_HTML.gif. Here, c 1 , , c s http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq168_HTML.gif are the zeros of the shifted Legendre polynomial d s d x s ( x s ( 1 x ) s ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq169_HTML.gif. For a linear problem, this results in a system which can be solved by the quadratic polynomial decomposition and the preconditioned iterative solution method, presented in Section 2.

          If u ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq170_HTML.gif is a polynomial of degree s, then (18) takes the form
          u ( t 0 ) = y 0 , u ˙ ( t + c i τ ) = f ( t 0 + c 0 τ , y ( t 0 + c i τ ) ) , i = 1 , , s http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ19_HTML.gif
          (19)

          and u 1 = u ( t 0 + τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq171_HTML.gif.

          For the Hamiltonian (17), it holds
          d d t H ( y ( t ) ) = 2 y T ( t ) C y ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equaz_HTML.gif
          and it follows from (16) that
          y 1 T C y 1 y 0 T C y 0 = 2 t 0 t 0 + τ u ( t ) T C u ˙ ( t ) d t . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equba_HTML.gif
          Since the integrand is a polynomial of degree 2 s 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq172_HTML.gif, it is evaluated exactly by the s-stage Gaussian quadrature formula. Therefore, since
          y ( t 0 + c i t ) T C y ˙ ( t 0 + c i τ ) = u ( t 0 + c i τ ) T C f ( u ( t 0 + c i τ ) ) = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbb_HTML.gif

          it follows that the energy quadrature forms y i T C i y i http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq173_HTML.gif are conserved.

          This is an important property in Hamiltonian systems and is referred to as being symplectic. For further references of symplectic integrators, see [10].

          5 Discretization error estimates

          Error estimation methods for parabolic and hyperbolic problems can differ greatly. Parabolic problems are characterized by the monotonicity property (6) while for hyperbolic problems a corresponding conservation property,
          ( F ( t , u ) F ( t , v ) , u v ) = 0 , t > 0 u , v V http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbc_HTML.gif
          holds, implying
          u ( t ) v ( t ) = u ( 0 ) v ( 0 ) , t 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ20_HTML.gif
          (20)

          Hence, there is no decrease of errors occurring at earlier time steps. On the other hand, the strong monotonicity property for parabolic problems implies that errors at earlier time steps decrease exponentially as time evolves.

          For a derivation of discretization errors for such parabolic type problems for a convex combination of the implicit Euler method and the midpoint method, referred to as the θ-method, the following holds (see [14]). Similar estimates can also be derived for the Radau quadrature method, see, e.g. [10].

          The major result in [14] is the following.

          Let u t s = s ( u ( t ) ) t s http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq174_HTML.gif. Consider the problem u t = F ( t , u ( t ) ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq175_HTML.gif where u belongs to some function space V and the corresponding truncation error,
          R θ ( t , u ) = F ( t ¯ , u ¯ ( t ) ) τ 1 t t + τ u t ( s ) d s = u ( t ¯ ) τ 1 [ u ( t + τ ) u ( t ) ] + F ( t ¯ , u ¯ ( t ) ) F ( t ¯ , u ( t ¯ ) ) , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbd_HTML.gif

          where t ¯ = θ t + ( 1 θ ) ( t + τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq176_HTML.gif, u ¯ ( t ) = θ u ( t ) + ( 1 θ ) u ( t + τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq177_HTML.gif, 0 θ 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq178_HTML.gif.

          If u C 3 ( V ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq179_HTML.gif, then a Taylor expansion shows that
          R θ ( t , u ) = 1 24 τ 2 u t ( 3 ) ( t 3 ) + ( 1 2 θ ) τ u t ( 2 ) ( t 2 ) + 1 2 θ ( 1 θ ) τ 2 F y ( t ¯ , u ˜ ( t ¯ ) ) u t ( 2 ) ( t 1 ) , t < t i < t + τ , i = 1 , 2 , 3 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ21_HTML.gif
          (21)

          where u ˜ ( t ¯ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq180_HTML.gif takes values in a tube with radius u ¯ ( t ) u ( t ¯ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq181_HTML.gif about the solution u ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq182_HTML.gif.

          It follows that if
          F u ( t ¯ , u ˜ ( t ¯ ) ) u t ( 2 ) ( t 1 ( t ) ) C 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ22_HTML.gif
          (22)
          and θ = 1 2 O ( τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq183_HTML.gif, then
          R θ ( t , u ) = O ( τ 2 ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Eqube_HTML.gif
          Under the above conditions, the discretization error e ( t ) = u ( t ) v ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq184_HTML.gif, where
          v ( t + τ ) v ( t ) + τ F ( t ¯ , v ¯ ( t ) ) = 0 , t = 0 , τ , 2 τ , , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbf_HTML.gif
          v ( 0 ) = u ( 0 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq185_HTML.gif, is the approximate solution, satisfies
          1. (i)

            if F is strongly monotone and 1 2 | O ( τ ) | θ θ 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq186_HTML.gif, then e ( t ) ϱ 0 1 C τ 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq187_HTML.gif, t > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq188_HTML.gif;

             
          2. (ii)

            if F is monotone (or conservative) and 1 2 | O ( τ ) | θ 1 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq189_HTML.gif, then e ( t ) t C τ 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq190_HTML.gif, t > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq191_HTML.gif.

             

          Here, C http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq192_HTML.gif depends on u t ( 2 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq193_HTML.gif and u t ( 3 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq194_HTML.gif, but is independent of the stiffness of the problem under the appropriate conditions stated above.

          If the solution u is smooth so that F u u t ( 2 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq195_HTML.gif has also only smooth components, then F u u t ( 2 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq196_HTML.gif may be much smaller than F u u t ( 2 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq197_HTML.gif, showing that the stiffness, i.e. factors F u 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq198_HTML.gif, do not enter in the error estimate.

          In many problems, we can expect that F u u t ( 2 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq199_HTML.gif is of the same order as u t ( 3 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq194_HTML.gif, i.e. the first and last forms in (21) have the same order. In particular, this holds for a linear problem u t + A u = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq200_HTML.gif, where u t ( 3 ) = A 3 u = F u u t ( 2 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq201_HTML.gif.

          It is seen from (20) that for hyperbolic (conservative) problems like the Hamiltonian problem in Section 4, the discretization error grows at least linearly with t, but likely faster if the solution is not sufficiently smooth. It may then be necessary to control the error by coupling the numerical time-integration method with an adaptive time step control. We present here such a method based on the use of backward integration at each time-step using the adjoint operator. The use of adjoint operators in error estimates gives back to the classical Aubin-Nitsche L 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq202_HTML.gif-lifting method used in boundary value problems to derive discretization error estimates in L 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq202_HTML.gif norm. It has also been used for error estimates in initial value problems, see e.g. [7].

          Assume that the monotonicity assumption (20) holds. We show first a nonlinear (monotone) stability property, called B-stability, that holds for the numerical solution of implicit Runge-Kutta methods based on Gauss quadrature points. It goes back to a scientific note in [15]; see also [16].

          Let u ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq203_HTML.gif, v ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq204_HTML.gif be two approximate solutions to u = f ( u , t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq205_HTML.gif, t > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq191_HTML.gif extended to polynomials of degree m from their pointwise values at t k , i http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq206_HTML.gif in the interval [ t k 1 , t k ] http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq207_HTML.gif. Let
          Ψ ( t ) = 1 2 u ˜ ( t ) v ˜ ( t ) 2 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbg_HTML.gif
          Then, since by (18), u ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq208_HTML.gif and v ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq209_HTML.gif satisfy the differential equation at the quadrature points, and by (19) it holds
          Ψ ( t k , i ) = ( u ˜ ( t k , i ) v ˜ ( t k , i ) , u ˜ ( t k , i ) v ˜ ( t k , i ) ) = ( f ( u ˜ ( t k , i ) ) f ( v ˜ ( t k , i ) ) , u ˜ ( t k , i ) v ˜ ( t k , i ) ) 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbh_HTML.gif
          i = 1 , 2 , , m http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq210_HTML.gif, where { t k , i } i = 1 m http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq211_HTML.gif is the set of quadrature points. Since Ψ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq212_HTML.gif is a polynomial of degree 2 m 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq213_HTML.gif, Gauss quadrature is exact so
          Ψ ( t k ) Ψ ( t k 1 ) = t k 1 t k Ψ ( s ) d s = i = 1 m b i Ψ ( t k , i ) 0 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbi_HTML.gif

          Here, b i > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq214_HTML.gif are the quadrature coefficients.

          Hence,
          u ˜ ( t k ) v ˜ ( t k ) u ˜ ( t k 1 ) v ˜ ( t k 1 ) u ˜ ( 0 ) v ˜ ( 0 ) , k = 1 , 2 , . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbj_HTML.gif

          Since Ψ ( 2 m ) ( t ) 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq215_HTML.gif, this monotonicity property can be seen to hold also for the Radau quadrature method.

          We present now a method for adaptive a posteriori error control for the initial value problem
          u ( t ) = σ ( t ) f ( u ( t ) ) , t > 0 , u ( 0 ) = u 0 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ23_HTML.gif
          (23)

          where u ( t ) R n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq216_HTML.gif and f ( u ( t ) ) = A u ( t ) f ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq217_HTML.gif.

          For the implicit Runge-Kutta method with approximate solution u ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq218_HTML.gif, it holds
          u ˜ ( t k ) = u ˜ ( t k 1 ) + t k 1 t k σ ( t ) f ( u ˜ ( t ) ) d t , k = 1 , 2 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbk_HTML.gif

          where u ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq219_HTML.gif is a piecewise polynomial of degree m.

          The corresponding residual equals
          R ( u ˜ ( t ) ) = u ˜ ( t ) σ ( t ) f ( u ˜ ( t ) ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbl_HTML.gif
          By the property of implicit Runge-Kutta methods, it is orthogonal, i.e.
          t k 1 t k ( u ˜ ( t ) σ ( t ) f ( u ˜ ( t ) ) ) v ̲ d t = 0 , k = 0 , 1 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ24_HTML.gif
          (24)
          to all polynomials of degree m. Here, the ‘dot’ indicates a vector product in R n http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq220_HTML.gif. The discretization error equals e ( t ) = u ( t ) u ˜ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq221_HTML.gif, t > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq191_HTML.gif. The error estimation will be based on the backward integration of the adjoint operator problem,
          { φ ( t ) = σ ( t ) A T φ ( t ) , t k 1 < t < t k , φ ( t k ) = e ( t k ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ25_HTML.gif
          (25)
          Note that σ ( t ) A e ( t ) = σ ( t ) ( f ( u ( t ) ) f ( u ˜ ( t ) ) ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq222_HTML.gif. It holds
          | e ( t k ) | 2 = | e ( t k ) | 2 + t k 1 t k e ( φ σ ( t ) A T φ ) d t , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbm_HTML.gif
          so by integration by parts, we get
          | e ( t k ) | 2 = t k 1 t k ( e σ ( t ) A e ) φ d t + e ( t k 1 ) φ ( t k 1 ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbn_HTML.gif
          Here
          e σ ( t ) A e = u σ ( t ) ( A u f ˜ ( t ) ) ( u ˜ σ ( t ) ( A u ˜ f ˜ ( t ) ) ) = u ˜ + σ ( t ) f ( u ˜ ) = R ( u ˜ ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbo_HTML.gif
          Hence,
          | e ( t k ) | 2 = t k 1 t k R ( u ˜ ) φ d t + e ( t k 1 ) φ ( t k 1 ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbp_HTML.gif
          Here, we can use the Galerkin orthogonality property (24) to get
          | e ( t k ) | 2 | e ( t k 1 ) φ ( t k 1 ) | min φ ˜ | t k 1 t k R ( u ˜ ) ( φ φ ˜ ) d t | , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbq_HTML.gif

          where φ ˜ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq223_HTML.gif is a polynomial of degree m.

          Since φ ( t k ) = e ( t k ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq224_HTML.gif, it follows that
          | e ( t k ) | | φ ( t k 1 ) φ ( t k ) | | e ( t k 1 ) | + min φ ˜ | t k 1 t k R ( u ˜ ) φ φ ˜ φ ( t k ) d t | , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbr_HTML.gif
          and from φ ( t ) = σ ( t ) A T φ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq225_HTML.gif and μ ( A T ) = μ ( A ) = max i Re | λ i ( A ) | = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq226_HTML.gif, it follows that
          φ ( t ) = e t t k μ ( t ) σ ( t ) d t φ ( t k ) = φ ( t k ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbs_HTML.gif
          Hence,
          | e ( t k ) | | e ( t k 1 ) | + min φ ˜ | t k 1 t k R ( u ˜ ) φ φ ˜ φ ( t k ) d t | . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbt_HTML.gif
          Under sufficient regularity assumptions the last term can be bounded by C τ 2 m http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq227_HTML.gif. Hence, the discretization error grows linearly with time,
          | e ( t k ) | C t k τ 2 m , k = 0 , 1 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbu_HTML.gif

          i.e. the implicit Runge-Kutta method, based on Gaussian quadrature, applied for hyperbolic (conservative) problems has order 2m.

          6 A numerical test example

          We consider the linear parabolic problem,
          u t + σ ( t ) ( Δ u + b u f ) = 0 , t > 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ26_HTML.gif
          (26)
          in the unit square domain Ω = [ 0 , 1 ] 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq228_HTML.gif with boundary condition
          { u = 0 on parts  y = 0 , y = 1 , u ν + u = g , 1 on parts  x = 0 , x = 1 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ27_HTML.gif
          (27)
          As initial function u 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq229_HTML.gif, we choose a tent-like function with u 0 = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq230_HTML.gif at the center of Ω and u 0 = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq231_HTML.gif on Ω; see Figure 1.
          http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Fig1_HTML.jpg
          Figure 1

          Initial function.

          Here, σ ( t ) = 1 + 2 5 sin k π t http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq232_HTML.gif, where k = 1 , 2 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq233_HTML.gif , k 1 τ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq234_HTML.gif, is a parameter used to test the stability of the method with respect to oscillating coefficients. Here, τ is the time step to be used in the numerical solution of (26). Note that this function σ ( t ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq235_HTML.gif satisfies the conditions of the ratio σ 1 σ 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq236_HTML.gif from (9). We let f ( x , y ) 2 e x http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq237_HTML.gif.

          Further b is a vector satisfying b 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq238_HTML.gif. We choose b = [ , 0 ] http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq239_HTML.gif, where is a parameter, possibly 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq240_HTML.gif.

          After a finite element or finite difference approximation, a system of the form (4) arises. For a finite difference approximation M = I http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq241_HTML.gif, the identity matrix. The Laplacian operator is approximated with a nine-point difference scheme. We use an upwind discretization of the convection term. In the outer corner points of the domain, we use the boundary conditions u x + u = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq242_HTML.gif for x = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq243_HTML.gif and u x + u = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq244_HTML.gif for x = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq245_HTML.gif.

          The time discretization is given by the implicit Runge-Kutta method with the Radau quadrature for m = 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq246_HTML.gif; see Section 3. For comparison, we also consider m = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq247_HTML.gif, i.e. the implicit Euler method, in some experiments. For solving the time-discretized problems, we use the GMRES method with preconditioners from Section 2 and with the tolerance 1 e 10 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq248_HTML.gif. Let us note that GMRES needs 5-6 iteration for this tolerance. The problem is implemented in Matlab.

          The primary aim is to show how the time-discretization errors decrease and how fast the numerical approximation of (26)-(27) approaches its stationary value, i.e. the corresponding numerical solution to the stationary problem
          { Δ u ˆ + b u ˆ = 2 e x in  Ω , u = 0 on parts  y = 0 , y = 1 , u ν + u = g on parts  x = 0 , x = 1 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equ28_HTML.gif
          (28)

          6.1 Experiments with a known and smooth stationary solution

          If we let
          g ( y ) = { 2 y ( 1 y ) for  x = 0 , 0 for  x = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbv_HTML.gif
          then the solution to (28) satisfies
          u ˆ ( x , y ) = e x y ( 1 y ) . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbw_HTML.gif
          First, we will investigate the influence of the space discretization error on the stationary problem (28). To this end, we use the relative error estimate in the Euclidean norm:
          e h = u ˆ h u ˆ 2 u ˆ 2 . http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbx_HTML.gif
          Here, u ˆ http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq249_HTML.gif, u ˆ h http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq250_HTML.gif denote the vectors representing the exact and numerical solutions to (28) at the nodal points, respectively. The error estimates in dependence on and h are found in Table 1. It is seen that the error decay is O ( h ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq251_HTML.gif. This is caused by the use of first order upwind approximation of the convection term.
          Table 1

          The error estimates in dependence on and h

          h

          1/10

          1/20

          1/50

          1/100

          1/150

          1

          1.2e−2

          5.9e−3

          2.3e−3

          1.2e−3

          7.7e−4

          20

          6.1e−1

          4.5e−1

          2.5e−1

          1.4e−1

          9.4e−2

          In Figures 2 and 3, there are depicted numerical stationary solutions for = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq252_HTML.gif and = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq253_HTML.gif, respectively. The discretization parameter is h = 1 / 50 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq254_HTML.gif.
          http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Fig2_HTML.jpg
          Figure 2

          Numerical stationary solution for = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq255_HTML.gif .

          http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Fig3_HTML.jpg
          Figure 3

          Numerical stationary solution for = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq256_HTML.gif .

          Now we will investigate how fast the numerical solution to (26)-(27) approaches the numerical solution to (28) in dependence on τ. We fix k = 10 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq257_HTML.gif and we search the smallest time T for which
          u h ( T ) u ˆ h 2 u ˆ h 2 < 10 6 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equby_HTML.gif
          where the vectors u ˆ h http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq250_HTML.gif and u h ( T ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq258_HTML.gif represent the numerical solution to (28) and the numerical solution to (26)-(27) at time T, respectively. The results for various and h are in Table 2. We can observe that the dependence of the results on h is small. For smaller , the final time does not depend on τ, while for larger , the dependence on τ is more significant.
          Table 2

          Values of time T in dependence on h and τ

          http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq259_HTML.gif

          Finally, we investigate how the time-discretization error decrease in dependence on τ at a fixed, relatively small, time T = 1 / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq260_HTML.gif. We consider five different time-discretization parameters: τ 1 = T http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq261_HTML.gif, τ 2 = T / 2 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq262_HTML.gif, τ 3 = T / 4 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq263_HTML.gif, τ 4 = T / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq264_HTML.gif, and τ 5 = T / 16 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq265_HTML.gif. We will compare the maximal differences between the vectors u i ( T ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq266_HTML.gif and u i + 1 ( T ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq267_HTML.gif, i = 1 , , 4 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq268_HTML.gif, where ui ( T ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq269_HTML.gif represents the numerical solution to (26)-(27) at time T for the time-discretization parameter τ i http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq270_HTML.gif, i = 1 , , 5 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq271_HTML.gif. So, we investigate the following error:
          e i = u i + 1 ( T ) u i ( T ) , i = 1 , , 4 , http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equbz_HTML.gif
          which values are found in Tables 3-5. If we let k = 10 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq257_HTML.gif, = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq253_HTML.gif and use various h, we obtain results written in Table 3. It is seen that the influence of h on the time discretization error is small for the larger time steps but more noticeable for the smaller time steps when the time and space discretization errors are of the same order.
          Table 3

          Time discretization error at time T = 1 / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq260_HTML.gif in dependence on h and τ

          hi

          1

          2

          3

          4

          1/20

          1.7e−1

          1.6e−2

          3.0e−4

          9.5e−6

          1/50

          1.8e−1

          2.0e−2

          5.6e−4

          4.5e−6

          1/100

          1.8e−1

          2.1e−2

          6.8e−4

          3.0e−6

          If we let k = 10 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq257_HTML.gif, h = 1 / 50 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq254_HTML.gif, = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq252_HTML.gif and = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq253_HTML.gif, we obtain results in Table 4. We can see that the investigated time-discretization error decreases faster for = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq253_HTML.gif than for = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq252_HTML.gif.
          Table 4

          Time discretization error at time T = 1 / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq260_HTML.gif in dependence on and τ

          i

          1

          2

          3

          4

          1

          7.1e−2

          3.9e−3

          2.1e−4

          2.8e−5

          20

          1.8e−1

          2.0e−2

          5.6e−4

          4.5e−6

          If we let k = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq272_HTML.gif, k = 10 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq257_HTML.gif, h = 1 / 50 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq254_HTML.gif, and = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq253_HTML.gif, we obtain results in Table 5.
          Table 5

          Time discretization error at time T = 1 / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq260_HTML.gif in dependence on k and τ

          ki

          1

          2

          3

          4

          0

          1.1e−1

          2.4e−2

          4.3e−4

          2.4e−5

          10

          1.8e−1

          2.0e−2

          5.6e−4

          4.5e−6

          The error estimates from Tables 3-5 indicate that the expected error estimate O ( τ 3 ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq273_HTML.gif holds.

          For comparison, we perform the same experiment as in Table 5 for the implicit Euler time discretization. The results are in Table 6.
          Table 6

          Time discretization error at time T = 1 / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq260_HTML.gif in dependence on and τ for the implicit Euler method

          ki

          1

          2

          3

          4

          0

          7.8e−2

          4.2e−2

          1.7e−2

          4.6e−3

          10

          2.5e−2

          2.5e−2

          4.8e−2

          2.2e−2

          The error estimates are here significantly influenced by the oscillation parameter k. For the larger value k = 10 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq257_HTML.gif, we do not observe convergence. In case k = 0 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq272_HTML.gif, the convergence is first order O ( τ ) http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq274_HTML.gif, that is, much slower than for the Runge-Kutta method with the two-point Radau quadrature.

          6.2 Experiments with an unknown and less smooth stationary solution

          Here, we replace the above defined function g with the following one:
          g ( y ) = { y ( 1 y ) , y < 1 / 4  or  y > 3 / 4 , e 2 | y 1 / 2 | , 1 / 4 y 3 / 4 } for  x = 0 , 0 for  x = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Equca_HTML.gif
          and prepare Tables 7 and 8 correspondingly to Tables 2 and 4, respectively. The results in Tables 7 and 8 are very similar to the results from Tables 2 and 4. It means that less smoothness in space of the solution to (26)-(27) do not significantly influence the time-discretization error.
          Table 7

          Values of stabilized time T in dependence on and τ

          τ

          1/5

          1/10

          1/20

          1/40

          1

          1.40

          1.30

          1.25

          1.23

          20

          1.60

          0.80

          0.45

          0.20

          We let h = 1/50.

          Table 8

          Time discretization error at time T = 1 / 8 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq260_HTML.gif in dependence on and τ

          li

          1

          2

          3

          4

          1

          7.4e−2

          4.0e−2

          2.0e−4

          2.6e−5

          20

          1.8e−1

          1.9e−2

          5.6e−4

          4.5e−6

          In Figures 4 and 5, there are depicted numerical stationary solutions for = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq252_HTML.gif and = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq275_HTML.gif, respectively. The discretization parameter is h = 1 / 50 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq254_HTML.gif.
          http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Fig4_HTML.jpg
          Figure 4

          Numerical stationary solution for = 1 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq255_HTML.gif .

          http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_Fig5_HTML.jpg
          Figure 5

          Numerical stationary solution for = 20 http://static-content.springer.com/image/art%3A10.1186%2F1687-2770-2013-108/MediaObjects/13661_2013_Article_359_IEq256_HTML.gif .

          7 Concluding remarks

          There are several advantages in using high order time integration methods. Clearly, the major advantage is that the high order of discretization errors enables the use of larger, and hence fewer timesteps to achieve a desired level of accuracy. Some of the methods, like Radau integration, are highly stable, i.e. decrease unwanted solution components exponentially fast and do not suffer from an order reduction, which is otherwise common for many other methods. The disadvantage with such high order methods is that one must solve a number of quadratic matrix polynomial equations. For this reason, much work has been devoted to development of simpler methods, like diagonally implicit Runge-Kutta methods; see e.g. [10]. Such methods are, however, of lower order and may suffer from order reduction.

          In the present paper, it has been shown that the arising quadratic matrix system polynomial factors can be handled in parallel and each of them can be solved efficiently with a preconditioning method, resulting in very few iterations. Each iteration involves just two first order matrix real valued factors, similar to what arises in the diagonal implicit Runge-Kutta methods. An alternative, stabilized explicit Runge-Kutta methods, i.e. methods where the stability domain has been extended by use of certain forms of Chebyshev polynomials; see, e.g. [17] can only be competitive for modestly stiff problems.

          It has also been shown that the methods behave robustly with respect to oscillations in the coefficients in the differential operator. Hence, in practice, high order methods have a robust performance and do not suffer from any real disadvantage.

          Declarations

          Acknowledgements

          This paper was funded by King Abdulaziz University, under grant No. (35-3-1432/HiCi). The authors, therefore, acknowledge technical and financial support of KAU.

          Authors’ Affiliations

          (1)
          King Abdulaziz University
          (2)
          IT4 Innovations Department, Institute of Geonics AS CR

          References

          1. Butcher JC: Numerical Method for Ordinary Differential Equations. 2nd edition. Wiley, Chichester; 2008.View Article
          2. Butcher JC: Implicit Runge-Kutta processes. Math. Comput. 1964, 18: 50-64. 10.1090/S0025-5718-1964-0159424-9MATHMathSciNetView Article
          3. Axelsson O: A class of A -stable methods. BIT 1969, 9: 185-199. 10.1007/BF01946812MATHMathSciNetView Article
          4. Axelsson O: Global integration of differential equations through Lobatto quadrature. BIT 1964, 4: 69-86. 10.1007/BF01939850MATHMathSciNetView Article
          5. Axelsson O: On the efficiency of a class of A -stable methods. BIT 1974, 14: 279-287. 10.1007/BF01933227MATHMathSciNetView Article
          6. Axelsson O, Kucherov A: Real valued iterative methods for solving complex symmetric linear systems. Numer. Linear Algebra Appl. 2000, 7: 197-218. 10.1002/1099-1506(200005)7:4<197::AID-NLA194>3.0.CO;2-SMATHMathSciNetView Article
          7. Varga RS: Functional Analysis and Approximation Theory in Numerical Analysis. SIAM, Philadelphia; 1971.MATHView Article
          8. Gear CW: Numerical Initial Value Problems in Ordinary Differential Equations. Prentice Hall, New York; 1971.MATH
          9. Fried I: Optimal gradient minimization scheme for finite element eigenproblems. J. Sound Vib. 1972, 20: 333-342. 10.1016/0022-460X(72)90614-1MATHView Article
          10. Hairer E, Wanner G: Solving Ordinary Differential Equations II. Stiff and Differential-Algebraic Problems. 2nd edition. Springer, Berlin; 1996.MATHView Article
          11. Axelsson O: Iterative Solution Methods. Cambridge University Press, Cambridge; 1994.MATHView Article
          12. Hairer E, Lubich Ch, Roche M Lecture Notes in Mathematics 1409. In The Numerical Solution of Differential-Algebraic Systems by Runge-Kutta Methods. Springer, Berlin; 1989.
          13. Petzold LR: Order results for implicit Runge-Kutta methods applied to differential/algebraic systems. SIAM J. Numer. Anal. 1986, 23(4):837-852. 10.1137/0723054MATHMathSciNetView Article
          14. Axelsson O: Error estimates over infinite intervals of some discretizations of evolution equations. BIT 1984, 24: 413-424. 10.1007/BF01934901MATHMathSciNetView Article
          15. Wanner G: A short proof on nonlinear A -stability. BIT 1976, 16: 226-227. 10.1007/BF01931374MATHMathSciNetView Article
          16. Frank R, Schneid J, Ueberhuber CW: The concept of B -convergence. SIAM J. Numer. Anal. 1981, 18: 753-780. 10.1137/0718051MATHMathSciNetView Article
          17. Hundsdorfer W, Verwer JG: Numerical Solution of Time Dependent Advection-Diffusion-Reaction Equations. Springer, Berlin; 2003.MATHView Article

          Copyright

          © Axelsson et al.; licensee Springer. 2013

          This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.