Derivation of a Kalman-type filter for linear systems with pointwise delay in signal noise

Abstract

Recently, it was demonstrated that a modification of the Kalman-filtering model with a pointwise delay of the signal noise could improve communication with considerably distanced spacecraft. However, a complete and correct derivation of the equations of the Kalman-type filter for this case has not yet been provided. In this paper, we close this gap. The method of derivation is based on a passage from the distributed delay to pointwise by means of a delta function. The derived equations constitute a system of first-order partial differential equations with the initial and boundary conditions.

1 Introduction

Kalman filtering [1, 2] has great applications in engineering. The areas of its application include guidance, navigation, and control of aircraft and spacecraft [3], Global Navigation Satellite Systems (GNSS) including GPS, GLONASS, Galileo, and other systems [4], robotic motions [5], forecasting [6], but are not restricted to these.

Originally, the Kalman filter was derived for a finite-dimensional linear signal-observation system corrupted by independent or correlated white noises. Further developments pushed scientists to create its different modifications. Thus, the extended Kalman filter [3] was developed to cover nonlinear systems, the infinite-dimensional Kalman filter [7] was created for application to systems governed by partial differential equations, and Kalman filters for colored [8] and wide-band [9] noises were established for estimation of the systems corrupted by noises of a nonwhite nature. A series of novel Kalman filters are presented in [10â€“13].

Recently, it was demonstrated that the classic Kalman-filtering model does not take into account a detail that is present in communication with significantly distanced spacecraft [14, 15]. This detail consists in the presence of a time delay in the signal noise. More specifically, consider a scenario of a spacecraft in a position at a fixed instant t so that the radio signals reach it from the Earth and come back at a time $$\varepsilon >0$$. The radio signals propagate safely in vacuum, but possess noises at the beginning and at the end of the travel when they pass through the higher layers of the Earthâ€™s atmosphere. For this reason, a ground radar detects at t the signal

$$z_{t}=x_{t-\varepsilon /2}+w_{t}$$

about the state $$x_{t-\varepsilon /2}$$ of the spacecraft at $$t-\varepsilon /2$$ corrupted by the noise $$w_{t}$$. Next, assuming that the control action u changes the position x of the spacecraft in accordance with the equation $$x'=Ax+Bu$$ without taking into consideration noise effects and the distance to the spacecraft, the position of the spacecraft at $$t-\varepsilon /2$$ is changed by the control action $$u_{t-\varepsilon }$$ that is sent by the ground radar at $$t-\varepsilon$$. This control propagating through the atmosphere accounts for the noise $$w_{t-\varepsilon }$$. Hence, the equation for the position of the spacecraft must be updated as

$$x'_{t-\varepsilon /2}=Ax_{t-\varepsilon /2}+B(u_{t-\varepsilon }+w_{t- \varepsilon }).$$

The substitution $$\tilde{x}_{t}=x_{t-\varepsilon /2}$$ and $$\tilde{u}_{t}=u_{t-\varepsilon }$$ leads to the signal-observation system

$$\textstyle\begin{cases} \tilde{x}'_{t}=A\tilde{x}_{t}+B\tilde{u}_{t}+w_{t-\varepsilon }, \\ z_{t}=\tilde{x}_{t}+w_{t} \end{cases}$$

in which the noise of the signal system is a delay of the observation noise.

Currently, this time delay is ignored in communication with spacecraft. Most probably, this is related to the negligible altitude of satellites in comparison to the speed of the signals. Thus, most far satellites are in Geostationary Earth Orbit with an approximately 36,000 km altitude for which $$\varepsilon =0.24$$ s, which is not seen as sufficiently significant. However, for spacecraft with interplanetary missions, Îµ is significant and, moreover, time dependent. An example is the spacecraft Voyager 2 lanced by NASA in 1977 that entered the Jovian, Saturnian, Uranian, Neptunian systems before leaving the Solar system. Taking into account that the distances to these systems are approximately 0.8, 1.5, 2.8, 4.4 billion kilometers, it can be calculated that the value Îµ of delay becomes 1.5, 2.8, 5.2, 8.1 hours, respectively. This is a scenario with increasing delay.

Another scenario with decreasing delay takes place in NASAâ€™s Mars Exploration Program (MEP), which aimed to prepare for a human landing on Mars. Unlike Voyager 2 type spacecraft, MEP plans on returning back spacecraft exploring the near planet Mars. The shortest distance to Mars is approximately 54,600,000 km. This distance can be covered in two ways by radio waves for approximately 6 min. This time delay is sufficiently large to be taken into account.

These two scenarios demonstrate that it is important to handle filtering problems with time-dependent delays in the signal noise. In this paper, we are going to derive equations for a Kalman-type filter for the filtering model in which the signal noise is a pointwise and time-dependent delay of the observation noise. Our method is based on the Kalman-type filter for the same kind of filtering model in which signal noise is a distributed delay of the observation noise [16, 17]. We apply a delta-function passage from the distributed delay to pointwise and deduce the equations of the required filter.

2 Notation

Two major notations, which are employed to make visible the derivation process and the final formulae, are as follows. First, we prefer to write the arguments of functions in subscripts instead of between parentheses. For example, instead of $$x(t)$$, we write $$x_{t}$$. Secondly, instead of the differential form of stochastic differential equations we prefer to write them in the derivative form. For example, the stochastic differential equation

$$dx_{t}=Ax_{t} \,dt+\Phi _{t}\,dW_{t},$$

driven by the Wiener process W, will be written as

$$x'_{t}=Ax_{t}+\Phi _{t}w_{t},$$

where $$w_{t}=W'_{t}$$ is a white noise being a derivative (generalized) of the Wiener process W. The differential form is popular in the mathematical literature, while the derivative form appears in engineering. Both become meaningful in the integral form

$$x_{t}=x_{0}+ \int _{0}^{t}Ax_{s}\,ds+ \int _{0}^{t}\Phi _{s} \,dW_{s}.$$

By Î´ we denote Diracâ€™s delta function. It has the property

$$\int _{0}^{T}f_{s}\delta _{s-t} \,ds=f_{t},$$

provided that t belongs to the interval of integration. In some sense, this property is valid for white-noise processes as well. Indeed, for the white noise w, we can evaluate

\begin{aligned} \int _{0}^{T}\Phi _{t}w_{t}\,dt &= \int _{0}^{T} \int _{0}^{T}\Phi _{t}w_{t} \delta _{s-t}\,ds\,dt \\ &= \int _{0}^{T} \int _{0}^{T}\Phi _{t}w_{t} \delta _{s-t}\,dt\,ds= \int _{0}^{T} \biggl( \int _{0}^{T}\Phi _{s}w_{s} \delta _{s-t}\,ds \biggr)\,dt, \end{aligned}

demonstrating that the integrals of $$\Phi _{t}w_{t}$$ and $$\int _{0}^{T}\Phi _{s}w_{s}\delta _{s-t}\,ds$$ are equal. Since a white noise becomes meaningful under the integral, we can identify them by writing

$$\Phi _{t}w_{t}= \int _{0}^{T}\Phi _{s}w_{s} \delta _{s-t}\,ds.$$

${\mathbb{R}}^{n}$ is an n-dimensional Euclidean space. As always, $\mathbb{R}={\mathbb{R}}^{1}$. ${\mathbb{R}}^{mÃ—n}$ is the space of $$(m\times n)$$ matrices. The identity and zero matrices are denoted by I and 0. $$A^{*}$$ is the transpose of the matrix A. ${L}_{2}\left(\mathrm{Î©},{\mathbb{R}}^{n}\right)$ denotes the space of square integrable random variables on the probability space $$(\Omega ,\mathcal{F},\mathbf{P})$$. Expectation and conditional expectation are denoted by E and $$\mathbf{E}(\cdot |\cdot )$$, respectively. $$\operatorname{cov} (\xi , \eta )$$ is the covariance of the random variables Î¾ and Î·. As always, $$\operatorname{cov} \xi = \operatorname{cov} (\xi , \xi )$$. Some other notations will be introduced in the text.

3 Setting of the problem

Motivated by the scenario in the introduction, we consider the following partially observed linear system

$$\textstyle\begin{cases} x'_{t}=Ax_{t}+Fw_{\max (0,\lambda _{t})},\qquad x_{0}=\xi ,\quad t>0, \\ z_{t}=Cx_{t}+w_{t},\qquad z_{0}=0, \quad t>0, \end{cases}$$
(1)

where the signal and observation systems are disturbed by white noises with the signal noise pointwisely delaying the observation noise in time for the time-dependent value $$t-\lambda _{t}\ge 0$$. In this paper, we assume the following conditions:

1. (A)

$Aâˆˆ{\mathbb{R}}^{nÃ—n}$, $Câˆˆ{\mathbb{R}}^{mÃ—n}$.

2. (B)

w is a k-dimensional white-noise process with the properties $$w_{0}=0$$, $$\mathbf{E}w_{t}=0$$ and $$\operatorname{cov} (w_{t},w_{s})=I\delta _{t-s}$$, $\mathrm{Î¾}âˆˆ{L}_{2}\left(\mathrm{Î©},{\mathbb{R}}^{n}\right)$ is a Gaussian random variable that is independent of w, $$\mathbf{E} \xi =0$$ and $$\operatorname{cov} \xi =P_{0}$$.

3. (C)

$Fâˆˆ{\mathbb{R}}^{nÃ—m}$ and Î» is a continuous strictly increasing function satisfying $$t-\varepsilon <\lambda _{t}\le t$$ for all $$t\ge 0$$, where $$\varepsilon >0$$ is fixed.

The problem of finding the best estimate $$\hat{x}_{t}$$ of the corrupted signal $$x_{t}$$ on the basis of the observations $$z_{s}$$, $$0\le s\le t$$, for the system in (1) will be called a filtering problem (1). Theoretically, the best estimate is the conditional expectation $$\hat{x}_{t}=\mathbf{E}(x_{t}|z_{s}, 0\le s\le t)$$. The equations describing $$\hat{x}_{t}$$ form an optimal filter. Later, we will consider filtering problems for other systems as well. The filtering problem for any system will refer to the reference number of the system.

At first glance, this problem can be successfully solved by a stepwise method. To explain, consider the particular case $$\lambda _{t}=t-\sigma$$. Then, on the intervals, $$[0,\sigma ]$$, $$[\sigma ,2\sigma ]$$, etc., the signal and observation systems in (1) are driven by the independent pieces of the white-noise process w and on each of these intervals the equations of the best estimate can be written in accordance with the Kalman filter for independent white noises by arranging the initial value $$\hat{x}_{k\sigma }$$ on the interval $$[k\sigma ,(k+1)\sigma ]$$ as the terminal value on the interval $$[(k-1)\sigma ,k\sigma ]$$. Therefore, there exists a closed system of equations describing the best estimate xÌ‚. This stepwise method can somehow be extended to general Î». However, it does not allow us to obtain the overall picture, it just displays fragments. Also, it does not cover the case $$\lambda _{0}=0$$, which means that the delays are accounted for at nonzero instants.

To display the overall picture, we will use another method. Roughly speaking, this method is based on a relation between pointwise and distributed delays.

4 Preliminaries

Replacing the pointwise delay of the signal noise in (1) by a distributed delay, we obtain the system

$$\textstyle\begin{cases} x'_{t}=Ax_{t}+\varphi _{t},\qquad x_{0}=\xi ,\quad t>0, \\ z_{t}=Cx_{t}+w_{t},\qquad z_{0}=0, \quad t>0, \end{cases}$$
(2)

where

$$\varphi _{t}= \int _{\max (0,t-\varepsilon )}^{t}\Phi _{t,s-t}w_{s} \,ds,\quad t\ge 0.$$
(3)

Here, in addition to (A) and (B), we will assume that

$$\mathrm{(C')}$$:

Î¦ is a deterministic function of $$(t,\theta )\in [0,\infty )\times [-\varepsilon ,0]$$ with values in ${\mathbb{R}}^{nÃ—m}$ and belongs to $C\left(0,\mathrm{âˆž};{L}_{2}\left(âˆ’\mathrm{Îµ},0;{\mathbb{R}}^{nÃ—m}\right)\right)$ (the space of continuous functions on $$[0,\infty )$$ with values in the space of square integrable ${\mathbb{R}}^{nÃ—m}$-valued functions on $$[-\varepsilon ,0]$$), where $$\varepsilon >0$$.

The noise process Ï† defined by (3) is called a wide-band or bandwidth noise. The function Î¦ in (3) is called a relaxing (damping) function of Ï†. To the best of our knowledge, the first report about wide-band noises appears in [18]. Later, this kind of noise was investigated by use of approximations [19â€“21] and by integral representation [22, 23]. In [24] it was shown that wide-band noises can be represented as a distributed delay of white noises and [25, 26] demonstrated that the systems disturbed by wide-band noises can be reduced to abstract systems disturbed by white noises. Thus, the general guidelines of working with wide-band noises were established. On this basis, different results, including filtering results as well, for wide-band noise-driven systems were obtained [16, 17, 27]. In particular, the following filtering result for the system (2) is proved in [16].

Theorem 1

Under the conditions $$\mathrm{(A)}$$, $$\mathrm{(B)}$$, and $$\mathrm{(C')}$$, the best estimate xÌ‚ in the filtering problem (2) is uniquely determined as a solution of the equation

$$\hat{x}'_{t}=A\hat{x}_{t}+\psi _{t,0}+P_{t}C^{*}(z_{t}-C \hat{x}_{t}),\qquad \hat{x}_{0}=0,\quad t>0,$$
(4)

where Ïˆ is a unique solution of

$$\textstyle\begin{cases} (\frac{\partial }{\partial t}+\frac{\partial }{\partial \theta } ) \psi _{t,\theta }=(Q_{t,\theta }C^{*}+\Phi _{t-\theta ,\theta })(z_{t}-C \hat{x}_{t}), \\ \psi _{0,\theta }=\psi _{t,-\varepsilon }=0,\quad -\varepsilon \le \theta \le 0, t>0, \end{cases}$$
(5)

P is a unique solution of the Riccati equation

$$P'_{t} =AP_{t}+P_{t}A^{*}+Q_{t,0}+Q^{*}_{t,0}-P_{t}C^{*}CP_{t}, \qquad P_{0}= \operatorname{cov} \xi , \quad t>0,$$
(6)

Q and R are unique solutions of

$$\textstyle\begin{cases} ( \frac{\partial }{\partial t}+\frac{\partial }{\partial \theta} )Q_{t,\theta }=Q_{t,\theta }A^{*}+R_{t,\theta ,0}-(Q_{t,\theta }C^{*}+ \Phi _{t-\theta ,\theta })CP_{t}, \\ Q_{0,\theta }=Q_{t,-\varepsilon }=0, \quad -\varepsilon \le \theta \le 0, t>0, \end{cases}$$
(7)

and

$$\textstyle\begin{cases} ( \frac{\partial }{\partial t}+\frac{\partial }{\partial \theta}+ \frac{\partial }{\partial \tau } ) R_{t,\theta ,\tau }=\Phi _{t- \theta ,\theta }\Phi ^{*}_{t-\tau ,\tau } \\ \hphantom{( \frac{\partial }{\partial t}+\frac{\partial }{\partial \theta}+ \frac{\partial }{\partial \tau } ) R_{t,\theta ,\tau }={}}{} -(Q_{t,\theta }C^{*}+ \Phi _{t-\theta ,\theta })(CQ^{*}_{t,\tau }+\Phi ^{*}_{t-\tau ,\tau }), \\ R_{0,\theta ,\tau }=R_{t,-\varepsilon ,\tau }=0,\quad -\varepsilon \le \theta \le \tau \le 0, t>0. \end{cases}$$
(8)

Moreover, the mean square error of estimation in the filtering problem (2) is equal to

$$e_{t}=\mathbf{E} \Vert \hat{x}_{t}-x_{t} \Vert ^{2}=\operatorname{tr} P_{t}.$$

The classic Kalman filter consists of two equations, while the number of equations in Theorem 1 is equal to five. Before going on, the origin of the additional equations should be clarified. The issue is that the process Ï† in (3) can be represented as $$\varphi _{t}=\phi _{t,0}$$, where Ï• is a solution of the stochastic partial differential equation

$$\biggl( \frac{\partial }{\partial t}+ \frac{\partial }{\partial \theta } \biggr) \phi _{t,\theta }=\Phi _{t- \theta ,\theta }w_{t},\qquad \phi _{0,\theta }=\phi _{t,-\varepsilon }=0.$$

Combining this with the equation of the signal process in (2), we obtain a linear system for the enlarged signal process

${\stackrel{Ëœ}{x}}_{t}=\left[\begin{array}{c}{x}_{t}\\ {\mathrm{Ï•}}_{t,â‹\dots }\end{array}\right],$

where the second component is an infinite-dimensional process of $$t\in [0,\infty )$$ with values in ${L}_{2}\left(âˆ’\mathrm{Îµ},0;{\mathbb{R}}^{n}\right)$. The infinite-dimensional Kalman filtering result [7] can be written for the pair $$(\tilde{x}_{t},z_{t})$$, where z is the observation process from (2), which produces equations (4) for xÌ‚ and (5) for $$\psi =\hat{\phi }$$. Respectively, the infinite-dimensional Riccati equation splits into four equations, two of which are adjoint to each other. This implies three informative equations. One of them is (6). The values of the solutions of the other two equations are integral operators on ${L}_{2}\left(âˆ’\mathrm{Îµ},0;{\mathbb{R}}^{n}\right)$. In fact, (7) and (8) describe the behavior of the kernels of them.

Note that (5), (7), and (8) are partial differential equations. The first of them is a stochastic linear equation. Its solution is understood in the mild sense, that is, it is

$$\psi _{t,\theta }= \int _{\max (0,t-\theta -\varepsilon )}^{t} \bigl(Q_{s,s-t+ \theta }C^{*}+ \Phi _{t-\theta ,s-t+\theta }\bigr) (z_{s}-C\hat{x}_{s})\,ds.$$
(9)

The other two equations are the components of an infinite-dimensional Riccati equation, the solution of which is understood in the scalar product sense. Applying the scalar product sense to these components produces the following integral equations for the kernels Q and R of them:

\begin{aligned} Q_{t,\theta } ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*} \\ &{} +R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s}-\Phi _{t- \theta ,s-t+\theta }CP_{s}\bigr)\,ds \end{aligned}
(10)

and

\begin{aligned} R_{t,\theta ,\tau } ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}CQ^{*}_{s,s-t+\tau } \\ & {} +Q_{s,s-t+\theta }C^{*}\Phi ^{*}_{t-\tau ,s-t+\tau }+\Phi _{t- \theta ,s-t+\theta }CQ^{*}_{s,s-t+\tau }\bigr)\,ds. \end{aligned}
(11)

Therefore, under (5), (7), and (8), it should be understood (9), (10), and (11), respectively.

The domain of Ïˆ and Q is the infinite band

$$D=\bigl\{ (t,\theta ): -\varepsilon \le \theta \le 0, t\ge 0\bigr\} ,$$

depicted in Fig. 1. On its two boundary lines $$t=0$$ and $$\theta =-\varepsilon$$, Ïˆ and Q satisfy zero boundary conditions that together with equations (5) and (7) lead to the value of them in the interior of D and, more importantly, on its third boundary $$\theta =0$$ with its further use in (4) and (6), respectively.

The domain of R is the triangular cylinder

$$E=\bigl\{ (t,\theta ,\tau ): -\varepsilon \le \theta \le \tau \le 0, t \ge 0 \bigr\} .$$

Its bottom view is depicted in Fig. 2. On its two boundary faces $$t=0$$ and $$\theta =-\varepsilon$$, R satisfies a zero boundary condition that together with equation (8) leads to the values of R in the interior of E and, more importantly, on its third boundary face $$\tau =0$$ with its further use in (7). The values of R can be extended for the case $$-\varepsilon \le \tau \le \theta \le 0$$ by use of symmetry $$R_{r,\theta ,\tau }=R^{*}_{t,\tau , \theta }$$.

It could be verified that if $$\Phi _{t,\theta }$$ is continuously differentiable in both its variables, then the Leibniz rule of differentiation implies that (9), (10), and (11) satisfy (5), (7), and (8), respectively, in the ordinary sense. However, for general Î¦, the sense of these equations should be changed respectively.

5 Motivation

Writing $$\Phi _{t,\theta }=F\delta _{\theta }$$ in (3) produces $$\varphi _{t}=Fw_{t}$$. Consequently, (2) reduces to the Kalman-filtering model

$$\textstyle\begin{cases} x_{t}'=Ax_{t}+Fw_{t}, \qquad x_{0}=\xi ,& t>0, \\ z_{t}=Cx_{t}+w_{t}, \qquad z_{0}=0,& t>0, \end{cases}$$
(12)

where the signal and observations are disturbed by the correlated white noises. According to the Kalman-filtering result, the best estimate xÌ‚ in the filtering problem (12) is a unique solution of the equation

$$\hat{x}_{t}'=A\hat{x}_{t}+ \bigl(P_{t}C^{*}+ F\bigr) (z_{t}-C \hat{x}_{t}), \qquad \hat{x}_{0}=0,\quad t>0,$$
(13)

where P is a unique solution of the matrix Riccati equation

$$P'_{t} =AP_{t}+P_{t}A^{*}+FF^{*}- \bigl(P_{t}C^{*}+F\bigr) \bigl(CP_{t}+F^{*} \bigr), \qquad P_{0}= \operatorname{cov} \xi ,\quad t>0.$$
(14)

In this section, we demonstrate that Theorem 1 accepts the case $$\Phi _{t,\theta }=F\delta _{\theta }$$ as well, that is, (4)â€“(8) reduce to (13)â€“(14) in this case.

First, we demonstrate that $$\Phi _{t,\theta }=F\delta _{\theta }$$ yields the following explicit form of the solution of (7) (or (10)):

$$Q_{t,\theta }= \textstyle\begin{cases} -FCP_{t} & \text{if } \theta =0 \text{ and } t>0, \\ 0 & \text{if } -\varepsilon \le \theta < 0 \text{ or } t=0. \end{cases}$$
(15)

Indeed, letting $$t=0$$ in (10) immediately implies $$Q_{0,\theta }=0$$. Assume $$-\varepsilon \le \theta <0$$. To show that $$Q_{t,\theta }=0$$ in this case, we let $$\Phi _{t,\theta }=F\delta _{\theta }$$ in (10) and obtain

\begin{aligned} Q_{t,\theta } ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*} \\ &{} +R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s}-FCP_{s} \delta _{s-t+\theta }\bigr)\,ds. \end{aligned}
(16)

Since $$s=t-\theta$$ belongs to the interval $$(\max (0,t-\theta -\varepsilon ),t]$$ of integration just for $$\theta =0$$, under the assumption $$-\varepsilon \le \theta <0$$ the integral of the last term in (16) vanishes. Therefore,

$$Q_{t,\theta } = \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s} \bigr)\,ds.$$
(17)

Next, we deduce $$R_{s,s-t+\theta ,0}$$ from (8) or (11). For this, we let $$\Phi _{t,\theta }=F\delta _{\theta }$$ in (11) and obtain

\begin{aligned} R_{t,\theta ,\tau } ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}CQ^{*}_{s,s-t+\tau } \\ & {}+Q_{s,s-t+\theta }C^{*}F^{*}\delta _{s-t+\tau }+FCQ^{*}_{s,s-t+ \tau } \delta _{s-t+\theta }\bigr)\,ds, \end{aligned}

which implies

\begin{aligned} R_{t,\theta ,0} ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}CQ^{*}_{s,s-t} \\ &{} +Q_{s,s-t+\theta }C^{*}F^{*}\delta _{s-t}+FCQ^{*}_{s,s-t} \delta _{s-t+\theta }\bigr)\,ds. \end{aligned}
(18)

For the same reason as above, the integral of the last term vanishes if $$-\varepsilon \le \theta <0$$. Therefore, we obtain

$$R_{t,\theta ,0}=-Q_{t,\theta }C^{*}F^{*}- \int _{\max (0,t-\theta - \varepsilon )}^{t}Q_{s,s-t+\theta }C^{*}CQ^{*}_{s,s-t} \,ds,$$

implying

$$R_{s,s-t+\theta ,0}=-Q_{s,s-t+\theta }C^{*}F^{*}- \int _{\max (0,t- \theta -\varepsilon )}^{s}Q_{r,r-t+\theta }C^{*}CQ^{*}_{r,r-s} \,dr.$$

Using this in (17), we obtain

\begin{aligned} Q_{t,\theta } ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}Q_{s,s-t+ \theta }\bigl(A^{*}-C^{*}CP_{s}-C^{*}F^{*} \bigr)\,ds \\ &{} - \int _{\max (0,t-\theta -\varepsilon )}^{t} \int _{\max (0,t- \theta -\varepsilon )}^{s}Q_{r,r-t+\theta }C^{*}CQ^{*}_{r,r-s} \,dr\,ds \\ ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}Q_{s,s-t+\theta } \biggl( A^{*} - C^{*}CP_{s} - C^{*}F^{*} - C^{*}C \int _{s}^{t}Q^{*}_{s,s-r}\,dr \biggr)\,ds. \end{aligned}

This demonstrates that $$Q_{t,\theta }=0$$ if $$-\varepsilon \le \theta <0$$. Using this in (18), we obtain

\begin{aligned} R_{t,\theta ,0} & =- \int _{\max (0,t-\theta -\varepsilon )}^{t}FCQ^{*}_{s,s-t} \delta _{s-t+\theta }\,ds \\ & = \textstyle\begin{cases} -FCQ^{*}_{t,0} & \text{if } \theta =0 \text{ and } t>0, \\ 0 & \text{if } -\varepsilon \le \theta < 0 \text{ or } t>0. \end{cases}\displaystyle \end{aligned}

Therefore, from (16),

\begin{aligned} Q_{t,\theta }=- \int _{\max (0,t-\theta -\varepsilon )}^{t}FCP_{s} \delta _{s-t+\theta }\,ds, \end{aligned}

which implies (15). This reduces (6) to the Riccati equation in (14).

Next, letting $$\Phi _{t,\theta }=F\delta _{\theta }$$ and using (15), (9) can be evaluated as

\begin{aligned} \psi _{t,\theta } & = \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}+F \delta _{s-t+\theta }\bigr) (z_{s}-C\hat{x}_{s})\,ds \\ & = \textstyle\begin{cases} F(z_{t}-C\hat{x}_{t}) & \text{if } \theta =0 \text{ and } t>0, \\ 0 & \text{if } -\varepsilon \le \theta < 0 \text{ or } t=0. \end{cases}\displaystyle \end{aligned}

This reduces (4) to (13).

Thus, the filtering problem (2) and its equations (4)â€“(8) reduce to the Kalman-filtering problem (12) and its equations (13) and (14). In fact, the following happens with Ïˆ and Q from (5) and (7). The domain D of them from Fig. 1 squeezes to the half-line on which $$\theta =0$$ and $$t>0$$. Respectively, the equations (5) and (7) make the value of Ïˆ and Q on this half-line to be $$\psi _{t,0}= F(z_{t}-C\hat{x}_{t})$$ and $$Q_{t,0}=-FCP_{t}$$, substitution of which in (4) and (6) produces the equations of Kalman filter (13) and (14).

Resuming, we see that although Theorem 1 does not include the functions Î¦, which are composed by use of the delta function, its statement remains valid with them as well. This can be stated as follows.

Theorem 2

Let the conditions $$\mathrm{(A)}$$ and $$\mathrm{(B)}$$ hold. Then, equations (4)â€“(8) of the best estimate in the filtering problem (2) are valid for $$\Phi _{t,\theta }=F\delta _{\theta }$$, $$-\varepsilon \le \theta \le 0$$. In this case, (4)â€“(8) reduce to the equations (13) and (14) of the classic Kalman filter.

This theorem motivates a consideration of the relaxing functions Î¦, which are composed by use of the delta function so that they lead to the pointwise delays of white noises. In other words, we will not squeeze the region D till the half-line, but stay in some medium position. This is a general idea of our derivation of the equations of the best estimate in the filtering problem (1).

6 Main result

The following theorem is the main result of this paper.

Theorem 3

Assume that the conditions $$\mathrm{(A)}$$, $$\mathrm{(B)}$$, and $$\mathrm{(C)}$$ hold. Then, the best estimate xÌ‚ in the filtering problem (1) is uniquely determined as a solution of the equation

$$\hat{x}_{t}=A\hat{x}_{t}+\psi _{t,0}+P_{t}C^{*}(z_{t}-C \hat{x}_{t}), \qquad \hat{x}_{0}=0,\quad t>0,$$
(19)

where Ïˆ is a unique solution of

$$\textstyle\begin{cases} (\frac{\partial }{\partial t}+\frac{\partial }{\partial \theta } ) \psi _{t,\theta }=Q_{t,\theta }C^{*}(z_{t}-C\hat{x}_{t}), \\ \psi _{t,\theta }=0, \quad t-\lambda ^{-1}_{0}\le \theta \le 0, 0\le t \le \lambda ^{-1}_{0}, \\ \psi _{t,t-\lambda ^{-1}_{t}}=F(z'_{t}-C\hat{x}_{t}), \quad t>0, \end{cases}$$
(20)

P is a unique solution of the Riccati equation

$$P'_{t}=AP_{t}+P_{t}A^{*}+Q_{t,0}+Q^{*}_{t,0}-P_{t}C^{*}CP_{t},\qquad P_{0}= \operatorname{cov} \xi ,\quad t>0,$$
(21)

Q and R are unique solutions of

$$\textstyle\begin{cases} ( \frac{\partial }{\partial t}+\frac{\partial }{\partial \theta} )Q_{t,\theta }=Q_{t,\theta }A^{*}+R_{t,\theta ,0}-Q_{t,\theta }C^{*}CP_{t}, \\ Q_{t,\theta }=0, \quad t-\lambda ^{-1}_{0}\le \theta \le 0, 0\le t\le \lambda ^{-1}_{0}, \\ Q_{t,t-\lambda ^{-1}_{t}}=-FCP_{t},\quad t>0, \end{cases}$$
(22)

and

$$\textstyle\begin{cases} ( \frac{\partial }{\partial t}+\frac{\partial }{\partial \theta}+ \frac{\partial }{\partial \tau } ) R_{t,\theta ,\tau }=-Q_{t, \theta }C^{*}CQ^{*}_{t,\tau }, \\ R_{t,\theta ,\tau }=0,\quad t-\lambda ^{-1}_{0}\le \theta \le \tau \le 0, 0\le t\le \lambda ^{-1}_{0}, \\ R_{t,t-\lambda ^{-1}_{t},\tau }=-FCQ^{*}_{t,\tau }, \quad t-\lambda ^{-1}_{t} \le \tau < \min (0,t-\lambda ^{-1}_{0}), t>0. \end{cases}$$
(23)

Moreover, the mean square error of estimation in the filtering problem (1) equals

$$e_{t}=\mathbf{E} \Vert \hat{x}_{t}-x_{t} \Vert ^{2}=\operatorname{tr} P_{t}.$$

Before proving the main result the meaning of the equations in Theorem 3 should be clarified. Two of them, namely, (19) and (21) are ordinary stochastic and deterministic, respectively, differential equations. The solutions of them are understood in the ordinary sense. However, (20), (22), and (23) are partial differential equations and to them we apply solution concepts similar to the partial differential equations from Theorem 1.

The boundary conditions in (20) and (22) are similar. They define Ïˆ and Q definitely to be 0 on the triangle

$$G'=\bigl\{ (t,\theta ):t-\lambda ^{-1}_{0}\le \theta \le 0, 0\le t\le \lambda ^{-1}_{0}\bigr\}$$

and state the boundary value of them along the curve

$$C=\bigl\{ (t,\theta ): \theta =t-\lambda ^{-1}_{t}, t>0\bigr\} .$$

Equations (20) and (22) describe the behavior of Ïˆ and Q, respectively, in the region G between the curve C, the line segment

$$L=\bigl\{ (t,\theta ): \theta =t-\lambda ^{-1}_{0}, 0\le t \le \lambda ^{-1}_{0} \bigr\} ,$$

and the interval $$(\lambda ^{-1}_{0},\infty )$$ on the t-axis (see Fig. 3). For technical reasons, besides the region $$G'$$, we add to G the region

$$G''=\bigl\{ (t,\theta ):-\varepsilon \le \theta < t- \lambda ^{-1}_{t}, t \ge 0\bigr\}$$

as well, noting that Îµ comes from $$\mathrm{(C)}$$. Thus, G, $$G'$$ and $$G''$$ from Fig. 3 form a partition of D from Fig. 1. We let $$\psi _{t,\theta }=0$$ and $$Q_{t,\theta }=0$$ for $$(t,\theta )\in G''$$. Now, the solution of (20) is understood in the mild sense. More precisely, it is zero on $$G'\cup G''$$ and equals

$$\psi _{t,\theta }=F(z_{\lambda _{t-\theta }}-C\hat{x}_{\lambda _{t- \theta }})+ \int _{\lambda _{t-\theta }}^{t}Q_{s,s-t+\theta }C^{*}(z_{s}-C \hat{x}_{s})\,ds$$
(24)

on G. Furthermore, the solution of (22) is zero on $$G'\cup G''$$ and is understood as a solution of the integral equation

$$Q_{t,\theta }=-FCP_{\lambda _{t-\theta }}+ \int _{\lambda _{t-\theta }}^{t}\bigl(Q_{s,s-t+ \theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s} \bigr)\,ds$$
(25)

on G.

Note that Ïˆ and Q from (20) and (22) vanish on $$G'$$ because on the interval $$[0,\lambda ^{-1}_{0}]$$ the signal system in (1) is noise free. Therefore, letting $$\psi _{t,0}=0$$ and $$Q_{t,0}=0$$ for $$0\le t\le \lambda ^{-1}_{0}$$ in (19) and (21) produces the respective equations of a Kalman filter. For this reason, it could be sufficient to restrict the boundary condition in (20) and (22) from the triangle $$G'$$ to its upper and lower edges.

For discussion of (23), we move the above considerations to three dimensions. The boundary conditions in (23) define R to be zero on the tetrahedron

$$H'=\bigl\{ (t,\theta ,\tau ):t-\lambda ^{-1}_{0} \le \theta \le \tau \le 0, 0\le t\le \lambda ^{-1}_{0}\bigr\}$$

for the same reason that Ïˆ and Q are zero on $$G'$$. Again, it could be sufficient to restrict the boundary condition in (23) from the tetrahedron $$H'$$ to its lower and front triangular edges. Additionally, they state its value on the surface

$$S=\bigl\{ (t,\theta ,\tau ): t-\lambda ^{-1}_{t}=\theta \le \tau \le 0 , t>0 \bigr\} .$$

Equation (23) describes the behavior of R in the region H between the surface S, the triangle

$$T=\bigl\{ (t,\theta ,\tau ):t-\lambda ^{-1}_{0}=\theta \le \tau \le 0, 0 \le t\le \lambda ^{-1}_{0}\bigr\}$$

and the planes $$\theta =\tau$$ and $$\tau =0$$ (see Fig. 4). For technical reasons, besides the region $$H'$$ we add to H the region

$$H''=\bigl\{ (t,\theta ,\tau ):-\varepsilon \le \theta < t-\lambda ^{-1}_{t}, \theta \le \tau \le 0, t\ge 0\bigr\}$$

as well (see Fig. 4). Thus, H, $$H'$$, and $$H''$$ from Fig. 4 form a partition of E from Fig. 2. We let $$R_{t,\theta ,\tau }=0$$ for $$(t,\theta ,\tau )\in H''$$. Now, the solution of (23) is zero on $$H'\cup H''$$ and understood as a solution of the integral equation

$$R_{t,\theta ,\tau }=-FCQ^{*}_{\lambda _{t-\theta },\lambda _{t- \theta }-t+\tau } - \int _{\lambda _{t-\theta}}^{t}Q_{s,s-t+\theta }C^{*}CQ^{*}_{s,s-t+ \tau } \,ds$$
(26)

on H.

In general, the function Î» as a continuous and strictly increasing function and is almost everywhere differentiable. Hence, by the Leibniz rule of differentiation, the function R from (26) satisfies (23) for all $$t>0$$ and for almost every $$\theta \in [-\varepsilon ,0]$$ (as well as for almost every $$\tau \in [-\varepsilon ,0]$$). The same is valid for Ïˆ and Q from (24) and (25).

7 Proof of the main result

It suffices to derive (20), (22), and (23) from (5), (7), and (8) by letting $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$ because (4) and (19) as well as (6) and (21) are the same, while having the entries Ïˆ, Q, and R coming from different equations. We start from the derivation of (22) from (7).

We look to the integral form (10) of (7) and substitute $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$. Then, $$\Phi _{t-\theta ,s-t+\theta }=F\delta _{s-\lambda _{t-\theta }}$$ and (10) yields

\begin{aligned} Q_{t,\theta } ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*} \\ & {}+R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s}-FCP_{s} \delta _{s-\lambda _{t-\theta }}\bigr)\,ds. \end{aligned}
(27)

Taking into account that the inequality $$t-\theta -\varepsilon <\lambda _{t-\theta }$$ holds by $$\mathrm{(C)}$$, we conclude that the integral of the term with the delta function is zero if either $$t<\lambda _{t-\theta }$$ or $$t-\theta -\varepsilon <\lambda _{t-\theta }\le 0$$. Solving them in Î¸ and taking into account the definitions of the sets G, $$G'$$, and $$G''$$, we conclude that (27) can be written as

\begin{aligned} Q_{t,\theta } ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s} \bigr)\,ds \\ & {}- \textstyle\begin{cases} FCP_{\lambda _{t-\theta }} & \text{if } (t,\theta )\in G, \\ 0 & \text{if } (t,\theta )\in G'\cup G''. \end{cases}\displaystyle \end{aligned}
(28)

We claim that $$Q_{t,\theta }=0$$ for $$(t,\theta )\in G'\cup G''$$.

First, let $$(t,\theta )\in G''$$. In this case, $$\max (0,t-\theta -\varepsilon )< s\le t$$ implies $$-\varepsilon < s-t+\theta \le \theta < t-\lambda ^{-1}_{t}$$ and, therefore,

$$(t,\theta )\in G'' \quad \text{and}\quad \max (0,t-\theta -\varepsilon )< s\le t \quad \Rightarrow \quad (s,s-t+\theta )\in G''.$$
(29)

This will be used below. Letting $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$ in (10), we obtain

$$Q_{t,\theta }= \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s}-FCP_{s} \delta _{s-\lambda _{t-\theta }}\bigr)\,ds.$$

For $$(t,\theta )\in G''$$, $$t<\lambda ^{-1}_{t-\theta }$$. Therefore, the integral of the term with a delta function vanishes and we obtain

$$Q_{t,\theta }= \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+\theta }C^{*}CP_{s} \bigr)\,ds.$$
(30)

To substitute R in (30) in terms of Q, we let $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$ in (11) and obtain

\begin{aligned} R_{t,\theta ,\tau } ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}CQ^{*}_{s,s-t+\tau } \\ &{} +Q_{s,s-t+\theta }C^{*}F^{*}\delta _{s-\lambda _{t-\tau }}+FCQ^{*}_{s,s-t+ \tau } \delta _{s-\lambda _{t-\theta }}\bigr)\,ds. \end{aligned}
(31)

The case $$(t,\theta )\in G''$$ assumes $$t<\lambda _{t-\theta }$$. Therefore,

\begin{aligned} R_{t,\theta ,0} ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}CQ^{*}_{s,s-t} \\ &{} +Q_{s,s-t+\theta }C^{*}F^{*}\delta _{s-\lambda _{t}}+FCQ^{*}_{s,s-t} \delta _{s-\lambda _{t-\theta }}\bigr)\,ds \\ ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}Q_{s,s-t+\theta }\bigl(C^{*}CQ^{*}_{s,s-t}+C^{*}F^{*} \delta _{s-\lambda _{t}}\bigr)\,ds. \end{aligned}
(32)

Then,

$$R_{s,s-t+\theta ,0}=- \int _{\max (0,t-\theta -\varepsilon )}^{s}Q_{r,r-t+ \theta }\bigl(C^{*}CQ^{*}_{r,r-s}+C^{*}F^{*} \delta _{r-\lambda _{s}}\bigr)\,dr.$$

Using this in (30), we obtain

\begin{aligned} Q_{t,\theta } ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}Q_{s,s-t+ \theta }\bigl(A^{*}-C^{*}CP_{s} \bigr)\,ds \\ &{} - \int _{\max (0,t-\theta -\varepsilon )}^{t} \int _{\max (0,t- \theta -\varepsilon )}^{s}Q_{r,r-t+\theta }\bigl(C^{*}CQ^{*}_{r,r-s}+C^{*}F^{*} \delta _{r-\lambda _{s}}\bigr)\,dr\,ds \\ ={}& \int _{\max (0,t-\theta -\varepsilon )}^{t}Q_{s,s-t+\theta }K_{t,s}\,ds, \end{aligned}

where

$$K_{t,s}=A^{*}-C^{*}CP_{s}- \int _{s}^{t}C^{*}C\bigl(Q^{*}_{s,s-r}+C^{*}F^{*} \delta _{s-\lambda _{r}}\bigr)\,dr.$$

By (29), Q on $$G''$$ is expressed linearly by the values of Q on $$G''$$. This implies that

$$Q_{t,\theta }=0 \quad \text{if } (t,\theta )\in G''.$$

Now, we assume $$(t,\theta )\in G'$$. In this case, $$0\le s\le t$$ implies $$0\le s\le \lambda ^{-1}_{0}$$ and $$s-\lambda ^{-1}_{0}\le s-t+\theta \le 0$$. Therefore,

$$0\le s\le t \quad \Rightarrow\quad (s,s-t+\theta )\in G'.$$
(33)

This will be used below. For $$(t,\theta )\in G'$$, we have $$t-\theta -\varepsilon \le \lambda ^{-1}_{0}-\varepsilon <0$$. Therefore, letting $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$ in (10), we obtain

$$Q_{t,\theta }= \int _{0}^{t}\bigl(Q_{s,s-t+\theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+ \theta }C^{*}CP_{s}-FCP_{s} \delta _{s-\lambda _{t-\theta }}\bigr)\,ds.$$

Additionally, in this case, $$\lambda _{t-\theta }\le 0$$ implying

$$Q_{t,\theta }= \int _{0}^{t}\bigl(Q_{s,s-t+\theta }A^{*}+R_{s,s-t+\theta ,0}-Q_{s,s-t+ \theta }C^{*}CP_{s} \bigr)\,ds.$$
(34)

Performing the same operations with (11) yields

\begin{aligned} R_{t,\theta ,0} ={}&- \int _{0}^{t}\bigl(Q_{s,s-t+\theta }C^{*}CQ^{*}_{s,s-t} \\ &{} +Q_{s,s-t+\theta }C^{*}F^{*}\delta _{s-\lambda _{t}}+FCQ^{*}_{s,s-t} \delta _{s-\lambda _{t-\theta }}\bigr)\,ds \\ ={}&- \int _{0}^{t}Q_{s,s-t+\theta }\bigl(C^{*}CQ^{*}_{s,s-t}+C^{*}F^{*} \delta _{s-\lambda _{t}}\bigr)\,ds. \end{aligned}
(35)

Then,

$$R_{s,s-t+\theta ,0}=- \int _{0}^{s}Q_{r,r-t+\theta }\bigl(C^{*}CQ^{*}_{r,r-s}+C^{*}F^{*} \delta _{r-\lambda _{s}}\bigr)\,dr.$$

Substituting this into (34) yields

\begin{aligned} Q_{t,\theta } ={}& \int _{0}^{t}Q_{s,s-t+\theta }\bigl(A^{*}-C^{*}CP_{s} \bigr)\,ds \\ & {}- \int _{0}^{t} \int _{0}^{s}Q_{r,r-t+\theta }\bigl(C^{*}CQ^{*}_{r,r-s}+C^{*}F^{*} \delta _{r-\lambda _{s}}\bigr)\,dr\,ds \\ ={}& \int _{0}^{t}Q_{s,s-t+\theta } \biggl( A^{*}-C^{*}CP_{s}- \int _{s}^{t}C^{*}C\bigl(Q^{*}_{s,s-r}+C^{*}F^{*} \delta _{s-\lambda _{r}}\bigr)\,dr \biggr)\,ds. \end{aligned}

By (33), Q on $$G'$$ is expressed linearly by the values of Q on $$G'$$. This implies that

$$Q_{t,\theta }=0 \quad \text{if } (t,\theta )\in G'.$$

Resuming, we can update equation (28) for $$(t,\theta )\in G$$ by removing from the interval $$(\max (0,t-\theta -\varepsilon ),t]$$ of integration those values of s for which $$(s,s-t+\theta )\notin G$$. According to the definitions of $$G'$$ and $$G''$$, these values of s are specified by the inequalities

$$s-t+\theta < s-\lambda ^{-1}_{s} \quad \text{and}\quad s-\lambda ^{-1}_{0}\le s-t+ \theta .$$

Solving these inequalities, we obtain $$s<\lambda _{t-\theta }\le 0$$. Therefore, the interval of integration in (28) must be $$(\lambda _{t-\theta },t]$$. Therefore, (28) in the updated form becomes the same as (25) if $$(t,\theta )\in G$$ and $$Q_{t,\theta }=0$$ if $$(t,\theta )\in G'\cup G''$$.

Next, we derive (23) from (8) or (11). Letting $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$ in (11), we obtain

\begin{aligned} R_{t,\theta ,\tau } ={}&- \int _{\max (0,t-\theta -\varepsilon )}^{t}\bigl(Q_{s,s-t+ \theta }C^{*}CQ^{*}_{s,s-t+\tau } \\ & {}+Q_{s,s-t+\theta }C^{*}F^{*}\delta _{s-\lambda _{t-\tau }}+FCQ^{*}_{s,s-t+ \tau } \delta _{s-\lambda _{t-\theta }}\bigr)\,ds. \end{aligned}
(36)

Implementing zeros of Q, one can see that $$R_{t,\theta ,\tau }=0$$ on the sets $$H'$$ and $$H''$$. Therefore, the integral of the first two terms in (36) can be updated to the interval $$(\lambda _{t-\theta },t]$$. Then, the integral of the second term vanishes since $$\theta \le \tau$$ implies $$\lambda _{t-\tau }\le \lambda _{t-\theta }$$ and, therefore, $$\lambda _{t-\tau }$$ remains out of the interval $$(\lambda _{t-\theta },t]$$ of integration. Finally, the integral of the last term produces $$-FCQ^{*}_{\lambda _{t-\theta },\lambda _{t-\theta }-t+\tau }$$. Thus, we obtain that (36) in the updated form becomes the same as (26) if $$(t,\theta ,\tau )\in H$$ and $$R_{t,\theta ,\tau }=0$$ if $$(t,\theta ,\tau )\in H'\cup H''$$.

Finally, we turn to (5). Its solution can be written as (9). Letting $$\Phi _{t,\theta }=F\delta _{\theta +t-\lambda _{t}}$$ and implementing zeros of Q, we find that (9) in the updated form becomes the same as (24) if $$(t,\theta )\in G$$ and $$\psi _{t,\theta }=0$$ if $$(t,\theta )\in G'\cup G''$$. This completes the proof.

8 Numerical aspects

Application of the proposed filter requires a clear determination of the delay as a function of time. In the example of a spacecraft, discussed in the introduction of this paper, this means that the trajectory of a spacecraft should be known beforehand and its distance from the Earth calculated at different instants of the voyage. The filter is seen to be stable to minor changes in the trajectory.

Another challenge is related to numerical calculations for realization of the filter. Equations (19)â€“(23) of the optimal filter from Theorem 3 could be seen as computationally complex. However, they are quite suitable for numerical calculations. First, they should be separated into (19)â€“(20) for the best estimate xÌ‚ and (21)â€“(23) for P. Equations (21)â€“(23) are deterministic. Therefore, initially they can be numerically solved and stored in a computer. Then, (19)â€“(20) can be solved on the basis of the stored data and timely available observation input z. In fact, the same idea can be applied to (19)â€“(20) and (21)â€“(23). The distinction is just in the number of equations and arguments. Therefore, we will just demonstrate this for the relatively simple set of equations (19)â€“(20) in the case of constant delay, assuming that $$\lambda _{t}=t-\varepsilon$$ with $$\varepsilon >0$$. Then, $$0\le t<\infty$$ and $$-\varepsilon \le \theta \le 0$$.

Discretize the continuous time argument t by considering

$$0=t_{0} < t_{1} < \cdots < t_{n} < \cdots .$$

Do the same for Î¸ by considering

$$-\varepsilon =\theta _{k} < \theta _{k-1}< \cdots < \theta _{m} < \cdots < \theta _{0} =0.$$

For simplicity, chose the steps of discretization in both t and Î¸ equal, that is, assume that $$t_{n+1}-t_{n} = \theta _{m}-\theta _{m+1}=h$$ for all $$n=0,1,\ldots$$ and $$m=0,1,\ldots ,k-1$$. Let

$$\hat{x}_{n}=\hat{x}_{t_{n}}, \qquad z_{n}=z_{t_{n}},\qquad \psi _{n,m}=\psi _{t_{n}, \theta _{m}},\qquad P_{n}=P_{t_{n}}.$$

Then, using the substitution

$$\hat{x}'_{n+1}\approx \frac{\hat{x}_{n+1}-\hat{x}_{n}}{h},$$

we transform (19) to the discrete form

$$\hat{x}_{n+1}=\hat{x}_{n}+h\bigl(\bigl(A+P_{n}C^{*}C \bigr)\hat {x}_{n}+\psi _{n,0}+P_{n}C^{*}z_{n} \bigr),$$

where $$z_{n}$$ is the input of the filter. Therefore, we need to determine only $$\psi _{n,0}$$ to be able to calculate $$\hat{x}_{n+1}$$ on the basis of $$\hat{x}_{n}$$. This can be done in k steps by discretization of (20). Note that the number of such steps reduces to n if $$0\le n\le k$$. At this point, observe that the left side of (20) is a directional derivative of Ïˆ in the main diagonal direction on the tÎ¸-plane so that we can use

$$\biggl( \frac{\partial}{\partial t}+\frac{\partial }{\partial \theta } \biggr) \psi _{n,m} \approx \frac{\psi _{n,m}-\psi _{n-1,m+1}}{h\sqrt{2}}.$$

Based on this, (20) can be discretized as

$$\psi _{n,m}=\psi _{n-1,m+1}+h\sqrt{2}Q_{n-1,m+1}C^{*}(z_{n-1}-C \hat{x}_{n-1}).$$

Thus, for jumping from calculation of $$\hat{x}_{n}$$ to $$\hat{x}_{n+1}$$, there are $$\min (k, n)$$ steps for calculation of $$\psi _{n,0}$$. The total number of steps becomes $$n+\sum_{i=1}^{n}\min (i,k)\le k(n+1)$$ for calculation of $$\hat{x}_{n}$$. In the case of equations (21)â€“(23), assuming that discretization of Ï„ is the same as for Î¸, the number of steps increases and is bounded by $$k(k+1)(n+1)/2$$ because $$0\le \theta \le \tau \le 0$$. These estimations are valid for the case when the state space is one-dimensional. For multidimensional state space the complexity increases. However, as was mentioned before, equations (21)â€“(23) are deterministic, they can be solved beforehand and be presented as a table of values of P and Q. However, (19)â€“(20) should be solved timely upon availability of observation measurements.

9 Conclusion

A delay is an important element in systems theory. Previously, delays in state and control were investigated and important results in this way were obtained. In the recent papers [14, 16, 17, 27] it was justified that a delay in noises in either distributed or pointwise forms is important too. As a continuation of these papers, the present paper proves an important Kalman-type filtering result in the case of a pointwise delay in the signal noise.

Availability of data and materials

Data sharing is not applicable to this paper as no datasets were generated or analyzed during the current study.

References

1. Kalman, R.E.: A new approach to linear filtering and prediction problems. J. Basic Eng. 82, 35â€“45 (1960)

2. Kalman, R.E., Bucy, R.S.: New results in linear filtering and prediction theory. J. Basic Eng. 83, 95â€“108 (1961)

3. Crassides, J.L., Junkins, J.L.: Optimal Estimation of Dynamic Systems. Chapman & Hall, Boca Raton (2004)

4. Hofmann-Wellenhof, B., Lichtenegger, H., Waste, E.: GNSSâ€“Global Navigation Satellite Systems. Springer, Vien (2008)

5. Lefebvre, T., Bruyninckx, H., DeSchutter, J.: Nonlinear Kalman Filtering for Force-Controlled Robot Tasks. Springer, Berlin (2005)

6. Harvey, A.C.: Forecasting, Structural Time Series Models and Kalman Filter. Cambridge University Press, Cambridge (1989)

7. Curtain, R.F., Pritchard, A.J.: Infinite Dimensional Linear Systems Theory. Lecture Notes in Control and Information Sciences, vol. 8. Springer, Berlin (1978)

8. Bucy, R.S., Joseph, P.D.: Filtering for Stochastic Processes with Application to Guidance. Interscience, New York (1968)

9. Bashirov, A.E.: Partilly Observable Linear Systems Under Dependent Noises. BirkhÃ¤user, Basel (2003)

10. Zhang, Y., Jia, G., Li, N., Bai, M.: A novel adaptive Kalman filter with colored measurements. IEEE Access 6, 74569â€“74578 (2018). https://doi.org/10.1109/ACCESS.2018.2883040

11. Jia, G., Zhang, Y., Bai, M., Li, N., Qian, J.: A novel robust student t-based Gaussian approximate filter with one-step randomly delayed measurements. Signal Process. 171, 107496 (2020). https://doi.org/10.1016/j.sigpro.2020.107496

12. Jia, G., Huang, Y., Zhang, Y., Chambers, J.: A novel adaptive Kalman filter with unknown probability of measurement loss. IEEE Signal Process. Lett. 26(12), 1862â€“1866 (2019). https://doi.org/10.1109/LSP.2019.2951464

13. Jia, G., Huang, Y., Bai, M., Zhang, Y.: A novel robust Kalman filter with non-stationary heavy-tailed measurement noise. IFAC-PapersOnLine 53(2), 368â€“373 (2020). https://doi.org/10.1016/j.ifacol.2020.12.188

14. Bashirov, A.E.: Kalman-type filter for communication with considerably distanced spacecraft. J. Br. Interplanet. Soc. 74(10), 381â€“385 (2021)

15. Bashirov, A.E.: Filtering for linear systems with shifted noises. Int. J. Control 78(7), 521â€“529 (2005)

16. Bashirov, A.E., Abuassba, K.: Invariant filtering results for wide band noise driven signal systems. TWMS J. Appl. Eng. Math. 8(1), 71â€“82 (2018)

17. Bashirov, A.E., Abuassba, K.: Invariant Kalman filter for correlated wide band noises. Asian J. Control 22(2), 648â€“656 (2020)

18. Fleming, W.M., Rishel, R.W.: Deterministic and Stochastic Optimal Control. Springer, Berlin (1975)

19. Kushner, H.J., Runggaldier, W.J.: Nearly optimal state feedback controls for stochastic systems with wideband noise disturbances. SIAM J. Control Optim. 25, 298â€“315 (1987)

20. Kushner, H.J., Runggaldier, W.J.: Filtering and control for wide bandwidth noise driven systems. IEEE Trans. Autom. Control 32AC, 123â€“133 (1987)

21. Kushner, H.J., Ramachandran, K.M.: Nearly optimal singular controls for sideband noise driven systems. SIAM J. Control Optim. 26, 569â€“591 (1988)

22. Bashirov, A.E., Etikan, H., Åžemi, N.: Filtering, smoothing and prediction for wide band noise driven systems. J. Franklin Inst. 334(4), 667â€“683 (1997)

23. Bashirov, A.E., Eppelbaum, L.V., Mishne, L.R.: Improving EÃ¶tvÃ¶ÅŸ corrections by wide band noise Kalman filtering. Geophys. J. Int. 108(1), 193â€“197 (1992)

24. Bashirov, A.E., Mazhar, Z., Etikan, H., ErtÃ¼rk, S.: Delay structure of wide band noises with application to filtering problems. Optim. Control Appl. Methods 34(1), 69â€“79 (2013)

25. Bashirov, A.E., UÇ§ural, S.: Representation of systems disturbed by wide band noises. Appl. Math. Lett. 15, 607â€“613 (2002)

26. Bashirov, A.E., UÇ§ural, S.: Analyzing wide band noise processes with application to control and filtering. IEEE Trans. Autom. Control 4(2), 323â€“327 (2002)

27. Bashirov, A.E.: Linear filtering for wide band noise driven observation systems. Circuits Syst. Signal Process. 36(3), 1247â€“1263 (2017)

Acknowledgements

This work was supported by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Project No. GRANT 395].

Author information

Authors

Contributions

Conceptualization, A.B.; Methodology, A.B.; Formal Analysis, A.B. and K.A.; Investigation, A.B. and K.A.; Writing, A.B.; Supervision, A.B.; Project Administration, K.A.; Funding Acquisition, K.A. Both authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Kinda Abuasbeh.

Ethics declarations

Not applicable.

Not applicable.

Competing interests

The authors declare no competing interests.

Rights and permissions

Reprints and permissions

Abuasbeh, K., Bashirov, A.E. Derivation of a Kalman-type filter for linear systems with pointwise delay in signal noise. Bound Value Probl 2022, 64 (2022). https://doi.org/10.1186/s13661-022-01646-6