Computing the error function defined over a transformed dataset

by Gerardo Durán Martín   Last Updated August 14, 2019 17:19 PM

I am currently reading chapter 5 from Christopher Bishop's Pattern Recognition and Machine Learning book.

Having an infinte dataset $$\{({\bf x}, t)\}$$, and an estimated value $$y({\bf x})$$, I am attempting to compute the squared error function over an expanded dataset driven by the transformation of the input vector $${\bf x}$$ given by $${\bf s}({\bf x}, \xi)$$, defined in such a way that $${\bf s}({\bf x}, 0) = {\bf x}$$. The form this error functions takes is

$$\tilde E = \frac{1}{2}\int\int\int \left[y({\bf s}({\bf x}, \xi)) - t\right]^2 p({\bf x})p(t|{\bf x})p(\xi) d{\bf x} \ dt \ d\xi$$

where we assume that $$\mathbb{E}[\xi] = 0$$ and $$\mathbb{V}[\xi]$$ is small.

According to Bishop, the transformed error function can be expanded as

\begin{align} \tilde E &= \frac{1}{2}\int\int[y({\bf x}) - t]^2 p({\bf x})p(t|{\bf x}) d{\bf x} \ dt \\ &+ \frac{1}{2}\mathbb{E}[\xi^2]\int\int \left[[y({\bf x}) - t]^2 \left[({\tau'})^T\nabla_{\bf x} y + \tau^T \nabla^2_{\bf x}y \tau\right]\\ + (\tau^T\nabla_{\bf x} y)^2 p({\bf x})p(t|{\bf x}) d{\bf x} \ dt\right] \end{align}

Where we consider the second-order Taylor expansion of y($${\bf s}({\bf x}, \xi)$$) given by $$y({\bf s}({\bf x}, \xi)) = y({\bf x}) + \xi \tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_xy + \tau^T\nabla_x^2\tau\right) + O(\xi^3)$$

My attempt

With the second-order Taylor expansion of y($${\bf s}({\bf x}, \xi)$$), we might rewrite the integrand as

\begin{align} & \left([y({\bf s}({\bf x}) - t] + \xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3) \right)^2 \\ & = [y({\bf x}) - t]^2 \\ &+ 2(y({\bf x}) - t)\left( \xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right) \\ &+ \left(\xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2 \end{align}

Where we see that the first term in the summation, integrated w.r.t. $${\bf x}$$, $$t$$ and $$\xi$$ is the first term in $$\tilde E$$; the second term in the summation decomposes into 0 (because of the constraint $$\mathbb{E}[\xi] = 0$$) plus the second term of $$\tilde E$$ plus $$O(\xi^3)$$; finally, we are left with the third term, which is where I am having some trouble.

expanding the third term of the expansion we get \begin{align} &\left(\xi\tau^T\nabla_xy + \frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2\\ &= \xi^2\left(\tau^T\nabla_xy\right)^2 \\ &+ 2 \left(\xi\tau^T\nabla_xy\right) \left(\frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right) \\ &+ \left(\frac{\xi^2}{2}\left((\tau')^T\nabla_x y + \tau^T\nabla_x^2y\tau\right) + O(\xi^3)\right)^2 \end{align}

Where is clear to see that the first term in this last expansion corresponds to the final term of $$\tilde E$$, but we are left to solve the second and third terms which contain $$\mathbb{E}[\xi^3]$$ and $$\mathbb{E}[\xi^4]$$ respectively. If the expansion of $$\tilde E$$ is correct, I would expect these two terms to be zero, but it is not clear to me how to go about doing this.

Tags :