Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Speech is a continuous signal, which means that consecutive samples of the signal are correlated (see figure on the right). In particular, if we know a previous sample xn-1, we can make a prediction of the current sample, \( \hat x_n = x_{n-1}, \) such that \( \hat x_n \approx x_n. \) By using more previous samples we have more information, which should help us make a better prediction. Specifically, we can define a predictor which uses M previous samples to predict the current sample xas

\[ \hat x_n = - \sum_{k=1}^M a_k x_{n-k}. \]

This is a linear predictor because it takes a linearly weighted sum of past components to predict the current one.

The error of the prediction, also known as the prediction residual is

\[ e_n = x_n - \hat x_n = x_n + \sum_{k=1}^M a_k x_{n-k} = \sum_{k=0}^M a_k x_{n-k}, \]

where a0=1. This explains why the definition \( \hat x_n \) included a minus sign; when we calculate the residual, the double negative disappears and we can collate everything into one summation.

A short segment of speech. Notice how consecutive samples are mostly near each other, which means that consecutive samples are correlated.

Using vector notation, we can make the expressions more compact

\[ e = Xa \]

where

\[ e = \begin{bmatrix}e_0\\e_1\\\vdots\\e_{N-1}\end{bmatrix},\qquad X = \begin{bmatrix}x_0 & x_{-1} & \dots & x_{M} \\x_1 & x_0 & \dots & x_{M-1} \\ \vdots & \vdots & & \vdots \\ x_{N-1} & x_{N-2} & \dots & x_{N-M} \end{bmatrix}, \qquad a = \begin{bmatrix}a_0\\a_1\\\vdots\\a_{M}\end{bmatrix} \]


  • No labels