Page tree
Go to start of banner

Linear regression

You are viewing an old version of this page. View the current version.

Version 7

Problem definition

In speech processing and elsewhere, a frequently appearing task is to make a prediction of an unknown vector y from available observation vectors x. Specifically, we want to have an estimate $$\hat y = f(x)$$ such that $$\hat y \approx y.$$ In particular, we will focus on linear estimates where $$\hat y=f(x):=A^T x,$$ and where A is a matrix of parameters.

The minimum mean square estimate (MMSE)

Suppose we want to minimise the squared error of our estimate on average. The estimation error is $$e=y-\hat y$$ and the squared error is the L2-norm of the error, that is, $$\left\|e\right\|^2 = e^T e$$ and its mean can be written as the expectation $$E\left[\left\|e\right\|^2\right] = E\left[\left\|y-\hat y\right\|^2\right] = E\left[\left\|y-A^T x\right\|^2\right].$$ Formally, the minimum mean square problem can then be written as

$\min_A\, E\left[\left\|y-A^T x\right\|^2\right].$

This can in generally not be directly implemented because we have the abstract expectation-operation in the middle. To get a computational model, we can approximate the expectation with the mean over desired outputs yk and observations xk as

$E\left[\left\|y-A^T x\right\|^2\right] \approx \frac1N \sum_{k=1}^N \left\|y_k-A^T x_k\right\|^2 = \frac1N \sum_{k=1}^N$

• No labels