Date: Thu, 23 Sep 2021 06:58:12 +0000 (UTC) Message-ID: <1057543769.26382.1632380292664@3ecc041e.tietoverkkopalvelut.fi> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_26381_953172796.1632380292664" ------=_Part_26381_953172796.1632380292664 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html Non-negative Matrix and Tensor Factorization

# Non-negative Matrix and Tensor Factorization

=20
=20
=20
=20

## Introductio= n

Many of the most descriptive features of speech are described by energy;= for example, formants are peaks and the fundamental frequency is visible a= s a comb-structure in the power spectrum. A basic property of such features= is that they are positive-valued. Negative values in energy are not physic= ally realizable. However, most signal processing methods are applicable onl= y for real-valued variables and inclusion of a non-negative constraints is = cumbersome.

Non-negative matrix factorization (NMF or NNMF) and its tensor-= valued counterparts is a family of methods which explicitly assumes that th= e input variables are non-negative, that is, they are by definitio= n applicable to energy-signals. In some sense, NMF methods are an extension= of prinicipal component analsys (PCA) -type and ot= her subspace methods to posi= tive-valued signals.

=20
=20
=20
=20

=20
=20
=20
=20
=20
=20

## Model de= finition

Specifically, suppose that the power (or magnitude) spectrum of one wind= ow of a speech signal is represented as a Nx1 vector vk, and furthermore we arrange the K windows into an <= em>NxK matrix V. The signal model we use is then

$V \ap= prox WH,$=20

where W is the NxM weight matrix, H is the MxK model matrix and the scalar M is the model order.

The idea is that H is a fixed matrix corresponding to our model= of the signal, viz. the source model. It describes typical types features = of the data. With the weights W, we interpolate between the c= olumns of H. In some sense, this is then a generalization of = a codebook (see vecto= r quantization), but such that we interpolate between codevectors. In a= ddition, we require that all elements of W and H are non-= negative, such that we ensure that V is also non-negative.

Since the model order K is chosen to be smaller than either&nbs= p;N or K, this mapping is generally an approximation= . The model thus tries to catch the relevant features of the input sign= al with a low number of parameters.

The model is generally optimized by

$\min_{W,H} \| V - WH \|_F\qqu= ad\text{such that}\qquad W,H\geq 0.$=20

Here the norm refers to the Frobenius norm, which = is defined as the square root sum of squared elements. We do not have analy= tic solutions to the above optimization problem, but we can solve it by num= erical methods, which are included in typical software libraries.

=20
=20
=20
=20

=20
=20
=20
=20
=20
=20

=20
=20
=20
=20

=20
=20
=20
=20
=20
=20