Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

List of authors (online updated):

Contributors
scopedescendants

Hard-copy of list of authors (does not require authorization): Tom Bäckström


Table of contents

  1. Introduction
    1. Why speech processing?
    2. Speech production and acoustic properties
    3. Phonetics (Wikipedia)
    4. Linguistics (Wikipedia)
    5. Speech perception (Wikipedia)
    6. Speech-Language pathology (Wikipedia)
  2. Basic representations and models
    1. Waveform
    2. Windowing
    3. Spectrogram and the STFT
    4. Autocorrelation and autocovariance
    5. Cepstrum and MFCC
    6. Linear prediction
    7. Fundamental frequency (F0)
    8. Zero-crossing rate
    9. Deltas and Delta-deltas
    10. PSOLA (Wikipedia stub)
  3. Pre-processing
    1. Pre-emphasis
    2. Voice activity detection (VAD)
    3. Vocal tract length normalization
    4. Speech enhancement
  4. Modelling tools in speech processing
    1. Linear regression
    2. Sub-space models
    3. Vector quantization (VQ)
    4. Gaussian mixture model (GMM)
    5. Neural networks
    6. Sub-space models
    7. Non-negative Matrix and Tensor Factorization
    8. Hidden Markov Models
  5. Evaluation of speech processing methods
    1. Subjective evaluation
    2. Objective evaluation
  6. Speech analysis
    1. Fundamental frequency estimation
    2. Formant estimation and tracking
    3. Inverse filtering for glottal activity estimation
  7. Recognition tasks in speech processing
    1. Voice activity detection (VAD)
    2. Keyword or wake-word spotting
    3. Speech recognition
    4. Speaker recognition and verification
  8. Natural language processing
  9. Speech Synthesis
    1. Concatenative synthesis
    2. Parametric synthesis
  10. Transmission, storage and telecommunication
    1. Short history of speech coding
    2. Design goals
    3. Basic tools
      1. Modified-discrete cosine transform (MDCT)
      2. Entropy coding
      3. Perceptual modelling and coding
      4. Vector quantization (VQ)
      5. Linear prediction
    4. Code-excited linear prediction (CELP)
    5. Frequency-domain coding
  11. Speech enhancement
    1. Noise attenuation
    2. Echo cancellation
    3. Dereverberation
    4. Source separation
  12. Speech analysis and imaging for medical applications
    1. Electroglottography (Wikipedia)
    2. Stroboscopy and videokymography (Wikipedia)
    3. Highspeed camera
    4. MRI
    5. Rothenberg mask
    6. Glottal inverse filtering
  13. References



Recent space activity

Recently Updated
typespage, comment, blogpost
max5
hideHeadingtrue
themesocial

Space contributors

Contributors
modelist
scopedescendants
limit5
showLastTimetrue
orderupdate