Page tree

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

One area of research making use of speech technology is the study of human language learning and processing. Language is a highly complex phenomenon with physical, biological, psychological, social and cultural dimensions. Therefore it is also studied across several disciplines, such as linguistics, neuroscience, psychology, and anthropology. While many of these fields primarily focus on empirical and theoretical work on language, computational models and simulations provide another important aspect to the research: capability to test theoretical models in practice. Implementation of models capable of processing real speech data requires techniques from speech processing and machine learning. For instance, techniques for speech signal representation and pre-processing are needed to interface the models with acoustic speech recordings. Different types of classifiers and machine learning algorithms are needed to implement learning mechanisms in the models or to analyze behavior of the developed models. In addition, model training data may be generated with speech synthesizers (e.g., Havard, Besacier & Rosec, 2017), whereas linguistic reference data for model evaluation may be extracted from speech recordings using automatic speech recognition


Dupoux, E. (2018). Cognitive Science science in the Era era of Artificial Intelligenceartificial intelligence: A Roadmap roadmap for Reversereverse-Engineering engineering the Infant Languageinfant language-Learnerlearner. Cognition, 173, 43–59.

Havard, W., Besacier, L., & Rosec, O. (2017). SPEECH-COCO: 600k visually grounded spoken captions aligned to MSCOCO data set. Proc. International Workshop on Grounding Language Understanding (GLU-2017), pp. 42-46, DOI: 10.21437/GLU.2017-9.

Howard, I., & Messum, P. (2014). Learning to pronounce first words in three languages: an investigation of caregiver and infant behavior using a computational model of an infant. PLoS ONE, 9,  e110334.


Marr, D. (1982). Vision: A Computational Approachcomputational approach. San Francisco, Freeman & Co.