26/06/2024 Seminari de COLT, a càrrec de Oli Liu (University of Edinburgh)

"Analyzing self-supervised representations of speech: encoding structures of speaker information and phonetic context", a càrrec de Oli Liu (University of Edinburgh)




Dia: 26 de juny del 2024
Hora: de 12.00h a 13.00h
Lloc: sala 52.701, Campus del Poblenoude la UPF (edifici Roc Boronat)


Mapping speech to meaning is a non-trivial computational problem. There is significant variability in the acoustic realization of a phoneme depending on the speaker and the phonetic context. Nevertheless, humans can easily overcome these challenges and comprehend speech with robustness. Recent advances in speech technology have created automatic speech recognition systems that approach human-level performance. Many of these systems make use of pre-trained models of speech, which are trained in a self-supervised manner without using any text labels, and yet are shown to encode significant linguistic information.

In my research, I use self-supervised learning models as a tool for understanding humans’ mental representations of speech. Through analyzing the structures of the representation space and comparing them to properties found in neural encoding of human listeners, I aim to identify potential computational mechanisms employed in speech perception. I will talk about two ongoing projects focusing on the representational geometry of speaker information and phonetic context respectively. I will discuss the implication of our findings for both understanding human speech perception and analysing self-supervised representations.



ODS - Objetivos de desarrollo sostenible:

Els ODS a la UPF
