External speakers: Asli Ozyurek. What does a multimodal language framework reveal about relations between language, cognition, and communication in humans and machines?

Asli Ozyurek. What does a multimodal language framework reveal about relations between language, cognition, and communication in humans and machines?

22/5/2025

- BCBL auditorium (and BCBL Auditorium zoom room)

What: What does a multimodal language framework reveal about relations between language, cognition, and communication in humans and machines?

Where: BCBL Auditorium and Auditorium zoom room (If you would like to attend to this meeting reserve at info@bcbl.eu)

Who: Professor Asli Ozyurek. PhD, Director, Multimodal Language Department, Max Planck Institute for Psycholinguistics, Nijmegen; Donders Institute for Brain Cognition and Behavior, Radboud University; The Netherlands

When: Thursday, May 22nd at 12:00 PM noon.

One of the fundamental challenges of cognitive science has been to formalize relations between language 1) to cognition, and 2) to communication (e.g., Fedorenko, Piantadosi, and Gibson 2024), a problem that has been recently been extended to language use by machines. These debates however have viewed language mostly from a text or speech-centric way. In this talk I will sketch how a multimodal language framework challenges fundamental assumptions about language structures—mostly based on text and speech—and how it can offer a new window to how human language relates to cognition and communication. Substantial evidence has shown the need to integrate properties of the visible bodily modality as integral design feature of language (e.g., Holler and Levinson 2019; Hagoort and Özyürek 2024), such as the omnipresent facial and hand gestures used by speakers accompanying spoken language, and the inherently visual nature of sign languages. The bodily modality, due to affordances of visual-bodily signals, can express meaning in holistic, iconic and indexical ways and relies on integration of simultaneous expressions produced by multiple articulators in both spoken and sign languages. As such, these expressions from the sequential and arbitrary nature stereotypical of spoken/textual expressions.
In this talk I will outline how producing and perceiving linguistic expressions with this added complexity has implications for the language-(neuro)cognition interface, production and comprehension processes, and how we think about models of communicative efficiency shaping language structures. Furthermore, ignoring a multimodal view of language is also a problem for interfacing current text/speech-based large language models (LLM) with human language, cognition and communication, especially in an era where human communication gets more and more integrated with virtual avatars and robots. Thus I will end the talk by discussing how the proposed multimodal language framework can offer ways to extend LLMs to multimodal language use.