Die nächsten Events
02.04.2012, 16:00–18:00
Vortrag
-
Reserviert für die Arbeitskreise
13.04.2012, 10:00–12:00
Vortrag
-
weitere Informationen folgen ...
16.04.2012, 16:00–18:00
Vortrag
-
Bericht A1 + A3
see more
Humans intuitively combine language with spontaneous gesture to form multimodal utterances. In such utterances, words and gestures are highly coordinated and closely intertwined - in other words, aligned to each other by the human speaker. These alignments concern the meaning that the verbal and non-verbal behaviours convey, the form they take up in doing so, the manner in which they are performed, their relative temporal arrangement, and their coordinated organization in a phrasal structure of utterance. Their effects are essential for how meaning is communicated by both modalities concertedly. The resulting confluence of language and gesture has led many researchers (e.g. McNeill, 2005) to believe that speech and gesture are produced by one and the same generative process. Still, how language and gesture do exactly interact in producing a coherent multimodal utterance is an open question. The goal of subproject B1 is to investigate the intra-personal mechanisms that underlie the composition of a multimodal utterance in dialogue. Concretely, we investigate the following research questions:
B1 investigates these topics by empirical study of human multimodal behavior and the conception
and simulation of computational models in virtual humans. Empirical studies elicit sets of
dialogue games. Video and VR tracking data are annotated in order to extract statistically
significant patterns. Based on the behavioral units found in the data, the generation processes
that turn content representations and communicative intentions into verbal and gestural behavior
are modeled both theoretically and computationally, informing the implementation of a prototype
simulation system with
our virtual human Max.
Based on an empirical study on spatial descriptions of landmarks in direction-giving, our model allows virtual humans to automatically generate coordinated language and iconic gestures. The model is characterized by a close interplay between these two modes of expressiveness: We utilize two different kinds of content representation, visuo-spatial imagery and propositional-linguistic knowledge. Further, specific planners carry out the formulation of concrete verbal and gestural behavior. Both, content planning and formulation processes, run in parallel and interact on a multimodal working memory. In gesture formulation we apply a novel probabilistic approach which incorporates not only systematic factors constraining the mapping of visuo-spatial referent properties onto gesture morphology, but also accounts for the role of idiosyncratic patterns in multimodal behavior.
Click
here for a video demonstrating the simultated gesturing behavior of a particular speaker
from our empirical data.
