Audio-Visual prosody of social attitudes in Vietnamese: building and evaluating a tones balanced corpus
INTERSPEECH 2009, page 2263--2266 - September 2009
This paper presents the building and a first evaluation of a tones balanced Audio-Visual corpus of social affect in Vietnamese language. This under-resourced tonal language has specific glottalization and co-articulation phenomena, for which interactions with attitudes prosody are a very interesting issue. A well-controlled recording methodology was designed to build a large representative audio-visual corpus for 16 attitudes, and one speaker. A perception experiment was carried out to evaluate a speaker's perceived performances and to study the role and integration of the audio, visual, and audio-visual information in the listener's perception of the speaker's attitudes. The results reveal characteristics of Vietnamese prosodic attitudes and allow us to investigate such social affect in Vietnamese language.
BibTex references
@InternationalConference{MARC09,
author = {MAC, D. and AUBERGE, V. and RILLIARD, A. and CASTELLI, E.},
title = {Audio-Visual prosody of social attitudes in Vietnamese: building and evaluating a tones balanced corpus},
booktitle = {INTERSPEECH 2009},
pages = {2263--2266},
month = {September},
year = {2009},
organization = {International Speech Communication Association (ISCA)},
address = {Brighton, U.K},
keywords = {Audio-visual corpus, expressive speech, attitudes, perception, Vietnamese},
url = {/2009/MARC09},
}
Other publications in the database