Web Survey Bibliography

Title Modeling Survey Respondents' Speech to Improve Speech Survey Interface
Year 2004
Access date 14.06.2004
Abstract Computer-based telephone interviewing systems of the future, with speech interfaces that recognize respondents' unconstrained speech, could exploit aspects of the respondents' speech to provide clarification only when it is needed. Respondents have been shown to subtly indicate their need for clarification with pauses and various speech disfluencies (Bloom & Schober, 2000); more generally, speakers of different ages have been shown to have different rates of speech and disfluency (Bortfeld et al., 2001). The current study tests whether models of different types of speakers ("stereotypes") could effectively identify and correct conceptual misalignment. The study uses a Wizard-of-Oz paradigm to simulate an advanced speech interviewing system; telephone respondents answer survey questions about facts and behaviors on the basis of fictional scenarios, so that response accuracy can be measured. The study contrasts interfaces that do not model respondents, and interfaces that rely on respondents to solicit clarification, with two kinds that model respondents to provide unsolicited clarification. One interface uses a generic, non-stereotyped respondent model to identify and correct cases of possible conceptual misalignment, and another uses a stereotyped respondent model to identify and correct respondents according to age. 120 respondents (60 aged 18-35, and 60 aged 65-80) answer survey questions presented in a synthesized voice by telephone. In the two modeling conditions, responses are monitored for indications of uncertainty, such as disfluent speech and inactivity, which serve as the basis of the respondent models. Scripted definitions are offered when disfluent responses appear to warrant further clarification. In the stereotyped condition, disfluent responses of the older group are treated differently from those produced by the younger group, allowing, for example, a longer average latency before deciding to help the respondent. Findings show when respondent modeling helps improve response accuracy in speech interfaces, and how modeling respondents according to group membership can aid detection of conceptual misalignment relative to modeling generic respondents.
Year of publication2004
Bibliographic typeConferences, workshops, tutorials, presentations
Print

Web Survey Bibliography - Voice technology (40)