T10: Practical Speech User Interface Design for Conversational Systems

Half Day Tutorial

James R. Lewis (short bio)
IBM Software Group, USA


Taught by the author of "Practical Speech User Interface Design", the objective of this course is to provide a basic foundation in current leading practices (many of which are not intuitive) in speech user interface design for interactive voice response applications. Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, this course will provide a comprehensive yet concise survey of practical speech user interface (SUI) design, including practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. The techniques for designing usable SUIs are not obvious, and to be effective, must be informed by a combination of critically interpreted scientific research and leading design practices.

Students will learn about the foundations of SUI design (technologies and key concepts in linguistics and communication), important overall aspects of SUI design, how to get started (high-level design decisions: barge-in, speech output methods, speech recognition methods, prompting styles, help styles, role of call center agents), specific aspects of design (low-level design decisions: creating introductions, avoiding poor practices, timing issues, dialog design, confirming input), and will participate in after-lecture exercises to try out new skills.

Content and Benefits:

The course will begin with an introduction to speech user interface design fundamentals, including speech technologies and key issues from psycholinguistics and conversational pragmatics. The next goal is to provide background in self-service technologies and associated market and psychological research which provides additional foundation for IVR design decisions. Design decisions for speech-enabled IVR include high-level and low-level decisions. Important high-level design decisions include decisions related to barge-in methods, use of recorded prompts versus synthesized speech (and when to combine them), simple versus complex speech recognition, concise versus verbose prompting styles, use or non-use of global navigation commands, how to provide help, and the role of call-center agents. Low-level (detailed) design topics will include creating introductions (and avoiding poor practices in introductions), timing issues, dialog design, designing effective menus and prompts, and confirming user input. After covering this material, attendees will participate in class exercises in the crafting of introductions, designing menus, and conducting informal Wizard-of-Oz evaluations.

  1. Introduction
    1. Speech technologies
    2. Key concepts in human language and communication
  2. Self-service technologies
    1. Satisfaction with and adoption of self-service technologies
    2. Waiting for service
    3. Service recovery
    4. Consequences of forced use of self-service technologies
  3. Getting started: High-level design decisions
    1. Choosing the barge-in style
    2. Selecting recorded prompts or synthesized speech
    3. Simple versus complex speech recognition
    4. Concise versus verbose prompting styles
    5. Speech versus speech plus touchtone
    6. Global navigation commands
    7. Help mode versus self-revealing help
    8. Role of human agents in a deployed system
  4. Getting specific: Low-level design decisions
    1. Creating introductions
    2. Avoid poor practices in introductions
    3. Getting the right timing
    4. Designing dialogs
    5. Constructing appropriate menus and prompts
    6. Confirming user input
  5. Classroom exercises
    1. Design an introduction
    2. Design a menu
    3. Conduct a WOZ evaluation
  6. Wrapping up

Target Audience:

Attendees are not expected to have a background in speech recognition or the design of IVR applications. That said, there has been enough new research over the past 10 years that people with extensive experience in IVR design will likely find new information to inform their design practices.

Bio Sketch of Presenter:

Jim Lewis

James R. (Jim) Lewis is currently a senior human factors engineer (at IBM since 1981), with a primary focus on the design and evaluation of user interfaces (graphical, spoken, mobile). He is a Certified Human Factors Professional with a Ph.D. in Experimental Psychology (Psycholinguistics). Jim is an internationally recognized expert in usability testing and measurement. He was the technical team lead for the human factors/usability group working in IBM speech product development from 1999 through 2005, and has experience in all areas of speech system usability (including desktop systems, embedded systems, text-to-speech systems, speech interactive voice response applications, and natural language understanding technologies). Before that, he was the lead user experience designer for a number of mobile products, including the product now widely regarded as the first smart phone, the Simon. He is the author of the books Practical Speech User Interface Design (2011) and Quantifying the User Experience: Practical Statistics for User Research (2012).

