Voice Application Design (SAP Library - Developing Voice-Enabled Applications)

Voice Application Design

This document provides an overview of typical roles in the voice-development environment, voice-application design concerns, strategies for fulfilling business requirements, design-effort trade-offs, and design basics for voice applications.

Roles

You can define roles as part of your voice-project management in development. Some typical voice-application roles are:

● VUI Designer

● VUI Implementer

● Voice Talent

● Quality Assurance Responsible

● R/3 or Web Service Expert

● Business Process Expert (BPX)

Design Concerns

These are the major voice-application design concerns:

● Objective

What is the voice application supposed to accomplish?

● Audience analysis

What are your caller target groups? What callers will you get?

● Content management

What content, structures, and hierarchies do you need to organize?

● Navigation architecture

How do you want the caller to move through the application? What choices will you offer the caller?

● Dialog flow and design

What voice personality, prompts, grammars, error messages, or help will you use in the application?

Note

Grammars are either spoken words or touch-tone input (from the caller) that the system can recognize. The NetWeaver Voice resources include a grammar that you can download during installation. You can integrate third-party grammars into your voice-application model as well.

Business Requirements

To fulfill the design objective you need to gather the business requirements. Here some suggestions for various situations:

● If you are replacing an existing phone service, call this service and document all the features you find in the application.

If you have access to design specifications of the old service, use them, but note that design specifications often lose accuracy over time.

● Get a phone tap.

This will help you to record any existing system for easy analysis. It will also help you to document bugs later in the process.

● During requirements-gathering sessions in a voice project, be prepared to discuss prompts and system responses.

Customers often concentrate on what the system will say.

● In bank applications, you need to know whether the bank will allow users to transfer funds.

If transfers are allowed, it is important to know the exact conditions under which the bank will permit funds be transferred. Consider possible causes of failure early.

Design Effort

The design effort for voice applications depend in part on the technology you want to implement. Here are comparisons and suggestions, considering cost and quality:

● Speech recognition versus DTMF (touch-tone):

○ DTMF is cheaper.

○ Speech recognition is more powerful (for example Amtrak or United Airlines).

○ Speech recognition systems usually have better customer satisfaction.

● Text To Speech (TTS) versus recorded prompts:

○ TTS is cheaper.

○ TTS is more flexible and more easily handles dynamic data.

○ TTS is the only practical alternative for handling extremely unpredictable data, such as is encountered when reading e-mails.

○ Recorded prompts sound better.

○ Recorded prompts are easier to understand and less tiring to listen to.

● Computer Telephony Integration (CTI):

CTI is necessary if you have call-center agents. Properly incorporating CTI in the voice application is key to a good customer experience.

Example

For example, avoid situations in which callers provide customer data first to the system and then to the operator.

Design Basics

Here are some Voice User Interface (VUI) design basics to consider:

● Serial nature of communication modality

○ More information means a longer time to convey information.

○ A VUI cannot be scanned quickly like a GUI.

● VUIs tax memory.

● Natural language increases usability, but also the cost of development.

Here are some VUI-design dos and don'ts:

● Limit length of lists to no more than five items.

● All elements in a list should be logically related to each other.

● Group elements within a list logically.

● Be consistent with parts of speech inside a list.

● Prompting:

○ Do not prompt the caller as follows:

You can say "create an order", "order status", or "my account".

○ Do prompt the caller as follows:

You can "create an order", "check the order status", or "check my account".

● Fit the register to the brand of the company and the service that you are providing.

Formality is one prominent aspect of register.

● Language usage:

○ Use contractions, such as don’t (instead of do not), I’m gonna, go ahead and <…>, if you wanna <…>.

○ Use naturally sounding dialog to put the caller at ease.

○ Do not use overly formal or bureaucratic language.

○ Adapt the language to fit the company image and service to be provided.

○ Do not say Your call is very important to us.

○ Do not say Please listen carefully as our menu options have changed.

● Importance of first and last elements:

○ The elements should be presented with the most frequently chosen option first.

○ The first element in the list will set the user’s expectations for the rest of the list. (priming).

○ The most recent element is the best remembered one in a list (Recency).

○ User may not listen to last items, when they have made a choice.

● Designing for specific dialog strategies with speech recognition systems:

○ When designing menus, choose items that are not too similar acoustically.

This may be difficult because what sounds the same to a human may not be what sounds the same to a computer.

○ In general, longer words are recognized better.

They are more acoustically distinct.

○ Certain information is not conveyed or not conveyed well over a phone line.

○ Good menu styles (pick one and be consistent):

■ You can say "check my account balance", "report a problem", or "talk to an operator”.

■ You can say "fund transfer", "account balance", or "operator".

● Combining designs (using both DTMF and speech):

○ Use DTMF as a fallback for speech when possible.

○ Prompt for DTMF right away, or only on the reprompt.

○ Provide expert users with DTMF shortcuts.

○ Be consistent with mapping DTMF commands to menus.

For example, use You can say "fund transfer" or press 1, "account balance" or press 2, or "operator" or press 3. This would be predictable for an expert user even if you do not play the DTMF choices since they are sequential.

● VUI Design book recommendations:

○ Voice User Interface Design by Michael H. Cohen, James P. Giangola, Jennifer Balogh.

○ Voice Interaction Design: Crafting the New Conversational Speech Systems by Randy Allen Harris.

○ The Art and Business of Speech Recognition: Creating the Noble Voice by Blade Kotelly.

Error-Condition Considerations

As opposed to the development of most software systems, if you use recorded audio in voice-application design, making new prompts for the error conditions is not trivial. At the same time, it is often not possible to have the business owners identify all potential error conditions you might need to report to the user.

Note

You may want to budget at least two separate recording sessions, one after the sign off of the design document and one at the end of final bug testing. Any modifications to the system between those times can be handled by TTS or by recording temporary prompts in your own voice.