Communications Project

Document Type:Master's Thesis
Name:Hope L. Doe
Title:Evaluating the Effects of Automatic Speech Recognition Word Accuracy
Degree:Master of Science
Department:Industrial and Systems Engineering
Committee Chair: Dr. Brian M. Kleiner
Committee Members:Dr. Andrew W. Gellatly
Dr. Robert C. Williges
Keywords:Automatic speech recognition, word accuracy, user satisfaction
Date of defense:July 10, 1998
Availability:Release the entire work for Virginia Tech access only.
After one year release worldwide only with written permission of the student and the advisory committee chair.


Automatic Speech Recognition (ASR) research has been primarily focused towards large-scale systems and industry, while other areas that require attention are often over-looked by researchers. For this reason, this research looked at automatic speech recognition at the consumer level. Many individual consumers will purchase and use automatic software recognition for a different purpose than that of the military or commercial industries, such as telecommunications. Consumers who purchase the software for personal use will mainly use ASR for dictation of correspondences and documents. Two ASR dictation software packages were used to conduct the study. The research examined the relationships between (1) speech recognition software training and word accuracy, (2) error-correction time by the user and word accuracy, and (3) correspondence type and word accuracy. The correspondences evaluated were those that resemble Personal, Business, and Technical Correspondences. Word accuracy was assessed after initial system training, five minutes of error-correction time, and ten minutes of error-correction time.

Results indicated that word recognition accuracy achieved does affect user satisfaction. It was also found that with increased error-correction time, word accuracy results improved. Additionally, the results found that Personal Correspondence achieved the highest mean word accuracy rate for both systems and that Dragon Systems achieved the highest mean word accuracy recognition for the Correspondences explored in this research. Results were discussed in terms of subjective and objective measures, advantages and disadvantages of speech input, and design recommendations were provided.

List of Attached Files


At the author's request, all materials (PDF files, images, etc.) associated with this ETD are accessible from the Virginia Tech network only.

The author grants to Virginia Tech or its agents the right to archive and display their thesis or dissertation in whole or in part in the University Libraries in all forms of media, now or hereafter known. The author retains all proprietary rights, such as patent rights. The author also retains the right to use in future works (such as articles or books) all or part of this thesis or dissertation.