Disney Research



This paper examines the extent to which computer speech recognition errors for children’s speech can be attributed to common phonological effects associated with language acquisition. Recognition results are presented for three corpora of children’s speech, two comprising recordings of American English spoken by five- to nine-year-olds and one comprising recordings of British English speech from children aged five and six. The results are compared with adult reference confusion matrices based on TIMIT for the first two experiments and with confusion matrices for British adults and children with good speech for the third. They appear to be influenced by three factors: (i) confusions that are predictable from phonological factors associated with language acquisition also arise from acoustic confusability (e.g. /k/ → /t/), (ii) the frequency of the phonological errors is expected to decrease with increasing age, and (iii) an accurate recogniser is more likely to detect a phonological error when it occurs than a less accurate one. Overall the percentage of errors attributable to phonological processes remains approximately constant in each experiment. However, the proportion of these errors that differ significantly from reference patterns increases with recognition accuracy and is greater for children who are judged to have poor speech.

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.