Current Research: a perception study of speech produced by a person with a speech disorder

Participants are currently sought to complete speech perception tasks that involve listening and transcribing disordered speech. Participants will be paid $10/hour for 3 sessions lasting 1-2 hours each. All three experimental sessions will take place at the Speech and Hearing Science Building. Session 2 will take place 1 week after Session 1 and Session 3 will take place 1 month after Session 2.

Potential participants must be:
1) be native speakers of American English,
2) be between 18 and 40 years old,
3) have no identified language, learning, or cognitive disability,
4) have no more than incidental experience with persons having speech disorders,
5) have grown up in the following regions: central and southern Ohio, central Indiana, central or northern Illinois, Iowa, northern Missouri, Nebraska or northern Kansas,
6) pass a hearing screening to verify normal hearing ability.

If interested, contact nanney2@illinois.edu.

Previous research

Since I completed my PhD in 2006, I have worked on Universal Access Automatic Speech Recognition Project for Talkers with Dysarthria. ASR development has provided a useful human-computer interface especially for people who have physical difficulties in typing with a keyboard. However, individuals with a neuromotor disorder (e.g., cerebral palsy, traumatic brain injury) have not been able to utilize the benefit of these advances because their symptoms include motor speech disorder: i.e. dysarthria. Dysarthria is characterized by distorted speech due to imprecise articulation of phonemes and monotonic or excessive variation of loudness and pitch. Although dysarthria can differ notably from normal speech, the articulation errors are generally not random. We thereby envision the advantage of using ASR, even for speech that is highly unintelligible for human listeners. Whereas previous ASR systems on dysarthria have targeted mostly small vocabulary sizes, we aim to develop large-vocabulary dysarthric ASR systems that will ultimately allow users to enter unlimited text into a computer. Constructing a database with a variety of word categories is necessary for our research. So far we have recorded 16 speakers with cerebral palsy-associated dysarthria in both audio and video (Kim et al. 2008), and we have made further innovations recently by recording muscle and kinematic movements of dysarthric speech in additoin to acoustic recording (Kim et al. 2010). We expect our database to be a vital resource for improving ASR systems for people with neuromotor disabilities. Our research will continue to assess the effectiveness of various ASR algorithems for speakers of varying severity.
I have also investigated the acoustic, articulatory and perceptual characteristics of dysarthric speech. Studies include a frequency analysis of consonant articulation errors with respect to phonological categories and a speaker's intelligibility, a kinematic analysis of tongue movement control, and acoustic and perceptual analyses of vowels, lexical stress and fricatives in dysarthria. The outcome of these studies will be to reveal the characteristics of articulation errors, and to help clinicians base their decisions on objective data and maximize the efﬁciency of clinical treatments.

I have been also participating in a project on generating the articulatory feature-level transcriptions of speech. Goals for this project include 1) to train pronunciation models separately from acoustic models, and 2) to study asynchrony among articulators’ movements and reduction effects in spontaneous speech. Articulatory feature-based transcriptions will serve to increase the word recognition accuracy in automatic speech recognition systems, compared to systems based on acoustic feature only. In addition, our transcriptions will contribute to better understand coarticulation and reduction phenomenon in speech production. To achieve these goals, I have been particularly focusing on 1) training transcribers, 2) transcribing additional data (Switchboard), 3) updating the transcription protocol and interface, and 4) making a version of the interface that can be used online across sites.