Welcome to the home page for

Modifying Speech to Improve Intelligibility in Noise

Departments of Electrical and Computer Engineering and Communication Sciences and Disorders

University of Pittsburgh

 

Stimulus changes in a speech signal represent transitions around consonants, between vowels and consonants, and within vowels. We believe that these transitions may be important acoustic cues for speech intelligibility, but they are characterized by relatively low energy and are easily obscured by noise. This research project is examining the effects of amplification of a speech transition component on the intelligibility of speech in noise.

 

We assume that speech is the sum of two components: S(t) = SQSS(t) + Stran(t),

where SQSS(t) is a quasi-steady-state component that includes the energy in vowels and hubs of consonants, and Stran(t) is a transient component that captures energy of transitions between vowels and consonants and within vowels. We assume that Stran(t) represents edges in the time-frequency domain of the speech signal, and we have used time-frequency methods of analysis to develop algorithms to identify it.

 

Modified speech is created by identifying the transient component, selectively amplifying it, and combining it with the original speech.

 

            Transient component obtained using time-varying adaptive filters  (Yoo et al., Proc. ICASSP-2005, pp. I69-I72, Philadelphia, PA)

 

            Example of modified speech in noise (Yoo’s transient)

 

            Transient component obtained using HMMs of speech transients (Tantibundhit et al., ICASSP-2006, pp. I-833-836, Toulouse, France)

           

           Transient component obtained using a wavelet-based approach (Rasetshwane et al., 2006 IEEE EMBS Ann Conf, pp. 1727-1730, New York)

 

            Participants

 

Supported by Grant N000140310277 from the Office of Naval Research