Modifying
Speech to Improve Intelligibility in Noise
Departments
of Electrical and Computer Engineering and Communication Sciences and Disorders
Stimulus changes in a
speech signal represent transitions around consonants, between vowels and
consonants, and within vowels. We believe that these transitions may be
important acoustic cues for speech intelligibility, but they are characterized
by relatively low energy and are easily obscured by noise. This research
project is examining the effects of amplification of a speech transition
component on the intelligibility of speech in noise.
We assume
that speech is the sum of two components: S(t) = SQSS(t) + Stran(t),
where SQSS(t) is
a quasi-steady-state component that includes the energy in vowels and hubs of
consonants, and Stran(t) is a transient component that captures energy of
transitions between vowels and consonants and within vowels. We assume that Stran(t)
represents edges in the time-frequency domain of the speech signal, and we have
used time-frequency methods of analysis to develop algorithms to identify it.
Modified
speech is created by identifying the transient component, selectively
amplifying it, and combining it with the original speech.
Transient
component obtained using time-varying adaptive filters (Yoo et al., Proc. ICASSP-2005, pp. I69-I72,
Example
of modified speech in noise (Yoo’s transient)
Transient
component obtained using HMMs of speech transients (Tantibundhit et al.,
ICASSP-2006, pp. I-833-836,
Transient component
obtained using a wavelet-based approach (Rasetshwane et al., 2006 IEEE EMBS
Ann Conf, pp. 1727-1730, New York)
Supported
by Grant N000140310277 from the Office of Naval Research