no-sensitive to phase distortion, phase signal of noise is implemented when phase is restored [5][6]。 In the frequency domain, equation (5) is expressed as
5。2。 Proposed VAD based on improved spectral
transforms (DFT) of y(i) , s(i) and d (i) , respectively。
Because s(i) and d (i) are independent and Nk is gauss distribution, equation (6) is expressed as (7) in frequency domain。
On the basis of section above, we can firstly apply spectral subtraction for noisy sound of CVR to reducing noise and enhance speech, and then enhanced signal is filtered by a preceding filter, finally cockpit voice is extracted by means of double thresholds VAD。 Figure 2
shows the flow chart of proposed VAD。 The preceding
filter is a high-pass filter, such as10。9375z1 , which
can filter low-frequency interference, especially
For a frame of speech signal, we
interference of frequency 50Hz or 60Hz, and advance spectrum of high frequency which is useful for cockpit
For a wonderful VAD, two requirements must be taken into considered comprehensively: to detect more speech
where n (k ) is statistical mean of unvoiced speech,
2
Sk is amplitude of enhanced speech。
However, basic SS can generate much musical noises in residual noises。 Some modified SS are proposed to reduce effect。 Weighting factorand power coefficient
sections and more unvoiced speech sections。 However, when VAD tries to detect more speech frames, it misjudges silence as speech or otherwise。 The latter is ever worse than the former for accident investigation。 Therefore, two evaluation standards are compared to weighing quantificationally the performance of VAD:
are introduced into SS, so equation (8) is modified as
probability of correctly detecting speech frame Pcs
probability of correctly detecting noise frame Pcn , which are expressed as
Modified SS is degraded to basic SS when =2 and =1[5]。 Other modified SS is showed in relative
references [6][7]。 Better enhancement performance can be gained by adjusting two parameters suitably, but voice
where N handand N handare relatively the overall
distortion becomes severer as the degree of noise reduction is larger。
5。Proposed VAD based on improved spectral subtraction
5。1。Improved spectral subtraction
In this paper, we propose iterative spectral subtraction to formerly reducing noise and enhancing speech。 This method uses basic SS or modified SS for appropriate times。 The former enhanced speech becomes latter input signal, so music noise is seen as input noise to reduce again。
number of hand-labeling speech frames and noise frames
by hand-labeling, N1 and N0 are relatively number of being detected correctly by VAD。
6。2。 Experiment results
In this paper, a section of speech in car (SNR =8) from standard voice bank Aurora2 and a section of true cockpit sound are used, simulation experiments based on traditional double thresholds VAD only and the proposed VAD are carried out。 Figure 3 and table 2 compare the performance of various methods in different environment。 Due to former spectral subtraction, the SNR increases, the curves of STE and ZCR become smoother and proper probability increases。