Noise Subspace Fuzzy C-means Clustering for Robust Speech Re(7)
时间:2025-07-11
时间:2025-07-11
Abstract. In this paper a fuzzy C-means (FCM) based approach for speech/non-speech discrimination is developed to build an effective voice activity detection (VAD) algorithm. The proposed VAD method is based on a soft-decision clustering approach built ove
100PAUSE HIT RATE (HR0)80G.729 AMR1 AMR2 AFE (Noise Est.) AFE (frame-dropping) Li Marzinzik Sohn Woo FCM-VAD6040200 0 10 20 30 40 50FALSE ALARM RATE (FAR0)Fig. 3. ROC curves of proposed FCM-VAD in high noisy conditions for m = 8, K = 32 and C = 2 and comparison to standard and recently reported VADsConclusionsA new VAD for improving speech detection robustness in noisy environments is proposed. The proposed FCM-VAD is based on noise modeling using FCM clustering and bene ts from long term information for the formulation of a soft decision rule. The proposed FCM-VAD outperformed Sohn’s VAD, that de nes the LRT on a single observation, and other methods including the standardized G.729, AMR and AFE VADs, in addition to recently reported VADs. The VAD performs an advanced detection of beginnings and delayed detection of word endings which, in part, avoids having to include additional hangover schemes or noise reduction blocks. Obviously it also will improve the recognition rate when it is considered as part of a complete speech recognition system. The discrimination analysis or the ROC curves are e ective to evaluate a given algorithm, the in uence of the VAD in a speech recognition system depends on its discrimination accuracy [12]. Thus the proposed VAD improves the recognition rate when it is used as a part of a Automated Speech Recognition (ASR) system.AcknowledgementsThis work has received research funding from the EU 6th Framework Programme, under contract number IST-2002-507943 (HIWIRE, Human Input that Works in Real Environments) and SESIBONN and SR3-VoIP projects (TEC200406096-C03-00, TEC2004-03829/TCM) from the Spanish government. The views expressed here are those of the authors only. The Community is not liable for any use that may be made of the information contained therein.References1. ETSI, Voice Activity Detector (VAD) for Adaptive Multi-Rate (AMR) Speech Traf c Channels, 1999, ETSI EN 301 708 Recommendation.
…… 此处隐藏:215字,全部文档内容请下载后查看。喜欢就下载吧 ……上一篇:动物有思想吗?|经典回顾
下一篇:料液储罐液位显示装置