Noise Subspace Fuzzy C-means Clustering for Robust Speech Re(7)

时间:2025-07-11

Abstract. In this paper a fuzzy C-means (FCM) based approach for speech/non-speech discrimination is developed to build an effective voice activity detection (VAD) algorithm. The proposed VAD method is based on a soft-decision clustering approach built ove

100PAUSE HIT RATE (HR0)80G.729 AMR1 AMR2 AFE (Noise Est.) AFE (frame-dropping) Li Marzinzik Sohn Woo FCM-VAD6040200 0 10 20 30 40 50FALSE ALARM RATE (FAR0)Fig. 3. ROC curves of proposed FCM-VAD in high noisy conditions for m = 8, K = 32 and C = 2 and comparison to standard and recently reported VADsConclusionsA new VAD for improving speech detection robustness in noisy environments is proposed. The proposed FCM-VAD is based on noise modeling using FCM clustering and bene ts from long term information for the formulation of a soft decision rule. The proposed FCM-VAD outperformed Sohn’s VAD, that de nes the LRT on a single observation, and other methods including the standardized G.729, AMR and AFE VADs, in addition to recently reported VADs. The VAD performs an advanced detection of beginnings and delayed detection of word endings which, in part, avoids having to include additional hangover schemes or noise reduction blocks. Obviously it also will improve the recognition rate when it is considered as part of a complete speech recognition system. The discrimination analysis or the ROC curves are e ective to evaluate a given algorithm, the in uence of the VAD in a speech recognition system depends on its discrimination accuracy [12]. Thus the proposed VAD improves the recognition rate when it is used as a part of a Automated Speech Recognition (ASR) system.AcknowledgementsThis work has received research funding from the EU 6th Framework Programme, under contract number IST-2002-507943 (HIWIRE, Human Input that Works in Real Environments) and SESIBONN and SR3-VoIP projects (TEC200406096-C03-00, TEC2004-03829/TCM) from the Spanish government. The views expressed here are those of the authors only. The Community is not liable for any use that may be made of the information contained therein.References1. ETSI, Voice Activity Detector (VAD) for Adaptive Multi-Rate (AMR) Speech Traf c Channels, 1999, ETSI EN 301 708 Recommendation.

…… 此处隐藏:215字,全部文档内容请下载后查看。喜欢就下载吧 ……
Noise Subspace Fuzzy C-means Clustering for Robust Speech Re(7).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219