Noise Subspace Fuzzy C-means Clustering for Robust Speech Re(4)

时间:2025-07-11

Abstract. In this paper a fuzzy C-means (FCM) based approach for speech/non-speech discrimination is developed to build an effective voice activity detection (VAD) algorithm. The proposed VAD method is based on a soft-decision clustering approach built ove

11.5 11.5 11 10.5 10 10 log(E) 9.5 9 8.5 8 8.5 7.5 7 20 Bands 5 8 1 20 Bands 40 3 4 2 frames 1110.59.5940201510 framesFig. 1. a) 20 log Energies of noise frames, computed using NF F T = 256, averaged over 50 subbands. b) Clustering approach applied to the a set of log-energies using hard decision CM (C=4 prototypes).Thus, the loss function is minimized by assigning the N observations to the C prototypes with a certain degree of membership in such a way that within each prototype the average dissimilarity of the observations Dij is minimized. Once convergence is reached, N K-dimensional pause frames are e ciently modeled by C K-dimensional noise prototype vectors denoted by mc , c = 1, . . . , C. In gure 1 we observed how the complex nature of noise can be simpli ed (smoothed) using a clustering approach (hard CM). The clustering approach speeds the decision function in a signi cant way since the dimension of feature vectors is reduced substantially (N → C).3.2Soft Decision function for VADIn order to classify the second labeled data (energies of speech frames) we use a sequential algorithm scheme using a MO window centered at frame l, as shown in section 2. For this purpose let consider the same dissimilarity measure, a threshold of dissimilarity γ and the maximum clusters allowed K = 2. Let E(l) be the decision feature vector that is based on the MO window as follows: E(l) = max{E(i)}, i = l m, . . . , l + m (6) This selection of the feature vector describing the actual frame is useful as it detects the presence of voice beforehand (pause-speech transition) and holds the detection ag, smoothing the VAD decision (as a hangover based algorithm [7, 6] in speech-pause transition).

Noise Subspace Fuzzy C-means Clustering for Robust Speech Re(4).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219