A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH(2)

时间:2025-03-09

Room reverberation causes two perceptual distortions on clean speech: Coloration and long-term reverberation. These two effects correspond to two physical variables: Signal-toreverberant energy ratio (SRR) and reverberation time, respectively. Based on thi

determinedbysignal-to-reverberantenergyratio(SRR),whichistheratiobetweentheenergytravelingdirectlyfromasourcetoalistenerandtheenergyofallacousticreflectionsreachingthelistener,andinturn,itisdeterminedbytalker-to-microphonedistance.Shortertalker-to-microphonedistanceresultsinhigherSRRandlessspectraldeviation,hence,lesscoloration.

Consequently,weproposeatwo-stagemodeltodealwithtwotypesofdegradations–colorationandlong-termreverberation–inareverberantenvironment.Inthefirststage,ourmodelestimatesaninversefiltertoreducecolorationeffectsinordertoincreaseSRR.Thesecondstageemploysspectralsubtractiontominimizetheinfluenceoflong-termreverberation.

3.INVERSEFILTERING

Inthefirststageofouralgorithm,wederiveaninversefiltertoreducethereverberationeffectsandthisstageisadaptedfromamulti-microphoneinversefilteringalgorithmproposedbyGillespieatel.[8].AnFIRinversefilteroftheroomimpulseresponseisestimatedbymaximizingthekurtosisofthelinearprediction(LP)residualofspeechutilizingablockfrequency-domainadaptivefilter.Then,inverse-filteredspeechisobtainedbyconvolvingtheinversefilterwithreverberantspeech.

AtypicalresultfromthefirststageofouralgorithmisshowninFig.1.Fig.1(a)illustratesaroomimpulseresponsefunction(T60=0.3s)generatedbytheimagemodelofAllenandBerkley[1].Theequalizedimpulseresponse–theresultoftheroomimpulseresponseinFig.1(a)convolvedwiththeobtainedinversefilter–isshowninFig.1(b).Ascanbeseen,theequalizedimpulseresponseisfarmoreimpulse-likethantheroomimpulseresponse.Infact,theSRRvalueoftheroomimpulseresponseis–9.8dBincomparisonwith2.4dBforthatoftheequalizedimpulseresponse.

However,theaboveinversefilteringmethoddoesnotimproveonthetailpartofreverberation.Fig.1(c)and(d)showtheenergydecaycurvesoftheroomimpulseresponseandtheequalizedimpulseresponse,respectively.Ascanbeseen,exceptforthefirst50ms,theenergydecaypatternsarealmostidentical,andthustheestimatedreverberationtimesarealmostthesame,around0.3s.WhilethecolorationdistortionisreducedduetotheincreaseofSRR,thedegradationduetoreverberationtailsisnotalleviated.Inotherwords,theeffectofinversefilteringissimilartothatofmovingthesoundsourceclosertothereceiver.Inthenextsection,weintroducethesecondstageofouralgorithmtoreducetheeffectsoflong-termreverberation.

3.SPECTRALSUBTRACTION

Latereflectionsinaroomimpulseresponsefunctionsmearspeechspectrumanddegradespeechintelligibilityandquality.Likewise,anequalizedimpulseresponsecanbedecomposedintotwoparts:earlyandlateimpulses.Resemblingtheeffectsofthelatereflectionsinaroomimpulseresponse,thelateimpulseshavedeleteriouseffectsonthequalityofinverse-filteredspeech;byestimatingtheeffectsofthelateimpulsesandsubtractingthem,wecanexpecttoenhancethespeechquality.

Inapreviousversionofthisalgorithm,WuandWang[15]proposeaone-stagemethodtoenhancethereverberantspeechbyestimatingandsubtractingeffectsoflatereflections.

Thesmearingeffectsoflateimpulsesleadtothesmoothingofthesignalspectruminthetimedomain.Therefore,weassumethatthepowerspectrumoflate-impulsecomponentsisa

(a)

(b)

(c)

Time(ms)

(d)

Fig.1.(a)Aroomimpulseresponsefunctiongeneratedbytheimagemodelinanoffice-sizeroom.(b)Theequalizedimpulseresponsederivedfromthereverberantspeechgeneratedbytheroomimpulseresponsein(a)astheresultofthefirststageofouralgorithm.Energydecaycurves(c)thatcomputedfromtheroomimpulseresponsefunctionin(a).(d)Thatfromtheequalizedimpulseresponsein(b).EachcurveiscalculatedusingtheSchroederintegrationmethod.Thehorizontaldotlinerepresents–60dBenergydecaylevel.Theleftdashlinesindicatethestartingtimesoftheimpulseresponsesandtherightdashlinesthetimesatwhichdecaycurvescross–60dB.

¬

smoothedandshiftedversionofthepowerspectrumoftheinverse-filteredspeechzt:

()

Sl(k;i=γw(i ρ) Sz(k;i),

2

2

(1)

whereSz(k;i)

2

andSl(k;i)

2

are,respectively,theshort-term

powerspectraoftheinverse-filteredspeechandthelate-impulsecomponents.Indexeskandirefertofrequencybinandtimeframe,respectively.Thesymbol denotesconvolutioninthetimedomainandw(i)isasmoothingfunction.Theshort-termspeechspectrumisobtainedbyusinghammingwindowsoflength16mswith8msoverlapforshort-termFourieranalysis.

…… 此处隐藏:2287字,全部文档内容请下载后查看。喜欢就下载吧 ……
A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH(2).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219