A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH(3)

时间:2025-03-09

Room reverberation causes two perceptual distortions on clean speech: Coloration and long-term reverberation. These two effects correspond to two physical variables: Signal-toreverberant energy ratio (SRR) and reverberation time, respectively. Based on thi

¬

TableI.ThesystematicresultsofreverberantspeechenhancementforspeechutterancesoffourfemaleandfourmalespeakersrandomlyselectedfromtheTIMITdatabase.Allsignalsaresampledat8kHz.Speaker/GenderFemale#1Female#2Female#3Female#4Male#1Male#2Male#3Male#4Average

SNRrevfw

SNRYMfw

proc

SNRfw

rev

SNRYMfw

proc rev

SNRfw

short-termphasespectrumofenhancedspeechissettothatofinverse-filteredspeechandtheprocessedspeechisreconstructedfromtheshort-termmagnitudeandphasespectrum.

3.RESULTSANDDISCUSSIONS

Acorpusofspeechutterancesfromeightspeakers,fourfemalesandfourmales,rmallisteningtestsshowthattheproposedalgorithmachievessubstantialreductionofreverberationandhaslittleaudibleartifacts.Toillustratetypicalperformance,weshowtheenhancementresultsinFig.2.Fig.2(a)and(c)showthecleanandthereverberantsignalandFig.2(b)and(d),thecorrespondingspectrograms,respectively.ThereverberantsignalisproducedbyconvolvingthecleansignalandtheroomimpulseresponsefunctioninFig.1(a)withT60=0.3s.Ascanbeseen,whilethecleansignalhasfineharmonicstructureandsilencegapsbetweenthewords,thereverberantspeechissmearedanditsharmonicstructureiselongated.

Toputourperformanceinperspective,wecomparewitharecentone-microphonereverberantspeechenhancementalgorithmproposedbyYegnanarayanaandMurthy[16].WerefertothisalgorithmastheYMalgorithm.TheYMalgorithmappliesweightstoLPresidualsothattheyresemblemorecloselythedampedsinusoidalpatternsofLPresidualfromcleanspeech.Fig.2(e)and(f)showtheprocessedspeechusingtheYMalgorithmanditsspectrogram,respectively.Ascanbeseen,spectralstructureisclearerandsomesilencegapsareattenuated.TheprocessedspeechusingouralgorithmanditsspectrogramareshowninFig.2(g)and(h).Ascanbeseen,theeffectsofreverberationhavebeensignificantlyreducedintheprocessedspeech.Thesmearingislessenedandmanysilencegapsareclearer.ThefigureclearlyshowsthatouralgorithmenhancesthereverberantspeechmorethandoestheYMalgorithm.Anaudiodemonstrationalsocanbefoundathttp://www.cse.ohio-state.edu/~dwang/demo/WuReverb.html.

Quantitativecomparisonsareobtainedfromthespeechutterancesoftheeightspeakersseparatelyutilizingfrequency-weightedsegmentalSNR[14]andpresentedinTableI.SNRrevfw,

(dB)

-3.64-3.51-3.86-4.12-3.86-3.33-3.30-3.50-3.64(dB)-3.06-3.05-3.19-3.29-2.65-2.68-2.53-2.76-2.90(dB)0.920.74-0.200.73-0.921.771.20-0.130.51(dB)0.580.460.680.831.210.650.760.750.74(dB)4.564.253.664.842.945.104.493.384.15

Theshiftdelayρindicatestherelativedelayofthelate-impulsecomponents.Thedistinctionofearlyandlatereflectionsforspeechiscommonlysetatadelayof50msinaroomimpulseresponsefunction[11].Thisdelayreflectsthepropertiesofspeechandisindependentfromreverberationcharacteristics.Consequently,ittranslatestoapproximately7framesforashiftintervalof8ms,andwechooseρ=7asaresult.Finally,thescalingfactor specifiestherelativestrengthofthelate-impulsecomponentsafterinversefilteringandwesetitto0.32.

Consideringtheshapeoftheequalizedimpulseresponse,wechooseanasymmetricalsmoothingfunctionastheRayleighdistribution:

§ i+a2i+a­¨

°w(i)=a2exp¨2a2

©®

°¯w(i)=0

·

¸¸¹

ifi> aotherwise

,(2)

¬

wherewechoosea=5anditcontrolsthespanofthesmoothingfunction.Thissmoothingfunctiongoesdowntozeroontheleftsidequicklybuttailsoffslowlyontherightside;therightsideofthesmoothingfunctionresemblestheshapeofreverberationtailsinequalizedimpulseresponses.

Assumingtheearly-andlate-impulsecomponentsareapproximatelyuncorrelated.,thepowerspectrumoftheearly-impulsecomponentscanbeestimatedbysubtractingthepowerspectrumofthelate-impulsecomponentsfromthatoftheinverse-filteredspeech.Theresultsarefurtherusedasanestimateofthepowerspectrumoforiginalspeech.Specifically,spectralsubtraction[7]isemployedtoestimatethepowerspectrumoforiginalspeechS~x(k;i):

2

SNRYMandfw,proc

SNRfw

representthefrequency-weighted

segmentalSNRvaluesofreverberantspeech,theprocessedspeechusingtheYMalgorithm,andtheprocessedspeechusingouralgorithm,respectively.TheSNRgainsbyemployingthe

rev

andYMalgorithmandouralgorithmaredenotedbySNRYMfw

proc rev

SNRfw,respectively.Ascanbeseen,theYMalgorithm

S~x(k;i)=Sz(k;i)

2

2

ªS(k;i)2 γw(i ρ) S(k;i)2º

zz

max«,ε»,(3)2

«»Szk;i¬¼

whereε=0.001isthefloorandcorrespondstothemaximum

attenuationof30dB.

Naturalspeechutterancescontainsilentgaps,andreverberationfillssomeofthegapsrightafterhigh-intensityspeechsections.Weidentifythesesilentgapsbyexaminetheenergyofinverse-filteredspeechandenergyreductionradioafterspectralsubtractioninatimeframe.Foridentifiedsilentframes,allfrequencybinsareattenuatedby30dB.Finally,the

obtainsanaverageSNRgainof0.74dBcomparedtothatof4.15dBbyouralgorithm.

Althoughouralgorithmisdesignedforenhancingreverberantspeechusingonemicrophone,itisstraightforwardtoextenditintomulti-microphonescenarios.Manyinversefilteringalgorithms,suchasthealgorithmbyGillespieetal.[8],areoriginallyproposedusingmultiplemicrophones.Afterinversefilteringusingmultiplemicrophones,thesecondstageofouralgorithm–thespectralsubtractionmethod–canbeutilizedforreducinglong-termreverberationeffects.

…… 此处隐藏:3124字,全部文档内容请下载后查看。喜欢就下载吧 ……
A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH(3).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219