A TWO-STAGE ALGORITHM FOR ENHANCEMENT OF REVERBERANT SPEECH
时间:2025-03-09
时间:2025-03-09
Room reverberation causes two perceptual distortions on clean speech: Coloration and long-term reverberation. These two effects correspond to two physical variables: Signal-toreverberant energy ratio (SRR) and reverberation time, respectively. Based on thi
ATWO-STAGEALGORITHMFORENHANCEMENTOF
REVERBERANTSPEECH
MingyangWuandDeLiangWang
DepartmentofComputerScienceandEngineering
andCenterforCognitiveScienceTheOhioStateUniversityColumbus,OH43210-1277,USA
Email:mwu@,dwang@cse.ohio-state.edu
ABSTRACT
Roomreverberationcausestwoperceptualdistortionsoncleanspeech:Colorationandlong-termreverberation.Thesetwoeffectscorrespondtotwophysicalvariables:Signal-to-reverberantenergyratio(SRR)andreverberationtime,respectively.Basedonthisobservation,weproposeatwo-stagealgorithmthatenhancesreverberantspeechfromone-microphonerecordings.Inthefirststage,aninversefilterisestimatedtoreducecolorationeffectsorincreaseSRR.Thesecondstageemploysspectralsubtractiontominimizetheinfluenceoflong-termreverberation.Theproposedalgorithmsignificantlyimprovesthequalityofreverberantspeech.Acomparisonwitharecentone-microphoneenhancementalgorithmshowsthatoursystemproducessignificantlybetterresults.
1.INTRODUCTION
Amaincauseofspeechdegradationinpracticallyalllisteningsituationsisroomreverberation.Althoughapersonwithnormalhearingislittleaffectedbyroomreverberationtoaconsiderabledegree,hearing-impairedlistenerssufferfromreverberationeffectsdisproportionally[12].Also,reverberationcausessignificantperformancedecrementforcurrentautomaticspeechrecognition(ASR)andspeakerrecognitionsystems.Consequently,aneffectivereverberantspeechenhancementsystemcanbeusedforimprovingintelligenthearingaidsdesignandisessentialformanyspeechtechnologyapplications.
Inthisarticlewestudyone-microphonereverberantspeechenhancement.Thisismotivatedbythefollowingtwoconsiderations.First,aone-microphonesolutionishighlydesirableformanyreal-worldapplicationssuchashand-freeaudiocommunicationandaudioinformationretrieval.Second,moderatelyreverberantspeechishighlyintelligibleinmonaurallisteningconditions.Hencehowtoachievethismonauralcapabilityremainsafundamentalscientificquestion.
Anumberofreverberantspeechenhancementalgorithmshavebeendesignedutilizingmorethanonemicrophone.Forexample,microphone-arraybasedmethods[6],suchasbeamformingtechniques,attempttosuppressthesoundenergycomingfromdirectionsotherthanthatofthedirectsourceandthereforeenhancetargetspeech.AspointedoutbyKoenigetal.[10],thereverberationtailsoftheimpulseresponses,characterizingthereverberationprocessinaroomwithmultiplemicrophonesandonespeaker,areuncorrelated.Several
«
algorithmsareproposedtoreducethereverberationeffectsbyremovingtheincoherentpartsofreceivedsignals.Blinddeconvolutionalgorithmsaimtoreconstructtheinversefilterswithoutthepriorknowledgeofroomimpulseresponses(forexample,see[8]).BrandsteinandGriebel[5]utilizetheextremaofwaveletcoefficientstoreconstructthelinearprediction(LP)residualoforiginalspeech.
Reverberantspeechenhancementusingonemicrophoneissignificantlymorechallengingthanthatusingmultiplemicrophones.Nonetheless,anumberofone-microphonealgorithmshavebeenproposed.Beesetal.[3]employsacepstrum-basedmethodtoestimatethecepstrumofreverberationimpulseresponse,anditsinverseisthenusedtodereverberatethesignal.Severaldereverberationalgorithms(forexample,see[2])aremotivatedbytheeffectsofreverberationonModulationTransferFunction(MTF).YegnanarayanaandMurthy[16]observedthatLPresidualofvoicedcleanspeechhasdampedsinusoidalpatternswithineachglottalcycle,whilethatofreverberantspeechissmearedandresemblesGaussiannoise.Withthisobservation,LPresidualofcleanspeechisestimatedandthentheenhancedspeechisresynthesized.NakataniandMiyoshi[13]proposedasystemcapableofblinddereverberationbyemployingtheharmonicstructureofspeech.Goodresultsareobtainedbutthisalgorithmrequiresalargeamountofreverberantspeechproducedusingthesameroomimpulseresponsefunction.Despitethesestudies,existingreverberantspeechenhancementalgorithms,however,donotreachaperformanceleveldemandedbymanypracticalapplications.
2.BACKGROUND
Reverberationcausesanoticeablechangeinspeechquality.BerkleyandAllen[4]identifiedthattwophysicalvariables,reverberationtimeT60andspectraldeviation,areimportantforreverberantspeechquality.Considertheimpulseresponseasacombinationofthreeparts,thedirect,early,andlatereflections.Whilelatereflectionssmearthespeechspectraandreducetheintelligibilityandqualityofspeechsignals,earlyreflectionscauseanotherdistortionofspeechsignalcalledcoloration;thenon-flatfrequencyresponseoftheearlyreflectionsdistortsthespeechspectrum.Thecolorationcanbecharacterizedbyaspectraldeviationdefinedasthestandarddeviationofroomfrequencyresponse.Increasingeitherspectraldeviationorreverberationtimeresultsindecreasedreverberantspeechquality.Moreover,Jetzt[9]showsthatspectraldeviationis
…… 此处隐藏:2784字,全部文档内容请下载后查看。喜欢就下载吧 ……