Topic segmentation with an aspect hidden Markov model(2)
时间:2025-05-11
时间:2025-05-11
We present a novel probabilistic method for partially unsupervised topic segmentation on unstructured text. Previous approaches to this problem utilize the hidden Markov model framework (HMM). The HMM treats a document as mutually independent sets of words
TopicSegmentationwithanAspectHiddenMarkovModel
DavidM.Blei
UniversityofCalifornia,Berkeley
Dept.ofComputerScience
Berkeley,CA,94720
PedroJ.Moreno
CambridgeResearchLaboratory
CompaqComputerCorporation
CambridgeMA02142-1612
July2001
Abstract
Wepresentanovelprobabilisticmethodforpartiallyunsupervisedtopicsegmen-tationonunstructuredtext.PreviousapproachestothisproblemutilizethehiddenMarkovmodelframework(HMM).TheHMMtreatsadocumentasmutuallyindepen-dentsetsofwordsgeneratedbyalatenttopicvariableinatimeseries.WeextendthisideabyembeddingtheaspectmodelfortextintothesegmentingHMM.Indoingso,weprovideanintuitivetopicaldependencybetweenwordsandacohesivesegmentationmodel.WeapplythismethodtosegmentunbrokenstreamsofNewYorkTimesarti-clesaswellasnoisytranscriptsofradioprogramsonSPEECHBOT1,anonlineaudioarchiveindexedbyanautomaticspeechrecognitionengine.WeprovideexperimentalcomparisonsbetweenourtechniqueandtheHMMapproach.OurresultssuggestthatthistechniquecanperformaswellastheHMMmethodandinsomecasesevenbetter.
上一篇:工程索道与柔性吊桥
下一篇:人教版六年级数学上册期末试卷2