Abstract The MediaMill TRECVID 2005 Semantic Video Search En(9)
时间:2025-05-05
时间:2025-05-05
UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.
Table2:Validationsetaverageprecisionperformancefor3typesofcameraworkusingseveralversionsofourcameraworkdetector.
Pan
TiltZoomMAPLateFusion
0.8620.7860.8620.837LateFusion+SelectedContext0.8590.7520.8660.826LateFusion+Context0.8560.6560.8560.789EarlyFusion0.7030.5580.7830.681Global
0.5690.6130.8130.665Global+Context
0.5910.5620.7920.648EarlyFusion+Context
0.616
0.461
0.765
0.614
Theresultsofourvisual-runre ecttheimportanceofvi-sualanalysis.Forfourconcepts(explosion,US ag,build-ing,car)weoutperformthepath ndersystem.Thisim-provementmightbeattributedtotheuseofimprovedvisualfeaturesandtothefactthatweusetheentiretrainingsetinSVM-training.However,sincethevisualanalysisstepisembeddedinthepath ndersystem,thevisualanalysisshouldneverperformbetter.Thereforewebelievethatre-sultsofthepath ndersystemwillimprovewhenthenewfeaturesareincluded.
4CameraWork
Forthedetectionofcameraworkwestartwithanexist-ingimplementationbasedonspatiotemporalimageanaly-sis[34,12].Givenasetofglobalintensityimagesfromshoti,thealgorithm rstextractspatiotemporalimages.Ontheseimagesadirectionanalysisisappliedtoestimatedi-rectionparameters.Theseparametersformtheinputforasupervisedlearningmoduletolearnthreetypesofcamerawork.Wemodi edthealgorithminvariousways.Wesu-perimposedatessellationof8regionsoneachinputframetodecreasethee ectoflocaldisturbances.Parametersthusobtainedareexploitedusinganearlyfusionandlatefusionapproach.Inadditionweexploredwhetherthe101conceptscoresobtainedfromthesemanticpath nderaidindetec-tionofcamerawork.
4.1Experiments
ExperimentsonvalidationsetDindicatethataveragepre-cisionresultsincreasedrastically,especiallyforpan(+51%)andtilt(+28%),seeTable2.Thebestapproachisalatefu-sionschemewithouttheusageofcontext.Relativetootherparticipantsweperformedquitegoodinprecision,butquitebadintermsofrecall.Resultsindicatethatthebasede-tectoristooconservative.However,italsoshowsthatanyglobalimagebasedcameraworkdetectorhasthepotentialtopro tfromatessellationofregion-baseddetectors.
5Lexicon-drivenRetrieval
Weproposealexicon-drivenretrievalparadigmtoequipuserswithsemanticaccesstomultimediaarchives.The
aimistoretrievefromamultimediaarchiveS,whichiscomposedofnuniqueshots{s1,s2,...,sn},thebestpossi-bleanswersetinresponsetoauserinformationneed.Tothatend,weusethe101conceptsinthelexiconaswellasthe3typesofcameraworkforourautomatic,manual,andinteractivesearchsystems.
5.1AutomaticSearch
Ourautomaticsearchengineusesonlytopictextasin-put[10],aswepostulatethatitisunreasonabletoexpectausertoprovideavideosearchsystemwithexamplevideosinarealworldscenario.Werelypurelyontextandthelex-iconof101semanticconceptdetectorsthatwehavedevel-opedusingthesemanticpath nder,seeSection3,tosearchthroughthevideocollection.Wedevelopedoursearchsys-temusingthevideodata,topics,andgroundtruthsfromthe2003and2004TRECVIDevaluationsasatrainingset.5.1.1
IndexingComponents
OurautomaticsearchsystemincorporatesregularTFIDF-basedindicesforstandardretrievalusingthebfx-bfx[24]formula,LatentSemanticIndexing[5]fortextretrievalwithimplicitqueryexpansion,and101thedi erentseman-ticconceptindicesforquery-by-concept.Eachindexwasmatchedtooneormoreconcepts,orsynsetsintheWord-Net[13]lexicaldatabaseonanindividualbasis,accordingtowhethertheconceptdirectlymatchesthecontentofthedetectors.Forexample,thedetectorfortheconceptbaseball ndsshotsofbaseballgames,andtheseshotsinvariablyin-cludebaseballplayers,baseballequipment,andabaseballdiamond,sotheseconceptsarealsomatched.AdditionalsynsetsareaddedtoWordNetforsemanticconceptsthatdonothaveadirectWordNetequivalent.5.1.2
AutomaticQueryInterfaceSelection
Weperformthestandardstoppingandstemmingproce-duresonthetopictext(usingtheSMARTstoplist[23]withtheadditionofthewords ndandshots;andthePorterstemmingalgorithm[20]respectively).Inaddition,weper-formpart-of-speechtaggingandchunkingusingtheTree-Tagger[26].Thisgrammaticalinformationisusedtoiden-tifytwodi erentquerycategorizations:complexvs.simplequeriesandgeneralvs.speci cqueries.Anytopiccontain-ingmorethanonenounchunkisclassi edascomplex,asitreferstomorethanoneobject,whilerequestscontainingonlyasinglenounchunkareclassi edassimple.Ifare-questcontainsaname(apropernoun)itreferstoaspeci cobject,ratherthanageneralcategory,sowecategorizeallrequestscontainingpropernounsasspeci crequests,andallothersasgeneralrequests.
Subsequently,weextracttheWordNetwordsinthetopictextthroughdictionarylookupofnounchunksandnouns.WeidentifythecorrectsynsetforWordNetwordswithmultiplemeaningsthroughdisambiguation.Weevaluated
…… 此处隐藏:2717字,全部文档内容请下载后查看。喜欢就下载吧 ……上一篇:自定义动画---陀螺旋
下一篇:刑法学案例分析题1