Abstract The MediaMill TRECVID 2005 Semantic Video Search En(5)

时间：2026-01-23

UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.

(a)

(b)

Figure5:(a)Generalschemeforearlyfusion.Outputofunimodalanalysisisfusedbeforeaconceptislearned.(b)Generalschemeforlatefusion.Outputofunimodalanalysisisusedtolearnseparatescoresforaconcept.Afterfusiona nalscoreislearnedfortheconcept.WeusetheconventionsofFig.1.

semanticrepresentationratherthanafeaturerepresenta-tion.Abigdisadvantageoflatefusionschemesisitsexpen-sivenessintermsofthelearninge ort,aseverymodalityrequiresaseparatesupervisedlearningstage.Moreover,thecombinedrepresentationrequiresanadditionallearningstage.Anotherdisadvantageofthelatefusionapproachisthepotentiallossofcorrelationinmixedfeaturespace.AgeneralschemeforlatefusionisillustratedinFig.5b.

Forthelatefusionscheme,weconcatenatetheprobabilis-ticoutputscoreaftervisualanalysis,i.e.p vi, q ),withi(ω|

theprobabilisticscoreresultingfromtextualanalysis,i.e.

q ),intolatefusionvector li.p i(ω|ti,

3.2.1StyleAnalysis

Wedevelopdetectorsforallfourproductionrolesasfeature

extractioninthestyleanalysisstep.Werefertoourpre-viousworkforspeci cimplementationdetailsofthedetec-tors[31,ElectronicAppendix].Wehavechosentoconverttheoutputofall

styledetectorstoanordinalscale,asthisallowsforelegantfusion.

ForthelayoutLthelengthofacamerashotisusedasafeature,asthisisknowntobeaninformativedescrip-torforgenre[31].Overlayedtextisanotherinformativedescriptor.Itspresenceisdetectedbyatextlocalization

3.1.6ContentPath nder

Welearn101semanticconceptsbasedonthefourvectorsresultingfromanalysisinthecontentanalysisstep.Thus vi, ti, ei,and liserveastheinputforoursupervisedlearn-ingmodule,whichlearnsanoptimizedSVMmodelforeachsemanticconceptωusing3-foldcrossvalidationwith3rep-etitionsontrainingsetA.Thesemodelsarethenvalidated

i)forallonsetD,yieldingabestperformingmodelp i(ω|m

ωinΛS,wherem i∈{ vi, ti, ei, li}.

3.2StyleAnalysisStep

Inthestyleanalysisstepweconceiveofavideofromtheproductionperspective.Basedonthefourrolesinvolvedinthevideoproductionprocess[31],youtdetectorsanalyzetheroleoftheeditor.Contentdetectorsanalyzetheroleofproductiondesign.Capturedetectorsanalyzetheroleoftheproductionrecordingunit.Finally,contextdetectorsanalyzetheroleofthepreproductionteam,seeFig.6.

Figure6:Featureextractionandclassi cationinthestyleanalysisstep,specialcaseofFig.2.

…… 此处隐藏：474字，全部文档内容请下载后查看。喜欢就下载吧 ……

Abstract The MediaMill TRECVID 2005 Semantic Video Search En(5).doc 将本文的Word文档下载到电脑

下载这篇word文档