Abstract The MediaMill TRECVID 2005 Semantic Video Search En(7)

时间：2026-01-23

UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.

Table1:UvA-MediaMillTRECVID2005runcomparisonforall10benchmarkconcepts.Thebestpathofthesemanticpath stcolumnindicatesresultsofourvisual-onlyrun.

SP-1

PeoplewalkingExplosionMapUS agBuildingWaterscapeMountainPrisonerSportsCarMAP

0.1990.0410.1420.10.2350.2010.220.0050.3420.2130.1698

SP-20.1720.0270.160.0630.2290.1980.1930.0010.2250.1920.146

SP-30.1540.0320.1350.110.2260.1370.18200.2890.1820.1447

SP-40.1790.0350.1230.0950.2250.1640.1950.0010.2020.2010.142

SP-50.1010.0360.0990.0720.210.1240.170.0010.1370.1960.1146

SP-60.1030.0340.1270.1140.1570.1360.1280.0010.1530.1990.1152

Visual-only

0.0310.0730.1380.1290.2690.1660.2070.0030.2720.2330.1521

sion,andlatefusion.Resultscon rmtheimportanceofvisualanalysisforgenericconceptdetection.Text-analysisyieldsthebestapproachforonly8concepts,whereasvisualanalysisyieldsthebestperformanceforasmuchas45con-cepts.Fusionisoptimalfortheremaining48concepts,withaclearadvantageforearlyfusion(33concepts)infavoroflatefusion(15concepts).

Thestyleanalysisstepagaincon rmstheimportanceforinclusionofprofessionaltelevisionproductionfacetsforse-manticvideoindexing.Especiallyforconceptswhichsharemanysimilaritiesintheirproductionprocess,likeanchors,monologues,andentertainment.Forotherconcepts,con-tentismoredecisive,liketennisandbaseballforexample.Thussomeconceptsarejustcontent,whereasothersarepureproductionstyle.

Weboostconceptdetectionperformancefurtherbytheusageofcontext.Thepath nderagainexploitsvariationinperformanceforthevariouspathstoselectanoptimalpathway.Theresultsdemonstratethevirtueofthesemanticpath nder.Conceptsaredividedbytheanalysisstepafterwhichtheyachievebestperformance.Basedontheseresultsweconcludethatanoptimalstrategyforgenericmultimediaanalysisisonethatlearnsfromthetrainingsetonaper-conceptbasiswhichtactictofollow.3.4.1

Path nderRuns

validationsetswearestillover ttingthedata.Apointofconcernhereistherandomassignmentofshotstothesep-aratetrainingandvalidationsets.Thismaybiastheclas-si ersasitispossiblethatsimilarnewsitemsfromseveralchannelsaredistributedtoseparatesets.Fortwoconcepts(mapandexplosion)performancesu eredfrommisinter-pretationofcorrectconcepts.Hadweincludedexamplesofnewsanchorswithmapsinthebackgroundofthestudiosetting(forthemapconcept)andsmoke(forexplosion)inourtrainingsets,resultswouldbehigher.Whenlookingatthejudgedresults,wealsofoundthatthreeconcepts(water-scape,mountain,andcar)aredominatedbycommercials.Wedonotperformwelloncommercialdetection.Thiscanbeexplainedbecausewetake1framepersecondoutofthevideointhevisualanalysis.Samplinginthismannerwillselectdi erentframesforthesamecommercialsthatreap-pearondi erenttimestampsinavideo.Weanticipatethatimprovementinframesamplingyieldsincreasedrobustnessfortheentirepath nder.3.4.2

Visual-onlyRun

Wesubmittedsixpathsforeachbenchmarkconcept,prior-itizedaccordingtovalidationsetperformance.Forconceptexplosionforexample,theoptimalpath(SP-1)indicatesthatvisual-onlyanalysisisthebestperformer.However,inmostcasesthebestpathisaconsecutivepathofcon-tent,style,andcontext.Wereporttheo cialTRECVIDbenchmarkresultsinTable1.

Theresultsshowthatthepath ndermechanismisagoodwaytoestimatethebestperforminganalysispath.TheSP-1runcontainingtheoptimalpathisindeedthebestper-formerin8outof10cases.Overall,thisisalsoourbestperformingrun.However,whatstrikesusmostisthatav-erageprecisionresultsaremuchlowerthancanbeexpectedbasedonvalidationsetperformancereportedinFig.8.Thismayindicatethatdespitetheuseofseparatetrainingand

ValidationsetperformanceinFig.8.indicatesthatourvi-sualanalysisstepperformsquitegood.Todeterminethecontributionofthevisualanalysisstep,wethereforesub-mittedavisual-onlyrun.ThisinvolvedtrainingaSupportVectorMachineonthevectorofcontexturesasintroducedinsection3.1.1.WetrainedanSVMforeachofthe10con-ceptoftheconceptdetectiontask.Anexperimentforrecog-nizingproto-conceptwassubmittedbyanothergroup[37].Thevisualfeaturesinthesubmittedvisual-onlyrunareslightlydi erentfromthevisualfeaturesinthesemanticpath ndersystem.Thisdi erenceiscausedbyongoingde-velopmentonthevisualanalysis.Speci cally,weimprovedtheWeibull ttobemorerobustandweaddedtheproto-conceptcar.Thenewerversionofthevisualanalysiswasnotincorporatedinthesemanticpath nder.Itwasnotintegratedbecausevisualanalysisisthe rststepinthese-manticpath.Thus,achangeinthevisualanalysismeansthatallfurtherpathswouldhavetoberecomputed.How-ever,foravisual-onlyrun,theimprovementswerefeasibletocompute.

…… 此处隐藏：2624字，全部文档内容请下载后查看。喜欢就下载吧 ……

Abstract The MediaMill TRECVID 2005 Semantic Video Search En(7).doc 将本文的Word文档下载到电脑

下载这篇word文档