Abstract The MediaMill TRECVID 2005 Semantic Video Search En(13)
时间:2025-05-05
时间:2025-05-05
UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.
Thesearecon ictingrequirements.Forexample,tosat-isfytheoverviewrequirement,thenumberofrepresentativekeyframesshouldbeincreased.Becauseofthe xedsizeofthedisplayspace,themorekeyframesthehigherthechanceofoverlap,thevisibilityrequirementhencewillbeviolated.Moreover,whilepreservingthevisibilityimagesarespreadoutfromeachother,originalrelationsbetweenthemarechangedi.e.structureisnotpreserved.Therefore,costfunctionsforeachrequirementandbalancingfunctionsbetweenthemareproposed.
Activelearningalgorithmsmostlyusesupportvectorma-chines(SVM)asafeedbacklearningbase[38,33].Inin-teractivesearch,usingthisapproach,thesystem rstshowssomeimagesandaskstheusertolabelthoseaspositiveand/ornegative.Thelearningiseitherbasedonbothpos-itiveandnegativeexamples(knownastwo-classSVM)oronpositive/negativeonesonly(knownasone-classSVM).TheseexamplesareusedtotraintheSVMtolearnclassi ersseparatingpositiveandnegativeexamples.Theprocessisrepeateduntiltheperformancesatis esgivenconstraints.Wehavedoneacomparisonbetweenthetwoapproaches,theresultsturnoutthatone-classSVMgenerallyperformsbetterthanthetwo-class,aswellasfasterinreturningtheresult.Weconcentrateontheuseofone-classSVMforlearningtherelevancefeedback.
Thecombinationofthetwotechniquesisdrawnintoonescheme(seeFig.11).Theo inestagecontainsfeatureextractionandsimilarityfunctionselection.TheISOSNEfrom[15]isappliedtoprojectthecollectionfromthehighdimensionalspacetothevisualizationspace.Thenextstepwilldecidewhichsetofkeyframeswillbeusedasarepre-sentativeone.Todoso,weemployk-meansalgorithmtoclusterkeyframesintoa xednumberofgroups.Asetofkeyframesselectedfromdi rmationofeachkeyframebelongingtoacertaingroup,anditspositioninthevisual-izationspaceisstoredaso inedata.
Intheinteractivestage,queryresultsareinputforstart-ingupthesearch.First,thesetoftopkkeyframesfromthequeryresultsisdisplayed.Theuserthenusesthesys-temtoexplorethecollectionand ndrelevantkeyframes.Particularly,ifthecurrentlydisplayedsetcontainsanypos-itiveone,theuserselectsthatkeyframeandgoesintothecorrespondingclusterwiththeexpectationof ndingmoresimilarones.Withtheadvantageofsimilaritybasedvi-sualization,insteadofclickingonanindividualkeyframeforlabeling,thesystemsupportstheuserwithmousedrag-gingtodrawtheareaofkeyframesinthesamecategory.Thismeansthatwhentheuser ndsagroupofrelevantkeyframes,he/shedrawsarectanglearoundthoseandmarksthemallaspositiveexamples.Therefore,oursystemcanreducethenumberofactionsfromtheuserwiththesameamountofinformationforrelevancefeedback.Incasethereisnopositivekeyframeinthecurrentset,theuserthenasksthesystemtodisplayanotherset,whichcontainsthenextkkeyframesfromthequeryresults.Keyframeswhichareselectedastrainingexamplesordisplayedbeforewillnot
be
Figure11:SchemeofaninteractivesearchintheGalaxyBrowserwiththecombinationofactivelearningandsimilaritybasedvisual-ization.
shownagain.
Inthelearningstep,whenacertainnumberoftrainingexamplesareprovided,theSVMtrainsthesupportvec-tors.Weusethewell-knownSVMlibrarydevelopedbyChangandLin[4],whichprovidesaone-classimplemen-tation.Afterthelearning,asetofimagesclosesttotheborderisreturned.Theprocessisrepeateduntilacertainconstraintissatis edsuchasnumberofiterations,timelim-itation,orsimplythattheuserdoesnotwanttogiveanymorefeedback.Atthatpoint,thesystemwillreturnthe nalresultcontainingkeyframeswithmaximumdistancestotheborderastheyareassumedhavinghighprobabilitiestoberelevanttothesearchtopic.5.4.3
SphereBrowser
TovisualizethethreadstructureasocalledSphere-Browser[22]wasdeveloped.Thebrowserdisplaystwoor-thogonaldimensions.Thehorizontaloneisthetime-thread,usingtheoriginalTRECVIDshotsequence.Theverticaldimensioncontainsforeachshotcluster-threadsofseman-ticallysimilarfootage.TheGUIgivestheuseraaspheri-callayoutofnearbyshotsonthescreen,ingthemouseandarrowkeystheusercanthennavigateeitherthroughtimeorthroughrelatedshots,selectingrelevantshotswhenfound.Alsoselecting(parts
…… 此处隐藏:2110字,全部文档内容请下载后查看。喜欢就下载吧 ……上一篇:自定义动画---陀螺旋
下一篇:刑法学案例分析题1