Abstract The MediaMill TRECVID 2005 Semantic Video Search En(13)

时间:2025-05-05

UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.

Thesearecon ictingrequirements.Forexample,tosat-isfytheoverviewrequirement,thenumberofrepresentativekeyframesshouldbeincreased.Becauseofthe xedsizeofthedisplayspace,themorekeyframesthehigherthechanceofoverlap,thevisibilityrequirementhencewillbeviolated.Moreover,whilepreservingthevisibilityimagesarespreadoutfromeachother,originalrelationsbetweenthemarechangedi.e.structureisnotpreserved.Therefore,costfunctionsforeachrequirementandbalancingfunctionsbetweenthemareproposed.

Activelearningalgorithmsmostlyusesupportvectorma-chines(SVM)asafeedbacklearningbase[38,33].Inin-teractivesearch,usingthisapproach,thesystem rstshowssomeimagesandaskstheusertolabelthoseaspositiveand/ornegative.Thelearningiseitherbasedonbothpos-itiveandnegativeexamples(knownastwo-classSVM)oronpositive/negativeonesonly(knownasone-classSVM).TheseexamplesareusedtotraintheSVMtolearnclassi ersseparatingpositiveandnegativeexamples.Theprocessisrepeateduntiltheperformancesatis esgivenconstraints.Wehavedoneacomparisonbetweenthetwoapproaches,theresultsturnoutthatone-classSVMgenerallyperformsbetterthanthetwo-class,aswellasfasterinreturningtheresult.Weconcentrateontheuseofone-classSVMforlearningtherelevancefeedback.

Thecombinationofthetwotechniquesisdrawnintoonescheme(seeFig.11).Theo inestagecontainsfeatureextractionandsimilarityfunctionselection.TheISOSNEfrom[15]isappliedtoprojectthecollectionfromthehighdimensionalspacetothevisualizationspace.Thenextstepwilldecidewhichsetofkeyframeswillbeusedasarepre-sentativeone.Todoso,weemployk-meansalgorithmtoclusterkeyframesintoa xednumberofgroups.Asetofkeyframesselectedfromdi rmationofeachkeyframebelongingtoacertaingroup,anditspositioninthevisual-izationspaceisstoredaso inedata.

Intheinteractivestage,queryresultsareinputforstart-ingupthesearch.First,thesetoftopkkeyframesfromthequeryresultsisdisplayed.Theuserthenusesthesys-temtoexplorethecollectionand ndrelevantkeyframes.Particularly,ifthecurrentlydisplayedsetcontainsanypos-itiveone,theuserselectsthatkeyframeandgoesintothecorrespondingclusterwiththeexpectationof ndingmoresimilarones.Withtheadvantageofsimilaritybasedvi-sualization,insteadofclickingonanindividualkeyframeforlabeling,thesystemsupportstheuserwithmousedrag-gingtodrawtheareaofkeyframesinthesamecategory.Thismeansthatwhentheuser ndsagroupofrelevantkeyframes,he/shedrawsarectanglearoundthoseandmarksthemallaspositiveexamples.Therefore,oursystemcanreducethenumberofactionsfromtheuserwiththesameamountofinformationforrelevancefeedback.Incasethereisnopositivekeyframeinthecurrentset,theuserthenasksthesystemtodisplayanotherset,whichcontainsthenextkkeyframesfromthequeryresults.Keyframeswhichareselectedastrainingexamplesordisplayedbeforewillnot

be

Figure11:SchemeofaninteractivesearchintheGalaxyBrowserwiththecombinationofactivelearningandsimilaritybasedvisual-ization.

shownagain.

Inthelearningstep,whenacertainnumberoftrainingexamplesareprovided,theSVMtrainsthesupportvec-tors.Weusethewell-knownSVMlibrarydevelopedbyChangandLin[4],whichprovidesaone-classimplemen-tation.Afterthelearning,asetofimagesclosesttotheborderisreturned.Theprocessisrepeateduntilacertainconstraintissatis edsuchasnumberofiterations,timelim-itation,orsimplythattheuserdoesnotwanttogiveanymorefeedback.Atthatpoint,thesystemwillreturnthe nalresultcontainingkeyframeswithmaximumdistancestotheborderastheyareassumedhavinghighprobabilitiestoberelevanttothesearchtopic.5.4.3

SphereBrowser

TovisualizethethreadstructureasocalledSphere-Browser[22]wasdeveloped.Thebrowserdisplaystwoor-thogonaldimensions.Thehorizontaloneisthetime-thread,usingtheoriginalTRECVIDshotsequence.Theverticaldimensioncontainsforeachshotcluster-threadsofseman-ticallysimilarfootage.TheGUIgivestheuseraaspheri-callayoutofnearbyshotsonthescreen,ingthemouseandarrowkeystheusercanthennavigateeitherthroughtimeorthroughrelatedshots,selectingrelevantshotswhenfound.Alsoselecting(parts

…… 此处隐藏:2110字,全部文档内容请下载后查看。喜欢就下载吧 ……
Abstract The MediaMill TRECVID 2005 Semantic Video Search En(13).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219