Abstract The MediaMill TRECVID 2005 Semantic Video Search En(3)
时间:2026-01-23
时间:2026-01-23
UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.
Figure3:Thesemanticpath nderforoneconcept,usingthecon-ventionsofFig.1.
conceptslikegraphics)donotaddmuch.Incontrast,morecomplexevents,likepeoplewalking,pro tfromincrementaladaptationoftheanalysistotheintentionoftheauthor.Thevirtueofthesemanticpath nderisitsabilityto ndthebestpathofanalysisstepsonaper-conceptbasis.Anoverviewofthesemanticpath nderisgiveninFig.3.
3.1ContentAnalysisStep
Weviewofvideointhecontentanalysisstepfromthedataperspective.Ingeneral,threedatastreamsormodalitiesexistinvideo,namelytheauditorymodality,thetextualmodality,andthevisualone.Asspeechisoftenthemostinformativepartoftheauditorysource,wefocusonvisualfeatures,andontextualfeaturesobtainedfromtranscribedspeech.Aftermodalityspeci cdataprocessing,wecombinefeaturesinamultimodalrepresentationusingearlyfusionandlatefusion[32].3.1.1
VisualAnalysis
Modelingvisualdataheavilyreliesonqualitativefeatures.
Goodfeaturesdescribetherelevantinformationinanimagewhilereducingtheamountofdatarepresentingtheimage.Toachievethisgoal,weuseWiccestfeaturesasintroducedin[6].Wiccestfeaturescombinecolorinvariancewith
nat-
ural
image
statistics.
Color
invariance
aims
toremoveac-cidentallightingconditions,whilenaturalimagestatisticse cientlyrepresentimagedata.
Colorinvarianceaimsatkeepingthemeasurementscon-stantundervarying
intensity,viewpointandshading.In[7]severalcolorinvariantsaredescribed.WeusetheWin-variantthatnormalizesthespectralinformationwiththeenergy.Thisnormalizationmakesthemeasurementsin-dependentofilluminationchangesunderuniformlightingconditions.
Whenmodelingscenes,edgesarehighlyinformative.Edgesrevealwhereoneregionendsandanotherbegins.Thus,anedgehasatleasttwicetheinformationcontentthenauniformlycoloredpatch,sinceanedgecontainsin-formationaboutallregionsitdivides.Besidesservingasregionboundaries,anensembleofedgesdescribestextureinformation.Texturecharacterizesthematerialanobjectismadeof.Moreover,acompilationofclutteredobjectscan
Figure4:Anexampleofdividinganimageupinoverlappingre-gions.Inthisparticularexample,theregionsizeisa1
forboththex-dimensionandy-dimension.Theregionsoftheimagesizeareuni-formlysampledacrosstheimagewithastepsizeofhalfaregion.Samplinginthismanneridenti esnineoverlappingregions.
bedescribedastextureinformation.Therefore,ascenecanbemodeledwithtexturedregions.
Textureisdescribedbythedistributionofedgesatacer-tainregioninanimage.Hence,ahistogramofaGaussianderivative ltersrepresentstheedgestatistics.Sincetherearemorenon-edgepixelsthenthereareedgepixels,thedis-tributionofedgeresponsesfornaturalimagesalwayshasapeakaroundzero,i.e.:manypixelshavenoedgeresponses.Additionally,theshapeofthetailsofthedistributionisoftenin-betweenapower-lawandaGaussiandistribution.Thisspeci cdistributioncanbewellmodeledwithanin-tegratedWeibulldistribution[8].Thisdistributionisgivenbyγ2γexp)
1βΓ(1γ whereristheedgeresponseto r µβ the γ ,(1)Gaussianderivative lter ∞andΓ(·)isthecompleteGammafunction,Γ(x)=tx 1e 1
dt.Theparameterβdenotesthewidthdistribution,0
ofthetheparameterγrepresentsthe’peakness’ofthedistribution,andtheparameterµdenotestheoriginofthedistribution.
ToassessthesimilaritybetweenWiccestfeatures,agoodness-of- ttestisutilized.Themeasureisbasedontheintegratedsquarederrorbetweenthetwocumulativedistri-butions,whichisobtainedbyaCram´er-vonMisesmeasure.FortwoWeibulldistributionswithparametersFβ,FγandGβ,Gγa rstorderTaylorapproximationoftheCram´er-vonMisesstatisticyieldsthelogdi erencebetweenthepa-rameters.Therefore,ameasureofsimilaritybetweentwoWeibulldistributionsFandGisgivenbytheratiooftheparameters,
W2(F,G)=
min(Fβ,Gβ)min(Fγ,Gγ)
max(F.(2)
β,Gβ)max(Fγ,Gγ)Theµparameterrepresentsthemodeofthedistribution.Thepositionofthemodeisin uencedbyunevenillumi-nationandcoloredillumination.Hence,toachievecolorconstancythevaluesforµmaybeignored.
Insummary,Wiccestfeaturesprovideacolorinvarianttexturedescriptor.Moreover,thefeaturesrelyheavilyonnaturalimagestatisticstocompactlyrepresentthevisualinformation.
…… 此处隐藏:2157字,全部文档内容请下载后查看。喜欢就下载吧 ……上一篇:自定义动画---陀螺旋
下一篇:刑法学案例分析题1