Abstract The MediaMill TRECVID 2005 Semantic Video Search En(3)

时间：2026-01-23

UvA-MediaMill team participated in four tasks. For the detection of camera work (runid: A CAM) we investigate the benefit of using a tessellation of detectors in combination with supervised learning over a standard approach using global image information.

Figure3:Thesemanticpath nderforoneconcept,usingthecon-ventionsofFig.1.

conceptslikegraphics)donotaddmuch.Incontrast,morecomplexevents,likepeoplewalking,pro tfromincrementaladaptationoftheanalysistotheintentionoftheauthor.Thevirtueofthesemanticpath nderisitsabilityto ndthebestpathofanalysisstepsonaper-conceptbasis.Anoverviewofthesemanticpath nderisgiveninFig.3.

3.1ContentAnalysisStep

Weviewofvideointhecontentanalysisstepfromthedataperspective.Ingeneral,threedatastreamsormodalitiesexistinvideo,namelytheauditorymodality,thetextualmodality,andthevisualone.Asspeechisoftenthemostinformativepartoftheauditorysource,wefocusonvisualfeatures,andontextualfeaturesobtainedfromtranscribedspeech.Aftermodalityspeci cdataprocessing,wecombinefeaturesinamultimodalrepresentationusingearlyfusionandlatefusion[32].3.1.1

VisualAnalysis

Modelingvisualdataheavilyreliesonqualitativefeatures.

Goodfeaturesdescribetherelevantinformationinanimagewhilereducingtheamountofdatarepresentingtheimage.Toachievethisgoal,weuseWiccestfeaturesasintroducedin[6].Wiccestfeaturescombinecolorinvariancewith

nat-

ural

image

statistics.

Color

invariance

aims

toremoveac-cidentallightingconditions,whilenaturalimagestatisticse cientlyrepresentimagedata.

Colorinvarianceaimsatkeepingthemeasurementscon-stantundervarying

intensity,viewpointandshading.In[7]severalcolorinvariantsaredescribed.WeusetheWin-variantthatnormalizesthespectralinformationwiththeenergy.Thisnormalizationmakesthemeasurementsin-dependentofilluminationchangesunderuniformlightingconditions.

Whenmodelingscenes,edgesarehighlyinformative.Edgesrevealwhereoneregionendsandanotherbegins.Thus,anedgehasatleasttwicetheinformationcontentthenauniformlycoloredpatch,sinceanedgecontainsin-formationaboutallregionsitdivides.Besidesservingasregionboundaries,anensembleofedgesdescribestextureinformation.Texturecharacterizesthematerialanobjectismadeof.Moreover,acompilationofclutteredobjectscan

Figure4:Anexampleofdividinganimageupinoverlappingre-gions.Inthisparticularexample,theregionsizeisa1

forboththex-dimensionandy-dimension.Theregionsoftheimagesizeareuni-formlysampledacrosstheimagewithastepsizeofhalfaregion.Samplinginthismanneridenti esnineoverlappingregions.

bedescribedastextureinformation.Therefore,ascenecanbemodeledwithtexturedregions.

Textureisdescribedbythedistributionofedgesatacer-tainregioninanimage.Hence,ahistogramofaGaussianderivative ltersrepresentstheedgestatistics.Sincetherearemorenon-edgepixelsthenthereareedgepixels,thedis-tributionofedgeresponsesfornaturalimagesalwayshasapeakaroundzero,i.e.:manypixelshavenoedgeresponses.Additionally,theshapeofthetailsofthedistributionisoftenin-betweenapower-lawandaGaussiandistribution.Thisspeci cdistributioncanbewellmodeledwithanin-tegratedWeibulldistribution[8].Thisdistributionisgivenbyγ2γexp)

1βΓ(1γ whereristheedgeresponseto r µβ the γ ,(1)Gaussianderivative lter ∞andΓ(·)isthecompleteGammafunction,Γ(x)=tx 1e 1

dt.Theparameterβdenotesthewidthdistribution,0

ofthetheparameterγrepresentsthe’peakness’ofthedistribution,andtheparameterµdenotestheoriginofthedistribution.

ToassessthesimilaritybetweenWiccestfeatures,agoodness-of- ttestisutilized.Themeasureisbasedontheintegratedsquarederrorbetweenthetwocumulativedistri-butions,whichisobtainedbyaCram´er-vonMisesmeasure.FortwoWeibulldistributionswithparametersFβ,FγandGβ,Gγa rstorderTaylorapproximationoftheCram´er-vonMisesstatisticyieldsthelogdi erencebetweenthepa-rameters.Therefore,ameasureofsimilaritybetweentwoWeibulldistributionsFandGisgivenbytheratiooftheparameters,

W2(F,G)=

min(Fβ,Gβ)min(Fγ,Gγ)

max(F.(2)

β,Gβ)max(Fγ,Gγ)Theµparameterrepresentsthemodeofthedistribution.Thepositionofthemodeisin uencedbyunevenillumi-nationandcoloredillumination.Hence,toachievecolorconstancythevaluesforµmaybeignored.

Insummary,Wiccestfeaturesprovideacolorinvarianttexturedescriptor.Moreover,thefeaturesrelyheavilyonnaturalimagestatisticstocompactlyrepresentthevisualinformation.

…… 此处隐藏：2157字，全部文档内容请下载后查看。喜欢就下载吧 ……

Abstract The MediaMill TRECVID 2005 Semantic Video Search En(3).doc 将本文的Word文档下载到电脑

下载这篇word文档