A Two-stage Multi-view Analysis Framework for Human Activity
时间:2025-07-09
时间:2025-07-09
This paper presents a new framework for a multi-stage multi-view approach for human interactions and activity analysis. The analysis is performed in a distributed vision system that synergistically integrate track- and body-level representations across mul
ATwo-stageMulti-viewAnalysisFrameworkfor
HumanActivityandInteractions
SanghoPark
ComputerVisionandRoboticsResearchLab.
UniversityofCalifornia,SanDiego
LaJolla,CA92037parks@ucsd.edu
Abstract
Thispaperpresentsanewframeworkforamulti-stagemulti-viewapproachforhumaninteractionsandactivityanalysis.Theanalysisisperformedinadistributedvisionsystemthatsynergisticallyintegratetrack-andbody-levelrepresentationsacrossmultiplecameras.Oursystemaimsatversatileandeasily-deployablesystemthatdoesnotre-quirecarefulcameracalibration.Maincontributionsofthepaperare:(1)context-dependentcamerahandoverforocclusionhandling,(2)switchingthemulti-stageanalysisbetweentrack-andbody-levelrepresentations,and(3)ahypothesis-veri cationparadigmfortop-downfeedbackex-ploitingspatio-temporalconstraintsinherentinhumanin-teraction.Experimentalevaluationshowstheef cacyoftheproposedsystemforanalyzingmulti-personinteractions.Currentimplementationusestwoviews,butextensiontomoreviewsisstraightforward.
1.IntroductionandMotivation
Analysisofmulti-personinteractionsinvolvingobjectsisanimportantresearchproblemincomputervisionforawiderangeofpotentialapplications:videosurveillance,securityenforcement,eventannotation,motionanalysisinsports,etc.Multi-personinteractionraisesparticularlydif- cultissuesincomputervision:occlusionbetweenobjectsandbodydeformationduringinteraction.
Fig.1illustratesmulti-personinteractionsituationswherethetwo-stagemulti-viewanalysiswouldbene t.Asingle-camerasystem(Fig.1(a))withviewingdirectionV1maybesuf cientformonitoringthetwo-personinteractionAisbetweenappropriatepersons(i.e.,Pwith1andthePviewing2,givendirectiontheimagingV1conditionorthogo-naltotheinteractionplanethatspansP1,A,andP2.)Iftheinteractionplaneisnotperpendiculartotheviewingdi-
MohanM.Trivedi
ComputerVisionandRoboticsResearchLab.
UniversityofCalifornia,SanDiego
LaJolla,CA92037mtrivedi@ucsd.edu
Figure1.Top-downviewdiagramsformulti-viewanalysisofhumaninteractions.
rection,however,thesingle-camerabasedmonitoringgetsmoredif cultduetotheocclusionandthechangeofap-pearance.Withmorethantwopersonsinvolved(Fig.1(b)),amulti-viewsystemmaybeinevitableevenintheop-timalviewingconditions;i.e.,theviewing-directions,V1andV2,areoptimalformonitoringtheinteractions,AandBspectively.,betweenAsthethepersonspersonsP1moveandParound2,and(Fig.P2and1(c)),P3,there-dynamicselectionandcoordinationofmultipleviewsgetsimportant,whichisachallengingproblemincomputervi-sion;Theincorporationofmultiplecamerasrequiresdatafusionfromeachcamera.Maindif cultiesinthedatafu-sionfrommultiplecamerasincludesthequestionofhowtodecidewhenandwhichcamerainputstofusefor2Dand3Dimageanalysis.involved,
Anintegratedunderstandingofhumanactivityinvolvingbodydeformationwouldrequiremultiplelevelsofanalysis;weconsidertwo
stagesofdetail:track-levelandbody-levelanalyses.Atthetracklevel,humanactivityisanalyzedintermsofthetracksofmovingGaussianellipsesthatencom-passindividualpersons.Atthebodylevel,humanactivityisanalyzedinmoredetailintermsofthecoordinatedpos-tureandgesturepatternsofthebodypartssuchasupperbodyandlowerbody.Majorchallengesinthetwo-stageanalysisincludesthemaintenanceofcoherencebetweenthetwoanalysisstages;Howcanavisionsystemswitchdiffer-
This paper presents a new framework for a multi-stage multi-view approach for human interactions and activity analysis. The analysis is performed in a distributed vision system that synergistically integrate track- and body-level representations across mul
Figure2.Theoverallsystemarchitecture.
entanalysislevelsdependingontheimagingqualityunderocclusion?Theabovetwoquestions(i.e.,multi-viewfu-sionandtwo-stagefusion)maynotbeachievedbyasim-pleuni-directionalbottom-uportop-downvisionprocess.Indeed,bidirectionalprocesswithsomefeedbackmecha-nismisdesirable,whichwouldinvolveincorporationoftop-downhypothesesabouthumaninteractionsandbottom-upvisionprocesses.
Majorityofpreviousstudiesonhumanactivityanaly-sishavefocusedontrack-level,single-perspectiveanalysis.Reviewsofgeneralresearchonhumanmotionunderstand-ingcanbefoundin[1].Majorityoftheapproachestobe-havioranalysisarebasedoneitherbodyfeaturesfromasingle-viewmodalityorcompositefeaturesfrommultipleviewssuchashistogramof3Dvoxel[4],withcalibratedcameras[8]oruncalibratedcameras[5].Areviewofdis-tributedsurveillancesystemscanbefoundin[13].Mostofgesturerecognitionstudieshaveaimedatlearningiso-latedgesturesofasinglepersonwithcertainassumptionsaboutcameracon guration.Multi-viewtrackingandcam-erahandoverstudieshavenotbeenactivelyrelatedinactiv-ityrecognitionstudies.
Inthispaper,weproposeanewframeworkfortheanaly-sisofmulti-personactivityinadistributedvisionsystembyasynergisticintegrationofthetrack-andbody-levelrep-resentationsacrossmultipleviews.Maincontributionsofthepaperare:(1)context-dependentcamerahandoverforocclusionhandling,(2)switchingthemulti-levelanalysisbetweentrack-andbody-levelrepresentations,and(3)in-tegrationofdata-drivenbottom-upprocessandknowledge-driventop-downprocessforhumanactivityunderstanding.
2.SystemOverview
Fig.2showstheoverallsystemarchitecture.Lightgraymodulescomposethebasicsingle-viewsystem,whilethebright(yellow)modulescomposethemulti-viewfunction-ality.Darkgraymodulecanworkeitherinsingle-ormulti-viewmodes,butmorecamerascanincreasetheoverallac-curacy.Currently,twocamerasareusedforsynchronizedviews,whichareforeground-segmentedandcombinedtoformaplanar-homographymapfor3Dfootagelocationsofthepersons.Thehomographymapisusedforthetrack-levelanalysis.Thecamerahandoversearchesforunoc-cludedpersonviewsforthebody-levelanalysis.Boththetrack-andbody-levelanalysiscanbeusedfortheactiv-ityanalysisdependingonanalysis …… 此处隐藏:19412字,全部文档内容请下载后查看。喜欢就下载吧 ……
上一篇:学前教育史知识点