Storage device performance prediction with CART models(6)
时间:2025-07-10
时间:2025-07-10
Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our appr
FeatureCART
high(505%)
Neuralnetworks
fair(66%)
Poor
Fair
Poor
Poor
fast(seconds)
slow(hours)
fast
(milliseconds)
low(60B)
low(2MB)
Fair
k-nearestneighbors
InterpretabilityAbilitytohandleirrelevantinput
GoodGood
PoorPoor
PredictiontimeEaseofuse
fast
(milliseconds)
Good
slow(minutes)Fair
Table1:Comparisonofregressiontoolsinpredictingper-requestresponsetime.(ThesamedatasetisusedinFigure5.)Thecomparisononrow2,3,4andthelastoneistakenfrom[16].Werankthefeaturesintheorderoftheirimportance.Interpretabilityisthemodel’sabilitytoinfertheimportanceofinputvariables.Robustnessistheabilitytofunctionwellundernoisydataset.Irrelevantinputreferstofeaturesthathavelittlepredictivepowers.
buildtherequest-leveldevicemodelasdescribedinSection4.2.Themodelswereconstructedonthe rstdayofcello99aandtestsrunonthesecondofthesametrace.TheinformaiononthetracesweusedmaybefoundinSection5.
Themodel[29]usesalinearfunctionofXtoapproximatefX.Duetonon-linearstoragedevicebehavior,linearmodelshavepooraccuracy.
Themodel[26]consistsofasetofhighlyinterconnectedprocessingelementsworkinginunisontoapproximatethetargetfunction.Weuseasinglehiddenlayerof20nodes(bestamong20and40)andalearningrateof0.05.Halfofthetrainingsetisusedinbuildingthemodelandtheotherhalfforvalidation.Suchamodeltakesalongtimetoconverge.
The[6]mapstheinputdataintoahighdimensionalspaceandperformsalinearregressionthere.Ourmodelusestheradialbasisfunction
Kxix
expγx
xi
2
asthekernelfunction,andγissettobe2(bestamong1,3,4,6).Weuseanef cientimplementation,SVMlight[18],inourexperiment.Selectingtheparametervaluesrequiresexpertiseandmultipleroundsoftrials.
Themodel[9]ismemory-basedbecausethemodelremembersallthetrain-ingdatapointsandpredictionisdonethroughaveragingtheoutputoftheknearestneighborsofthedatapointbeingpredicted.WeusetheEuclideandistancefunctionandakvalueof5(bestamong5,10,15,and20).Themodelisaccurate,butisinef cientinstorageandcomputation.
Thelastthreetoolsrequirethatallthefeaturesandoutputbenormalizedtotheunitlength.Forfeaturesoflargevaluerange,wetakelogarithmsbeforenormalization.Overall,CARTisthebestatpredictingper-requestresponsetimes,withtheonlydownsidebeingslightlyloweraccuracycomparedtothemuchmorespace-andtime-consumingapproach.
…… 此处隐藏:451字,全部文档内容请下载后查看。喜欢就下载吧 ……下一篇:中国食物成分表(全)2010版