Storage device performance prediction with CART models(5)
时间:2025-07-10
时间:2025-07-10
Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our appr
x<5.94851
x<3.23184
x<7.7123 100
x<9.0008x<1.92033
x<5.0035x<7.05473
80 60 40 20 0
y
f(x) = x * x
CART
x<1.69098
x<3.60137
30.83 48.68 56.33 72.06 16.64 26.85x<4.4543x<0.889
88.01 2.57 21.67 -1.92 6.05 8.94-20
0 2 4
x
6 8 10
(a)Fittedtree
(b)Datapointsandregressionline
Figure1:CARTmodelforasimpleone-dimensionaldataset.Thedatasetcontains100datapointsgen-x2ε,whereεfollowsaGuassiandistributionwithmean0andstandarddeviationeratedusingfx
10.
Thepiece-wiseconstantfunctionf Xcanbevisualizedasabinarytree.Figure1(a)showsaCARTmodelconstructedonthesampleone-dimensionaldatasetin(b).Thesampledatasetisgeneratedusing
yi
x2i
εi
i
12
100
wherexiisuniformlydistributedwithin(0,10),andεifollowsaGuassiandistributionofN010.The
leafnodescorrespondtodisjointhyper-rectanglesinthefeaturevectorspace.Thehyper-rectanglesaredegeneratedintointervalsforone-dimensionaldatasets.Eachleafisassociatedwithavalue,f X,whichisthepredictionforallXswithinthecorrespondinghyper-rectangle.Theinternalnodescontainsplitpoints,andapathfromtheroottoaleafde nesthehyper-rectangleoftheleafnode.Thetree,therefore,representsapiece-wiseconstantfunctiononthefeaturevectorspace.Figure1(b)showstheregressionlineofthesampleCARTmodel.
3.2CARTModelProperties
CARTmodelsarecomputationallyef cientinbothconstructionandprediction.Theconstructionalgorithmstartswithatreewithasinglerootnodecorrespondingtotheentireinputvectorspaceandgrowsthetreebygreedilyselectingthesplitpointthatyieldsthemaximumreductioninmeansquarederror.AmoredetaileddiscussionofthesplitpointselectionispresentedinAppendixA.Eachpredictioninvolvesatreetraversaland,therefore,isfast.
CARToffersgoodinterpretabilityandallowsustoevaluatetheimportanceofvariousworkloadchar-acteristicsinpredictingworkloadperformance.ACARTmodelisabinarytree,makingiteasytoplotonpaperasinFigure1(a).Moreimportantly,onecanevaluateafeature’simportancebyitscontributioninerrorreduction.Intuitively,amoreimportantfeatureshouldcontributemoretotheerrorreduction;thus,leavingitoutofthefeaturevectorwouldsigni cantlyraisethepredictionerror.InaCARTmodel,weusethesumoftheerrorreductionrelatedtoalltheappearancesofafeatureasitsimportance.
3.3ComparisonWithOtherRegressionTools
OtherregressiontoolscanachievethesamefunctionalityasCART.WechoosetouseCARTbecauseofitsaccuracy,ef ciency,robustness,andeaseofuse.Table1comparesCARTwithfourotherpopulartoolsto
…… 此处隐藏:583字,全部文档内容请下载后查看。喜欢就下载吧 ……下一篇:中国食物成分表(全)2010版