Storage device performance prediction with CART models(9)
时间:2025-07-10
时间:2025-07-10
Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our appr
Theworkloaddescriptionusestheentropyplot[32]toquantifytemporalandspatialburstinessandcorrela-tionsbetweenattributes.Entropyvalueareplottedononeortwoattributesagainsttheentropycalculationgranularity.Theincrementoftheentropyvaluescharacterizeshowtheburstinessandcorrelationschangefromonegranularitytothenext.Becauseoftheself-similarityofI/Oworkloads[13],theincrementisusuallyconstant,allowingustousetheentropyplotslopetocharacterizetheburstinessandcorrelations.AppendixBdescribestheentropyplotindetail.
Theworkload-leveldevicemodeloffersfastpredictions.ThemodelcompressesaworkloadintoaworkloaddescriptionandfeedsthedescriptionintoaCARTmodeltoproducethedesiredperformancemeasure.Featureextractionisalsofast.Topredictboththeaverageand90thpercentileresponsetime,themodelmusthavetwoseparatetrees,oneforeachperformancemetric.
Workloadmodelingintroducesaparametercalled“windowsize.”Thewindowsizeistheunitofper-formancepredictionand,thus,theworkloadlengthforworkloaddescriptiongeneration.Forexample,wecandividealongtraceintoone-minutefragmentsandusetheworkload-levelmodeltopredicttheaverageresponsetimeoverone-minuteintervals.Fragmentingworkloadshasseveraladvantages.First,performanceproblemsareusuallytransient.A“problem”ingtheworkloadinitsentirety,ontheotherhand,failstoindentifysuchtransientproblems.Second,fragmentingthetrainingtraceproducesmoresamplesfortrainingandreducestherequiredtrainingtime.Windowsthataretoosmall,however,containtoofewrequestsfortheentropyplottobeeffective.Weuseone-minutewindowsinallofourexperiments.
4.4ComparisonofTwoTypesofModels
Thereisacleartradeoffbetweentherequest-levelandworkload-leveldevicemodels.Theformerisfastintrainingandslowinprediction,andthelatteristheopposite.
Themodeltrainingtimeisdominatedbytracereplay,which,whentakingplaceonactualdevices,requiresexactlythesameamountoftimeasthetracelength.BuildingaCARTmodelneedsonlysecondsofcomputation,buttracereplaycanrequirehundredsofhourstoacquireenoughdatapointsformodelconstruction.Whenoperatingattherequestlevel,thedevicemodelgetsonedatapointperrequestasopposedtoonedatapointperone-minuteworkloadfragmentasintheworkload-leveldevicemodel.Inordertogetthesamenumberofdatapoints,theworkload-leveldevicemodelneedsatrainingtime100timeslongerthantherequest-levelmodelwhenthearrivalrateis100requestsperminute.
Thenumberoftreetraversalsdeterminesthepredictiontime,sinceeachpredictedvaluerequiresatreetraversal.Therefore,thetotalnumberoftreetraversalsisthenumberofrequestsintheworkloadfortherequest-leveldevicemodelandthenumberofworkloadfragmentsfortheworkload-levelmodel.Withanaveragearrivalrateof100requestsperminute,therequest-levelmodelis100timesslowerinprediction.
Anitemforfutureresearchistheexplorationofthepossibilityofcombiningthetwomodelstodeliveronesthatareef cientinbothtrainingandprediction.
5ExperimentalResults
ThissectionevaluatestheCART-baseddevicemodelspresentedintheprevioussectionusingarangeofworkloadtraces.
Devices.Wemodeltwodevices:asinglediskandadiskarray.Thesinglediskisa9GBAtlas10Kdiskwithanaveragerotationallatencyof3milliseconds.ThediskarrayisaRAID5diskarrayconsistingof8Atlas10Kdiskswitha32KBstripesize.WereplayallthetracesonthetwodevicesexcepttheSAPtrace,whichisbeyondthecapacityoftheAtlas10Kdisk.
…… 此处隐藏:1459字,全部文档内容请下载后查看。喜欢就下载吧 ……下一篇:中国食物成分表(全)2010版