Storage device performance prediction with CART models(9)

时间：2026-01-20

Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our appr

Theworkloaddescriptionusestheentropyplot[32]toquantifytemporalandspatialburstinessandcorrela-tionsbetweenattributes.Entropyvalueareplottedononeortwoattributesagainsttheentropycalculationgranularity.Theincrementoftheentropyvaluescharacterizeshowtheburstinessandcorrelationschangefromonegranularitytothenext.Becauseoftheself-similarityofI/Oworkloads[13],theincrementisusuallyconstant,allowingustousetheentropyplotslopetocharacterizetheburstinessandcorrelations.AppendixBdescribestheentropyplotindetail.

Theworkload-leveldevicemodeloffersfastpredictions.ThemodelcompressesaworkloadintoaworkloaddescriptionandfeedsthedescriptionintoaCARTmodeltoproducethedesiredperformancemeasure.Featureextractionisalsofast.Topredictboththeaverageand90thpercentileresponsetime,themodelmusthavetwoseparatetrees,oneforeachperformancemetric.

Workloadmodelingintroducesaparametercalled“windowsize.”Thewindowsizeistheunitofper-formancepredictionand,thus,theworkloadlengthforworkloaddescriptiongeneration.Forexample,wecandividealongtraceintoone-minutefragmentsandusetheworkload-levelmodeltopredicttheaverageresponsetimeoverone-minuteintervals.Fragmentingworkloadshasseveraladvantages.First,performanceproblemsareusuallytransient.A“problem”ingtheworkloadinitsentirety,ontheotherhand,failstoindentifysuchtransientproblems.Second,fragmentingthetrainingtraceproducesmoresamplesfortrainingandreducestherequiredtrainingtime.Windowsthataretoosmall,however,containtoofewrequestsfortheentropyplottobeeffective.Weuseone-minutewindowsinallofourexperiments.

4.4ComparisonofTwoTypesofModels

Thereisacleartradeoffbetweentherequest-levelandworkload-leveldevicemodels.Theformerisfastintrainingandslowinprediction,andthelatteristheopposite.

Themodeltrainingtimeisdominatedbytracereplay,which,whentakingplaceonactualdevices,requiresexactlythesameamountoftimeasthetracelength.BuildingaCARTmodelneedsonlysecondsofcomputation,buttracereplaycanrequirehundredsofhourstoacquireenoughdatapointsformodelconstruction.Whenoperatingattherequestlevel,thedevicemodelgetsonedatapointperrequestasopposedtoonedatapointperone-minuteworkloadfragmentasintheworkload-leveldevicemodel.Inordertogetthesamenumberofdatapoints,theworkload-leveldevicemodelneedsatrainingtime100timeslongerthantherequest-levelmodelwhenthearrivalrateis100requestsperminute.

Thenumberoftreetraversalsdeterminesthepredictiontime,sinceeachpredictedvaluerequiresatreetraversal.Therefore,thetotalnumberoftreetraversalsisthenumberofrequestsintheworkloadfortherequest-leveldevicemodelandthenumberofworkloadfragmentsfortheworkload-levelmodel.Withanaveragearrivalrateof100requestsperminute,therequest-levelmodelis100timesslowerinprediction.

Anitemforfutureresearchistheexplorationofthepossibilityofcombiningthetwomodelstodeliveronesthatareef cientinbothtrainingandprediction.

5ExperimentalResults

ThissectionevaluatestheCART-baseddevicemodelspresentedintheprevioussectionusingarangeofworkloadtraces.

Devices.Wemodeltwodevices:asinglediskandadiskarray.Thesinglediskisa9GBAtlas10Kdiskwithanaveragerotationallatencyof3milliseconds.ThediskarrayisaRAID5diskarrayconsistingof8Atlas10Kdiskswitha32KBstripesize.WereplayallthetracesonthetwodevicesexcepttheSAPtrace,whichisbeyondthecapacityoftheAtlas10Kdisk.

…… 此处隐藏：1459字，全部文档内容请下载后查看。喜欢就下载吧 ……

Storage device performance prediction with CART models(9).doc 将本文的Word文档下载到电脑

下载这篇word文档