Storage device performance prediction with CART models(20)
时间:2025-07-10
时间:2025-07-10
Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our appr
1 0.8Entropy value
0.8 0.6 0.4 0.2
Time and RWLocation and RW
0.8 0.6 0.4 0.2 0
Time and SizeLocation and Size
5
10 15Scale
20
25
Entropy value
0.6 0.4 0.2 0
0 1 2
3 4Scale
5 6 7 0 5
10 15Scale
20 25
(a)Entropyplotontime(b)Entropyplotwithoperationtype
Entropy value
(c)Entropyplotwithsize
Figure10:Entropyplotstoquantifyotherworkloadcharacteristics.
Asimilarcalculationonthetrace’s“margin”onLBNgivestheentropyplotonLBN.Figure9(b)showstheentropyplotonbotharrivaltimeandLBNofthesampletrace.
Wemaketwoobservations.First,theentropyplotshowsstronglinearity,suggestingtheskewinarrivaltimeandLBNstaysconstantatallgranularities.Theconstantincrementoftheentropyvaluefromonescaletothenextsuggeststhatthedegreeofskewstaysthesameatallthescales.Thatis,thesampletracehasthesameburstinessatallscales,whichcon rmstheself-similarityofI/Oworkloadsobservedinpreviousstudies[13].Second,thelinearentropyplotallowsustousetheentropyplotslopestocharacterizetheburstiness.Smoothtraf chasanentropyplotofslopecloseto1.Real-worldtraces,however,havestrongburstiness.Insummary,theentropyplotde nedonthetracemarginsallowsustousetwoscalarstocharacterizeboththetemporalandspatialburstinessofI/Oworkloads.
Entropyplotontwo-dimensionaldatasets.Weextendtheentropyplottohandletwo-dimensionaldatasetstomeasurethecorrelationsbetweentwoattributes.Asbefore,theentropyplotcalculatestheentropyvalueatdifferentscales,onlythistimeontwo-dimensionaldatasets.Givenatwo-dimensionalprojectionofa
2n,wedividetheprojectioninto2k2kgrids,whichaggregatesbothdimensionstrace,Cijij12
withscalek.ThisgivesaseriesCkof2k2kelements.ApplyingtheentropyfunctiontoCkgivesthejointentropyvalueatscalekonthetwodimensions.
Thejointentropyallowsustocalculatethecorrelationbetweenthetwoattributes.Thecorrelationisthedifferencebetweenthesumoftheentropyvalueonthetwoattributesandthejointentropyplot.Figure9(b)showsboththejointentropyandthecorrelationonarrivaltimeandLBNforthesampledisktrace.WeobservethatastrongcorrelationexistsbetweenarrivaltimeandLBN,andalsothatthecorrelationstaysconstantatallscales.Thus,weareabletouseascalarvalue,thecorrelationslope,toquantifythecorrelationbetweenarrivaltimeandLBN.
Entropyplotinvolvingrequestsizeandoperationtype.Itispossibletoextendtheentropyplottohandleoperationtypeandrequestsize.Theonlydifferenceisthelimitedvaluerangesofthetwoattributes,whichlimitthenumberofdatapointsintheentropyplot.Asaresult,theworkloaddescriptiondoesnotincludeentropyplotslopesonthesetwoattributes.
Quantifyingthecorrelationsinvolvingeitherofthetwoattributesfacesthesameproblem.Oursolutionistoalwaysusethe nestgranularityontherequestsizeoroperationtype,buttochangethescaleontheotherattribute.Forexample,tocalculatethejointentropyplotonarrivaltimeandrequesttype,theaggregationhappensonlyonthearrivaltime.Figure10showstheentropyplotsthatinvolveoperationtypeandrequestsize.Theseentropyplotsarenotaslinearaspreviousones.Therefore,itisnotstraightforwardtocompresseachlineintoascalar.Currently,ourworkloaddescriptionusestheaverageincrementbetweentwoadjacentscales.
…… 此处隐藏:1260字,全部文档内容请下载后查看。喜欢就下载吧 ……下一篇:中国食物成分表(全)2010版