Storage device performance prediction with CART models(3)

时间:2025-07-10

Storage device performance prediction is a key element of self-managed storage systems and application planning tasks, such as data assignment. This work explores the application of a machine learning tool, CART models, to storage device modeling. Our appr

1Introduction

Thecostsandcomplexityofsystemadministrationinstoragesystems[17,35,11]anddatabasesystems[12,1,15,21]makeautomationofadministrationtasksacriticalresearchchallenge.Oneimportantaspectofadministeringself-managedstoragesystems,particularlylargestorageinfrastructures,isdecidingwhichdatasetstostoreonwhichdevices.To ndanoptimalornearoptimalsolutionrequirestheabilitytopredicthowwelleachdevicewillserveeachworkload,sothatloadscanbebalancedandparticularlygoodmatchescanbeexploited.

Researchershavelongutilizedperformancemodelsforsuchpredictiontocomparealternativestoragedevicedesigns.Givensuf cienteffortandexpertise,accuratesimulations(e.g.,[5,28])oranalyticmodels(e.g.,[22,30,31])canbegeneratedtoexploredesignquestionsforaparticulardevice.Unfortunately,inpractice,suchtimeandexpertiseisnotavailablefordeployedinfrastructures,whichareoftencomprisedofnumerousanddistinctdevicetypes,andtheiradministratorshaveneitherthetimenortheexpertiseneededtocon guredevicemodels.

Thispaperattacksthisobstaclebyprovidingablack-boxmodelgenerationalgorithm.By“blackbox,”wemeanthatthemodel(andmodelgenerationsystem)hasnoinformationabouttheinternalcomponentsoralgorithmsofthestoragedevice.Givenaccesstoadeviceforsome“trainingperiod,”themodelgen-erationsystemlearnsadevice’sbehaviorasafunctionofinputworkloads.Theresultingdevicemodelapproximatesthisfunctionusingexistingmachinelearningtools.OurapproachemploystheClassi cationAndRegressionTrees(CART)toolbecauseofitsef ciencyandaccuracy.CARTmodels,inanutshell,approximatefunctionsonamulti-dimensionalCartesianspaceusingpiece-wiseconstantfunctions.

Suchlearning-basedblackboxmodelingisdif cultfortworeasons.First,allthemachinelearningtoolswehaveexaminedusevectorsofscalarsasinput.Existingworkloadcharacterizationmodels,however,pressingthesedistributionsintoasetofscalarsisnotstraightforward.Second,thequalityofthegeneratedmodelsdependshighlyonthequalityofthetrainingworkloads.Thetrainingworkloadsshouldbediverseenoughtoprovidehighcoverageoftheinputspace.

Thisworkdevelopstwowaysofencodingworkloadsasvectors:avectorperrequestoravectorperworkload.Thetwoencodingschemesleadtotwotypesofdevicemodels,operatingattheper-requestandper-workloadgranularities,respectively.Therequest-leveldevicemodelspredicteachrequest’sresponsetimebasedonitsper-requestvector,or“requestdescription.”Theworkload-leveldevicemodels,ontheotherhand,predictaggregateperformancedirectlyfromper-workloadvectors,or“workloaddescriptions.”Ourexperimentsonavarietyofrealworldworkloadshaveshownthatthesedescriptionsarereasonablygoodatcapturingworkloadperformancefrombothsingledisksanddiskarrays.ThetwoCART-basedmodelshaveamedianrelativeerrorof17%and38%,respectively,foraverageresponsetimeprediction,and18%and43%respectivelyforthe90thpercentile,whenthetrainingandtestingtracescomefromthesameworkload.TheCART-basedmodelsalsointerpolatewellacrossworkloads.

Theremainderofthispaperisorganizedasfollows.Section2discussespreviousworkintheareaofstoragedevicemodelingandworkloadcharacterization.Section3describesCARTanditsproperties.Section4describestwoCART-baseddevicemodels.Section5evaluatesthemodelsusingseveralreal-worldworkloadtraces.Section6concludesthepaper.

2RelatedWork

Performancemodelinghasalongandsuccessfulhistory.Almostalways,however,thoroughknowledgeofthesystembeingmodeledisassumed.Disksimulators,suchasPantheon[33]andDiskSim[5],usesoftwaretosimulatestoragedevicebehaviorandproduceaccurateper-requestresponsetimes.Developingsuchsim-ulatorsischallenging,especiallywhendiskparametersarenotpubliclyavailable.Predictingperformance

…… 此处隐藏:1788字,全部文档内容请下载后查看。喜欢就下载吧 ……
Storage device performance prediction with CART models(3).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219