L1 Cache and TLB Enhancements to the RAMpage Memory Hierarch(3)

发布时间:2021-06-06

Abstract. The RAMpage hierarchy moves main memory up a level to replace the lowest-level cache by an equivalent-sized SRAM main memory, with a TLB caching page translations for that main memory. This paper illustrates how more aggressive components higher

themid-1980s,CPUspeedshaveimprovedatarateof50-100%peryear,whileDRAMlatencyhasonlyimprovedataround7%peryear[12].Ifpredictionsofthememorywall[30]arecorrect,DRAMlatencywillbecomeaseriouslimitingfactorinperformanceimprovement.Attemptsatworkingaroundthememorywallarebecomingincreasinglycommon[9],butthefundamentalunderlyingDRAMandCPUlatencytrendscontinue[27].

2.2TheRAMpageApproach

RAMpageisbasedonthenotionthatDRAM,whilestillordersofmagnitudefasterthandisk,isincreasinglystartingtodisplayoneattributeofaperipheral:thereistimetodootherworkwhilewaitingforit[24],particularlyifrelativelylargeunitsaremovedbetweenDRAMandSRAMlevel.InRAMpage,thelowest-levelcacheismanagedasthemainmemory(i.e.,asapagedvirtually-addressedmemory),withdiskasecondarypagingdevice.TheRAMpagemainmemorypagetableisinverted,tominimizeitssize.Aninvertedpagetablehasanotherbene t:noTLBmisscanresultinaDRAMreference,unlessthereferencecausingtheTLBlookupisnotinanyoftheSRAMlayers[22].

RAMpageisintendedtohavethefollowingadvantages:

–fasthits–ahitphysicallyaddressesanSRAMmemory

–fullassociativity–fullassociativitythroughpagingavoidstheslowerhitsofhardwarefullassociativity

–software-managedpaging–replacementcanbeassophisticatedasneeded–TLBmissesstoDRAMminimized–asexplainedabove

–pinninginSRAM–criticalOSdataandcodecanbepinnedinSRAM–hardwaresimplicity–thecomplexityofacachecontrollerisremovedfromthelowestlevelofSRAM

–contextswitchesonmissestoDRAM–theCPUcanbekeptbusy

Theseadvantagescomeatthecostofslowermissesbecauseofsoftwaremiss-handling,andtheneedtomakeoperatingsystemchanges.However,thelatterproblemcouldbeavoidedbyaddinghardwaresupportforthemodel.

TheRAMpageapproachhasinthepastbeenshowntoscalewellinthefaceofthegrownCPU-DRAMspeedgap,particularlywithcontextswitchesonmisses.Thee ectofcontextswitchesonmissesisthat,providedthereisworkavailablefortheCPU,waitingforDRAMcane ectivelybeeliminated[21].Contextswitchesonmisseshavethemostsigni cante ect.

2.3Alternatives

Approachestoaddressingthememorywallcanloosely(withsomeoverlaps)begroupedintolatencytoleranceandmissreduction.Someapproachestola-tencytoleranceincludeprefetch,criticalword rst,memorycompression,writebu ering,non-blockingcaches,andsimultaneousmultithreading(SMT).

L1 Cache and TLB Enhancements to the RAMpage Memory Hierarch(3).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219