毕设英译汉原文 协作数据共享系统的可靠存储和(11)

时间:2025-07-06

Database Scale Factor 010203040506070N e t w o r k T r a f f i c (M B )

work traffic vs.data size,TPC-H,8nodes.

Per-Node Bandwidth KB/sec 0102030405060708090100

E x e c u t i o n T i m e (s e c )

Q10Q3Q5Q1Q6

Fig.17.Running time vs.per-node bandwidth,8nodes,TPC-H scale factor 4.

Number of Nodes

102030405060

E x e c u t i o n T i m e (s e c )

Q10Q5Q3Q1Q6

rger-scale performance on EC2,TPC-H scale factor 10.

Number of Nodes 0

50

100

150

200

250

N e t w o r k T r a f f i c (M B )

Q10Q3Q5Q6Q1

Fig.19.Total traffic on EC2,TPC-H scale factor 10.Number of Nodes

5

10

15

20

P e r -n o d e N e t w o r k T r a f f i c (M B )

Q10Q3Q5Q6Q1

Fig.20.Per-node traffic on EC2,TPC-H scale factor 10.

Failure Time (sec)

04812T i m e (s e c )

Restart Recovery

Q1

Failure Time (sec)

04812T i m e (s e c )

Restart Recovery

Q10

Fig.21.Running times for Q1and Q10with a failure with and without incremental recovery,8nodes,TPC-H scale factor 2.

are degraded but reasonable for the bandwidths likely to be available between academic,institutional,or corporate users (>400kB/sec).Queries 1and 6,which perform no rehash operations and therefore send much less data over the network,are less impacted than queries 3,5,and 10,which join multiple relations and rehash data while doing so.

Higher Latency Settings.We omit a full presentation of our latency experiments due to space constraints.Realistic laten-cies (up to 200ms)had little impact on query performance.D.Scalability to Larger Numbers of Nodes

Since we have a limited number of local machines in our cluster,we next tried several alternatives to scale to higher numbers.Our initial efforts were with the PlanetLab network testbed —but disappointingly,we found that most nodes here were severely underpowered and overloaded,and disk-and memory-intensive tasks like ours were constantly thrashing,resulting in inconsistent and uninformative results.

Instead,we leased virtual nodes from Amazon’s EC2service —something we envision O RCHESTRA ’s user base doing as needed.Amazon has data centers geographically distributed across the world,so round-trip times are short and bandwidth is high.We used EC2’s “large”instances with 7.5GB RAM,and a virtualized dual-core 2GHz Opteron CPU.We show settings with only EC2nodes to make the execution time results simpler to understand,although we performed addi-tional experiments showing similar results using a mixture of local and EC2nodes.We experimented with the TPC-H scenario,as performance on STBenchmark at the data sizes we could generate was either too fast to be measured reliably or dominated by the cost of collecting the results.

We varied the number of total participants in the setting from 10to 100,using TPC-H scale factor 10(10GB data).Network traffic results,shown in Figures 19and 20,are similar to the results shown in Figures 11and 12for smaller numbers of nodes.Execution times are shown in Figure 18.As before,increasing the number of nodes leads to a dramatic decrease in execution time.This experiment validates the scalability of our system to large numbers of nodes.E.Failure and Recomputation

Finally,we study recovery when a node fails or becomes unreachable.One option is to abort the query and restart it over the remaining nodes.The other is to use the remaining nodes to recompute the “lost”results.Our experiments used 8nodes and TPC-H scale factor 2.

Incremental Recomputation vs.Total Restart.To explore the trade-offs between incremental recomputation versus full restart,we first ran a series of experiments using Q1(a selection and aggregation query)and Q10(which performs three joins followed by an aggregation),chosen to represent the two classes of TPC queries we studied.We started each query and at varying points after the start of the query (before it finished)we caused one of the nodes to fail.To avoid giving incremental recomputation an unfair advantage,we recompute using the same routing tables (which spreads the range of the failed node evenly over the nodes holding its replicated data).Figure 21shows performance results for Q1and Q10.In both cases,incremental recovery outperforms aborting and restarting by approximately 20%,validating the approach.Execution is slow for both techniques (compared to no failure)due to the cache misses inherent when a new node takes over a portion of the substrate key space.

Overhead of Incremental Recomputation.Incremental recomputation requires more data to be stored and sent over the network (to track the provenance of intermediate results),and requires that all intermediate results be kept around until the end of the query.Clearly,if this adds significant overhead to an average query,it may actually be preferable to restart after nodes fail.We measured the overhead of incremental recovery support on the TPC-H queries,which we briefly sum-marize due to space constraints.As expected,recovery support slightly increased execution time:queries ran from 2%-7%work traffic increased by negligible amounts,at

50

…… 此处隐藏:2978字,全部文档内容请下载后查看。喜欢就下载吧 ……
毕设英译汉原文 协作数据共享系统的可靠存储和(11).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219