毕设英译汉原文 协作数据共享系统的可靠存储和(10)

时间:2025-07-06

Number of Nodes 05

10

15

20E x e c u t i o n T i m e (S e c )

Join

Corresp.Concatenate Copy Select

Fig.7.Running time:STBenchmark,800K tuples/relation,1-16nodes.

Number of Nodes

50

100

150

200

N e t w o r k T r a f f i c (M B )

Join

Corresp.Copy

Concatenate Select work traffic:STBenchmark,800K tuples/relation,1-16nodes.

Number of Nodes

102030405060N e t w o r k T r a f f i c p e r N o d e (M B )

Join

Corresp.Copy

Concatenate Select

Fig.9.Per-node network traffic:STBenchmark,800K tuples/relation,1-16nodes.

No. of Nodes 012345678

E x e c u t i o n T i m e (s e c )

Q10Q5Q3Q1Q6

Fig.10.Running time:TPC-H Scale Factor 0.5,1-16nodes.

No. of Nodes 0

24

6

8

10

N e t w o r k T r a f f i c (M B )

Q10Q3Q5Q6Q1

work traffic:TPC-H Scale Factor 0.5,1-16nodes.

No. of Nodes

0.0

0.5

1.0

1.5

2.0

2.5

P e r -N o d e N e t w o r k T r a f f i c (M B )

Q10Q3Q5Q6Q1

Fig.12.Per-node network traffic:TPC-H scale factor 0.5,1-16nodes.

# Tuples/Relation

01234567

E x e c u t i o n T i m e (s e c )

Join

Corresp.Copy

Concatenate Select

Fig.13.Running time vs.data size,STBench-mark,8nodes.Database Scale Factor

01234567

8

E x e c u t i o n T i m e (s e c )

Q10Q3Q5Q1Q6

Fig.14.Running time vs.data size,TPC-H,8

nodes.

# Tuples/Relation

050100150200250300350400

N e t w o r k T r a f f i c (M B )

Join

Corresp.Copy

Concatenate Select

work traffic vs.data size,STBench-mark,8nodes.

Scaling Nodes.Figure 7shows execution times for STBench-mark (at 800,000tuples/relation)for 1to 16physical nodes,while Figure 10shows times for TPC-H queries over the 500MB data set (scale factor 0.5).Note that results for STBenchmark are directly above the corresponding results for TPC-H to emphasize that the trends are very similar.Ideally,the running times would be halved each time we double the number of nodes.Our results come very close to matching this expectation for all of the TPC-H queries and about half of the STBenchmark queries.In the other STBenchmark queries (in particular Copy),so much data is returned (because the tuples consist of many long strings),that collecting the results at the query initiator becomes a bottleneck.With 16nodes,all but 0.1sec of the Copy query is spent transmitting and receiving the results.We conducted separate experiments to verify that performance is mostly limited by network bandwidth,with some additional performance degradation due to the unmar-shaling and storage at the query initiator.All queries continue to show some performance improvement as the number of processing nodes increases.

Figures 8and 11show the total network traffic while executing these queries,and Figures 9and 12show the per-node traffic.As expected,the network traffic increases as we scale up the number of nodes,but not dramatically so,and the per-node traffic (after rising significantly when we move from single-node computation to distributed operation)continues to decrease as nodes are added to the system.

Scaling Data Set Size.We next consider the effects of scaling the data.Figure 13shows execution times for STBenchmark on the 16-node cluster for 100K to 1.6M tuples/relation,and Figure 14shows the same for the TPC-H queries over the 8-node cluster while varying the data size from 250MB to 4GB (scale factors 0.25to 4).Figures 15and Figure 16show total network traffic for the same scenarios.Execution times and network traffic for all queries scale approximately linearly in the size of the data,as one would expect since there are only foreign-key joins and the data is fairly evenly distributed.We conclude that our system scales well on a LAN,and move on to consider other network settings.

C.Performance over a Simulated Wide Area Network

We next consider possible variations on Internet connectiv-ity among compute nodes.We made use of the traffic shaping and network emulation features built into recent versions of Linux to simulate various parameter changes.Specifically,we used NetEm to delay outgoing packets,simulating a higher latency network,and we used the HTB queue discipline to simulate a lower bandwidth network.Here we focus on the TPC-H benchmark,since STBenchmark,due to its large strings,becomes increasingly bandwidth-constrained at the query initiator,and since we feel its data is actually less representative than TPC-H’s.

Limited Bandwidth Settings.Our experimental results,shown in Figure 17,demonstrate that while performance suffers in very low-bandwidth connections,execution times

49

…… 此处隐藏:2412字,全部文档内容请下载后查看。喜欢就下载吧 ……
毕设英译汉原文 协作数据共享系统的可靠存储和(10).doc 将本文的Word文档下载到电脑

精彩图片

热门精选

大家正在看

× 游客快捷下载通道(下载后可以自由复制和排版)

限时特价:7 元/份 原价:20元

支付方式:

开通VIP包月会员 特价:29元/月

注:下载文档有可能“只有目录或者内容不全”等情况,请下载之前注意辨别,如果您已付费且无法下载或内容有问题,请联系我们协助你处理。
微信:fanwen365 QQ:370150219