收藏本站 | 论文目录

关键词: python matlab plc 单片机 dsp fpga 仿真 stm32

Hadoop部署模型的性能和能源效率文献翻译

[关键词:Hadoop,部署模型,能源效率]  [热度 ]
提示:此作品编号wxfy0120,word完整版包含【英文文献,中文翻译

以下仅为该作品极少介绍,详细内容请点击购买完整版!
Hadoop部署模型的性能和能源效率文献翻译

通信工程文献翻译——科学与商业的指数增长数据导致了云计算的演进和MapReduce并行编程模型。云计算强调提高利用率和节省电力固结而MapReduce实现大规模数据分析。Hadoop框架最近进化的标准框架实现了MapReduce模型。本文中,我们评估了Hadoop的性能在传统模型同位的数据和计算服务以及考虑分离服务的影响。数据分离计算服务在环境中提供了更多的灵活性在数据局部性可能不会有相当大的影响,如具有先进网络的虚拟化环境和集群。在本文中,我们还进行了能源效率评价对不同结构的物理和虚拟集群的Hadoop—等。我们广泛的评价表明:(1)性能物理集群明显优于虚拟集群;(2)由于服务分离导致的业绩下降依赖于数据来计算比率;(3)应用程序完成进展与功耗和功耗消耗大量应用程序特定。 

一、介绍 我介绍过去几年所产生的数据量科学和商业应用经历了指数增长。例如,大Hadron Collider(LHC)预计将产生1千兆字节的数据[许多]每一年。同样,脸谱网已经在处理超过500新的数据报[ 2 ]百万兆字节。 云计算环境和MapReduce的[ 3 ]有在过去的几年里逐步发展,以解决需要处理大数据集。云计算环境—年龄虚拟化提高利用率和降低功耗通过虚拟机(VM)整合的消费。这个MapReduce的核心思想是将数据分成固定大小的并行处理的块。几个开源MapReduce框架已在过去几年中最流行的是Hadoop的[ 4 ]。而Hadoop已初步设计为物理集群操作,随着云计算的到来,它现在也部署在虚拟集群(例如,亚马逊弹性MapReduce [ 5 ])。然而,性能和权力的影响综合环境仍然没有得到很好的调查。在本文中,我们探讨两Hadoop的部署模型在物理和虚拟集群了解他们性能和功率影响。首先,我们使用TRA—Hadoop的数据和计算服务的传统模式配置。其次,我们考虑一个替代的Hadoop de—部署模型,包括分离的数据和计算服务.特别是分离数据和计算服务有趣的环境中,数据的地方可能没有相当大的影响,如虚拟化环境和集群与先进的网络[ 6 ]。

我们认为部署模型的影响应用程序性能(即执行时间)。我们还CON—考虑功耗问题由于数据中心启用可扩展的数据分析现在需要大量的能量。一些作品(例如,[ 7 ],[ 8 ] ]调查了节能机制为Hadoop设计。然而,只有一个工作[ 9 ]研究了功耗Hadoop应用物理集群的焦点,传统的Hadoop部署模型,并计算密集型应用.了解应用程序性能—文件和功耗是一个基本步骤设计......

Abstract—The exponential growth of scientific and business data has resulted in the evolution of the cloud computing and the MapReduce parallel programming model. Cloud computing emphasizes increased utilization and power savings through consolidation while MapReduce enables large scale data analysis.The Hadoop framework has recently evolved to the standard framework implementing the MapReduce model. In this paper, we evaluate Hadoop performance in both the traditional model of collocated data and compute services as well as consider the impact of separating out the services. The separation of data and compute services provides more flexibility in environments where data locality might not have a considerable impact such as virtualized environments and clusters with advanced networks.

In this paper, we also conduct an energy efficiency evaluation of Hadoop on physical and virtual clusters in different configurations. Our extensive evaluation shows that: (1) performance on

physical clusters is significantly better than on virtual clusters;(2) performance degradation due to separation of the services depends on the data to compute ratio; (3) application completion

progress correlates with the power consumption and power consumption is heavily application specific.

Keywords-Cloud Computing, Hadoop MapReduce, Performance, Energy Efficiency, Virtualization

I. INTRODUCTION

Over the past years the amount of data generated by scientific as well as business applications has experienced an exponential growth. For instance, the Large Hadron Collider (LHC) is expected to generate dozens of petabytes of data [1]per year. Similarly, Facebook is already processing over 500 terabytes of new data daily [2].

Cloud computing environments and MapReduce [3] have evolved separately in the last few years to address the need to process large data sets. Cloud computing environments leverage virtualization to increase utilization and decrease power consumption through virtual machine (VM) consolidation. The key idea of MapReduce is to divide the data into fixed-size chunks which are processed in arallel. Several open-source MapReduce frameworks have been developed in the last years with the most popular one being Hadoop [4]. While Hadoop has been initially designed to operate on physical clusters,with the advent of cloud computing it is now also deployed......

 


以上仅为该作品极少介绍,详细内容请点击购买完整版!


本文献翻译作品由 毕业论文设计参考 [http://www.qflunwen.com] 征集整理——Hadoop部署模型的性能和能源效率文献翻译!