208
PUBLIC
© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ
Apache™ Hadoop® is an open-source framework
for distributed storage and processing of very large
data sets on computer clusters from commodity
hardware
Used at Google, Yahoo, Facebook, eBay,
LinkedIn, startups, and Fortune 500 enterprises to
store and process petabytes of data on thousands
of servers
Hadoop components
Cluster of commodity servers
Distributed storage layer (Hadoop Distributed File
System, or HDFS)
Distributed processing infrastructure (MapReduce
programming model)
Hadoop Overview
What is Hadoop?




