Hadoop Basics
At a high-level, Hadoop has two key components made up of Hadoop Distributed File System (HDFS) and Map Reduce.
October 16, 2014
In late 2011, Dr. David Dewitt presented a Big Data keynote session, focused primarily on Hadoop, at the Professional Association for SQL Server (PASS) Summit event. Dr. Dewitt's keynote session is a great primer for learning more about Hadoop. At a high-level, Hadoop starts with two key components:
Hadoop Distributed File System (HDFS) – a distributed, fault tolerant file system.
Map Reduce – a framework for writing/executing distributed, fault tolerant algorithms. Note that Map Reduce has recently undergone an overhaul, and is now referred to as either MapReduce 2.0 (MRv2) or YARN.
Other components, like Hive, Pig, etc., build on top of these components.
Main article: Integrating Hadoop with SQL Server
About the Author
You May Also Like