Apache Hadoop: Single-node or Pseudo-distributed cluster on macOS

This article walks through setting up and configuring a single-node Hadoop Cluster or pseudo-distributed cluster on macOS. A single-node cluster is very useful for development as it reduces the need for an actual cluster for running quick tests.
At the end of this tutorial, you’ll have a single-node Hadoop cluster with all the essential Hadoop daemons such as NameNode, DataNode, NodeManager, ResourceManager, and SecondaryNameNode.
Prerequisites
The two prerequisites for setting up a single-node Hadoop cluster are Java and SSH.
Java
Java must be installed and $JAVA_HOME environment variable should be set.
Install Java
Install java from the official website – https://java.com/en/download/
Verify Java is installed
Check the version of java in the terminal
$ java -version
If java is installed, java version will be printed similarly to the below output.
$ java -version
java version “1.8.0_211”
Java(TM) SE Runtime Environment (build 1.8.0_211-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
Set $JAVA_HOME
Add $JAVA_HOME environment variable to .bash_profile file
$ export “JAVA_HOME=$(/usr/libexec/java_home)” >> ~/.bash_profile
Source the .bash_profile file
$ source ~/.bash_profile
Verify that $JAVA_HOME is set up properly
$ echo $JAVA_HOME
SSH
SSH (Remote Login) is disabled by default on MacOS. SSH should be enabled and SSH keys should be set up to manage remote Hadoop daemons.
Enable SSH
Open System Preferences and go to Sharing
Select the Remote Login checkbox to enable SSH

Setup SSH Key
Generate SSH Key
$ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
Add the newly created public key to authorized SSH keys
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ chmod 0600 ~/.ssh/authorized_keys
Verify SSH
Verify you can SSH to the localhost with a passphrase:
$ ssh localhost
Install and configure Hadoop
Download Hadoop Distribution
Download the latest Hadoop distribution from the official website – https://hadoop.apache.org/releases.html
hadoop-3.1.2 was the latest distribution at the time of writing.
Unpack and move
Unpack the tar file. Update the location in the command if the tar file is in a different directory.
$ tar xzvf ~/Downloads/hadoop-3.1.2.tar
Move the Hadoop distribution directory to a preferred directory. We are using /User/ash/bin/ directory to store the Hadoop distribution. You can use any directory of your preference.
$ mkdir /User/ash/bin/
$ mv -f ~/Downloads/hadoop-3.1.2 /User/ash/bin/
Set variables
hadoop-env.sh
Edit ~/bin/hadoop-3.1.2/etc/hadoop/hadoop-env.sh file to define the following parameters. Set HADOOP_HOME to the Hadoop distribution location in your machine.
export JAVA_HOME="$(/usr/libexec/java_home)"
export HADOOP_HOME=/User/ash/bin/hadoop-3.1.2
.bash_profile
Add the following properties to ~/.bash_profile file
export HADOOP_VERSION=3.1.2
export HADOOP_HOME=$HOME/bin/hadoop-$HADOOP_VERSION
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export YARN_HOME=$HADOOP_HOME
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME
export HADOOP_LIBEXEC_DIR=$HADOOP_HOME/libexec
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
Source the .bash_profile file.
$ source ~/.bash_profile
Verify the variables are set
Verify $HADOOP_HOME is set.
$ echo $HADOOP_HOME
The output should look similar to the following.
$ echo $HADOOP_HOME
/Users/ashwin/bin/hadoop-3.1.2
Verify Hadoop executable binaries are added to $PATH.
$ hadoop version
The output should look similar to the following.
$ hadoop version
Hadoop 3.1.2
Source code repository https://github.com/apache/hadoop.git -r 1019dde65bcf12e05ef48ac71e84550d589e5d9a
Compiled by sunilg on 2019-01-29T01:39Z
Compiled with protoc 2.5.0
From source with checksum 64b8bdd4ca6e77cce75a93eb09ab2a9
Configure site.xml files
Modify the following site.xml files with the properties shown below.
mapred-site.xml
$HADOOP_HOME/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:8021</value>
</property>
</configuration>
yarn-site.xml
$HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
<value>98.5</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, HADOOP_CONF_DIR, CLASSPATH_PREPEND_DISTCACHE, HADOOP_YARN_HOME, HADOOP_MAPRED_HOME, HDFS_HOME</value>
</property>
</configuration>
hdfs-site.xml
$HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
core-site.xml
$HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Start up the Hadoop cluster
Start the cluster
Format the HDFS filesystem.
$ hdfs namenode -format
Start the Hadoop daemons.
$ start-all.sh
Verify the cluster is up
Verify NameNode, DataNode, NodeManager, ResourceManager, and SecondaryNameNode are running.
$ jps
The output should look similar to the following.
33703 SecondaryNameNode
34376 ResourceManager
34537 Jps
34473 NodeManager
33466 NameNode
33567 DataNode
Information about the cluster
Browse the following web pages to find information about the Hadoop cluster.
Hadoop Health
Browse the Hadoop Health web page at http://localhost:9870.

Yarn Resource Manager
Browse Yarn Resource Manager UI at http://localhost:8088/cluster.

Voila! You have a single-node Hadoop cluster up and running on your Mac.
10 Comments
Hi using the same procedure hadoop -version says
ERROR: -version is not COMMAND nor fully qualified CLASSNAME.
It’s supposed to be `hadoop version`. I have fixed the typo. Thank you.
Greaat article, exactly what I wanted too find.
Hello, I beelieve yolur bog could bee havin internett broweser
compatibility issues. When I take a loik at yopur sitge in Safari, itt lokks finne
however, wheen openkng inn I.E., it’s goot some overlpapping issues.
I siply wwanted too provvide you with a quick heads up!
Aprt from that, wonderful site!
I’m more than hhappy to uncover this website.
I wanht to to thank you for ones time ddue to tthis fantastic read!!
I definiteoy liiked evwry bitt off it and i alszo have yoou
book-marked to chck ouut new information on yourr website.
When I initially commented I clicked the
“Notify me when new comments are added” checkbox and now each time a
comment is added I get several emails with the same comment.
Is there any way you can remove people from that service?
Bless you!
Good day! I could have sworn I’ve visited your blog before but after looking at many of
the articles I realized it’s new to me. Anyhow, I’m certainly happy I stumbled upon it
and I’ll be bookmarking it and checking back frequently!
fluoxetine viagra viagra diovan https://pornderful.ai/ decadron viagra https://altavillaspa.com/ https://comicshopservices.com/ https://sexy.ai/ https://aipornhub.net/ https://altavillaspa.com/ viagra https://pornmake.ai/ pharmacy https://sexy.ai/ drugstore https://pornjourney.ai/ lioresal https://spiderguardtek.com/ https://pornmake.ai/ cheap https://pornmake.ai/ discounts
https://pornjourney.ai/ viagra https://altavillaspa.com/ https://endmedicaldebt.com/ cheap https://www.aiporn.net/ https://aipornhub.net/ decadron
https://aipornhub.net/ https://pornmake.ai/ https://pornworks.ai/ https://altavillaspa.com/ drugs viagra viagra https://sexy.ai/ https://aipornhub.net/ buy levitra minocin drugstore pharmacy
uroxatral dapoxetine cheap cheap https://pornjourney.ai/ https://medcostbuy.co.uk/ https://costmedbuy.com/ minocin drugs
https://allwallsmn.com/ https://deepnude.cc/ https://www.aiporn.net/ buy viagra discounts https://aipornhub.net/ diovan https://beauviva.com/ https://beauviva.com/
Having read this I believed it was really informative.
I appreciate you taking the time and effort to put this short article together.
I once again find myself spending a lot of time both reading and leaving comments.
But so what, it was still worth it!