For the last couple of weeks I’ve been doing QA testing using Serenity, Selenium, and JBehave. I’ve been writing tests and configuring a framework to test our websites and I’m more and more convinced about the power of JBehave. You can find my learning Github repo at JBehave social example. Also for the most part I recommend using Idea IntelliJ.
A step by step guide to get your running with Hadoop today! In Hadoop on Mac part 2 we actually walk through the creation and compilation process of Java Hadoop Wordcount from beginning to end and automating it with .pom files.
- Additional Resources
- Github Wordcount example.
Download it from the website at http://brew.sh/ or simply paste the script inside the terminal
$ ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
$ brew install hadoop
Hadoop will be installed in the following directory
The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hadoop-env.sh
where 2.6.0 is the hadoop version.
Find the line with
and change it to
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/core-site.xml .
<configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/Cellar/hadoop/hdfs/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/mapred-site.xml and by default will be blank.
<configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9010</value> </property> </configuration>
The file can be located at /usr/local/Cellar/hadoop/2.6.0/libexec/etc/hadoop/hdfs-site.xml .
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration>
To simplify life edit your ~/.profile using vim or your favorite editor and add the following two commands. By default ~/.profile might not exist.
alias hstart="/usr/local/Cellar/hadoop/2.6.0/sbin/start-dfs.sh;/usr/local/Cellar/hadoop/2.6.0/sbin/start-yarn.sh" alias hstop="/usr/local/Cellar/hadoop/2.6.0/sbin/stop-yarn.sh;/usr/local/Cellar/hadoop/2.6.0/sbin/stop-dfs.sh"
$ source ~/.profile
in the terminal to update.
Before we can run Hadoop we first need to format the HDFS using
$ hdfs namenode -format
Nothing needs to be done here if you have already generated ssh keys. To verify just check for the existance of ~/.ssh/id_rsa and the ~/.ssh/id_rsa.pub files. If not the keys can be generated using
$ ssh-keygen -t rsa
Enable Remote Login
“System Preferences” -> “Sharing”. Check “Remote Login”
Authorize SSH Keys
To allow your system to accept login, we have to make it aware of the keys that will be used
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
Let’s try to login.
$ ssh localhost Last login: Fri Mar 6 20:30:53 2015 $ exit
Now we can run Hadoop just by typing
and stopping using
To run examples, Hadoop needs to be started.
Test them out using:
$ hadoop jar pi 10 100
Good to know
We can access the Hadoop web interface by connecting to
Resource Manager: http://localhost:50070 JobTracker: http://localhost:8088 Specific Node Information: http://localhost:8042Command $ jps 7379 DataNode 7459 SecondaryNameNode 7316 NameNode 7636 NodeManager 7562 ResourceManager 7676 Jps $ yarn // For resource management more information than the web interface. $ mapred // Detailed information about jobs
This we can use to access the HDFS filesystem, for any resulting output files.
Connection Refused after installing Hadoop
$ hdfs dfs -ls 15/03/06 20:13:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable ls: Call From spaceship.local/192.168.1.65 to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
The start-up scripts such as start-all.sh do not provide you with specifics about why the startups failed. Some of the time it won’t even notify you that a startup failed… To troubleshoot the service that isn’t functioning execute it manually.
$ hdfs namenode 15/03/06 20:18:31 WARN namenode.FSNamesystem: Encountered exception loading fsimage org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/local/Cellar/hadoop/hdfs/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible. 15/03/06 20:18:31 FATAL namenode.NameNode: Failed to start namenode.
and the problem is…
$ hadoop namenode -format
To verify the problem is fixed run
$ hstart $ hdfs dfs -ls /
If ‘hdfs dfs -ls’ gives you a error
ls: `.': No such file or directory
then we need to create the default directory structure Hadoop expects (ie. /user/whoami_output/)
$ whoami spaceship $ hdfs dfs -mkdir -p /user/spaceship 15/03/06 20:31:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable $ hdfs dfs -ls 15/03/06 20:31:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable $ hdfs dfs -put book.txt 15/03/06 20:32:29 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable $ hdfs dfs -ls 15/03/06 20:32:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items -rw-r--r-- 1 marekbejda supergroup 29578 2015-03-06 20:32 book.txt
JPS and Nothing Works…
Seems like certain builds of Java 1.8 (i.e.. 1.8_40) are missing a critical package that breaks Yarn. Check your logs at
$ jps 5935 Jps $ vim /usr/local/Cellar/hadoop/2.6.0/libexec/logs/yarn-* 2015-03-07 16:21:32,934 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain java.lang.NoClassDefFoundError: sun/management/ExtendedPlatformComponent .. 2015-03-07 16:21:32,937 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1 2015-03-07 16:21:32,939 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
Either downgrade to Java 1.7 or I’m currently running 1.8.0_20
$ java -version java version "1.8.0_20" Java(TM) SE Runtime Environment (build 1.8.0_20-b26) Java HotSpot(TM) 64-Bit Server VM (build 25.20-b23, mixed mode)
I’ve Done Everything!! SSH still asks for a password!!!! OMGG!!!!
So I’ve ran across this problem today, all of a sudden ssh localhost started requesting a password. I’ve generated new keys and searched all day for an answer, thanks to this Apple thread.
$ chmod go-w ~/ $ chmod 700 ~/.ssh $ chmod 600 ~/.ssh/authorized_keys