In the step 1 of the set up available here, step 2 is available here. We took a look at installation of Linux based OS (Ubuntu) for Hadoop as we opted for Linux instead of Windows for Hadoop. We also saw the reasons for the preference. We installed and Configured our chosed Java - Oracle Java.

This step is pretty straight forward. We create a user and a user group. All hadoop cluster nodes will be using similar user name and will be part of same group. Lets call the group as Hadoop and the user as HdUser. Then we will create a SSH with its RSA key and no authentication (for ease of access by Hadoop).

Use Ubuntu Terminal window and below commands.

Create group
sudo addgroup hadoop

Create User and add it to the group
sudo adduser --ingroup hadoop hduser

Login as HdUser and generate SSH key
su - hduser

ssh-keygen -t rsa -P ""

Generating public/private rsa key pair. Enter file in which to save the key (/home/hduser/.ssh/id_rsa): Created directory '/home/hduser/.ssh'. Your identification has been saved in /home/hduser/.ssh/id_rsa. Your public key has been saved in /home/hduser/.ssh/id_rsa.pub. The key fingerprint is: 7b:62:<<more hex codes>>

hduser@ubuntu The key's randomart image is: <<som image>>

Store the generated key
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Test SSH
ssh localhost

The authenticity of host 'localhost (::1)' can't be established. RSA key fingerprint is c7:47:55:<<more hex code>>. 

Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'localhost' (RSA) to the list of known hosts. Linux ubuntu 2.6.32-22-generic #33-Ubuntu SMP Wed Apr 28 13:27:30 UTC 2010 i686 GNU/Linux Ubuntu 10.04 LTS <<info>>

This completes step 3. You can also learn about usage of Hadoop and about Hadoop architecture on BabaGyan.com.

