Hadoop Cluster on UBUNTU 12.04 LTS

Configuring Hadoop Singlenode Cluster on Ubuntu 12.04 64bit

Step:1

Download ,Install and Configure Java 1.6

Step:2

Download and Configure Hadoop 0.20.2

Step:1

I. In Ubuntu , it's very easy to install JDK 1.6 follow

below steps

i. Go to Terminal and type

$ sudo apt-get install openjdk-6-jdk (enter)

ii. Setting Environment Variables in Terminal

$ export JAVA_HOME= /usr/lib/jvm/java-6-openjdk-amd64

(usual path. check once in your system)

and

in terminal type --->

$ sudo vim ~/.bashrc (enter)

you will see one visual editor--> press i and type

export JAVA_HOME= /usr/lib/jvm/java-6-openjdk-amd64

Step:2

I. Create a system user account to use for hadoop installation.

type following in terminal

$ sudo addgroup hgroup (enter)

$ sudo adduser --ingroup hgroup huser (enter)

Adding user `huser' ...

Adding new user `huser' (1002) with group `hgroup' ...

Creating home directory `/home/huser' ...

Copying files from `/etc/skel' ...

Enter new UNIX password:

Retype new UNIX password:

passwd: password updated successfully

Changing the user information for huser

Enter the new value, or press ENTER for the default

Full Name []:

Room Number []:

Work Phone []:

Home Phone []:

Other []:

Is the information correct? [Y/n] Y (enter)

II. Configuring SSH Key Based Login

Its required to setup haddop user to ssh itself without password. Using following method it will enable key based login for hadoop user.

$ su - huser

"huser@localhost:~$" you will get like this

$ ssh-keygen -t rsa -P ""

$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

$ ssh localhost

(if success you will get welcome message)

(if fail you will get

ssh connect to host localhost port 22 connection refused)

then run following command on super user... only if fails

$ sudo apt-get install openssh-server (if fails then run)

then run again

$ ssh localhost

III. Disabling IPV6

i. you need to run this command as super user

$ sudo vim /etc/sysctl.conf

(press i.. copy below lines end of the file)

# disable ipv6

net.ipv6.conf.all.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

ii. test ipv6 disabled or not.. run below command

cat /proc/sys/net/ipv6/conf/all/disable_ipv6

You have to reboot your machine in order to make the changes take effect

A return value of 0 means IPv6 is enabled, a value of 1 means disabled (that’s what we want)

IV. Download the stable version 0.20.2 of Hadoop

i. go to below link and download

http://www.mediafire.com/download/dtv7k2bhvkfoq64/hadoop-0.20.2.tar.gz

ii. Extract Hadoop tar file (as super user)

$ sudo cp /home/huser/Downloads/hadoop-0.20.2.tar.gz /usr/local

$ cd /usr/local/

$ sudo tar -xzf hadoop-0.20.2.tar.gz

$ sudo mv hadoop-0.20.2 hadoop

$ sudo chown -R huser:hgroup hadoop

$ su - huser

s cd /usr/local/hadoop/ (now we are in hadoop home directory)

iii. Configure Hadoop

Make the following additions to the corresponding files: in hadoop/conf

run from huser--->

$ vim conf/core-site.xml

* core-site.xml (inside the configuration tags)

<name>fs.default.name</name>

<value>hdfs://localhost:8020</value>

</property>

$ vim conf/mapred-site.xml

* mapred-site.xml (inside the configuration tags)

<name>mapred.job.tracker</name>

<value>localhost:8021</value>

</property>

$ vim conf/hdfs-site.xml

* hdfs-site.xml (inside the configuration tags)

<name>dfs.replication</name>

</property>

$ vim conf/hadoop-env.sh

* hadoop-env.sh

uncomment the JAVA_HOME export command, and set the path to your Java home

export JAVA_HOME=/usr/lib/jvm/java java-6-openjdk-amd64

Set JAVA_HOME path as per your system configuration for java.

iv. Format Name Node

(from huser only)

$ cd /usr/local/hadoop

$ bin/hadoop namenode -format (you will get )

29/06/13 07:24:20 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = srv1.tecadmin.net/192.168.1.90 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1479473; compiled by 'hortonfo' on Mon May 6 06:59:37 UTC 2013 STARTUP_MSG: java = 1.7.0_17 ************************************************************/ 29/06/13 07:24:20 INFO util.GSet: Computing capacity for map BlocksMap 29/06/13 07:24:20 INFO util.GSet: VM type = 32-bit 29/06/13 07:24:20 INFO util.GSet: 2.0% max memory = 1013645312 13/06/02 22:53:48 INFO util.GSet: capacity = 2^22 = 4194304 entries 29/06/13 07:24:20 INFO util.GSet: recommended=4194304, actual=4194304 29/06/13 07:24:20 INFO namenode.FSNamesystem: fsOwner=hadoop 13/06/02 22:53:49 INFO namenode.FSNamesystem: supergroup=supergroup 29/06/13 07:24:20 INFO namenode.FSNamesystem: isPermissionEnabled=true 29/06/13 07:24:20 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 29/06/13 07:24:20 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 29/06/13 07:24:20 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 29/06/13 07:24:20 INFO namenode.NameNode: Caching file names occuring more than 10 times 29/06/13 07:24:20 INFO common.Storage: Image file of size 112 saved in 0 seconds. 29/06/13 07:24:20 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/opt/hadoop/hadoop/dfs/name/current/edits 29/06/13 07:24:20 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/opt/hadoop/hadoop/dfs/name/current/edits 29/06/13 07:24:20 INFO common.Storage: Storage directory /opt/hadoop/hadoop/dfs/name has been successfully formatted. 29/06/13 07:24:20 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at srv1.tecadmin.net/192.168.1.90 ************************************************************

v. Start Hadoop Services

$ bin/start-all.sh

the output will be look like

huser@ubuntu:/usr/local/hadoop$ bin/start-all.sh

starting namenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-namenode-ubuntu.out

localhost: starting datanode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-datanode-ubuntu.out

localhost: starting secondarynamenode, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-secondarynamenode-ubuntu.out

starting jobtracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-jobtracker-ubuntu.out

localhost: starting tasktracker, logging to /usr/local/hadoop/bin/../logs/hadoop-hduser-tasktracker-ubuntu.out

huser@ubuntu:/usr/local/hadoop$

vi. Test and Access Hadoop Services

$ jps <you will get like this>

26049 SecondaryNameNode

25929 DataNode

26399 Jps

26129 Jobtracker

26249 Tasktracker

25807 Namenode

Web Access URLs for Services

http://localhost:50030/ for the Jobtracker

http://localhost:50070/ for the Namenode

http://localhost:50060/ for the Tasktracker

Creating HOME Paths for Hadoop and JAVA in bashrc file

run from huser

$ vim $HOME/.bashrc

and press i, add following line at end of the file

# Set Hadoop-related environment variables

export HADOOP_HOME=/usr/local/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)

export JAVA_HOME=/usr/lib/jvm/java java-6-openjdk-amd64

# Some convenient aliases and functions for running Hadoop-related commands

unalias fs &> /dev/null

alias fs="hadoop fs"

unalias hls &> /dev/null

alias hls="fs -ls"

# If you have LZO compression enabled in your Hadoop cluster and

# compress job outputs with LZOP (not covered in this tutorial):

# Conveniently inspect an LZOP compressed file from the command

# line; run via:

# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo

# Requires installed 'lzop' command.

lzohead () {

hadoop fs -cat $1 | lzop -dc | head -1000 | less

}

that's it, now you can access Hadoop from anywhere from terminal

Congratulations , your successfully configured Hadoop on Ubuntu

MY TECH LEARNERS

Monday, 26 August 2013

Hadoop Cluster on UBUNTU 12.04 LTS

1 comment:

Search This Blog

Blog Posts

Pages

Translate

Labels

Popular Posts

Connect

Total Pageviews

Followers

About Me