Hadoop Singlenode Cluster on CentOS

Configuring Hadoop Singlenode Cluster on CentOS 6.3 64bit

Step:1

Download ,Install and Configure Java 1.6

Step:2

Download and Configure Hadoop 0.20.2

Step:3

Download and Configure Eclipse

Step:1

I. In CentOS , it's very easy to install JDK 1.6 follow

below steps

i. Goto (at top panel)System-->Administration--> Add/Remove Software-->(sometimes it ask authentication --click anyway)--> on Search Box Type -- JDK then click on find

ii. Search for OpenJDK Developement Environment (1.6 vesrion)and select it then click apply

iii. Once Installation Done, then open terminal and type java -version(if you got version no.. then you installed java successfully)

II. Setting Environment Variables in Terminal

export JAVA_HOME= /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64

and

and in terminal type ---> $ vi ~/.bash_profile (enter)

you will see one visual editor--> press i and type

export JAVA_HOME= /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64

Step 2:

I. Create a system user account to use for hadoop installation.

# useradd huser

# passwd huser

Changing password for user hadoop.

New password:

Retype new password:

passwd: all authentication tokens updated successfully.

II. Configuring Key Based Login

Its required to setup haddop user to ssh ifself without password. Using following method it will enable key based login for hadoop user.

# su - huser

$ ssh-keygen -t rsa

$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

$ chmod 0600 ~/.ssh/authorized_keys

$ exit

III. Download the stable version 0.20.2 of Hadoop

i. go to below link and download

http://www.mediafire.com/download/dtv7k2bhvkfoq64/hadoop-0.20.2.tar.gz

ii. Extract Hadoop tar file

# mkdir /opt/huser

# cp /home/huser/Downloads/hadoop-0.20.2.tar.gz hadoop

# cd /opt/huser/

# tar -xzf hadoop-0.20.2.tar.gz

# mv hadoop-0.20.2 hadoop

# chown -R huser /opt/huser

# cd /opt/huser/hadoop /

iii. Configure Hadoop

Make the following additions to the corresponding files: in Hadoop/conf

* core-site.xml (inside the configuration tags)

<name>fs.default.name</name>

<value>hdfs://localhost:8020</value>

</property>

* mapred-site.xml (inside the configuration tags)

<name>mapred.job.tracker</name>

<value>localhost:8021</value>

</property>

* hdfs-site.xml (inside the configuration tags)

<name>dfs.replication</name>

</property>

* hadoop-env.sh

uncomment the JAVA_HOME export command, and set the path to your Java home

export JAVA_HOME= /usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64

Set JAVA_HOME path as per your system configuration for java.

iv. Format Name Node

# su - huser

$ cd /opt/huser/hadoop

$ bin/hadoop namenode -format (you will get )

29/06/13 07:24:20 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: host = srv1.tecadmin.net/192.168.1.90 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 1.2.0 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1479473; compiled by 'hortonfo' on Mon May 6 06:59:37 UTC 2013 STARTUP_MSG: java = 1.7.0_17 ************************************************************/ 29/06/13 07:24:20 INFO util.GSet: Computing capacity for map BlocksMap 29/06/13 07:24:20 INFO util.GSet: VM type = 32-bit 29/06/13 07:24:20 INFO util.GSet: 2.0% max memory = 1013645312 13/06/02 22:53:48 INFO util.GSet: capacity = 2^22 = 4194304 entries 29/06/13 07:24:20 INFO util.GSet: recommended=4194304, actual=4194304 29/06/13 07:24:20 INFO namenode.FSNamesystem: fsOwner=hadoop 13/06/02 22:53:49 INFO namenode.FSNamesystem: supergroup=supergroup 29/06/13 07:24:20 INFO namenode.FSNamesystem: isPermissionEnabled=true 29/06/13 07:24:20 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100 29/06/13 07:24:20 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s) 29/06/13 07:24:20 INFO namenode.FSEditLog: dfs.namenode.edits.toleration.length = 0 29/06/13 07:24:20 INFO namenode.NameNode: Caching file names occuring more than 10 times 29/06/13 07:24:20 INFO common.Storage: Image file of size 112 saved in 0 seconds. 29/06/13 07:24:20 INFO namenode.FSEditLog: closing edit log: position=4, editlog=/opt/hadoop/hadoop/dfs/name/current/edits 29/06/13 07:24:20 INFO namenode.FSEditLog: close success: truncate to 4, editlog=/opt/hadoop/hadoop/dfs/name/current/edits 29/06/13 07:24:20 INFO common.Storage: Storage directory /opt/hadoop/hadoop/dfs/name has been successfully formatted. 29/06/13 07:24:20 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at srv1.tecadmin.net/192.168.1.90 ************************************************************

v. Start Hadoop Services

$ bin/start-all.sh

vi. Test and Access Hadoop Services

$ jps <you will get like this>

26049 SecondaryNameNode

25929 DataNode

26399 Jps

26129 Jobtracker

26249 Tasktracker

25807 Namenode

Web Access URLs for Services

http://localhost:50030/ for the Jobtracker

http://localhost:50070/ for the Namenode

http://localhost:50060/ for the Tasktracker

Tip: set hadoop home path environment variable

export PATH=$PATH:/opt/huser/hadoop/bin

that's it, now you can access Hadoop from anywhere from terminal

Congratulations , your successfully configured Hadoop on CentOS

MY TECH LEARNERS

Monday, 26 August 2013

Hadoop Singlenode Cluster on CentOS

0 comments:

Post a Comment

Search This Blog

Blog Posts

Pages

Translate

Labels

Popular Posts

Connect

Total Pageviews

Followers

About Me