Configuring High Availability Hadoop Cluster:(Qjournal Method)
Steps:
Step1: Download and configure Zookeeper
Step2: Hadoop configuration and high availability settings
Step3: Creating folders for Hadoop cluster and file permissions
Step4: Hdfs service and file system format
Steps in Details:
Step 1: Download and configure Zookeeper
1.1 Download and configure Zookeeper software package from (https://www.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz)
[cluster@n3:~]$wget https://www.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz
Extract source
[cluster@n3:~]$tar –zxvf zookeeper-3.4.5.tar.gz
1.2 Zookeeper related configuration files are located
Configuration files: /home/cluster/zookeeper-3.4.5/conf
Binary executables: /home/cluster/zookeeper-3.4.5/bin
The Main configuration file
/home/cluster/zookeeper-3.4.5/conf/zoo.cfg
[cluster@n3:~]$cp zoo_sample.cfg zoo.cfg
Modifying zoo.cfg as per our installation guide
[cluster@n3:~]$nano /home/cluster/zookeeper-3.4.5/conf/zoo.cfg
tickTime=2000
clientPort=3000
initLimit=5
syncLimit=2
dataDir=/home/cluster/zookeeper/data/
dataLogDir=/home/cluster/zookeeper/log/
server.1=n3:2888:3888
server.2=n4:2889:3889
Save & Exit!
Note :-If Each of the servers hosted in the same physical machine as instance , every server port number has changed to n3:2888:3888 , n4: 2889:3889
1.3 Create the folder structure for Zookeeper data and logs as defined in zoo.cfg , repeat following step in all the nodes in the cluster (n3 & n4)
[cluster@n3:~]$mkdir –p /home/cluster/zookeeper/data/
[cluster@n3:~]$mkdir –p /home/cluster/zookeeper/log/
1.4 Create the myid file in /home/cluster/zookeeper/data/ and assign the value of each of the nodes in cluster. (n3=1 & n4=2)
[cluster@n3:~]$nano /home/cluster/zookeeper/data/myid
1
Save and Exit!
[cluster@n4~]$nano /home/cluster/zookeeper/data/myid
2
Save & Exit!
Step 2: Hadoop configuration and high availability settings
Download the latest hadoop 2.X.X version from the releases
2.1 Add / modify the following lines in hadoop-env.sh file to apply environment variable settings.
[cluster@n3:~]$ nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45/
export HADOOP_COMMON_LIB_NATIVE_DIR=/home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/lib/native/
export HADOOP_OPTS=”-Djava.library.path=/home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/lib/native/”
2.2 Add following lines in cores-site.xml file to configure journaling, default FS, temp directory & hdfs cluster. Within the <configuration> tag
[cluster@n3:~]$nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/hdfs/dfs/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hdfs/dfs/journal/data</value>
</property>
2.3 Add following lines in hdfs-site.xml file to configure dfs nameservice, dfs high availability, zookeeper & failover. Within the <configuration> tag.
[cluster@n3:~]$nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.name.dir</name>
<value>/hdfs/dfs/nn</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hdfs/dfs/dn</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
<final>true</final>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>n3,n4</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.n3</name>
<value>n3:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.n3</name>
<value>n3:50070</value>
</property>
<property>
<name>dfs.namenode.secondaryhttp-address.mycluster.n3</name>
<value>n3:50090</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.n4</name>
<value>n4:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.n4</name>
<value>n4:50070</value>
</property>
<property>
<name>dfs.namenode.secondaryhttp-address.mycluster.n4</name>
<value>n4:50090</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://n3:8485;n4:8485/mycluster</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>n3:3000,n4:3000</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/cluster/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
2.4 Add datanodes in the slaves configuration file as shown below.
[cluster@n3:~]$nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/slaves
n3
n4
Save & Exit!
Step 3 : Creating folders for Hadoop cluster and set file permissions
3.1 Create folder structure for journalnode as defined in core-site.xml, repeat following step in all the cluster nodes (n3 & n4)
[cluster@n3:~]$mkdir –p /hdfs/dfs/journal/data
3.2 Create temp folder for hadoop cluster as defined in core-site.xml, repeat following step in all the cluster nodes (n3 & n4)
[cluster@n3:~]$mkdir -p /hdfs/dfs/tmp
3.3 Create datanode and namenode folder for hadoop cluster as defined in hdfs-site.xml, repeat following step in all the cluster nodes (n3 & n4)
[cluster@n3:~]$mkdir -p /hdfs/dfs/dn
[cluster@n3:~]$mkdir -p /hdfs/dfs/nn
3.4 Copy hadoop source and zookeeper source configured in n3 node to n4
Step 4: Hdfs service and file system format
4.1 Start zookeeper service, once in all the nodes in cluster used for zookeeper, repeat below step in all the cluster nodes running zookeeper (n3 & n4).
Go to zookepeer-3.4.5 Binary path i.e /home/cluster/zookeeper-3.4.5/bin, then execute below commands
[cluster@n3:~]$./zkServer.sh start
[cluster@n4~]$./zkServer.sh start
4.2 Format Zookeeper file system in n3
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n3:~]$bin/hdfs zkfc –formatZK
Before format start journalnode in all the cluster nodes (n3 & n4)
[cluster@n3:~]$sbin/hadoop-daemon.sh start journalnode
4.3 Format namenode in n3
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n3:~]$bin/hdfs namenode –format
4.4 Copy Meta data information to slave name node in our guide (n4), run below command in
n4 (slave).
Make sure that namenode service is running in master node(n3)….
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n3:~]$sbin/hadoop-daemon.sh start namenode
Then in n4,
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n4~]$bin/hdfs namenode -bootstrapStandby
Start hadoop service............
[cluster@n3:~]$cd /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/sbin
./stop-all.sh
and start again.
./start-dfs.sh
run jps to check services running in n3 & n4
[cluster@n3 sbin]$ jps
17588 DFSZKFailoverController
16966 DataNode
24142 Jps
4268 QuorumPeerMain
16745 NameNode
17276 JournalNode
[cluster@n4 bin]$ jps
14357 DFSZKFailoverController
2369 QuorumPeerMain
13906 DataNode
23689 Jps
15458 NameNode
14112 JournalNode
Verifying Automatic Failover
After the initial deployment of a cluster with automatic failover enabled, you should test its operation. To do so, first locate the active NameNode. As mentioned above, you can tell which node is active by visiting the NameNode web interfaces.
Once you have located your active NameNode, you can cause a failure on that node. For example, you can use kill -9 <pid of NN> to simulate a JVM crash. Or you can power-cycle the machine or its network interface to simulate different kinds of outages. After you trigger the outage you want to test, the other NameNode should automatically become active within several seconds. The amount of time required to detect a failure and trigger a failover depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc daemons as well as the NameNode daemons in order to further diagnose the issue.
Thank You.
Steps:
Step1: Download and configure Zookeeper
Step2: Hadoop configuration and high availability settings
Step3: Creating folders for Hadoop cluster and file permissions
Step4: Hdfs service and file system format
Steps in Details:
Step 1: Download and configure Zookeeper
1.1 Download and configure Zookeeper software package from (https://www.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz)
[cluster@n3:~]$wget https://www.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz
Extract source
[cluster@n3:~]$tar –zxvf zookeeper-3.4.5.tar.gz
1.2 Zookeeper related configuration files are located
Configuration files: /home/cluster/zookeeper-3.4.5/conf
Binary executables: /home/cluster/zookeeper-3.4.5/bin
The Main configuration file
/home/cluster/zookeeper-3.4.5/conf/zoo.cfg
[cluster@n3:~]$cp zoo_sample.cfg zoo.cfg
Modifying zoo.cfg as per our installation guide
[cluster@n3:~]$nano /home/cluster/zookeeper-3.4.5/conf/zoo.cfg
tickTime=2000
clientPort=3000
initLimit=5
syncLimit=2
dataDir=/home/cluster/zookeeper/data/
dataLogDir=/home/cluster/zookeeper/log/
server.1=n3:2888:3888
server.2=n4:2889:3889
Save & Exit!
Note :-If Each of the servers hosted in the same physical machine as instance , every server port number has changed to n3:2888:3888 , n4: 2889:3889
1.3 Create the folder structure for Zookeeper data and logs as defined in zoo.cfg , repeat following step in all the nodes in the cluster (n3 & n4)
[cluster@n3:~]$mkdir –p /home/cluster/zookeeper/data/
[cluster@n3:~]$mkdir –p /home/cluster/zookeeper/log/
1.4 Create the myid file in /home/cluster/zookeeper/data/ and assign the value of each of the nodes in cluster. (n3=1 & n4=2)
[cluster@n3:~]$nano /home/cluster/zookeeper/data/myid
1
Save and Exit!
[cluster@n4~]$nano /home/cluster/zookeeper/data/myid
2
Save & Exit!
Step 2: Hadoop configuration and high availability settings
Download the latest hadoop 2.X.X version from the releases
2.1 Add / modify the following lines in hadoop-env.sh file to apply environment variable settings.
[cluster@n3:~]$ nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45/
export HADOOP_COMMON_LIB_NATIVE_DIR=/home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/lib/native/
export HADOOP_OPTS=”-Djava.library.path=/home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/lib/native/”
2.2 Add following lines in cores-site.xml file to configure journaling, default FS, temp directory & hdfs cluster. Within the <configuration> tag
[cluster@n3:~]$nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/hdfs/dfs/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hdfs/dfs/journal/data</value>
</property>
2.3 Add following lines in hdfs-site.xml file to configure dfs nameservice, dfs high availability, zookeeper & failover. Within the <configuration> tag.
[cluster@n3:~]$nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.name.dir</name>
<value>/hdfs/dfs/nn</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hdfs/dfs/dn</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
<final>true</final>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>n3,n4</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.n3</name>
<value>n3:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.n3</name>
<value>n3:50070</value>
</property>
<property>
<name>dfs.namenode.secondaryhttp-address.mycluster.n3</name>
<value>n3:50090</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.n4</name>
<value>n4:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.n4</name>
<value>n4:50070</value>
</property>
<property>
<name>dfs.namenode.secondaryhttp-address.mycluster.n4</name>
<value>n4:50090</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://n3:8485;n4:8485/mycluster</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>n3:3000,n4:3000</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/cluster/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
2.4 Add datanodes in the slaves configuration file as shown below.
[cluster@n3:~]$nano /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/etc/hadoop/slaves
n3
n4
Save & Exit!
Step 3 : Creating folders for Hadoop cluster and set file permissions
3.1 Create folder structure for journalnode as defined in core-site.xml, repeat following step in all the cluster nodes (n3 & n4)
[cluster@n3:~]$mkdir –p /hdfs/dfs/journal/data
3.2 Create temp folder for hadoop cluster as defined in core-site.xml, repeat following step in all the cluster nodes (n3 & n4)
[cluster@n3:~]$mkdir -p /hdfs/dfs/tmp
3.3 Create datanode and namenode folder for hadoop cluster as defined in hdfs-site.xml, repeat following step in all the cluster nodes (n3 & n4)
[cluster@n3:~]$mkdir -p /hdfs/dfs/dn
[cluster@n3:~]$mkdir -p /hdfs/dfs/nn
3.4 Copy hadoop source and zookeeper source configured in n3 node to n4
Step 4: Hdfs service and file system format
4.1 Start zookeeper service, once in all the nodes in cluster used for zookeeper, repeat below step in all the cluster nodes running zookeeper (n3 & n4).
Go to zookepeer-3.4.5 Binary path i.e /home/cluster/zookeeper-3.4.5/bin, then execute below commands
[cluster@n3:~]$./zkServer.sh start
[cluster@n4~]$./zkServer.sh start
4.2 Format Zookeeper file system in n3
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n3:~]$bin/hdfs zkfc –formatZK
Before format start journalnode in all the cluster nodes (n3 & n4)
[cluster@n3:~]$sbin/hadoop-daemon.sh start journalnode
4.3 Format namenode in n3
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n3:~]$bin/hdfs namenode –format
4.4 Copy Meta data information to slave name node in our guide (n4), run below command in
n4 (slave).
Make sure that namenode service is running in master node(n3)….
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n3:~]$sbin/hadoop-daemon.sh start namenode
Then in n4,
Go to hadoop home path i.e /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2, then execute below command
[cluster@n4~]$bin/hdfs namenode -bootstrapStandby
Start hadoop service............
[cluster@n3:~]$cd /home/cluster/hadoop-2.2.0-cdh5.0.0-beta-2/sbin
./stop-all.sh
and start again.
./start-dfs.sh
run jps to check services running in n3 & n4
[cluster@n3 sbin]$ jps
17588 DFSZKFailoverController
16966 DataNode
24142 Jps
4268 QuorumPeerMain
16745 NameNode
17276 JournalNode
[cluster@n4 bin]$ jps
14357 DFSZKFailoverController
2369 QuorumPeerMain
13906 DataNode
23689 Jps
15458 NameNode
14112 JournalNode
Verifying Automatic Failover
After the initial deployment of a cluster with automatic failover enabled, you should test its operation. To do so, first locate the active NameNode. As mentioned above, you can tell which node is active by visiting the NameNode web interfaces.
Once you have located your active NameNode, you can cause a failure on that node. For example, you can use kill -9 <pid of NN> to simulate a JVM crash. Or you can power-cycle the machine or its network interface to simulate different kinds of outages. After you trigger the outage you want to test, the other NameNode should automatically become active within several seconds. The amount of time required to detect a failure and trigger a failover depends on the configuration of ha.zookeeper.session-timeout.ms, but defaults to 5 seconds.
If the test does not succeed, you may have a misconfiguration. Check the logs for the zkfc daemons as well as the NameNode daemons in order to further diagnose the issue.
Thank You.
Nice Article for configuring two Name node service using zookeeper.Thanks
ReplyDeletegood one. this was one of the several blogs that helped me create an automatic ha with zoopkeeper nfs/qjm. http://hashprompt.blogspot.in/search/label/Hadoop
ReplyDeleteHi,
ReplyDeleteI am trying to setup a HA Hadoop cluster following your blog but facing some difficulties. Details are here. -
http://stackoverflow.com/questions/29346348/hadoop-ha-setup-not-able-to-connect-to-zookeeper?noredirect=1#comment46921872_29346348
Could you please help me?
Regards,
Subhankar
ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers (we call these registers znodes) modeled after the familiar directory tree structure of file systems. But unlike normal file systems ZooKeeper provides its clients with high throughput, low latency, highly available, strictly ordered access to the znodes. Like the distributed processes it coordinates, ZooKeeper itself is intended to be replicated over a sets of hosts called an ensemble. These hosts maintain an in-memory image of the data tree along with transaction logs and snapshots in a persistent store. Because the data is kept in-memory ZooKeeper is able to get very high throughput and low latency numbers. More at Hadoop Online Training
ReplyDeleteHi Arvind,
DeleteNice that you added some comments on my blog and very useful info on Zookeeper..,
but what the purpose of adding on my blog?
Hi,
ReplyDeleteit is very nice article but I have try to setup following your steps, and I have a error. You can see with all the detailed here:
http://stackoverflow.com/q/34000403/5621509
Can you help me please? I am new to hadoop, hdfs.
Best regards
Hi Asier,
ReplyDeleteSorry for the late response, Because i was inactive on the blog for server days.
And i had answered to your question on stackoverflow, it may help you.
Regards,
raj
Hi Rajendra,
ReplyDeleteI gone through your blog can you please help me to resolve the problem statement in the below link:
http://stackoverflow.com/questions/38911180/zookeeper-process-failed-when-hdfs-process-started-in-mesos
Hi Rajendra,
ReplyDeleteI gone through your blog can you please help me to resolve the problem statement in the below link:
http://stackoverflow.com/questions/38911180/zookeeper-process-failed-when-hdfs-process-started-in-mesos
That was a great message in my carrier, and It's wonderful commands like mind relaxes with understand words of knowledge by information's.
ReplyDeleteHadoop Training in Chennai
Please note! There may be multiple solutions to a specific crossword puzzle definition. We apologize in advance, if there is another solution for this crossword clue. Please send it to us and we will add it too, ASAP! Wood Fence
ReplyDeleteI just want to know about Configuring Hadoop and found this post is perfect one ,Thanks for sharing the informative post of UDF in Pig and able to understand the concepts easily,Thoroughly enjoyed reading
ReplyDeleteAlso Check out the : https://www.credosystemz.com/training-in-chennai/best-hadoop-training-in-chennai/
Hey, would you mind if I share your blog with my twitter group? There’s a lot of folks that I think would enjoy your content. Please let me know. Thank you.
ReplyDeleteAzure Training in Chennai
Salesforce Training in Chennai
PowerBI Training in Chennai
MSBI Training in Chennai
Java Training in Chennai
Software Testing Training in Chennai
Hey Nice Blog !! Thanks For Sharing !!! Wonderful blog & good post.Its really helpful for me, waiting for a more new post. Keep Blogging :)
ReplyDeleteJava Training in Chennai | Java Course in Chennai
It’s great to come across a blog every once in a while that isn’t the same out of date rehashed material. Fantastic read.
ReplyDeleteData science Course Training in Chennai |Best Data Science Training Institute in Chennai
matlab training chennai | Matlab course in chennai
Hey, Wow all the posts are very informative for the people who visit this site. Good work! We also have a Website. Please feel free to visit our site. Thank you for sharing.
ReplyDeleteBe Your Own Boss! If you're looking for a change in your work prospects, then let's prepare for your career from here!!!
Self Employment | Women Development | Information Technology | Engineering Courses
It’s always so sweet and also full of a lot of fun for me personally and
ReplyDeletemy office colleagues to search your blog a minimum of thrice in a
week to see the new guidance you have got.
dot net course in chennai
core java training classes in chennai
manual testing training in chennai
The prime benefit of going self-employed is that one becomes their own boss.so make yourself as self employer.we create the self employment opportunities here.
ReplyDeleteTo known about the laptop service course is here.it is use for self employment also.
ReplyDelete