Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Arrow up icon
GO TO TOP
Hadoop Real-World Solutions Cookbook- Second Edition

You're reading from   Hadoop Real-World Solutions Cookbook- Second Edition Over 90 hands-on recipes to help you learn and master the intricacies of Apache Hadoop 2.X, YARN, Hive, Pig, Oozie, Flume, Sqoop, Apache Spark, and Mahout

Arrow left icon
Product type Paperback
Published in Mar 2016
Publisher
ISBN-13 9781784395506
Length 290 pages
Edition 2nd Edition
Tools
Arrow right icon
Author (1):
Arrow left icon
Tanmay Deshpande Tanmay Deshpande
Author Profile Icon Tanmay Deshpande
Tanmay Deshpande
Arrow right icon
View More author details
Toc

Table of Contents (12) Chapters Close

Preface 1. Getting Started with Hadoop 2.X FREE CHAPTER 2. Exploring HDFS 3. Mastering Map Reduce Programs 4. Data Analysis Using Hive, Pig, and Hbase 5. Advanced Data Analysis Using Hive 6. Data Import/Export Using Sqoop and Flume 7. Automation of Hadoop Tasks Using Oozie 8. Machine Learning and Predictive Analytics Using Mahout and R 9. Integration with Apache Spark 10. Hadoop Use Cases Index

Decommissioning DataNodes

The Hadoop framework provides us with the option to remove certain nodes from the cluster if they are not needed any more. Here, we cannot simply shutdown the nodes that need to be removed as we might lose some of our data. They need to be decommissioned properly. In this recipe, we are going to learn how to decommission nodes from the Hadoop cluster.

Getting ready

To perform this recipe, you should have a Hadoop cluster, and you should have decided which node to decommission.

How to do it...

To decommission a node from the HDFS cluster, we need to perform the following steps:

  1. Create a dfs.exclude file in a folder, say /usr/local/hadoop/etc/hadoop, and add the hostname of the node you wish to decommission.
  2. Edit hdfs-site.xml on NameNode to append the following property:
        <property>
            <name>dfs.hosts.exclude</name>
            <value>/usr/local/hadoop/etc/hadoop/dfs.exclude</value>
        </property>
  3. Next, we need to execute the refreshNodes command so that it rereads the HDFS configuration in order to start the decommissioning:
    hdfs dfsadmin –refreshNodes
    

This will start the decommissioning, and once successful execution of the dfsadmin report command, you will see that the node's status is changed to Decommissioned from Normal:

hdfs dfsadmin –report
Name: 172.31.18.55:50010 (ip-172-31-18-55.us-west-2.compute.internal)
Hostname: ip-172-31-18-55.us-west-2.compute.internal
Decommission Status : Decommissioned
Configured Capacity: 8309932032 (7.74 GB)
DFS Used: 1179648 (1.13 MB)
Non DFS Used: 2371989504 (2.21 GB)
DFS Remaining: 5936762880 (5.53 GB)
DFS Used%: 0.01%
DFS Remaining%: 71.44%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu Oct 08 10:56:49 UTC 2015

Generally, the decommissioning takes time as it requires block replications on other nodes. Once the decommissioning is complete, the node will be added to the decommissioned nodes list.

How it works...

HDFS/Namenode reads the configurations from hdfs-site.xml. You can configure a file with the list of nodes to decommission and execute the refreshNodes command; it then rereads the configuration file. While doing this, it reads the configuration about the decommissioned nodes and will start rereplicating blocks to other available datanode. Depending on the size of datanode getting decommissioned, the time varies. Unless the completed decommissioning is not completed, it advisable for you to touch datanode.

You have been reading a chapter from
Hadoop Real-World Solutions Cookbook- Second Edition - Second Edition
Published in: Mar 2016
Publisher:
ISBN-13: 9781784395506
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $19.99/month. Cancel anytime
Banner background image