Hadoop cluster set up with Ansible.

Skabhi
3 min readJun 30, 2021

🔰 Task Description📄

Configure Hadoop and start cluster services using Ansible Playbook

🔰 STEPS:

step 1. Copy java and Hadoop software

step 2. Install java and Hadoop software

step 3. create directory for namenode and datanode

step 4. Configure hdfs-site.xml and core-site.xml file

step 5. format namenode and start services for both the nodes

step 6. check the report

🔰 Practical:

  1. Setup the controller node with installed and configured ansible

Now, create inventory and write managed node ip , user name , password in it.

To, create ansible config file.

2. To check all the hosts

To check , managed nodes are ping or not.

3. Now, Create a playbook for NameNode

- hosts: namenode
vars_files:
- namenode_vars.yml
tasks:
- copy:
src: "/root/jdk-8u281-linux-x64.rpm"
dest: "/root/jdk-8u281-linux-x64.rpm"
- copy:
src: "/root/hadoop-1.2.1-1.x86_64.rpm"
dest: "/root/hadoop-1.2.1-1.x86_64.rpm"
- command:
cmd: "rpm -ivh jdk-8u281-linux-x64.rpm"
- command:
cmd: "rpm -ivh hadoop-1.2.1-1.x86_64.rpm --force"
- lineinfile:
path: "/etc/hadoop/hdfs-site.xml"
regexp: '</configuration>'
insertafter: '<configuration>'
line: "<property>\n<name>dfs.name.dir</name>\n<value>{{ nn_dir }}</value>\n</property>\n</configuration>" - lineinfile:
path: "/etc/hadoop/core-site.xml"
regexp: '</configuration>'
insertafter: '<configuration>'
line: "<property>\n<name>fs.default.name</name>\n<value>hdfs://{{ ansible_facts['default_ipv4']['address'] }}:9001</value>\n</property>\n</configuration>"
- file:
state: directory
path: "{{ nn_dir }}"
- name: Stopping firewall
shell: "systemctl stop firewalld"
- name: Formatting namenode
shell: "echo Y | hadoop namenode -format"
- name: Starting namenode services
shell: "hadoop-daemon.sh start namenode"

Create variable files for namenode.

nn_dir: "/nn"
nn_ip: "192.168.43.245"

Now, run the playbook .

4. Now, create a playbook for DataNode.

- hosts: datanode
vars_files:
- datanode_vars.yml
tasks:
- copy:
src: "/root/jdk-8u281-linux-x64.rpm"
dest: "/root/jdk-8u281-linux-x64.rpm"
- copy:
src: "/root/hadoop-1.2.1-1.x86_64.rpm"
dest: "/root/hadoop-1.2.1-1.x86_64.rpm"
- command:
cmd: "rpm -ivh jdk-8u281-linux-x64.rpm"
- command:
cmd: "rpm -ivh hadoop-1.2.1-1.x86_64.rpm --force"
- lineinfile:
path: "/etc/hadoop/hdfs-site.xml"
regexp: '</configuration>'
insertafter: '<configuration>'
line: "<property>\n<name>dfs.data.dir</name>\n<value>{{ dn_dir }}</value>\n</property>\n</configuration>"
- lineinfile:
path: "/etc/hadoop/core-site.xml"
regexp: '</configuration>'
insertafter: '<configuration>'
line: "<property>\n<name>fs.default.name</name>\n<value>hdfs://{{ nn_ip }}:9001</value>\n</property>\n</configuration>"
- file:
state: directory
path: "{{ dn_dir }}"
- name: Stopping firewall
shell: "systemctl stop firewalld"
- name: Starting namenode services
shell: "hadoop-daemon.sh start datanode"

Create variable files for datanode.

dn_dir: "/dn"
nn_ip: "192.168.43.245"

Now, run the playbook .

5. Now, to check .

Here , softwares are installed and services is also started in both Namenode and Datanode .

6. Now, to check the report of hadoop setup , use

That’s all for this article. Thanks For Reading😊

--

--