Tutorial - Apache Hadoop Installation on Ubuntu Linux [ Step by Step ]

Would you like to learn how to do an Apache Hadoop installation on Ubuntu Linux? In this tutorial, we are going to show you how to download and install Apache Hadoop on a computer running Ubuntu Linux.

• Ubuntu 18.04
• Ubuntu 19.04
• Ubuntu 19.10
• Apache Hadoop 3.1.3
• Openjdk version 11.0.4

Hardware List:

The following section presents the list of equipment used to create this Apache Hadoop tutorial.

Every piece of hardware listed above can be found at Amazon website.

Apache Hadoop Related Tutorial:

On this page, we offer quick access to a list of tutorials related to Apache Hadoop.

List of Tutorials - Apache Hadoop

Tutorial - Apache Hadoop Installation on Ubuntu Linux

Install the Java JDK package.

Copy to Clipboard

Use the following command to find the Java JDK installation directory.

Copy to Clipboard

This command output should show you the Java installation directory.

Copy to Clipboard

In our example, our Java JDK is installed under the folder: /usr/lib/jvm/java-11-openjdk-amd64

Now, you need to create an environment variable named JAVA_HOME.

Let’s create a file to automate the required environment variables configuration

Copy to Clipboard

Here is the java.sh file content.

Copy to Clipboard

Reboot the computer.

Copy to Clipboard

Use the following command to verify if the JAVA_HOME variable was created.

Copy to Clipboard

Here is the command output:

Copy to Clipboard

Use the following command to test the Java installation.

Copy to Clipboard

Here is the command output:

Copy to Clipboard

Create a local user account named hadoop.

Copy to Clipboard

Here is the command output.

Take note of the Hadoop user password.

Copy to Clipboard

Use the SU comand to become the Haddop user.

Generate a SSH key to the Hadoop user account.

Copy to Clipboard

Here is the command output.

Copy to Clipboard

As the user Haddop, add the Hadoop user key to the list of authorized ssh keys.

You will need to enter the Hadoop user password.

Copy to Clipboard

Here is the command output.

Copy to Clipboard

As the Hadoop user account, try to login on the localhost.

Copy to Clipboard

Logoff from the Hadoop user account and go back to the root account.

Copy to Clipboard

Download the Hadoop package from the official website.

Copy to Clipboard

Install the Hadoop software on your Linux server.

Copy to Clipboard

Now, you need to create the Apache Haddop required environment variables.

Let’s create a file to automate the required environment variables configuration.

Copy to Clipboard

Here is the hadoop.sh file content.

Copy to Clipboard

You need to set the JAVA_HOME environment variable on the hadoop.sh file.

Edit the hadoop-env.sh file.

Copy to Clipboard

Add the following line at the end of this file.

Copy to Clipboard

Reboot the computer.

Copy to Clipboard

Use the following command to verify if the Apache Hadoop environment variables were created.

Copy to Clipboard

Here is the command output:

Copy to Clipboard

Verify the Apache Hadoop version installed.

Copy to Clipboard

Here is the command output.

Copy to Clipboard

The Apache Hadoop software installation was completed.

Tutorial - Apache Hadoop Configuration Example

In our example, we are going to configure an Apache Hadoop single node cluster setup.

Edit the core-site.xml file.

Copy to Clipboard

Here is the original file, before our configuration.

Copy to Clipboard

Here is the new file with our configuration.

Copy to Clipboard

Edit the hdfs-site.xml file.

Copy to Clipboard

Here is the original file, before our configuration.

Copy to Clipboard

Here is the new file with our configuration.

Copy to Clipboard

Edit the mapred-site.xml file.

Copy to Clipboard

Here is the original file, before our configuration.

Copy to Clipboard

Here is the new file with our configuration.

Copy to Clipboard

Edit the yarn-site.xml file.

Copy to Clipboard

Here is the original file, before our configuration.

Copy to Clipboard

Here is the new file with our configuration.

Copy to Clipboard

Create the required directories named namenode and datanode.

Copy to Clipboard

Use the following command to format the namenode.

Copy to Clipboard

Use the following command to start your Apache Hadoop cluster.

Copy to Clipboard

Use the following command to start your Apache Hadoop cluster.

Copy to Clipboard

Open a browser software, enter the IP address of your Apache Hadoop server plus :9870

In our example, the following URL was entered in the Browser:

• http://192.168.15.10:9870

The Apache Hadoop web interface should be presented.

Congratulations! You have finished the Apache Hadoop installation on Ubuntu Linux.