Pages

Sunday 25 August 2013

Hadoop Installation on ubuntu 12.04 (Single Node cluster)

The Apache Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.



In this tutorial we are going to learn how to install Hadoop single node cluster on Ubuntu 12.04.

Before getting started with hadoop installation we need to make sure the java is installed on our system. In this tutorial i am going to install java7 on my machine you can go with java6 also.

Install oracle java 7 via PPA repository. Use the following commands:

$sudo add-apt-repository ppa:webupd8team/java
$sudo apt-get update
$sudo apt-get install oracle-java7-installer
$sudo update-java-alternatives -s java-7-oracle

To check if the java is correctly installed or not and what is the version installed type in the folloeing command:
$java -version

To install you can either create a new hadoop user or you can use your current itself. I am going with the second approach. So in this article i will be using user "shakeel" which is my default and only user in ubuntu.

Install SSH Server if not already present. This is needed as hadoop does an ssh into localhost for execution.
$sudo apt-get install openssh-server
$ssh-keygen -t rsa -P ""
$cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

The final step is to test the SSH setup by connecting to your local machine with the shakeel user. The step is also needed to save your local machine’s host key fingerprint to the shakeel user’s known_hosts file.
$ssh localhost

Disable IPV6

$sudo gedit /etc/sysctl.conf
Paste the below lines at the end of the file :
#disable ipv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

For these configurations to take effect normally you need to reboot the system. But you can aloo re-initialize the configurations without rebooting the system by executing below command:
$sudo sysctl -p

To make sure that IPV6 is disabled, you can run the following command:
$cat /proc/sys/net/ipv6/conf/all/disable_ipv6
The printed value should be 1, which means that is disabled.

HADOOP Installation

As all the basic settings for hadoop installations are done now we need to proceed with hadoop installation. You can download hadoop package from the Apache downloads http://www.apache.org/dyn/closer.cgi/hadoop/core.
I downloaded hadoop-1.0.4.tar.gz.

Copy the tar file to your user directory. 
$cd /home/shakeel

Untar all the contents of the tar file.
$sudo tar xzf hadoop-1.0.4.tar.gz

To keep the things simple we are renaming hadoop-1.0.4 to hadoop
$sudo mv hadoop-1.0.4 hadoop

Open .bashrc file
$sudo gedit /home/shakeel/.bashrc

Now add the HADOOP_HOME environment variable to your .bashrc which corresponds to the dirctory where you have extracted hadoop-1.0.4.tar.gz contents i.e. hadoop.
export HADOOP_HOME=/home/shakeel/hadoop

Add JAVA_HOME environment variable also at the end of .bachrc file
export JAVA_HOME=/usr/lib/jvm/java-7-oracle

Add the $HADOOP_HOME/bin to $PATH. By doing this you can start and stop hadoop cluster (run start-all.sh or stop-all.sh) from any of the directory without actually navigating to bin directory of hadoop and executing it.
export PATH=$PATH:$HADOOP_HOME/bin

Open a new terminal window and check if the hadoop home, java home and path is set properly and contains the changes that you have made to them
$echo $HADOOP_HOME
$echo $JAVA_HOME
$echo $PATH

Update JAVA_HOME in hadoop-env.sh
$sudo gedit /home/shakeel/hadoop/conf/hadoop-env.sh
replace # export JAVA_HOME=/usr/lib/j2sdk1.5-sun with
export JAVA_HOME=/usr/lib/jvm/java-7-oracle
Make sure to remove "#" which is placed at the begining of the command.

Create Temprory directory for hadoop
$sudo mkdir /home/shakeel/tmp

Open the core-site.xml and add the following between <configuration> .. </configuration> tags.
$sudo gedit /home/shakeel/hadoop/conf/core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/home/shakeel/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri’s scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri’s authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

Open the mapred-ste.xml and add the following between <configuration> .. </configuration> tags.
$sudo gedit /home/shakeel/hadoop/conf/mapred-site.xml
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

Open the hdfs-site.xml and add the following between <configuration> .. </configuration> tags.
$sudo gedit /home/shakeel/hadoop/conf/hdfs-site.xml
<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>

Next step is formatting the HDFS filesystem via NameNode (which simply initializes the directory specified by the dfs.name.dir variable that corresponds to ${hadoop.tmp.dir}/dfs/name on the local filesystem )
Don't run this command when the system is running and its done only at the first time during installation.
$/home/shakeel/hadoop/bin/hadoop namenode -format

To start the hadoop server navigate to bin directory of hadoop
$cd /home/shakeel/hadoop/bin/

Type in the command
$./start-all.sh
or if you have PATH variable appended with the HADOOP_HOME/bin you can directly use below command from anywhere:
$start-all.sh
Once all the process are started go to logs and check if the logs doesn't have any exceptions in it.

To check what all processes are running you can type in:
$jps

Output should be something like:
3435 NameNode
5645 DataNode
6766 SecondaryNameNode
6788 JobTracker
6567 TaskTracker
3445 jps
If you find any of the process missing from the above mentioned processes than it means there was some error with the starting of hadoop cluster. Go and check all the logs for verifying the cause for it.

Hadoop comes with several web interfaces which are by default (see conf/hadoop-default.xml) available at these locations:
http://localhost:50070/  --> web UI of the NameNode daemon
http://localhost:50030/  --> web UI of the JobTracker daemon
http://localhost:50060/  -->  web UI of the TaskTracker daemon
Make sure all these links are working fine which means your single-node hadoop cluster was installed successfully on your machine.

To stop the hadoop server use the below command :
$./stop-all.sh
or
$stop-all.sh

I hope this would have helped you in installing the single-node hadoop cluster on Ubuntu.
My next post would be on installing HBase over HDFS.

References:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
http://mysolvedproblem.blogspot.com/2012/05/installing-hadoop-on-ubuntu-linux-on.html

Monday 19 August 2013

Tomcat server setup on windows

Tomcat an open source web server and servlet container developed by the Apache Software Foundation. Tomcat implements the Java Servlet and the JavaServer Pages and provides a "pure Java" HTTP web server environment for Java code to run in

This post will help you setup a tomcat server on your windows machine.

For setting up tomcat we need to first download the binary distribution zip file from the apache site. Download the latest version or the required version of zip file from http://tomcat.apache.org/download-60.cgi

dwnldLocImg

Unzip it and place it on any of the drives.

Now you need to add a new environment variable JAVA_HOME if it is not already present on yor machine. Please make sure it points to the JDK installed on your machine.

For example: in my case:  JAVA_HOME=C:\Program Files\Java\jdk1.6.0

envVarImg
Now go to bin folder under the unzipped apache-tomcat-6.0.36 folder and run startup.bat to start tomcat server. A console will open and once you see “INFO: Server startup in XX ms” it means the server is started . 
cmdPrmptImg

On your browser type http://localhost:8080/ and if you see the apache tomcat server page it means you've setup Tomcat successfully.

Sunday 18 August 2013

Installation of Apache Directory for creating LDAP server

This post will show you how to install Apache Directory to create a LDAP server and insert few user records in it which will be used as LDAP authentication for our application.

Firstly we need to download the Apache Directory studio, please visit apache site http://directory.apache.org/studio/download/download-windows.html and download the required version.

Once it is downloaded double click on the exe and follow the steps given below:







Now your installation of Apache directory is done. We will now create an LDAP server on it.

For creating a LDAP server  open the apache directory and go to the LDAP Servers tab and right click --> New --> New Server


Give your server a name and choose one of the listed apache foundation servers and click finish

You can view the configuration properties of your server by right clicking on the server --> Open Configuration. These properties will be used while connecting to the applications.

Now you need to start the server. Right click on the server --> run.

Once the server is started. Right click on the server and create a new LDAP connection by clicking --> Create a connection.

You will get a message for the creation of the new server.

Once the connection is created now we are going to add few user credentials for this go to the LDAP browser and expand DIT --> ou=system --> ou=users as shown below. Right click on ou=users --> New  --> New Entry.

Create entry from scratch --> Next

Select inetOrgPerson from the available object classes --> Add --> Next.

For RDN select uid from the list and enter a username against it. It should be a unique value. This will act as the user for your application.

Update sn and cn value where sn = surname and cn = common name.


As we need a password also for the user against which the user will be authenticated we have to add a new attribute for the password. Right click --> New Attribute


Select userPassword from the list --> Next --> finish.

Enter the password in the Password Editor and press OK.

Now the user has been added to the directory and you can use it for the LDAP authentication. You can now access your LDAP server at ldap://localhost:10389 (see server configurations). Hope this will help you.

LDAP implementation

LDAP stands for  Lightweight Directory Access Protocol and it is used for user authentication, user provisioning, authorization, feeds, and views.



This post will help you implement a LDAP onto your java application.

Here first we will be preparing a Login page where we accept user credentials and we authenticate the user against the LDAP and only when the user is authenticated as per the active directory he/she will be allowed to access the application. For now i have just put the success and failure pages which will be redirected based on the user authentication you can customize according to your needs.

If you don't have any directory (LDAP Url) where you can test the LDAP the best way is to install an Apache directory Studio and insert few user credentials in it and test on it. The detailed instructions on how to install an Apache directory studio and insert user credentials in it is available in my other post http://technsolution.blogspot.in/2013/08/installation-of-apache-directory-for.html

Here is the code:


login.html

<html>
<head>
<title>
Login page
</title>
</head>
<body>
<h1 style="font-family:Comic Sans Ms;text-align="center";font-size:20pt;color:#00FF00;>
Simple Login Page
</h1>
<form name="login" action="Login" method="post">
Username : <input type="text" name="username"/>
Password : <input type="password" name="password"/>
<input type="submit" name="submit" value="Enter" style="background-color: #FFA500;width: 100 ">

</form>


</body>


</html>


Login.java

import java.io.IOException;
import javax.servlet.RequestDispatcher;
import javax.servlet.Servlet;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.naming.*;
import javax.naming.directory.*;
import java.util.Hashtable;

public class Login extends HttpServlet implements Servlet {


/**


*/
private static final long serialVersionUID = 1L;



public Login() {

super();
}

protected void processRequest(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {


final String SUCCESS = "/success.html";

final String FAILURE = "/failure.html";
String strUrl = "/login.html";
String username = request.getParameter("username");
String password = request.getParameter("password");

Hashtable<String,String> env = new Hashtable<String,String>(11);

boolean b = false;


env.put(Context.INITIAL_CONTEXT_FACTORY,"com.sun.jndi.ldap.LdapCtxFactory");

env.put(Context.PROVIDER_URL, "ldap://localhost:10389");
env.put(Context.SECURITY_AUTHENTICATION, "simple");
env.put(Context.SECURITY_PRINCIPAL, "uid="+ username +",ou=users,ou=system");
env.put(Context.SECURITY_CREDENTIALS, password);

System.out.println("User str :: "+ "uid="+ username +",ou=users,ou=system");
System.out.println("Password Str :: "+ password);


       

try {
// Create initial context
DirContext ctx = new InitialDirContext(env);

// Close the context when we're done
b = true;
ctx.close();

} catch (NamingException e) {

b = false;
}finally{
if(b){
System.out.print("Success");
strUrl = SUCCESS;
}else{
System.out.print("Failure");
strUrl = FAILURE;
}
}

RequestDispatcher rd = getServletContext().getRequestDispatcher(strUrl);
rd.forward(request, response);


}



protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

processRequest(request,response);

}


success.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Success Page</title>
</head>
<body>
 <h1>Success</h1>
</body>

</html>


failure.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<title>Failure Page</title>
</head>
<body>
 <h1>Failure</h1>
</body>

</html>

web.xml

<?xml version="1.0" encoding="UTF-8"?>
<web-app xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://java.sun.com/xml/ns/javaee" xmlns:web="http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" id="WebApp_ID" version="2.5">
  <display-name>Test</display-name>
  <welcome-file-list>
    <welcome-file>login.html</welcome-file>
  </welcome-file-list>
  
  <servlet>
<servlet-name>login</servlet-name>
<servlet-class>Login</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>login</servlet-name>
<url-pattern>/Login</url-pattern>
</servlet-mapping>

</web-app>

Thats it you are done with a basic application having LDAP implemented. Hope this would have helped you.


Saturday 17 August 2013

Installing JBoss Application server as a windows service



This post provides you all the basic steps that are involved in setting up a JBoss server as a windows service.

First we need to download the binary distribution of JBoss 5.1.0 from the JBoss site i.e. http://www.jboss.org/jbossas/downloads/

You will be redirected to the below page, download jboss-5.0.1.GA.zip

Once downloaded extract the contents of the zip file to any location. Check for the service.bat file under jboss-5.1.0.GA\bin in the extracted folder.

Now open the command prompt and navigate to the above location where service.bat file is located (In my case D:\jboss-5.1.0.GA\bin) and enter the “service install” command to install JBoss 5.1 service on the windows as shown below.

Go to the services console and you would be able to see a new JBoss Application Server 5.1 service installed there.

Right click on the JBoss Application Server 5.1 service and click start. Now your service will get started and you would be able to use the JBoss Server at localhost.
    

Type in http://localhost:8080/ in the browser and if you see the JBoss home page (shown below) it means the server started successfully.


Click on the Administration console and enter the default username and password as admin.




Now your installation is complete and you are ready to use the JBoss 5.1.0 Application Server.