Programming Hadoop in Netbeans

Note: It seems that Netbeans is no longer supported by Karmasphere Studio. For programming Hadoop in Eclipse, you could read it here.

Hadoop MapReduce is an Open Source implementation of MapReduce programming model for processing large scale of data in distributed environment. Hadoop is implemented in Java as a class library. There are some distribution for Hadoop, from Apache, Cloudera, and Yahoo!

Meanwhile, Netbeans is an integrated development environment (or IDE) for programming in Java and many other programming languages. Netbeans (like any other IDE) helps programmer to develop applications easier and as painless as possible with its features. For this case, it helps us to develop Hadoop MapReduce jobs.

In this post, I’ll tell you step-by-step how to use Netbeans to develop a Hadoop MapReduce job. I’m using Netbeans 6.8 in Ubuntu Karmic Koala distribution. The MapReduce program we are going to create here is a simple program called wordcount. This program reads text in some files and lists all the words and how many those words present in all files. The source code of this program is available on the MapReduce tutorials packed with the Apache Hadoop distribution.

We divided this tutorial into three steps. First, we will install Karmasphere Studio for Hadoop, a Netbeans extension. Then, we will type some codes. And finally, we will run the MapReduce job in the Netbeans. Okay, fasten your seat belt.. Here we go.. Continue reading