Hadoop MapReduce is an Open Source implementation of MapReduce programming model for processing large scale of data in distributed environment. Hadoop is implemented in Java as a class library. There are some distribution for Hadoop, from Apache, Cloudera, and Yahoo!
Meanwhile, Netbeans is an integrated development environment (or IDE) for programming in Java and many other programming languages. Netbeans (like any other IDE) helps programmer to develop applications easier and as painless as possible with its features. For this case, it helps us to develop Hadoop MapReduce jobs.
In this post, I’ll tell you step-by-step how to use Netbeans to develop a Hadoop MapReduce job. I’m using Netbeans 6.8 in Ubuntu Karmic Koala distribution. The MapReduce program we are going to create here is a simple program called wordcount. This program reads text in some files and lists all the words and how many those words present in all files. The source code of this program is available on the MapReduce tutorials packed with the Apache Hadoop distribution.
We divided this tutorial into three steps. First, we will install Karmasphere Studio for Hadoop, a Netbeans extension. Then, we will type some codes. And finally, we will run the MapReduce job in the Netbeans. Okay, fasten your seat belt.. Here we go..
Install Karmasphere Studio for Hadoop
In order to do this, you must already installed JDK 1.6 and Netbeans (of course). There is a nifty tutorial with pictures about how to install the Karmasphere Studio for Hadoop on their site, but I’ll write it again here.
- Open your Netbeans, go to Update Center using Tools > Plugins.
- In the Update Center, go to Settings tab and click the Add button. Enter the following Name and URL in the Update Center Customizer window:
Name: Karmasphere Studio for Hadoop
URL: http://hadoopstudio.org/updates/updates.xml
- Now, select the Available Plugins tab. Find the “Karmasphere Studio for Hadoop” in the list and check it. Then click the Install button.
- Click Next and accept the license agreement. Click Install for list of will be installed plugins. Then, click Continue to download and install the plugins. The plugins size is about 20-something MB (I forgot). Wait for it and when it’s finished, restart your IDE.
- Done, we are good to go.
Typing some codes
Now, we are going to type some codes for wordcount program. To do this you must restart your IDE after the plugins installation. If you haven’t do it, then do it now, I’ll wait. Done it? Okay, let’s continue.
- We need to create a new Java application. To do that, go to File > New Project. Pick Java Application project and click Next.
- In the next window, give WordCount as the name of the project. Then type WordCount as the Main Class. When you’re done, click Finish.
- Okay, the editor for WordCount.java is now open. But first, we must added the Hadoop library to the project. To do this right-click on the Libraries on the WordCount project folder at the left side of the IDE, then pick Add Library.

- In the Add Library window, select Hadoop 0.20.0 as the version of Hadoop that we are going to use. Then click the Add Library button.
- The appropriate library now has been added to the project. Next we are going to the WordCount.java editor. Edit this file with this code below:
import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; public class WordCount{ public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException{ String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while(tokenizer.hasMoreTokens()){ word.set(tokenizer.nextToken()); output.collect(word, one); } } } public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable>{ public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException{ int sum = 0; while (values.hasNext()){ sum += values.next().get(); } output.collect(key, new IntWritable(sum)); } } public static void main(String[]args) throws IOException{ JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); FileInputFormat.setInputPaths(conf, new Path(args[0])); FileOutputFormat.setOutputPath(conf, new Path(args[1])); try{ JobClient.runJob(conf); }catch(IOException e){ System.err.println(e.getMessage()); } } }This program will take two arguments, the directory path of the input and the output. In this post, I’ll not explain the details about the code above. Please refer to the Apache Hadoop MapReduce tutorial if you wanna know about it
- After we sure that there is no error or typo, let’s build the program. To do this, right-click the WordCount project in the left side and pick Build. This step will create the JAR file of the program.
- Next, we will prepare the input for this program. We will create a folder and two text files inside the folder.
For example, if you are creating input folder at your home directory, then the path will be/home/username/input. Inside it create two text files, let’s name itfile01andfile02.
On the first file type the sentence (without the quotes): “Hello world Bye world”
And in the second sentence type (without the quotes): “Hello Hadoop Bye Hadoop”
Actually, you can type anything you want. The two sentences are just examples. Save the files when you’re done - We are done in this step. Let’s go to the final step.
Running the MapReduce job
Okay. Now we are going to run the MapReduce job locally in Netbeans. This is how it’s done.
- On the left side of the IDE, click the Services tab. Right-click on the Hadoop Jobs and pick New Job.

- Give WordCount as Job Name and select the Hadoop Job from pre-existing JAR file type. Click Next when you’re done.
- Then, browse the JAR file we already created in the previous step. Click browse and go to your Netbeans WordCount Project folder. The JAR file is located in the
distfolder. If you’re using Netbeans default settings, then the JAR file will be located in/home/username/NetbeansProjects/WordCount/dist. Click Next when you’re done. - In the step Set Job Defaults (Step 5 of 5), choose In-Process Thread (0.20.0) as the default cluster. Then, in the Default Arguments type the arguments needed by the program. In this case, the input and output directory path. Type the input folder that we created earlier and the output folder:
/home/username/input /home/username/output
For your information, we don’t need to create the output folder first. The program will create the folder for you. Click Finish when you’re done. - Now, we will finally run the MapReduce job. To do this right-click the WordCount under the Hadoop Jobs list and pick Run Job…
- In the Execute Hadoop Job window, give WordCount as the Job Name and click Run.
- If your job executes successfully, there will be an output directory and inside it you’ll find a file. Inside the file you’ll find something like this:
Bye 2 Hadoop 2 Hello 2 World 2
Now we’re done. If you have a question, feel free to ask me. But for your information, I’m still learning about this too. Let’s study about it together. Have a nice try and see you on the next post.
Next post: Hadoop on single-node cluster
Comments
ordinarydot at 24/01/2010 said...
is english mandatory here? >_>
well, emm, i followed the step and succeed. while i’m still a noob at these, i think i can still understand a lil’ bit, maybe because i took paralel programming subject back then :D. after skimmed the wiki article, i assume the concept basically similar with the mpi in paralel programming
btw, this is a great tutorial and very well written. keep up the good work!
oktaf at 15/04/2010 said...
hello..
i want ask you about Hadoop..
Can i develop Hadoop MapReduce job using Netbeans 6.8 in Windows Xp?
Thanks before..I hope you ask my question..
sidudun at 19/04/2010 said...
hello there..
Sorry, I just checked my site again.. ^^
I never tried it in Windows XP, but some said that it worked there..
but if u want to deploy a Hadoop MapReduce jobs, I think u only can do it on Linux..
CMIIW..
Fikri at 16/04/2010 said...
I’ve already followed the stops of your article about writing mapreduce program with hadoop. Unfortunately mine is unsuccessful and folder output doesn’t appear. I’m so confuse. Would you help me.
sidudun at 19/04/2010 said...
Really?
Did the output console produce error message?
One thing to remember, you shouldn’t create the output folder first.. the code will do it for you..
Fikri at 19/05/2010 said...
I’ve done with my quick start on hadoop tutorial and succesfully deploy may cluster with 3 PCs. By the way, can hadoop accsesses database file like sqlite and write on it? I want to make an indexing documents with hadoop.
sidudun at 25/05/2010 said...
@Fikri:
for storing indexed documents you can use HBase, a non relational database provided with Hadoop.
For further exploration about indexing documents for searching purposes, you can check Lucene.
Reza at 15/05/2010 said...
Hi,
I was trying this program on Windows XP and I got this problem:
Cannot run program “chmod”: CreateProcess error=2
It seems it has a problem in “chmod” command. Does anybody know how to solve this problem?
Thanks,
Reza
sidudun at 15/05/2010 said...
Hmm…
AFAIK, chmod is a command for changing file/directory access permissions in Linux/UNIX. Maybe the error exists because there is no chmod command in Windows..
It makes me wonder, where the command came from?
Fikri at 26/05/2010 said...
I’ve tried your tutorial above and I did it exactly the same like yours but there is still error. The error:
java.util.ConcurrentModificationException
at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
at java.util.AbstractList$Itr.next(AbstractList.java:343)
at com.karmasphere.studio.hadoop.executor.HadoopExecutor.getClassLoaderRoots(HadoopExecutor.java:412)
at com.karmasphere.studio.hadoop.executor.HadoopExecutor.getClassLoader(HadoopExecutor.java:490)
at com.karmasphere.studio.hadoop.executor.HadoopExecutor.getMainClass(HadoopExecutor.java:520)
at com.karmasphere.studio.hadoop.executor.HadoopExecutor.getMainMethod(HadoopExecutor.java:642)
at com.karmasphere.studio.hadoop.executor.HadoopExecutor.isInvokable(HadoopExecutor.java:771)
at com.karmasphere.studio.hadoop.executor.HadoopExecutorConfigPanel.initJobPropertyEditor(HadoopExecutorConfigPanel.java:180)
at com.karmasphere.studio.hadoop.executor.HadoopExecutorConfigPanel.(HadoopExecutorConfigPanel.java:129)
at com.karmasphere.studio.hadoop.job.RunJobAction.runJob(RunJobAction.java:37)
at com.karmasphere.studio.hadoop.job.RunJobAction.runJob(RunJobAction.java:46)
at com.karmasphere.studio.hadoop.job.RunJobAction.performAction(RunJobAction.java:67)
at org.openide.util.actions.NodeAction$DelegateAction$1.run(NodeAction.java:589)
at org.netbeans.modules.openide.util.ActionsBridge.implPerformAction(ActionsBridge.java:83)
at org.netbeans.modules.openide.util.ActionsBridge.doPerformAction(ActionsBridge.java:64)
at org.openide.util.actions.NodeAction$DelegateAction.actionPerformed(NodeAction.java:585)
at javax.swing.AbstractButton.fireActionPerformed(AbstractButton.java:1995)
at javax.swing.AbstractButton$Handler.actionPerformed(AbstractButton.java:2318)
at javax.swing.DefaultButtonModel.fireActionPerformed(DefaultButtonModel.java:387)
at javax.swing.DefaultButtonModel.setPressed(DefaultButtonModel.java:242)
at javax.swing.AbstractButton.doClick(AbstractButton.java:357)
at javax.swing.plaf.basic.BasicMenuItemUI.doClick(BasicMenuItemUI.java:1223)
at javax.swing.plaf.basic.BasicMenuItemUI$Handler.mouseReleased(BasicMenuItemUI.java:1264)
at java.awt.Component.processMouseEvent(Component.java:6263)
at javax.swing.JComponent.processMouseEvent(JComponent.java:3267)
at java.awt.Component.processEvent(Component.java:6028)
at java.awt.Container.processEvent(Container.java:2041)
at java.awt.Component.dispatchEventImpl(Component.java:4630)
at java.awt.Container.dispatchEventImpl(Container.java:2099)
at java.awt.Component.dispatchEvent(Component.java:4460)
at java.awt.LightweightDispatcher.retargetMouseEvent(Container.java:4574)
at java.awt.LightweightDispatcher.processMouseEvent(Container.java:4238)
at java.awt.LightweightDispatcher.dispatchEvent(Container.java:4168)
at java.awt.Container.dispatchEventImpl(Container.java:2085)
at java.awt.Window.dispatchEventImpl(Window.java:2478)
at java.awt.Component.dispatchEvent(Component.java:4460)
[catch] at java.awt.EventQueue.dispatchEvent(EventQueue.java:599)
at org.netbeans.core.TimableEventQueue.dispatchEvent(TimableEventQueue.java:125)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:269)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:184)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:174)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:169)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:161)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:122)
sidudun at 28/05/2010 said...
From the exception identification (ConcurrentModificationException), it looks like the Hadoop executor try to modify and AbstractList.
What is your development environment? Did you do something else?
Fikri at 31/05/2010 said...
I used Netbeans IDE and the newest update Karmasphere that is used for hadoop 0.20.2. I don’t know what is wrong with my app. I’m sure that I’ve followed your instruction above line per line.
Can I have your YM or GTalk id?
sidudun at 02/06/2010 said...
I haven’t tried the latest Karmasphere..I’ll try it and report it here asap..
I’ve sent you my YM id, check your email. Thank you.. :D
shubham agrawal at 07/06/2010 said...
hiiiiiiiiiii…
hey during the running mapreduce jo ….after step 5 which is as follows:”" we will finally run the MapReduce job. To do this right-click the WordCount under the Hadoop Jobs list and pick Run Job…”"
as i excuted this step a window appeared named “Hadoop deployment wordcount”
and d content is as given below…i ws nt able to go to step 6..nd no output folder create:::
Using cluster In-Process Thread (0.20.2)
Using filesystem Local Filesystem /
Preparing to execute job 2010-06-07/wordcoun-152434_J5
Creating class loader…Building configuration resources…OK.
Creating composite jar file…
Writing to /tmp/hadoop-job-2357655027006475012.jar…
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/commons-cli-2.0-SNAPSHOT.jar
Skipping /home/shubham/.netbeans/6.8/modules/ext/commons-codec-1.3.jar: it looks like a stock Hadoop jar.
Skipping /home/shubham/.netbeans/6.8/modules/ext/commons-httpclient-3.1.jar: it looks like a stock Hadoop jar.
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/commons-logging-1.1.1.jar
Skipping /home/shubham/.netbeans/6.8/modules/ext/commons-net-1.4.1.jar: it looks like a stock Hadoop jar.
Skipping /home/shubham/.netbeans/6.8/modules/ext/oro-2.0.8.jar: it looks like a stock Hadoop jar.
Skipping /home/shubham/.netbeans/6.8/modules/ext/log4j-1.2.15.jar: it looks like a stock Hadoop jar.
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/jets3t-0.7.1.jar
Skipping /home/shubham/.netbeans/6.8/modules/ext/xmlenc-0.52.jar: it looks like a stock Hadoop jar.
Skipping /home/shubham/.netbeans/6.8/modules/ext/hadoop-0.20.2-core.jar: it looks like a stock Hadoop jar.
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/hadoop-0.20.2-streaming.jar
Adding standard JAR library lib/hadoop-0.20-karmasphere-extras.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/karmasphere-client.jar
Aggregating configuration file META-INF/vfs-providers.xml
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/karmasphere-client-amazon.jar
Aggregating configuration file META-INF/vfs-providers.xml
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/jsch-0.1.42-patched.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/commons-beanutils-core-1.8.2.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/commons-lang-2.4.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/commons-io-1.4.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/commons-vfs-1.0.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/collections-generic-4.01.jar
Merging JAR library /home/shubham/.netbeans/6.8/modules/ext/jsr305.jar
Merging Hadoop JAR library /home/shubham/NetBeansProjects/WordCoun/dist/WordCoun.jar
Adding aggregated configuration file META-INF/vfs-providers.xml
Writing done. File size is 3M
Adding CompositeJar file /tmp/hadoop-job-2357655027006475012.jar to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-cli-2.0-SNAPSHOT.jar!/[MERGE] to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-codec-1.3.jar!/ to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-httpclient-3.1.jar!/ to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-logging-1.1.1.jar!/[MERGE] to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-net-1.4.1.jar!/ to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/oro-2.0.8.jar!/ to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/log4j-1.2.15.jar!/ to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/jets3t-0.7.1.jar!/[MERGE] to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/xmlenc-0.52.jar!/ to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/hadoop-0.20.2-core.jar!/ to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/hadoop-0.20.2-streaming.jar!/[MERGE] to roots.
Adding ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/hadoop-0.20-karmasphere-extras.jar!/ to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/karmasphere-client.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/karmasphere-client-amazon.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/jsch-0.1.42-patched.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-beanutils-core-1.8.2.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-lang-2.4.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-io-1.4.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/commons-vfs-1.0.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/collections-generic-4.01.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/.netbeans/6.8/modules/ext/jsr305.jar!/[MERGE] to roots.
Not adding MERGE ClassPath entry jar:file:/home/shubham/NetBeansProjects/WordCoun/dist/WordCoun.jar!/[INSPECT_JAVA, INSPECT_HADOOP, MERGE] to roots.
plz reply me comment either on my email id or on same website….
its very important 4 me..plz do reply as soon as possible…..
thnx a lot..this tutorial hrlped me a lot for installing and connecting d haddop nd netbeans….thnx
do reply
sidudun at 12/06/2010 said...
Hmm..
I can suggest you some quick solution, I dunno if it will work or not..
first, check your output folder permission, is it permissible to create a new directory?
second, from the stack trace of your program, I see that you name the project as WordCoun, not WordCount.. I dunno if this is matter, but try change it.
If the problem still persist, tell me more about it..
Gary at 11/06/2010 said...
Hello.. I’m testing this codes but I’m facing some error also.
You got any idea whats going on? Or we can discuss this using IM?
My specification is:
I’m running it on ubuntu using vmware (I’m using xp)
I’m using hadoop 0.20.2
Thanks for your help in advance first.
10/06/10 21:03:26 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively
10/06/10 21:03:28 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
10/06/10 21:03:29 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
10/06/10 21:03:30 INFO mapred.FileInputFormat: Total input paths to process : 2
10/06/10 21:03:31 INFO mapred.FileInputFormat: Total input paths to process : 2
10/06/10 21:03:31 INFO mapred.JobClient: Running job: job_local_0001
10/06/10 21:03:31 INFO mapred.MapTask: numReduceTasks: 1
10/06/10 21:03:31 INFO mapred.MapTask: io.sort.mb = 100
10/06/10 21:03:32 WARN mapred.LocalJobRunner: job_local_0001
java.lang.OutOfMemoryError: Java heap space
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:781)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:350)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10/06/10 21:03:32 INFO mapred.JobClient: map 0% reduce 0%
10/06/10 21:03:32 INFO mapred.JobClient: Job complete: job_local_0001
10/06/10 21:03:32 INFO mapred.JobClient: Counters: 0
Job failed!
BUILD SUCCESSFUL (total time: 9 seconds)
sidudun at 12/06/2010 said...
I’m sorry..
Like I said earlier, I haven’t tried it on the latest version of Karmasphere and Hadoop. Maybe there are some changes in the latest version that I don’t know about..
I’ll try this on the latest version, and I’ll report you again as soon as possible.
Cheers..
papu at 24/08/2010 said...
Hi,
I am also getting the same issue and poking around to get a solution.I changed all memory configuration for netbean without any result.
Could anybody please help.
Thanks in advance.
Joseph at 18/07/2010 said...
thank you for this tutorial. it worked perfectly. however, i copied the generated WordCount.jar to my hadoop cluster but i could not run it over there. please can you give an insight on how to run a mapreduce program (ie a jar file ) in hadoop. i have only been running the examples from hadoop banchmark but i do not know how to import my program and run in my cluster. sorry i’m still a newbie in this technology. please i will appreciate it more if you can send a response to my email. thank you.
Joseph
Joseph at 18/07/2010 said...
here is an extract of the error i was getting.
As you can from the jps command, my hadoop is running ok.
hadoop@ubuntu:/usr/local/hadoop/hadoop$ jps
4032 TaskTracker
4195 Jps
3509 NameNode
3865 JobTracker
3654 DataNode
3807 SecondaryNameNode
then i tried to run the wordcount jar file, after copying it to another location in my local system. ie /home/hadoop/the jar file
however, when i try to run it, this is the error i got
hadoop@ubuntu:/usr/local/hadoop/hadoop$ bin/hadoop jar /home/hadoop/WordCount.jar WordCount /home/hadoop/input /home/hadoop/output
10/07/18 11:43:28 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
Input path does not exist: hdfs://localhost:54310/user/hadoop/WordCount
please can anyone assist me.
sidudun at 19/07/2010 said...
From the error message, I assume that the program can’t find your input path..
They are looking for /home/hadoop/input that haven’t existed yet. Maybe you should make sure that the direktori is available in HDFS.
CMIIW
VB NikAM at 14/08/2010 said...
it lokks to be very helpful, i was searching for this only. very valuable tutorial. I will try & at the eariest i will share my experience.
I have a query……
can I simulate a Hadoop cluster on 1 machine(laptop/desktop) or have to have a cluster of machines?
if cluster is required anyway, can u direct for the tutorial to build the cluster to run hadoop mapreduce projects on it.
I will be thankfull a lot.
sidudun at 19/08/2010 said...
Dear NikAM,
you can simulate Hadoop on a single machine. I’ve wrote a tutorial about it based on Michael Noll’s post.
For tutorial how to build hadoop cluster, you can read Michael Noll’s post.
vijayalakshmi at 26/08/2010 said...
Good Morning sir
i am doing final year mtech. want to do project in mapreduce using hadoop tool. want to clarify some doubts with you sir.
i tried what you said in this tutorial its excellent working. i was tried upto running of wordcount program it was working. but i want to know how to do a small application using hadoop tool that means using jsp or using swing concepts to develop user interface for an application to developing project.
if you have idea pls give me some guidelines sir i am in need sir pls. i hope will get reply from you sir.
sidudun at 27/08/2010 said...
I never tried it before, but one thing that came to mind is that you can execute shell command on your Java class.
I googled and found some readings that I hope can help you:
Stackoverflow
DZone Snippets
Oracle Sun Forum
and many more…
If you succeed, it would be nice for you to write an article about it on your blog and tell me about it..
Pings
uberVU - social comments at 28/Jan/2010 said...
Social comments and analytics for this post…
This post was mentioned on Twitter by sidudun: Programming Hadoop in Netbeans – http://bit.ly/7m3jIw...
The DZone Effects | sidudun at 20/Mar/2010 said...
[...] months ago, I wrote a simple tutorial on how to create a Hadoop MapReduce program using Netbeans. Not a slightest clue in my head that this post will change my life. Okay, I’m exaggerating. [...]