The DZone Effects

Two months ago, I wrote a simple tutorial on how to create a Hadoop MapReduce program using Netbeans. Not a slightest clue in my head that this post will change my life. Okay, I’m exaggerating. I mean the post change the history of this blog.

The first day after I wrote the post, nothing special happened. But the day after, I was shocked when I checked this blog stats. This blog usually have about 15-25 visitors a day. So I was amazed when I saw 90 visitors that day. I started to investigate which post has the biggest contribution in delivering traffics. And I found out that it was the Hadoop in Netbeans post. I noticed that all of the traffic came from one site called DZone. DZone? I never heard that site before. When I was investigating the site, I found that it’s a cool bookmarking site for developer around the globe. And someone, later I known as mitchp, just share my post into this site.

The magic continued the next day. My traffic keep increasing. Then later in that day, I got an email from my hosting provider:

The domain has reached 80% of its bandwidth limit (807.50/1000.00 Megs).

Well, my bandwidth limit was just 1 GB a month. So I double checked the hosting package and found that my bandwidth limit should be 2 GB. I contacted my hosting provider to make sure that they didn’t make any mistakes. They said that my bandwidth should be 2 GB and they will resolve it a.s.a.p. Okay, so I started installing WP Super Cache to prevent my site from being down because of the high traffic and low bandwidth limit.

The magic ended. The highest peak was on the third day, near 200 visitors. After that, the traffic declining and found its equilibrium state. But, this state is higher than my average traffic before. My average traffic now is about 20-35 visitors a day. Not bad, huh?

And now, I’m a fan of DZone. Some days ago, I found out that I’m not the only one who has this kind of experience. Yeah, of course. Jordi Cabot wrote in his blog about his DZone traffic experience:

Unfortunately, a completely different story is the mid and long-term impact. By this I mean the number of people that discovered my portal thanks to the link and that has become a frequent visitor of the site since then. This is very difficult to assess (there is no way to know if a new subscriber originally discover your site thanks to the DZone link or it is just a temporal coincidence that he/she joined the site around those dates) but if we look at the increase in the number of subscribers to the RSS portal feeds , my twitter account or the daily visits to the site, my estimation is that only a 2-3% of the original DZone visitors has converted into new portal followers.

I second that. Maybe it’s just a sweet temporal coincidence if my traffic growth above the average. But one thing that I can learn from this experience is that if you want to have a high amount of traffic, you should write a good post regularly. And I hope I can do that.

Do you have the same experience?

Yet Another Introduction to MapReduce (part 2)

I’m sorry for the long delay from the first part. I’ve been pretty busy lately. On this part, I write about the idea of MapReduce, how is it work, and how it distributes the data and process. This article is heavily referenced from MapReduce paper by Google. I write it again to deepen my knowledge about the concept. Enjoy!

What is MapReduce?

According to Wikipedia, MapReduce is a software framework patented by Google to support distributed computing on large data sets on clusters of computers. This framework is presented by Jeffery Dean and Sanjay Ghemawat in OSDI’04: Sixth Symposium on Operating System Design and Implementation on December 2004. The main idea is to utilize functional programming techniques, to obtain processing simplification in distributed environment.

MapReduce processing data using list concept that usually used in functional programming. The process consists of two function, map and reduce function. Each function take list of input elements and produce list of output. Map function take inputs and produce intermediate key-value pairs. These pairs then sent to the reduce function. The reduce function take these intermediate key-value pairs as a input. Then, for the same intermediate key, the function merges together the values to produce output. According to the paper, for every reduce invocation typically produces zero or one output value. Continue reading

Yet Another Introduction to MapReduce (part 1)

There are so many article outside about what is MapReduce, the basic concepts behind it, how it works, and many other things. Even that, I still wanna write a little introduction to MapReduce. It’s mandatory, at least for me, to write about “something” in order to understand the “something”. I challenge my understanding about MapReduce in this post. I’ll use some resources available on the clouds like I mentioned earlier. This is just another introduction to MapReduce.

Data, Data, Data

We are living in the clouds era. Internet provide us with such a great resource to help our lives. In the progress, we created a lot of data. Consider a search engine like Google or Bing. They indexed all of sites across the network. If we are talking about sites these days, that’s a big number we are talking about. Netcraft reported that there are more than 200 Millions sites in the world. It means the search engine must process and analysis a lot of data. Continue reading

Programming Hadoop in Netbeans

Note: It seems that Netbeans is no longer supported by Karmasphere Studio. For programming Hadoop in Eclipse, you could read it here.

Hadoop MapReduce is an Open Source implementation of MapReduce programming model for processing large scale of data in distributed environment. Hadoop is implemented in Java as a class library. There are some distribution for Hadoop, from Apache, Cloudera, and Yahoo!

Meanwhile, Netbeans is an integrated development environment (or IDE) for programming in Java and many other programming languages. Netbeans (like any other IDE) helps programmer to develop applications easier and as painless as possible with its features. For this case, it helps us to develop Hadoop MapReduce jobs.

In this post, I’ll tell you step-by-step how to use Netbeans to develop a Hadoop MapReduce job. I’m using Netbeans 6.8 in Ubuntu Karmic Koala distribution. The MapReduce program we are going to create here is a simple program called wordcount. This program reads text in some files and lists all the words and how many those words present in all files. The source code of this program is available on the MapReduce tutorials packed with the Apache Hadoop distribution.

We divided this tutorial into three steps. First, we will install Karmasphere Studio for Hadoop, a Netbeans extension. Then, we will type some codes. And finally, we will run the MapReduce job in the Netbeans. Okay, fasten your seat belt.. Here we go.. Continue reading

College Students, Your Job Opportunities are in Danger

In my college’s department mailing list, there is an interesting discussion about the quality of IT bachelor degree in the workplace. There are some reasons behind that:

  • The bachelor graduate worker lacking practical skills. They can not answer a fundamental question that every IT or computer science graduate should know.
  • The bachelor graduate worker also lacking soft skills, like how to speak with the higher-ups and communicate with another workers.

As a result, the companies prefer to hire a vocational IT graduate. Why?

  • A vocational graduate sometimes have the practical skills that a bachelor graduate didn’t have. Computer science or IT is a wide spread knowledge. It means you didn’t have to go to the college just the learn how to program. It’s all over the clouds. So the learning materials are reachable to everyone.
  • Vocational graduates are easier to manage. Some of them have more respect to the higher-ups than the bachelor graduates.
  • The standard salary for the vocational graduates is less expensive than the bachelor graduates. Combine this factor with better skills and higher respect means that bachelor graduates’s job opportunities are in a grave danger.

Continue reading