Hadoop WordCount example stuck at map 100% reduce 0%

First of all, open up your job tracker and look at the number of free reducer slots and other running jobs – is there another job running which is consuming all the free reducer slots when then become available.

Once you’ve proved to yourself that there are some free reducer slots available to run a reducer for you job, locate your job in the job tracker web ui and click on it to open it up. You should now be able to see the number of completed mappers – ensure this reads that you have no running mappers. The % complete in the console sometimes lies and you could have a mapper which is in the process of committing saying it’s 100%, but having a problem finalizing.

Once you’re satisfied that all your mappers have finished, look at the number of running reducers – does this show 0? If not does it show that some are running – click on the number of running reducers to bring up the running reducers page, now click through on an instance until you get an option to view the logs for the reducer. You’ll want to view all the logs for this reducer (not the first / last 100k). This should tell you what your reducer is actually doing – most probably trying to copy the results from the mappers to the reducer node. I imagine this is where your problem is, one of network or disk space, but eitherway, eventually hadoop should fail the reducer instance out and reschedule it to run on another node.

Leave a Comment