Using map reduce to find max and min temperature for a data set – lab 1

Hi Readers,

I am happy to post another blog post on how to use map reduce to find Max and Min temperature of a data set.

I have a dataset (say temperaturedata.txt) file. I like to analyze the above data set using map reduce.

I have a written a map reduce program to find the max and min temp of the year.


I have a data set in my linux file system and I need to move that to hadoop file system.

hdfs dfs -copyFromLocal /home/cloudera/Downloads/temperaturedata.txt /jpraveen/temperaturedata.txt

Now the file is in hdfs directory i.e (hdfs dfs -ls /jpraveen/)

Now you need to convert your map reduce program into a jar file and run the jar file using the below command.

hadoop jar MaxMinTemp.jar  /jpraveen/temperaturedata.txt  ~/output1


Below code snippet in the main method of your program

 FileInputFormat.setInputPaths(job, new Path(args[0])) represents this value /jpraveen/temperaturedata.txt.
 FileOutputFormat.setOutputPath(job, new Path(args[1])) represents this value ~/output1.

JobClient.runJob(job) this will trigger the map reduce job to start.

Moreover you can also track the job status which you will get at the time of running the map reduce.


The output of the above job can seen using the below command or through file browser

hdfs dfs -cat /root/output1/part-00000 .



One thought on “Using map reduce to find max and min temperature for a data set – lab 1

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.