1、安装插件
Eclipse Standard 4.3.1
hadoop-eclipse-plugin-1.2.1.jar
2、打开MapReduce视图
Window--->Perspecttive
3、添加MapReduce环境
在eclipse下端,控制台旁边会多一个Tab,叫“Map/Reduce Locations”,在下面空白的地方点右键,选择“New Hadoop location...”,如图所示:
在弹出的对话框中填写如下内容:
Location name(取个名字)
Map/Reduce Master(Job Tracker的IP和端口,根据mapred-site.xml中配置的mapred.job.tracker来填写)
DFS Master(Name Node的IP和端口,根据core-site.xml中配置的fs.default.name来填写)
4.使用eclipse对HDFS内容进行修改
经过上一步骤,左侧“Project Explorer”中应该会出现配置好的HDFS,点击右键,可以进行新建文件夹、删除文件夹、上传文件、下载文件、删除文件等操作。
注意:每一次操作完在eclipse中不能马上显示变化,必须得刷新一下。
5、创建MapReduce工程
5.1、配置hadoop路径
这里/User/Libo/Downloaders/eclipse/hadoop-1.2.1是拷贝的hadoop安装文件包
5.2创建工程
File -> New -> Project 选择“Map/Reduce Project”,然后输入项目名称,创建项目。插件会自动把hadoop根目录和lib目录下的所有jar包导入
5.3创建Mapper或者Reducer
File -> New -> Mapper 创建Mapper,自动继承mapred包里面的MapReduceBase并实现Mapper接口。
注意:这个插件自动继承的是mapred包里旧版的类和接口,新版的Mapper得自己写。
Reducer同理。
6、在eclipse中运行WorkCount程序
6.1 导入代码
package WordCount
import java.io.IOException;
import java.util.StringTokenizer;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount {
public static class TokenizerMapper extends Mapper<LongWritable, Text, Text, IntWritable>{
private final static IntWritable one = new IntWritable(1);
private Text word = new Text();
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
StringTokenizer itr = new StringTokenizer(value.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
context.write(word, one);
}
}
}
public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values, Context context)
throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum += val.get();
}
result.set(sum);
context.write(key, result);
}
}
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
if (args.length != 2) {
System.err.println("Usage: wordcount ");
System.exit(2);
}
Job job = new Job(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setReducerClass(IntSumReducer.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
6.2、配置运行参数
hdfs://192.168.76.2:9000/user/grid/in hdfs://192.168.76.2:9000/user/grid/out1 是部署hadoop的环境
6.3、执行时遇到报错
013-09-30 15:34:10.219 java[2477:1203] Unable to load realm info from SCDynamicStore
13/09/30 15:34:10 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/09/30 15:34:10 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/09/30 15:34:10 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
13/09/30 15:34:10 INFO input.FileInputFormat: Total input paths to process : 4
13/09/30 15:34:10 WARN snappy.LoadSnappy: Snappy native library not loaded
13/09/30 15:34:11 INFO mapred.JobClient: Running job: job_local1825526799_0001
13/09/30 15:34:11 WARN mapred.LocalJobRunner: job_local1825526799_0001
org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=LiBo, access=WRITE, inode="grid":grid:supergroup:rwxr-xr-x
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1459)
at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:362)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1161)
at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:52)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:319)
通过” Permission denied“ 确认是读写权限文件,难道多hfds没有足够的权限?
更改hdfs读写权限:
[grid@h1 hadoop-1.2.1]$ hadoop fs -chmod -R 777 .
程序正常运行: