Hadoop由Java编写的,所有通过JavaAPI可以调用所有的HDFS的交互操作接口,最常用的是FileSystem类,它是有Hadoop fs 实现。
一、读取文件内容
1、Java.net.URL读取HDFS文件内容
import java.io.InputStream;
import java.net.MalformedURLException;
import java.net.URL;
import org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
import org.apache.hadoop.io.IOUtils;
public class main {
static{
//让JAVA 程序识别Hadoop HDFS URL
URL.setURLStreamHandlerFactory(new FsUrlStreamHandlerFactory());
}
public static void main(String[] args) throws Exception {
InputStream in = null;
try{
//使用java.net.URL对象打开数据流
in = new URL("hdfs://192.168.2.50:8020/user/hadoop/outp89/part-r-00000").openStream();
IOUtils.copyBytes(in, System.out, 4096,false);
}
finally{
IOUtils.closeStream(in);
}
}
}
2、SequenceFile文件写入
SequenceFile是HDFS API提供的一种二进制文件支持,这种二进制文件直接将<Key,Value>序列化到文件中。
package hadooptest2;
import java.io.IOException;
import java.net.URI;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;
public class SequenceFileWriter {
private static final String[] text = {
"不忘初心",
"砥砺前行",
"只是测试",
};
public static void main(String[] args) throws Exception {
String uri = "hdfs://192.168.2.50:8020/user/hadoop/testseq";
Configuration conf = new Configuration();
SequenceFile.Writer writer = null;
try{
FileSystem fs = FileSystem.get(URI.create(uri),conf);
Path path = new Path(uri);
//Int类型的Writable封装(Hadoop包)
IntWritable key = new IntWritable();
Text value = new Text();
//SequenceFile.Writer 构造方法需要指定键值对类型
writer = SequenceFile.createWriter(fs, conf, path, key.getClass(), value.getClass());
for(int i= 0;i<100;i++)
{
//此demo中,键是从100-1
key.set(100-i);
//此demo中,值是text[i%text.length]模值
value.set(text[i%text.length]);
writer.append(key, value);
}
}catch(IOException e)
{
e.printStackTrace();
}finally{
IOUtils.closeStream(writer);
}
}
}
查看写入的文件:
[root@TJ1-000 ~]# hdfs dfs -text /user/hadoop/testseq
17/03/28 09:31:05 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
17/03/28 09:31:05 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
100 不忘初心
99 砥砺前行
98 只是测试
97 不忘初心
96 砥砺前行
95 只是测试
94 不忘初心...
4 不忘初心
3 砥砺前行
2 只是测试
1 不忘初心
-