Hadoop的key和value的传递序列化需要涉及两个重要的接口Writable和WritableComparable
Writable:
void write(DataOutput out) throws IOException; void readFields(DataInput in) throws IOException;
也就是读数据和写数据的方式
WritableComparable:
WritableComparable<T> extends Writable, Comparable<T>
public interface Comparable<T> { public int compareTo(T o); }
也就是比Writable多了一个compareTo方法,这个的用途是是为了确定是不是相同的key,因此得出如下结论:
hadoop为Key的数据类型必须实现WritableComparable,而Value的数据类型只需要实现Writable即可,能做Key的一定可以做Value,能做Value的未必能做Key
常用的WritableComparable实现如下:
org.apache.hadoop.io.NullWritable; org.apache.hadoop.io.BooleanWritable; org.apache.hadoop.io.BytesWritable; org.apache.hadoop.io.DoubleWritable; org.apache.hadoop.io.FloatWritable; org.apache.hadoop.io.IntWritable; org.apache.hadoop.io.LongWritable; org.apache.hadoop.io.MD5Hash; org.apache.hadoop.io.Text; org.apache.hadoop.io.UTF8; org.apache.hadoop.io.VIntWritable; org.apache.hadoop.io.VLongWritable;
常用的Writable实现如下(除了上述的):
org.apache.hadoop.io.TwoDArrayWritable; org.apache.hadoop.io.SortedMapWritable; org.apache.hadoop.io.ObjectWritable; org.apache.hadoop.io.MapWritable; org.apache.hadoop.io.ArrayWritable;