Hadoop之MapReduce的MapTask详解

一、前提：

我们写的Driver类，提交之后，根据默认的FileInputFormat的getSplit() 方法之后，将切片信息和配置信息还有jar包上传到指定目录之后，yarn根据切片信息，启动相应的MapTask，然后去执行任务。

二、通过源码的方式，详解MapTask：

下面是MapTask类的run() 方法：

 @Override
  public void run(final JobConf job, final TaskUmbilicalProtocol umbilical)
    throws IOException, ClassNotFoundException, InterruptedException {
    this.umbilical = umbilical;

    if (isMapTask()) {
      // If there are no reducers then there won't be any sort. Hence the map 
      // phase will govern the entire attempt's progress.
      if (conf.getNumReduceTasks() == 0) {
        mapPhase = getProgress().addPhase("map", 1.0f);
      } else {
        // If there are reducers then the entire attempt's progress will be 
        // split between the map phase (67%) and the sort phase (33%).
        mapPhase = getProgress().addPhase("map", 0.667f);
        sortPhase  = getProgress().addPhase("sort", 0.333f);
      }
    }
    TaskReporter reporter = startReporter(umbilical);
 
    boolean useNewApi = job.getUseNewMapper();
    initialize(job, getJobID(), reporter, useNewApi);

    // check if it is a cleanupJobTask
    if (jobCleanup) {
      runJobCleanupTask(umbilical, reporter);
      return;
    }
    if (jobSetup) {
      runJobSetupTask(umbilical, reporter);
      return;
    }
    if (taskCleanup) {
      runTaskCleanupTask(umbilical, reporter);
      return;
    }

    if (useNewApi) {
      runNewMapper(job, splitMetaInfo, umbilical, reporter);
    } else {
      runOldMapper(job, splitMetaInfo, umbilical, reporter);
    }
    done(umbilical, reporter);
  }

根据上述代码块：

如果没有reduce任务，就没有必要为map的结果进行归并排序操作了，那么整个map过程将以100%资源执行；相反，如果其中含有reduce任务，那么map的任务被分成两部分，map函数执行的部分占整个资源的66.7%（此时我们在RecordReader中的getProgress仅仅给出的是相对这部分的百分比值），剩下的33.3%资源赋予归并排序的过程。

使用新的api，并且进行初始化。

进入initialize()方法，如下图：

 public void initialize(JobConf job, JobID id, 
                         Reporter reporter,
                         boolean useNewApi) throws IOException, 
                                                   ClassNotFoundException,
                                                   InterruptedException {
    jobContext = new JobContextImpl(job, id, reporter);
    taskContext = new TaskAttemptContextImpl(job, taskId, reporter);
    if (getState() == TaskStatus.State.UNASSIGNED) {
      setState(TaskStatus.State.RUNNING);
    }
    if (useNewApi) {
      if (LOG.isDebugEnabled()) {
        LOG.debug("using new api for output committer");
      }
      outputFormat =
        ReflectionUtils.newInstance(taskContext.getOutputFormatClass(), job);
      committer = outputFormat.getOutputCommitter(taskContext);
    } else {
      committer = conf.getOutputCommitter();
    }
    Path outputPath = FileOutputFormat.getOutputPath(conf);
    if (outputPath != null) {
      if ((committer instanceof FileOutputCommitter)) {
        FileOutputFormat.setWorkOutputPath(conf, 
          ((FileOutputCommitter)committer).getTaskAttemptPath(taskContext));
      } else {
        FileOutputFormat.setWorkOutputPath(conf, outputPath);
      }
    }
    committer.setupTask(taskContext);
    Class<? extends ResourceCalculatorProcessTree> clazz =
        conf.getClass(MRConfig.RESOURCE_CALCULATOR_PROCESS_TREE,
            null, ResourceCalculatorProcessTree.class);
    pTree = ResourceCalculatorProcessTree
            .getResourceCalculatorProcessTree(System.getenv().get("JVM_PID"), clazz, conf);
    LOG.info(" Using ResourceCalculatorProcessTree : " + pTree);
    if (pTree != null) {
      pTree.updateProcessTree();
      initCpuCumulativeTime = pTree.getCumulativeCpuTime();
    }
  }

就是对jobContext和taskContext进行实例化，

对committer尽行实例化：

通过反射获取对应的FileOutputFormat类，获取它的输出路径。

然后设置它的输出路径为E：/output

----------------------------------------------------由此 mapTask 初始化完成--------------------------------------------------------------

继续回到代码段一：

检查它是否是清理作业任务

点击进入runNewMapper()方法，如下代码段：

private <INKEY,INVALUE,OUTKEY,OUTVALUE>
  void runNewMapper(final JobConf job,
                    final TaskSplitIndex splitIndex,
                    final TaskUmbilicalProtocol umbilical,
                    TaskReporter reporter
                    ) throws IOException, ClassNotFoundException,
                             InterruptedException {
    // make a task context so we can get the classes
    org.apache.hadoop.mapreduce.TaskAttemptContext taskContext =
      new org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl(job, 
                                                                  getTaskID(),
                                                                  reporter);
    // make a mapper
    org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE> mapper =
      (org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE>)
        ReflectionUtils.newInstance(taskContext.getMapperClass(), job);
    // make the input format
    org.apache.hadoop.mapreduce.InputFormat<INKEY,INVALUE> inputFormat =
      (org.apache.hadoop.mapreduce.InputFormat<INKEY,INVALUE>)
        ReflectionUtils.newInstance(taskContext.getInputFormatClass(), job);
    // rebuild the input split
    org.apache.hadoop.mapreduce.InputSplit split = null;
    split = getSplitDetails(new Path(splitIndex.getSplitLocation()),
        splitIndex.getStartOffset());
    LOG.info("Processing split: " + split);

    org.apache.hadoop.mapreduce.RecordReader<INKEY,INVALUE> input =
      new NewTrackingRecordReader<INKEY,INVALUE>
        (split, inputFormat, reporter, taskContext);
    
    job.setBoolean(JobContext.SKIP_RECORDS, isSkipping());
    org.apache.hadoop.mapreduce.RecordWriter output = null;
    
    // get an output object
    if (job.getNumReduceTasks() == 0) {
      output = 
        new NewDirectOutputCollector(taskContext, job, umbilical, reporter);
    } else {
      output = new NewOutputCollector(taskContext, job, umbilical, reporter);
    }

    org.apache.hadoop.mapreduce.MapContext<INKEY, INVALUE, OUTKEY, OUTVALUE> 
    mapContext = 
      new MapContextImpl<INKEY, INVALUE, OUTKEY, OUTVALUE>(job, getTaskID(), 
          input, output, 
          committer, 
          reporter, split);

    org.apache.hadoop.mapreduce.Mapper<INKEY,INVALUE,OUTKEY,OUTVALUE>.Context 
        mapperContext = 
          new WrappedMapper<INKEY, INVALUE, OUTKEY, OUTVALUE>().getMapContext(
              mapContext);

    try {
      input.initialize(split, mapperContext);
      mapper.run(mapperContext);
      mapPhase.complete();
      setPhase(TaskStatus.Phase.SORT);
      statusUpdate(umbilical);
      input.close();
      input = null;
      output.close(mapperContext);
      output = null;
    } finally {
      closeQuietly(input);
      closeQuietly(output, mapperContext);
    }
  }

具体分析上述代码段：

无非就是用反射的方式获取我们自己编写的Mapper类，TextInputFormat实例，切片信息，RecordReader input的实例对象是Maptask内部类的NewTrackingRecordReader

对RecordWriter 进行赋值进入方法：

对我们的mapContext 和mapperContext上下文环境进行赋值：

整个map执行过程大概如下：

首先对RecordRead （input）进行初始化：

它是个抽象方法，我们用的是LineRecordReader

进入方法：如下代码段：

public void initialize(InputSplit genericSplit,
                         TaskAttemptContext context) throws IOException {
    FileSplit split = (FileSplit) genericSplit;
    Configuration job = context.getConfiguration();
    this.maxLineLength = job.getInt(MAX_LINE_LENGTH, Integer.MAX_VALUE);
    start = split.getStart();
    end = start + split.getLength();
    final Path file = split.getPath();

    // open the file and seek to the start of the split
    final FileSystem fs = file.getFileSystem(job);
    fileIn = fs.open(file);
    
    CompressionCodec codec = new CompressionCodecFactory(job).getCodec(file);
    if (null!=codec) {
      isCompressedInput = true;
      decompressor = CodecPool.getDecompressor(codec);
      if (codec instanceof SplittableCompressionCodec) {
        final SplitCompressionInputStream cIn =
          ((SplittableCompressionCodec)codec).createInputStream(
            fileIn, decompressor, start, end,
            SplittableCompressionCodec.READ_MODE.BYBLOCK);
        in = new CompressedSplitLineReader(cIn, job,
            this.recordDelimiterBytes);
        start = cIn.getAdjustedStart();
        end = cIn.getAdjustedEnd();
        filePosition = cIn;
      } else {
        if (start != 0) {
          // So we have a split that is only part of a file stored using
          // a Compression codec that cannot be split.
          throw new IOException("Cannot seek in " +
              codec.getClass().getSimpleName() + " compressed stream");
        }

        in = new SplitLineReader(codec.createInputStream(fileIn,
            decompressor), job, this.recordDelimiterBytes);
        filePosition = fileIn;
      }
    } else {
      fileIn.seek(start);
      in = new UncompressedSplitLineReader(
          fileIn, job, this.recordDelimiterBytes, split.getLength());
      filePosition = fileIn;
    }
    // If this is not the first split, we always throw away first record
    // because we always (except the last split) read one extra line in
    // next() method.
    if (start != 0) {
      start += in.readLine(new Text(), 0, maxBytesToConsume(start));
    }
    this.pos = start;
  }

读取文件的第一行内容，和对他的偏移量进行复制。

进入mapper.run(）方法：如下图：

这里的context就是mapperContext

while 循环获取下一行的偏移量和下一行的内容，传递到我们自定义的Mapper类的map(）方法。至此开始执行我们在map()方法的逻辑。

我们具体来看一下map()方法，这是我的map() 方法逻辑：如下图：

打断点来看一下 context.write()方法

结合下图来看，

就是将MapTask内部类NewOutputCollector 封装到mapperContext，而map()方法里的逻辑是context.write() ,所以就是调用NewOutputCollector.write()方法。

NewOutputCollector.write()方法：如下图

进入collect() 方法如下代码段：MapOutputBuffer.collect() 方法：


  public synchronized void collect(K key, V value, int partition) throws IOException {
            this.reporter.progress();//汇报进程状态
            if (key.getClass() != this.keyClass) {
                throw new IOException("Type mismatch in key from map: expected " + this.keyClass.getName() + ", received " + key.getClass().getName());
            } else if (value.getClass() != this.valClass) {
                throw new IOException("Type mismatch in value from map: expected " + this.valClass.getName() + ", received " + value.getClass().getName());
            } else if (partition >= 0 && partition < this.partitions) {
                this.checkSpillException();
                this.bufferRemaining -= 16;
                int kvbidx;
                int kvbend;
                int bUsed;
                if (this.bufferRemaining <= 0) {
                    this.spillLock.lock();

                    try {
                        if (!this.spillInProgress) {
                            kvbidx = 4 * this.kvindex;
                            kvbend = 4 * this.kvend;
                            bUsed = this.distanceTo(kvbidx, this.bufindex);
                            boolean bufsoftlimit = bUsed >= this.softLimit;
                            if ((kvbend + 16) % this.kvbuffer.length != this.equator - this.equator % 16) {
                                this.resetSpill();//核心方法，重新开始溢写
                                this.bufferRemaining = Math.min(this.distanceTo(this.bufindex, kvbidx) - 32, this.softLimit - bUsed) - 16;
                            } else if (bufsoftlimit && this.kvindex != this.kvend) {
                                this.startSpill();//核心方法，开始溢写(因为溢写会有触发条件，所以我们如果一步一步Debug很难触发该操作，所以一会我们单独触发它。)
                                int avgRec = (int)(this.mapOutputByteCounter.getCounter() / this.mapOutputRecordCounter.getCounter());
                                int distkvi = this.distanceTo(this.bufindex, kvbidx);
                                int newPos = (this.bufindex + Math.max(31, Math.min(distkvi / 2, distkvi / (16 + avgRec) * 16))) % this.kvbuffer.length;
                                this.setEquator(newPos);
                                this.bufmark = this.bufindex = newPos;
                                int serBound = 4 * this.kvend;
                                this.bufferRemaining = Math.min(this.distanceTo(this.bufend, newPos), Math.min(this.distanceTo(newPos, serBound), this.softLimit)) - 32;
                            }
                        }
                    } finally {
                        this.spillLock.unlock();
                    }
                }

                try {
                    kvbidx = this.bufindex;
                    this.keySerializer.serialize(key);//序列化key
                    if (this.bufindex < kvbidx) {
                        this.bb.shiftBufferedKey();//写入keyBuffer
                        kvbidx = 0;
                    }

                    kvbend = this.bufindex;
                    this.valSerializer.serialize(value);//序列化value值
                    this.bb.write(this.b0, 0, 0);//写入值
                    bUsed = this.bb.markRecord();
                    this.mapOutputRecordCounter.increment(1L);//计数器+1
                    this.mapOutputByteCounter.increment((long)this.distanceTo(kvbidx, bUsed, this.bufvoid));
                    this.kvmeta.put(this.kvindex + 2, partition);//维护K-V元数据，分区相关
                    this.kvmeta.put(this.kvindex + 1, kvbidx);//
                    this.kvmeta.put(this.kvindex + 0, kvbend);//维护K-V元数据
                    this.kvmeta.put(this.kvindex + 3, this.distanceTo(kvbend, bUsed));//维护K-V元数据
                    this.kvindex = (this.kvindex - 4 + this.kvmeta.capacity()) % this.kvmeta.capacity();
                } catch (MapTask.MapBufferTooSmallException var15) {
                    MapTask.LOG.info("Record too large for in-memory buffer: " + var15.getMessage());
                    this.spillSingleRecord(key, value, partition);
                    this.mapOutputRecordCounter.increment(1L);
                }
            } else {
                throw new IOException("Illegal partition for " + key + " (" + partition + ")");
            }
        }

上面源码我们看到，在我们自定义的Mapper类中，会循环调用我们写的map方法，而在map方法内，我们使用context.write()将K-V值通过MapOutputBuffer类中的collect方法不停的往内存缓存区中写数据，这些数据的元数据包含了分区信息等，在内存缓存区到达一定的大小时，他就开始往外溢写数据，也就是collect方法中的this.startSpill();那么现在我们需要看的是Spill都干了什么事情。我们停止Debug，重新打断点，这次我们只在collect方法的this.startSpill();处打上断点。如下图所示：
我去，没有触发。。。好像是文件大小太小了，没有触发溢出条件好吧，我们先不看spill，我们在flush处打断点：

为什么要在flush 处打断点呢？虽然因为我们的输入文件太小没有触发spill操作，但是shullfe阶段总得将数据溢出，所以我们看到在flush阶段，它会触发如图：

关闭输入流。

当关闭输出流时：如下图，进入output.close(）方法：

发现会执行flush() 方法，如下图：

public void flush() throws IOException, ClassNotFoundException, InterruptedException {
            MapTask.LOG.info("Starting flush of map output");
            if (this.kvbuffer == null) {
                MapTask.LOG.info("kvbuffer is null. Skipping flush.");
            } else {
                this.spillLock.lock();//溢出加锁

                try {
                    while(this.spillInProgress) {//判断当前进度是否事溢写进度，如果是
                        this.reporter.progress();//汇报进度
                        this.spillDone.await();//等待当前溢写完成。
                    }

                    this.checkSpillException();
                    int kvbend = 4 * this.kvend;
                    if ((kvbend + 16) % this.kvbuffer.length != this.equator - this.equator % 16) {
                        this.resetSpill();//重置溢写的条件。
                    }

                    if (this.kvindex != this.kvend) {//如果当前kvindex不是kv的最后下标
                        this.kvend = (this.kvindex + 4) % this.kvmeta.capacity();
                        this.bufend = this.bufmark;
                        MapTask.LOG.info("Spilling map output");
                        MapTask.LOG.info("bufstart = " + this.bufstart + "; bufend = " + this.bufmark + "; bufvoid = " + this.bufvoid);
                        MapTask.LOG.info("kvstart = " + this.kvstart + "(" + this.kvstart * 4 + "); kvend = " + this.kvend + "(" + this.kvend * 4 + "); length = " + (this.distanceTo(this.kvend, this.kvstart, this.kvmeta.capacity()) + 1) + "/" + this.maxRec);
                        this.sortAndSpill();//核心代码，排序并溢出。一会我们详细看一下方法
                    }
                } catch (InterruptedException var7) {
                    throw new IOException("Interrupted while waiting for the writer", var7);
                } finally {
                    this.spillLock.unlock();
                }

                assert !this.spillLock.isHeldByCurrentThread();

                try {
                    this.spillThread.interrupt();
                    this.spillThread.join();
                } catch (InterruptedException var6) {
                    throw new IOException("Spill failed", var6);
                }

                this.kvbuffer = null;
                this.mergeParts();//核心代码，合并操作
                Path outputPath = this.mapOutputFile.getOutputFile();
                this.fileOutputByteCounter.increment(this.rfs.getFileStatus(outputPath).getLen());
            }
        }

进入sortAndSpill();

       private void sortAndSpill() throws IOException, ClassNotFoundException, InterruptedException {
            long size = (long)(this.distanceTo(this.bufstart, this.bufend, this.bufvoid) + this.partitions * 150);//获取写出长度
            FSDataOutputStream out = null;//新建输出流

            try {
                SpillRecord spillRec = new SpillRecord(this.partitions);
                Path filename = this.mapOutputFile.getSpillFileForWrite(this.numSpills, size);//确认将数据溢出到那个文件中
                out = this.rfs.create(filename);
                int mstart = this.kvend / 4;
                int mend = 1 + (this.kvstart >= this.kvend ? this.kvstart : this.kvmeta.capacity() + this.kvstart) / 4;
                this.sorter.sort(this, mstart, mend, this.reporter);//核心方法，对MapOutPutBuffer缓存进行排序方法，默认为快速排序。(这里就是对key的排序)
                int spindex = mstart;
                IndexRecord rec = new IndexRecord();
                MapTask.MapOutputBuffer<K, V>.InMemValBytes value = new MapTask.MapOutputBuffer.InMemValBytes();

                for(int i = 0; i < this.partitions; ++i) //循环分区{
                    Writer writer = null;

                    try {
                        long segmentStart = out.getPos();
                        FSDataOutputStream partitionOut = CryptoUtils.wrapIfNecessary(this.job, out);
                        writer = new Writer(this.job, partitionOut, this.keyClass, this.valClass, this.codec, this.spilledRecordsCounter);
                        if (this.combinerRunner == null) {
                            for(DataInputBuffer key = new DataInputBuffer(); spindex < mend && this.kvmeta.get(this.offsetFor(spindex % this.maxRec) + 2) == i; ++spindex)//循环写入k-v {
                                int kvoff = this.offsetFor(spindex % this.maxRec);
                                int keystart = this.kvmeta.get(kvoff + 1);
                                int valstart = this.kvmeta.get(kvoff + 0);
                                key.reset(this.kvbuffer, keystart, valstart - keystart);
                                this.getVBytesForOffset(kvoff, value);
                                writer.append(key, value);
                            }
                        } else {
                            int spstart;
                            for(spstart = spindex; spindex < mend && this.kvmeta.get(this.offsetFor(spindex % this.maxRec) + 2) == i; ++spindex) {
                                ;
                            }

                            if (spstart != spindex) {
                                this.combineCollector.setWriter(writer);
                                RawKeyValueIterator kvIter = new MapTask.MapOutputBuffer.MRResultIterator(spstart, spindex);
                                this.combinerRunner.combine(kvIter, this.combineCollector);//满足combiner的条件还会进行combiner
                            }
                        }

                        writer.close();
                        rec.startOffset = segmentStart;
                        rec.rawLength = writer.getRawLength() + (long)CryptoUtils.cryptoPadding(this.job);
                        rec.partLength = writer.getCompressedLength() + (long)CryptoUtils.cryptoPadding(this.job);
                        spillRec.putIndex(rec, i);
                        writer = null;
                    } finally {
                        if (null != writer) {
                            writer.close();
                        }

                    }
                }

                if (this.totalIndexCacheMemory >= this.indexCacheMemoryLimit) {
                    Path indexFilename = this.mapOutputFile.getSpillIndexFileForWrite(this.numSpills, (long)(this.partitions * 24));
                    spillRec.writeToFile(indexFilename, this.job);//写如文件，从这里看，排序是在内存中完成的。
                } else {
                    this.indexCacheList.add(spillRec);
                    this.totalIndexCacheMemory += spillRec.size() * 24;
                }

                MapTask.LOG.info("Finished spill " + this.numSpills);
                ++this.numSpills;
            } finally {
                if (out != null) {
                    out.close();
                }

            }
        }

this.sorter.sort(this, mstart, mend, this.reporter);//核心方法，排序方法，默认为快速排序。(这里就是对key的排序)可以看出排序是在内存中完成的如下图：

进入sortInternal()方法：

  private static void sortInternal(final IndexedSortable s, int p, int r,
      final Progressable rep, int depth) {
    if (null != rep) {
      rep.progress();
    }
    while (true) {
    if (r-p < 13) {
      for (int i = p; i < r; ++i) {
        for (int j = i; j > p && s.compare(j-1, j) > 0; --j) {
          s.swap(j, j-1);
        }
      }
      return;
    }
    if (--depth < 0) {
      // give up
      alt.sort(s, p, r, rep);
      return;
    }

    // select, move pivot into first position
    fix(s, (p+r) >>> 1, p);
    fix(s, (p+r) >>> 1, r - 1);
    fix(s, p, r-1);

    // Divide
    int i = p;
    int j = r;
    int ll = p;
    int rr = r;
    int cr;
    while(true) {
      while (++i < j) {
        if ((cr = s.compare(i, p)) > 0) break;
        if (0 == cr && ++ll != i) {
          s.swap(ll, i);
        }
      }
      while (--j > i) {
        if ((cr = s.compare(p, j)) > 0) break;
        if (0 == cr && --rr != j) {
          s.swap(rr, j);
        }
      }
      if (i < j) s.swap(i, j);
      else break;
    }
    j = i;
    // swap pivot- and all eq values- into position
    while (ll >= p) {
      s.swap(ll--, --i);
    }
    while (rr < r) {
      s.swap(rr++, j++);
    }

    // Conquer
    // Recurse on smaller interval first to keep stack shallow
    assert i != j;
    if (i - p < r - j) {
      sortInternal(s, p, i, rep, depth);
      p = j;
    } else {
      sortInternal(s, j, r, rep, depth);
      r = i;
    }
    }
  }

排序方法完毕。

继续分析sortandspill()方法：