2021-MIT6.284 lab1 实验环境搭建以及调试运行技巧

从GitHub仓库Clone项目

$ git clone git://g.csail.mit.edu/6.824-golabs-2021 6.824
$ cd 6.824
$ ls
Makefile src
$

运行串行的Wordcount程序,用户提供具体的Map和Reduce函数,因此要编译一个wc.go 成wc.so 动态链接库

$ cd ~/6.824
$ cd src/main
$ go build -race -buildmode=plugin ../mrapps/wc.go
$ rm mr-out*
$ go run -race mrsequential.go wc.so pg*.txt
$ more mr-out-0
A 509
ABOUT 2
ACT 8
...

Your job is to implement a distributed MapReduce, consisting of two programs, the coordinator and the worker. There will be just one coordinator process, and one or more worker processes executing in parallel. In a real system the workers would run on a bunch of different machines, but for this lab you'll run them all on a single machine. The workers will talk to the coordinator via RPC. Each worker process will ask the coordinator for a task, read the task's input from one or more files, execute the task, and write the task's output to one or more files. The coordinator should notice if a worker hasn't completed its task in a reasonable amount of time (for this lab, use ten seconds), and give the same task to a different worker.

We have given you a little code to start you off. The "main" routines for the coordinator and worker are in main/mrcoordinator.go and main/mrworker.go; don't change these files. You should put your implementation in mr/coordinator.gomr/worker.go, and mr/rpc.go.

Here's how to run your code on the word-count MapReduce application. First, make sure the word-count plugin is freshly built:

go build -race -buildmode=plugin ../mrapps/wc.go

In the main directory, run the coordinator.

$ rm mr-out*
$ go run -race mrcoordinator.go pg-*.txt

The pg-*.txt arguments to mrcoordinator.go are the input files; each file corresponds to one "split", and is the input to one Map task. The -race flags runs go with its race detector.

In one or more other windows, run some workers:

$ go run -race mrworker.go wc.so
  • One way to get started is to modify mr/worker.go's Worker() to send an RPC to the coordinator asking for a task. Then modify the coordinator to respond with the file name of an as-yet-unstarted map task. Then modify the worker to read that file and call the application Map function, as in mrsequential.go.

先实现doMap和doReduce两个接口,这两个接口涉及到基本的算法和文件读写

func DoMap(inFile string, NReduce int, mapTask int, mapf func(string, string) []KeyValue){
	intermediate := []KeyValue{}
	file, err := os.Open(inFile)
	if err != nil {
		log.Fatalf("cannot open %v", inFile)
	}
	content, err := ioutil.ReadAll(file)
	if err != nil {
		log.Fatalf("cannot read %v", inFile)
	}
	file.Close()
	kva := mapf(inFile, string(content))
	intermediate = append(intermediate, kva...)
	OutFileArray := make([][]KeyValue,NReduce)
	for _, kv := range intermediate{
		index := ihash(kv.Key)%NReduce
		OutFileArray[index] = append(OutFileArray[index],kv)
	}

	for i :=0;i<NReduce;i++{
		outFile := fmt.Sprintf("mr-%v-%v", mapTask, i)
		f, openErr := os.OpenFile(outFile, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0664)
		if openErr != nil {
			log.Fatal("OpenFile: ", openErr)
		}
		defer f.Close()

		enc := json.NewEncoder(file)
		for _, kv :=  range OutFileArray[i] {
			enc.Encode(&kv)
		}
	}
}

doReduce

func DoReduce(reduceTask int, NMap int, outFile string, reducef func(string, []string) string){
	interMaps := make(map[string][]string)
	for i:=0;i<NMap;i++{
		filename := fmt.Sprintf("mr-%v-%v",i, reduceTask)
		file, err := os.Open(filename)
		if err != nil {
			log.Fatalf("cannot open %v", filename)
		}
		dec := json.NewDecoder(file)
		for {
			var kv KeyValue
			if err := dec.Decode(&kv); err != nil {
				break
			}
			// 如果Key还没有被统计过
			if _, ok := interMaps[kv.Key]; !ok {
				interMaps[kv.Key] = []string{kv.Value}
			}else{
				interMaps[kv.Key] = append(interMaps[kv.Key],kv.Key)
			}
		}
		f, openErr := os.OpenFile(outFile, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0664)
		if openErr != nil {
			log.Fatal("Open result file: ", openErr)
		}
		enc := json.NewEncoder(f)

		for key, value := range interMaps{
			enc.Encode(KeyValue{key, reducef(key,value)})
		}

		f.Close()
	}
}

之后,调试RPC逻辑。

猜你喜欢

转载自blog.csdn.net/wwxy1995/article/details/113885853