记录一下错误,在windows上面运行spark报错
words = sc.parallelize(['scala','java','hadoop','spark','scala','hadoop','spark','scala'])
words.distinct().count()
最然能够运行出结果,但是会报错
Please install psutil to have better support with spilling
解决办法:
直接在cmd上面pip install psutil就行