hadoop streaming 使用自己的 python 版本

1 #!/usr/bin/env python3

15 hadoop jar hadoop-streaming.jar \

 16 -Dmapred.fairscheduler.pool=build \
 17 -Dmapred.reduce.tasks=500 \
 18 -Dmapred.job.priority=VERY_HIGH \
 19 -Dmapred.job.name="" \
 21 -cacheArchive "hdfs:///home/python-3.1.2.tgz#python3" \
 22 -file "cut_rank_fields.py" \
 23 24 -input ${rank_data_dir} \
 25 -output ${output_dir} \
 26 -mapper "export LD_LIBRARY_PATH=python3/lib:${LD_LIBRARY_PATH}; python3/bin/python3 cut_rank_fields.py 
 27 -reducer "cut -f 2"

猜你喜欢

转载自blog.csdn.net/mzg12345678/article/details/78984421