pyspark on yarn 出现 Cannot run program python3

启动一个pyspark on yarn:

$ pyspark --master yarn
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 3.2.0
      /_/

Using Python version 3.8.12 (default, Nov 12 2021 08:41:47)
Spark context Web UI available at http://master:4041
Spark context available as 'sc' (master = yarn, app id = application_1652774608535_0001).
SparkSession available as 'spark'.
>>

提交一个任务:

>>> sc.parallelize([1,2]).map(lambda x:x*10).collect()

报错

 java.io.IOException: Cannot run program "python3": error=2, No such file or directory  
 ......

解决方法,添加环境变量:

export PYSPARK_PYTHON=$PYTHON_HOME/bin/python3

重启,成功执行
在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/qq_41129489/article/details/124823829
今日推荐