1、提交程序报错;
(1)利用crontab 设置定时任务,利用python 脚本提交spark程序时,报UnknownHostException :logSave 错,具体错误如下:
Exception in thread "main" java.lang.IllegalArgumentException: java.net.UnknownHostException: logSave
at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:378)
at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:310)
at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:176)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:678)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.spark.util.Utils$.getHadoopFileSystem(Utils.scala:1822)
at org.apache.spark.scheduler.EventLoggingListener.<init>(EventLoggingListener.scala:67)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:514)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
at com.rong360.featureAnalyse.FeatherAnalyseOnline.main(FeatherAnalyseOnline.java:52)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.UnknownHostException: logSave
... 29 more
(2) crontab 命令:
crontab:22 11 * * 2 sh /data1/rong360/baseFeatures/submit.py >>/data1/rong360/baseFeatures/out.log 2>&1
submit脚本:
import os
import time
for i in [1,9]:
ml="spark-submit --class com.rong360.featureAnalyse.FeatherAnalyseOnline --executor-memory 30g --total-executor-cores 72 " \
"--driver-memory 40g get_feature_analyse.jar "+str(i)
print(ml)
os.system(ml)
time.sleep(30)
2、利用python 提交任务就是无法正常执行;换成shell 脚本提交代码就没问题了
#!bin/bash
source /etc/profile
for i in '0' '1' '2' '6' '9'
do
echo "/home/dmp/spark-2.0.2-bin-hadoop2.7/bin/spark-submit --master spark://179.168.200.175:7077 --class com.rong360.featureAnalyse.
FeatherAnalyseOnline --executor-memory 20g --total-executor-cores 72 --driver-memory 30g /data1/rong360/baseFeatures/get_feature_analyse.j
ar $i"
/home/dmp/spark-2.0.2-bin-hadoop2.7/bin/spark-submit --master spark://192.168.200.175:7077 --class com.rong360.featureAnalyse.Feathe
rAnalyseOnline --executor-memory 20g --total-executor-cores 72 --driver-memory 30g /data1/rong360/baseFeatures/get_feature_analyse.jar $i
done