Prediction(5)Cluster Trouble Shooting

Prediction(5)Cluster Trouble Shooting

Face to some issue on local zeppelin with spark 1.5.0, zeppelin 0.6.0, hadoop 2.7.1.
It may be memory issue. But it does not help I configure the zeppelin as follow:
export MASTER="yarn-client"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop/"

export SPARK_HOME="/opt/spark"
. ${SPARK_HOME}/conf/spark-env.sh
export ZEPPELIN_CLASSPATH="${SPARK_CLASSPATH}"
export ZEPPELIN_JAVA_OPTS="-Dspark.yarn.driver.memoryOverhead=512 -Dspark.yarn.executor.memoryOverhead=512 -Dspark.akka.frameSize=100 -Dspark.executor.instances=2 -Dspark.driver.memory=3g -Dspark.storage.memoryFraction=0.7 -Dspark.core.connection.ack.wait.timeout=800 -Dspark.rdd.compress=true -Dspark.default.parallelism=18 -Dspark.executor.memory=3g"

Spark as follow:
export SPARK_DAEMON_JAVA_OPTS="-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70"
export HADOOP_CONF_DIR="/opt/hadoop/etc/hadoop"
#export SPARK_WORKER_MEMORY=1024m
#export SPARK_JAVA_OPTS="-Dbuild.env=lmm.sparkvm"
export USER=carl

Install phantomjs
> sudo apt-get install phantomjs

Build Zeppelin Again
> mvn clean package -Pspark-1.5 -Dspark.version=1.5.0 -Dhadoop.version=2.7.1 -Phadoop-2.6 -Pyarn -DskipTests

I just set up ubuntu-pilot to run the zeppelin and spark. ubuntu-master and ubuntu-dev1, ubuntu-dev2 will be the yarn cluster. Everything works fine now.

References:
http://machinelearningmastery.com/non-linear-classification-in-r-with-decision-trees/

猜你喜欢

转载自sillycat.iteye.com/blog/2247152