Could not initialize derby.jdbc.AutoloadedDriver40

操作

spark2.4.5 hive 2.3.5 客户端
执行Spark 任务查询hive表时报NoClassDefFoundError 异常

异常日志

javax.jdo.JDOFatalInternalException: Unexpected exception caught.
at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1193)
...
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6891)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:164)
...
at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332)
at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
...
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at com.xxx.xxx.xxx.xxx(xxx.java:43)
at com.xxx.xxx.xxx.main(xxx.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
NestedThrowablesStackTrace:
java.lang.reflect.InvocationTargetException
...
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:659)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:431)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6891)
...
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:248)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:388)
at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:332)
at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:312)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
...
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3369)
at org.apache.spark.sql.Dataset.<init>(Dataset.scala:194)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
at xxx.xxx.xxx.xxx.xxx(xxx.java:43)
at xxx.xx.xxx.xxx.main(xxx.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.AutoloadedDriver40
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at java.sql.DriverManager.isDriverAllowed(DriverManager.java:556)
at java.sql.DriverManager.getConnection(DriverManager.java:661)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
... 101 more

异常分析

1、 查看异常堆栈

NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.AutoloadedDriver40
为初始化类异常(注意不是ClassNotFoundException),此类对应的包为derby*.jar

2、搜索相关包

   find / -name derby*.jar  发现jdk 下面有此包,hive 下也有,并不缺少包。ClassNotFoundException是缺少包,NoClassDefFoundError 应该为版本不一致导致。

3、检查hive是否正常

   执行hive命令,发现hive正常

4、再次查看堆栈信息,查看执行哪段业务代码抛出的异常

  发现 com.xxx.xxx.xxx.xxx(xxx.java:43)  执行了   sparkSession.sql("use "+ dbname);

5、手工执行spark-shell命令,模拟代码动作

  import spark.sql
 sql("use dbname")
     抛出类似异常,可以断定spark 任务依赖的jar存在问题

6、检查SPARK_HOME 下的jar

发现依赖的hive版本为1.2.2, 但实际运行环境的hive为 2.3.5

解决过程

1、替换SPARK_HOME 下的hive jar包,
2、使用spark sql 验证,以前异常消失,新异常为:java.lang.NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT
3、查资料发现spark2.4.5 依赖的hive版本为1.2.x 版本。hive升级到了2之后的版本,hive去掉了HIVE_STATS_JDBC_TIMEOUT这个参数。spark-sql代码依然调用hive的这个参数,这样就报错了。
4、如何解决spark 和hive 版本的一致性问题呢?
(1)修改spark 源代码,删除掉HIVE_STATS_JDBC_TIMEOUT,重新编译spark-hive的jar包
(2)spark 依旧使用hive 旧版本hive客户端,连接新版本的hive
5、spark 依旧使用hive 旧版本hive客户端方案:将spark所在服务器安装1.2.x版本hive客户端,hive-site.xml同步到spark和hadoop配置文件中

完毕,验证通过

猜你喜欢

转载自blog.51cto.com/zhsusn/2590503