postgre sql 中jsonb入库,cast ? as jsonb

df.write.mode(SaveMode.Append).jdbc(url, table, connectionProperties)
遇到异常

Hint: You will need to rewrite or cast the expression.
  Position: 388  Call getNextException to see other errors in the batch.
    at org.postgresql.jdbc.BatchResultHandler.handleError(BatchResultHandler.java:148)
    at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:777)
    at org.postgresql.jdbc.PgPreparedStatement.executeBatch(PgPreparedStatement.java:1563)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:215)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:277)
    at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$saveTable$1.apply(JdbcUtils.scala:276)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920)
    at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$33.apply(RDD.scala:920)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1869)
    at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:1869)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
    at org.apache.spark.scheduler.Task.run(Task.scala:89)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.postgresql.util.PSQLException: ERROR: column "extdata" is of type jsonb but expression is of type character
  Hint: You will need to rewrite or cast the expression.
  Position: 388
    at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2477)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2190)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:300)
    at org.postgresql.jdbc.PgStatement.executeBatch(PgStatement.java:774)
    ... 14 more

参考
https://github.com/apache/spark/pull/8948
https://github.com/apache/spark/commit/1be25157449335dca70ba37720a172efa1f90714

spark-shell 测试
spark-shell --jars $(echo /home/chenf/ExhaustEmission/DataMiningAnalysis-0.1-SNAPSHOT/target/jars/*.jar | tr ' ' ',')
import org.apache.spark.sql.{DataFrame, SQLContext, SaveMode}
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
case class Test(id:String,extdata:String)
val testData = sc.parallelize(List(Test("1","{"id":"1","data":"data1"}")))
val df = testData.toDF
val connectionProperties = new java.util.Properties()
connectionProperties.put("driver", "org.postgresql.Driver")
connectionProperties.put("user", "test")
connectionProperties.put("password", "test")
import org.apache.spark.sql.{DataFrame, SQLContext, SaveMode}
df.write.mode(SaveMode.Append).jdbc("jdbc:postgresql://localhost:5432/dev_test", "tb_test", connectionProperties)

不支持,抛出异常
自己写sql批量插入

猜你喜欢

转载自blog.csdn.net/weixin_34268843/article/details/87171921
今日推荐