hive0.11多表join countI(distinct )bug

     hive0.11测试过程中,发现如下bug

 

select count(distinct t2.user_id),t1.app_id,t2.from_id
 from t1 
 join t2 on t1.app_id=t2.app_id
 join t3 on t2.from_id=t3.flag
 group by t1.app_id,t2.from_id

 查询过程报如下错误:FAILED: NullPointerException null

2013-09-16 20:20:59,611 ERROR ql.Driver (SessionState.java:printError(386)) - FAILED: NullPointerException null
java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.optimizer.physical.MetadataOnlyOptimizer$MetadataOnlyTaskDispatcher.dispatch(MetadataOnlyOptimizer.java:308)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:87)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:124)
    at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:101)

 具体原因正在解决中

中间解决办法形成临时表,子查询,也可以通过设置参数"set hive.map.aggr=false;"临时解决

select count(distinct tmp.user_id), tmp.app_id,tmp.from_id 
from (select t2.user_id,t1.app_id,t2.from_id
	 from t1 
	 join t2 on t1.app_id=t2.app_id
	 join t3 on t2.from_id=t3.flag
	 group by t1.app_id,t2.from_id
 ) tmp

 hive关于此邮件问题描述:http://mail-archives.apache.org/mod_mbox/hive-user/201309.mbox/%3CCA+FBdFQYHm9WvpWYSwaFGs8Vo=crNuSD=zv-Wf7tE8S4=X7AJg@mail.gmail.com%3E

hive官方issues,HIVE-5129:https://issues.apache.org/jira/browse/HIVE-5129

hive官方reviewboard:https://reviews.apache.org/r/13697/diff/#index_header

hive官方历史jira:https://issues.apache.org/jira/issues/?jql=project%20%3D%20HIVE

猜你喜欢

转载自moon-yang85-gmail-com.iteye.com/blog/1943009
bug