hive 非等值连接, 设置hive为nonstrict模式

1 数据准备

create table stocks(id int, date string,price string, company string);

insert into table stocks values 
(1,'2010-01-04','214.01','aapl'),
(2,'2010-01-05','214.38','aapl'),
(3,'2010-01-06','210.97','aapl'),
(4,'2010-01-07','210.58','aapl'),
(5,'2010-01-08','211.58','aapl'),
(6,'2010-01-11','210.11','aapl'),
(7,'2010-01-04','132.45','ibm'),
(8,'2010-01-05','138.85','ibm'),
(9,'2010-01-06','129.55','ibm'),
(10,'2010-01-07','130.0','ibm'),
(11,'2010-01-08','130.85','ibm'),
(12,'2006-01-11','121.48','ibm'),
(13,'2007-01-11','120.48','ibm'),
(14,'2008-01-11','123.48','ibm');

2 测试等值连接,通过表的自连接

select a.ymd, a.price, b.price
from
	stocks a 
inner join
	stocks b
on a.ymd = b.ymd
where 
	a.company = 'aapl' and b.company = 'ibm';

  结果为:

2010-01-04	214.01	132.45
2010-01-05	214.38	138.85
2010-01-06	210.97	129.55
2010-01-07	210.58	130.0
2010-01-08	211.58	130.85

3 测试非等值连接,通过表的自连接

select a.ymd,b.ymd, a.price, b.price
from
	stocks a
inner join 
	stocks b
on a.ymd <= b.ymd
where a.company = 'aapl' and b.company = 'ibm'
order by a.ymd asc;

报错如下:

FAILED: SemanticException Cartesian products are disabled for safety reasons. 
If you know what you are doing, please sethive.strict.checks.cartesian.product to false and that hive.mapred.mode is not set to 'strict' to proceed. 
Note that if you may get errors or incorrect results if you make a mistake while using some of the unsafe features.
当前hive运行在strict模式,该模式下:
- 不能进行表的笛卡尔积连接
- order by语句必须带有limit:order by在一个reducer中执行,容易成为性能瓶颈
- 带分区表的查询必须使用分区字段,在where条件中  

解决方式:

set hive.mapred.mode=nonstrict;

之后,再次执行非等值连接即可得到结果:

aapl时间          ibm时间       aapl价格  ibm价格

2010-01-04 2010-01-04 214.01 132.45 2010-01-04 2010-01-05 214.01 138.85 2010-01-05 2010-01-05 214.38 138.85 2010-01-04 2010-01-06 214.01 129.55 2010-01-05 2010-01-06 214.38 129.55 2010-01-06 2010-01-06 210.97 129.55 2010-01-04 2010-01-07 214.01 130.0 2010-01-05 2010-01-07 214.38 130.0 2010-01-06 2010-01-07 210.97 130.0 2010-01-07 2010-01-07 210.58 130.0 2010-01-04 2010-01-08 214.01 130.85 2010-01-05 2010-01-08 214.38 130.85 2010-01-06 2010-01-08 210.97 130.85 2010-01-07 2010-01-08 210.58 130.85 2010-01-08 2010-01-08 211.58 130.85

猜你喜欢

转载自www.cnblogs.com/wooluwalker/p/9196863.html
今日推荐