hive中的分析函数以及时间戳的使用

样例数据如下所示: 仅仅展示字段createTime和memberId

createTime                    memberId
2017/11/13 2017-11-13 12:00:01 8a9e7bf05d7ec61b015d89e060901ef8
2017/11/13 2017-11-13 12:01:01 8a9f156c5d409b7d015d4566b0f06b06
2017/11/13 2017-11-13 12:02:01 8a9f17655e078b0e015e14a79de02059


需求是怎么将相同的memberId,间隔展示出来??


---创建测试表

create table member_create(memberId String,creatime TIMESTAMP);


--插入数据

insert into member_create values('8a9e7bf05d7ec61b015d89e060901ef8','2017-11-13 12:00:01');
insert into member_create values('8a9e7bf05d7ec61b015d89e060901ef8','2017-11-13 12:01:01');
insert into member_create values('8a9e7bf05d7ec61b015d89e060901ef8','2017-11-13 12:02:01');


---测试语句

select memberid, creatime, last_creatime,
case when last_creatime is null then 0 else unix_timestamp(creatime)-unix_timestamp(last_creatime) end diff_time
 from (select memberid, creatime, lag(creatime) over(partition by memberid order by creatime) last_creatime
from member_create) t;

--执行结果

8a9e7bf05d7ec61b015d89e060901ef8 2017-11-13 12:00:01 NULL 0
8a9e7bf05d7ec61b015d89e060901ef8 2017-11-13 12:01:01 2017-11-13 12:00:01 60
8a9e7bf05d7ec61b015d89e060901ef8 2017-11-13 12:02:01 2017-11-13 12:01:01 60


说明: 我仅仅列举了偏移量函数的用法,个人认为,hive中的分析函数和oracle中的分析函数功能都是类似的

猜你喜欢

转载自blog.csdn.net/zhaoxiangchong/article/details/78550231