hive学习之六：row_number()排序函数的使用 - 代码天地

hive学习之六：row_number()排序函数的使用

其他 2018-06-03 17:26:30 阅读次数: 1

在hive中经常需要使用到排序，hive中的排序函数有多种，可在相关文档中查阅具体的使用方法，在项目中用到了row_number()来做排序。简单的用法在这里就

不做赘述了，项目具体需求如下：

表tbl_custinfo结构如下

create table tbl_custinfo(
custno   string，--客户号
acctno   string，--账号
cardno   string，--卡号
recdate  string，--卡审核日期
appid    string，--卡申请id
product  string，--卡种
cardtype string  --卡申请时卡种
)
partitioned by(pt string)
row format delimited   
fields terminated by ','  
stored as rcfile;

现在要求同一个客户下，取卡审核日期小的卡，若卡审核日期相同，取卡申请id小的卡，若卡申请id相同，则取卡种和卡申请时卡种相同的卡。

原始数据：

1002 65898 622589785645238 20161101 V9875624201 106  1027------①
1002 65898 655812318977101 20161101 V9875624201 1027 1027------②
1003 12876 688951011942014 20050521 Z1021540014 301  301-------③

一开始不知道如何实现，按卡审核日期和卡申请id排序好处理，但是要取取卡种和卡申请时卡种相同的卡就不太好处理，原先的想法是这样的：

select * from
  (select *,row_number() over(distribute by custno sort by recdate asc,appid asc,product=cardtype) as rank  from tbl_custinfo where pt='20161015') a
where a.rank=1;

mapreduce不报错，正常执行，

1002 65898 622589785645238 20161101 V9875624201 106  1027 1------①
1002 65898 655812318977101 20161101 V9875624201 1027 1027 2------②
1003 12876 688951011942014 20050521 Z1021540014 301  301  1------③

但是客户号为1002的排序结果不正确，②应该排序为1。在同事的提醒下换了个思路解决：

select * from
  (select *,row_number() over(distribute by custno sort by recdate asc,appid asc,case when product=cardtype then '1' else '2' end asc) as rank  from tbl_custinfo where pt='20161015') a
where a.rank=1;

结果显示是正确的。

1002 65898 622589785645238 20161101 V9875624201 1027  1027 1------①
1002 65898 655812318977101 20161101 V9875624201 106   1027 2------②
1003 12876 688951011942014 20050521 Z1021540014 301   301  1------③

猜你喜欢

转载自blog.csdn.net/javajxz008/article/details/53493509

hive学习之六：row_number()排序函数的使用

hive中：row_number()排序函数的使用

Hive函数之rank(),dense_rank(),row_number()排序分析函数

Hive之row_number() over分组排序

hive row_number分组排序top

hive的分组排序 row_number

hive中ROW_NUMBER()函数

HIVE ROW_NUMBER()函数去重

分组排序函数——row_number()

row_number()函数

row_number()over函数的使用(转)

Oracle 分析函数 ROW_NUMBER() 使用

SQL中ROW_NUMBER()函数的使用

ROW_NUMBER()函数使用详解

hive分组排序函数-row_number() over (partition by * order by d topN

hive：函数：lateral view的使用（炸开函数）和 row_number() 函数打行号

Hive row_number() 等用法

Hive内置row_number

hive查询数据排序加编号row_number() over()

Hive SQL之如何在row_number()等窗口函数中加where条件?

Hive使用row_number()函数有重复值，顺序固定吗

Hive中rank()、row_number()函数的用法

Hive中row_number()函数用法详解及示例

Mysql_窗口函数之排序函数rank()、dense_rank()、row_number()

Oracle-分析函数之排序后顺序号row_number()

Oracle中排序函数的用法之ROW_NUMBER()_RANK()_DENSE_RANK() OVER()的区别

oracle分析函数row_number() over()使用分组后内部排序

利用ROW_NUMBER()函数实现按条件分组排序

ROW_NUMBER() OVER()函数用法;(分组，排序），partition by

ROW_NUMBER() OVER()函数用法;(分组，排序），partition by (转)

今日推荐

基于大语言模型的开源知识库问答系统 MaxKB GitHub Star 数量突破 5,000 个！

美国拟限制 AI 大模型出口中国和俄罗斯

苹果将与 OpenAI 达成协议，将 ChatGPT 应用于 iPhone

openKylin 社区生态委员会第六次会议圆满召开

阿里云正式发布通义千问 2.5

Python 3.13 发布首个 Beta：实验性自由线程模式和 JIT、改进交互式解释器

Stack Overflow 拿我的代码去训练 AI 大模型，还封了我的账号

Pop!_OS 的 COSMIC 桌面完成 App Store 上架工作

《2024 年一季度互联网投融资运行情况》研究报告

报告：Django 仍然是 74% 开发者的首选

15 年前上了“FFmpeg 耻辱柱”，今天他还得谢谢咱——腾讯QQPlayer一雪前耻？

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

周排行

记一下去大梅沙的准备（2018-05-26）

Spring 注解事务

基于HTTP协议的客户端缓存

阿里云rds 备份和还原

[PHP] 几个拖慢 PHP 程序/API 运行速度的点

python 代码风格------------PEP8规则

js控制json生成菜单——自制菜单（一）

将字符串: 'k:1|k1:2|k2:3|k3:4 ' ,处理成 python 字典: {'k':1, 'k1':2, ...}

微信小程序转支付宝小程序

Qt551.窗口滚动条

每日归档

更多

2024-05-13(18)

2024-05-12(0)

2024-05-11(38)

2024-05-10(38)

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)