Hive数据仓库之金融业务-业务实战一

业务名称:某金融公司所属营业部月交易量前10的客户的客户

技术描述:

1.数据接入

数据来源于柜台系统中的业务表和凌志营销一体化系统中的表

系统数据库1:

hive_s5_szdb_sz_asset_client
hive_s5_szdb_sz_his_his_deliver
hive_s5_szdb_sz_his_his_fundjour
hive_s5_szdb_sz_his_his_assetdebit

系统数据库2:
hive_s8__t_gxgl_gxmx_query
hive_s8__t_ehr_jjr_jbxx

2.hive数据仓库中业务sql编写

基于星型模型开发

事实表关联维度表

业务准备:

1.月交易量=收入=佣金+发生金额+融资应付利息

--当期佣金fare0 
with a as (select branch_no,client_id from hive.hive_s5_szdb_sz_asset_client where hive_p_date=20190531 and branch_no in (1,2))
with b as (select branch_no,client_id,fund_account,client_name,sum(fare0)value from hive.hive_s5_szdb_sz_his_his_deliver where hive_p_date between 20190501 and 20190531 and branch_no in(1,2) and asset_prop = 7 group by branch_no,client_id,fund_account,client_name order by sum(fare0)desc)
--当期发生金额occur_balance(默认为负值)
,c as (select branch_no,client_id,fund_account,max(client_name)client_name,sum(-occur_balance)value from hive.hive_s5_szdb_sz_his_his_fundjour where hive_p_date between 20190501 and 20190531 and branch_no in(1,2) and business_flag in(2701,2712,2715,2718,2721,2724,2732,2735,2738,2741,2744,2769,2776,2778,2779,2794,2795) and asset_prop = 7 group by branch_no,client_id,fund_account order by sum(-occur_balance) desc)
--当期末应收利费收入-上期末应收利费收入=融资应付利息 fin_pre_interest
,d as (select a.branch_no,a.client_id,a.client_name,a.fund_account,(value1-value2)value from
 (select branch_no,client_id,fund_account,max(client_name)client_name,sum(fin_pre_interest) value1 from hive.hive_s5_szdb_sz_his_his_assetdebit where hive_p_date = 20190531 and branch_no in (1,2) group by branch_no,client_id,fund_account order by branch_no,client_id)a
left join(select branch_no,client_id,fund_account,max(client_name)client_name,sum(fin_pre_interest) value2 from hive.hive_s5_szdb_sz_his_his_assetdebit where hive_p_date = 20181231 and branch_no in (1,2) group by branch_no,client_id,fund_account order by branch_no,client_id)b on a.branch_no=b.branch_no and a.client_id=b.client_id order by branch_no,(value1-value2)desc)
--,c as (select branch_no,client_id,fund_account,max(client_name)client_name,sum(fin_pre_interest)value1 from hive.hive_s5_szdb_sz_his_his_assetdebit where hive_p_date between 20190501 and 20190531 and branch_no in (1,2) group by branch_no,client_id,fund_account order by branch_no,client_id,fund_account)
--客户服务关系
,e as (select a.ryid,a.khid,a.gxlxbh,b.xm from (
select * from (select ryid,khid,case when gxlxbh=2 then '服务关系' else '@' end gxlxbh from hive.hive_s8__t_gxgl_gxmx_query where hive_p_date=20190507)where gxlxbh !='@')a 
join (select ryid,xm from hive.hive_s8__t_ehr_jjr_jbxx where hive_p_date=20190507)b on a.ryid=b.ryid)
/* 查看中间状态结果select a.branch_no,a.client_id,
nvl(nvl(b.client_name,c.client_name),d.client_name)client_name,
nvl(nvl(b.fund_account,c.fund_account),d.fund_account)fund_account,
(nvl(b.value,0)+nvl(c.value,0)+nvl(d.value,0))value1,xm,gxlxbh
from a left join b on a.branch_no = b.branch_no and a.client_id=b.client_id
left join c on a.branch_no=c.branch_no and a.client_id=c.client_id
left join d on a.branch_no=d.branch_no and a.client_id=d.client_id  
left join e on a.client_id=e.khid 
order by value1 desc;*/
--目前里面都是信用账户的业务(后续若有变动请调整)
,f as (select a.branch_no,a.client_id,
nvl(nvl(b.client_name,c.client_name),d.client_name)client_name,
nvl(nvl(b.fund_account,c.fund_account),d.fund_account)fund_account,
(nvl(b.value,0)+nvl(c.value,0)+nvl(d.value,0))value1,xm,gxlxbh
from a left join b on a.branch_no = b.branch_no and a.client_id=b.client_id
left join c on a.branch_no=c.branch_no and a.client_id=c.client_id
left join d on a.branch_no=d.branch_no and a.client_id=d.client_id  
left join e on a.client_id=e.khid
order by value1 desc)
--去重,每个营业部输出10条交易量前10的客户数据
,g as (select *,row_number() over(partition by branch_no order by value1 desc)rid from f where value1>0)
select branch_no,client_id,client_name,fund_account,value1,rid,201904,xm,gxlxbh from g where rid<=10;

3.将sql封装成 .py脚本

1)脚本运行 python xxx.py 日期

2)配置到大数据管理平台上之后,平台默认调度设置当天跑前一天的任务

扫描二维码关注公众号,回复: 11884214 查看本文章

4.数据出库到报表平台后台数据库

1)将业务sql的结果数据在报表平台进行个性化(表格,键值对,环装图,雷达图...)展示,大屏展示

2)报表平台的相关操作

3)该报表的权限配置,指定用户可见,相应报表权限开放给对应的需求部门

5.任务管理

1)配置脚本的的依赖视图,注意:每一个步骤都是任务,有相应的任务id

入库->指标计算->统计指标->指标出库到某报表平台后台数据库xxxreport中相关表

2)设置任务的日期运行参数,是否只是交易日跑数据

3)第二天如果运行有错误日志则根据日志定向查找并分析具体原因

6.调度配置

1)任务时间设置

在每天3:30调起任务,运行屏输出运行日志

7.数据核查

1)大数据管理平台仪表盘显示每天所有接入,计算,统计,出库等队列的失败任务

2)根据平台监控仪表盘可查看个人失败任务,如此任务失败则根据日志定向查找原因,否则就没有啦,该任务运行的很good

猜你喜欢

转载自blog.csdn.net/ALIVEE/article/details/89978177