hive (wzj)> desc function extended concat_ws;
OK
tab_name
concat_ws(separator, [string | array(string)]+) - returns the concatenation of the strings separated by the separator.
Example:
> SELECT concat_ws('.', 'www', array('facebook', 'com')) FROM src LIMIT 1;
'www.facebook.com'
Time taken: 0.029 seconds, Fetched: 4 row(s)
concat:
hive (wzj)> desc function extended concat;
OK
tab_name
concat(str1, str2, ... strN) - returns the concatenation of str1, str2, ... strN or concat(bin1, bin2, ... binN) - returns the concatenation of bytes in binary data bin1, bin2, ... binN
Returns NULL if any argument is NULL.
Example:
> SELECT concat('abc', 'def') FROM src LIMIT 1;
'abcdef'
Time taken: 0.021 seconds, Fetched: 5 row(s)
collect_set:
hive (wzj)> desc function extended collect_set;
OK
tab_name
collect_set(x) - Returns a set of objects with duplicate elements eliminated
Time taken: 0.042 seconds, Fetched: 1 row(s)
需求
jack,dept01,A
jerry,dept02,A
wzj,dept01,A
qwe,dept02,A
asd,dept03,A
==>
dept01 A jack|wzj
dept02 A jerry|qwe
dept03 A asd
实现
hive (wzj)> create table hzl_concat_ws(name string,dept string,grade string)row format delimited fields terminated by ',';
OK
Time taken: 0.166 seconds
hive (wzj)> load data local inpath '/home/wzj/data/dept.txt' into table hzl_concat_ws;
Loading data to table wzj.hzl_concat_ws
Table wzj.hzl_concat_ws stats: [numFiles=1, numRows=0, totalSize=66, rawDataSize=0]
OK
Time taken: 0.621 seconds
hive (wzj)> select * from hzl_concat_ws;
OK
hzl_concat_ws.name hzl_concat_ws.dept hzl_concat_ws.grade
ck dept01 A
jerry dept02 A
wzj dept01 A
qwe dept02 A
asd dept03 A
Time taken: 0.09 seconds, Fetched: 5 row(s)
hive (wzj)> select t.grade_concat,concat_ws('|',t.name) from(
> select name,concat(dept,',',grade) as grade_concat from hzl_concat_ws) t
> group by t.grade_concat;
FAILED: SemanticException [Error 10002]: Line 1:38 Invalid column reference 'name'
hive (wzj)> select t.grade_concat,concat_ws('|',collect_set(t.name)) from(
> select name,concat(dept,',',grade) as grade_concat from hzl_concat_ws) t
> group by t.grade_concat
> ;
Ended Job = job_1585756506667_0001
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 5.49 sec HDFS Read: 8655 HDFS Write: 48 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 490 msec
OK
t.grade_concat _c1
dept01,A ck|wzj
dept02,A jerry|qwe
dept03,A asd
Time taken: 39.233 seconds, Fetched: 3 row(s)
列转行
explode:
hive (wzj)> desc function extended explode;
OK
tab_name
explode(a) - separates the elements of array a into multiple rows, or the elements of a map into multiple rows and columns
Time taken: 0.019 seconds, Fetched: 1 row(s)
将数组a的元素分为多行,或将地图的元素分为多行和多列
需求:
jack 语文,数学,英语
jerry 语文,数学
wzj 英语,物理
==>
jack 语文
jack 数学
jack 英语
...
hive (wzj)> create table lzh_array(name string,loaction array<string>) row format delimited fields terminated by '\t' collection items terminated by ',';
OK
Time taken: 0.086 seconds
hive (wzj)> load data local inpath '/home/wzj/data/lzh_arrat.txt' into table lzh_array;
Loading data to table wzj.lzh_array
Table wzj.lzh_array stats: [numFiles=1, numRows=0, totalSize=64, rawDataSize=0]
OK
Time taken: 0.354 seconds
hive (wzj)> select * from lzh_array;
OK
lzh_array.name lzh_array.loaction
jack ["语文","数学","英语"]
jerry ["语文","数学"]
wzj ["英语","物理"]
Time taken: 0.111 seconds, Fetched: 3 row(s)
hive (wzj)> select explode(loaction) from lzh_array;
OK
col
语文
数学
英语
语文
数学
英语
物理
Time taken: 0.104 seconds, Fetched: 7 row(s)
hive (wzj)> select name,subject from lzh_array lateral view explode(loaction) tmp as subject;
OK
name subject
jack 语文
jack 数学
jack 英语
jerry 语文
jerry 数学
wzj 英语
wzj 物理
Time taken: 0.084 seconds, Fetched: 7 row(s)