hive中加载数据的方式:
1、加载本地数据到hive表中:
load data local inpath '/opt/hive-0.13.1/emp.txt' into table emp ;
hive> load data local inpath '/opt/hive-0.13.1/emp.txt' into table emp ;
Copying data from file:/opt/hive-0.13.1/emp.txt
Copying file: file:/opt/hive-0.13.1/emp.txt
Loading data to table db_hive_0927.emp
Table db_hive_0927.emp stats: [numFiles=1, numRows=0, totalSize=656, rawDataSize=0]
OK
Time taken: 4.471 seconds
2、加载HDFS中的数据到表中:
load data inpath '/emp.txt' into table emp ;
hive> load data inpath '/emp.txt' into table emp ;
Loading data to table db_hive_0927.emp
Table db_hive_0927.emp stats: [numFiles=2, numRows=0, totalSize=1312, rawDataSize=0]
OK
Time taken: 11.055 seconds
3、创建表的时候加载数据:
as select 和 like
create table db_hive_0927.emp_load as select empno, ename, deptno from db_hive_0927.emp ;
hive> create table db_hive_0927.emp_load as select empno, ename, deptno from db_hive_0927.emp ;
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1445142802171_0006, Tracking URL = http://cloud1:8088/proxy/application_1445142802171_0006/
Kill Command = /opt/hadoop/bin/hadoop job -kill job_1445142802171_0006
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
2015-10-19 08:49:57,599 Stage-1 map = 0%, reduce = 0%
2015-10-19 08:50:58,584 Stage-1 map = 0%, reduce = 0%
2015-10-19 08:51:28,175 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 5.75 sec
2015-10-19 08:52:28,757 Stage-1 map = 50%, reduce = 0%, Cumulative CPU 6.66 sec
2015-10-19 08:52:59,494 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 12.5 sec
MapReduce Total cumulative CPU time: 12 seconds 500 msec
Ended Job = job_1445142802171_0006
Stage-4 is filtered out by condition resolver.
Stage-3 is selected by condition resolver.
Stage-5 is filtered out by condition resolver.
Launching Job 3 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1445142802171_0007, Tracking URL = http://cloud1:8088/proxy/application_1445142802171_0007/
Kill Command = /opt/hadoop/bin/hadoop job -kill job_1445142802171_0007
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2015-10-19 08:54:11,395 Stage-3 map = 0%, reduce = 0%
2015-10-19 08:54:26,512 Stage-3 map = 100%, reduce = 0%, Cumulative CPU 2.43 sec
MapReduce Total cumulative CPU time: 2 seconds 430 msec
Ended Job = job_1445142802171_0007
Moving data to: hdfs://cluster/user/hive/warehouse/db_hive_0927.db/emp_load
Table db_hive_0927.emp_load stats: [numFiles=1, numRows=0, totalSize=392, rawDataSize=0]
MapReduce Jobs Launched:
Job 0: Map: 2 Cumulative CPU: 13.36 sec HDFS Read: 1751 HDFS Write: 574 SUCCESS
Job 1: Map: 1 Cumulative CPU: 2.43 sec HDFS Read: 761 HDFS Write: 392 SUCCESS
Total MapReduce CPU Time Spent: 15 seconds 790 msec
OK
Time taken: 378.773 seconds
创建表的时候通过insert,like的方式加载数据
hive> create table db_hive_0927.emp_ins like db_hive_0927.emp ;
OK
Time taken: 0.505 seconds
4、覆盖已有表中的数据
load data inpath '/emp.txt' overwrite into table emp ;
hive> load data inpath '/emp.txt' overwrite into table emp ;
FAILED: SemanticException Line 1:17 Invalid path ''/emp.txt'': No files matching path hdfs://cluster/emp.txt
查看表中的数据:
hive> desc db_hive_0927.emp_ins;
OK
empno int
ename string
job string
mgr int
hiredate string
sal double
comm double
deptno int
Time taken: 0.587 seconds, Fetched: 8 row(s)