hive中加载数据的方式

hive中加载数据的方式：

1、加载本地数据到hive表中：

load data local inpath '/opt/hive-0.13.1/emp.txt' into table emp ;

hive> load data local inpath '/opt/hive-0.13.1/emp.txt' into table emp ;
Copying data from file:/opt/hive-0.13.1/emp.txt
Copying file: file:/opt/hive-0.13.1/emp.txt
Loading data to table db_hive_0927.emp
Table db_hive_0927.emp stats: [numFiles=1, numRows=0, totalSize=656, rawDataSize=0]
OK
Time taken: 4.471 seconds

2、加载HDFS中的数据到表中：

load data inpath '/emp.txt' into table emp ;

hive> load data inpath '/emp.txt' into table emp ;
Loading data to table db_hive_0927.emp
Table db_hive_0927.emp stats: [numFiles=2, numRows=0, totalSize=1312, rawDataSize=0]
OK
Time taken: 11.055 seconds

3、创建表的时候加载数据：

as select 和 like

create table db_hive_0927.emp_load as select empno, ename, deptno from db_hive_0927.emp ;

hive> create table db_hive_0927.emp_load as  select empno, ename, deptno from db_hive_0927.emp ;
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1445142802171_0006, Tracking URL = http://cloud1:8088/proxy/application_1445142802171_0006/
Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1445142802171_0006
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 0
2015-10-19 08:49:57,599 Stage-1 map = 0%,  reduce = 0%
2015-10-19 08:50:58,584 Stage-1 map = 0%,  reduce = 0%
2015-10-19 08:51:28,175 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 5.75 sec
2015-10-19 08:52:28,757 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 6.66 sec
2015-10-19 08:52:59,494 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 12.5 sec
MapReduce Total cumulative CPU time: 12 seconds 500 msec
Ended Job = job_1445142802171_0006
Stage-4 is filtered out by condition resolver.
Stage-3 is selected by condition resolver.
Stage-5 is filtered out by condition resolver.
Launching Job 3 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1445142802171_0007, Tracking URL = http://cloud1:8088/proxy/application_1445142802171_0007/
Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1445142802171_0007
Hadoop job information for Stage-3: number of mappers: 1; number of reducers: 0
2015-10-19 08:54:11,395 Stage-3 map = 0%,  reduce = 0%
2015-10-19 08:54:26,512 Stage-3 map = 100%,  reduce = 0%, Cumulative CPU 2.43 sec
MapReduce Total cumulative CPU time: 2 seconds 430 msec
Ended Job = job_1445142802171_0007
Moving data to: hdfs://cluster/user/hive/warehouse/db_hive_0927.db/emp_load
Table db_hive_0927.emp_load stats: [numFiles=1, numRows=0, totalSize=392, rawDataSize=0]
MapReduce Jobs Launched: 
Job 0: Map: 2   Cumulative CPU: 13.36 sec   HDFS Read: 1751 HDFS Write: 574 SUCCESS
Job 1: Map: 1   Cumulative CPU: 2.43 sec   HDFS Read: 761 HDFS Write: 392 SUCCESS
Total MapReduce CPU Time Spent: 15 seconds 790 msec
OK
Time taken: 378.773 seconds

创建表的时候通过insert，like的方式加载数据

hive> create table db_hive_0927.emp_ins like db_hive_0927.emp ;
OK
Time taken: 0.505 seconds

4、覆盖已有表中的数据

load data inpath '/emp.txt' overwrite into table emp ;

hive> load data inpath '/emp.txt' overwrite into table emp ;
FAILED: SemanticException Line 1:17 Invalid path ''/emp.txt'': No files matching path hdfs://cluster/emp.txt

查看表中的数据：

hive> desc db_hive_0927.emp_ins;
OK
empno               	int                 	                    
ename               	string              	                    
job                 	string              	                    
mgr                 	int                 	                    
hiredate            	string              	                    
sal                 	double              	                    
comm                	double              	                    
deptno              	int                 	                    
Time taken: 0.587 seconds, Fetched: 8 row(s)

hive中加载数据的方式

猜你喜欢