After installing Hadoop, we have to become familiar with the operation in HDFS distributed file storage component among the command-line interface. HDFS is large data platform Hadoop distributed file system, to provide data storage applications, or other large data components, such as Hive, Mapreduce, Spark, HBase like.
1 systems, software and premise constraints
- CentOS-7 64
To reduce the impact on the rights linux beginners, all commands are operated under the linux root privileges. - Hadoop-2.5.2 is installed https://www.jianshu.com/p/5707c5ccd85b
2 Basic Operation
Note: Before the "# ./hdfs dfs -ls /" command "#" represents the currently logged on user is root. Running path in the bin directory hadoop file in the folder. Command. "" Represents the current directory. info file commands involved need to create a good advance. As shown below:
2.1 -ls Function: Display directory information.
# ./hdfs dfs -ls /
drwxr-xr-x - root supergroup 0 2018-07-30 00:09 /hbase
drwxr-xr-x - root supergroup 0 2018-06-23 15:22 /output
drwx------ - root supergroup 0 2018-07-31 00:32 /tmp
drwxr-xr-x - root supergroup 0 2018-07-31 00:41 /user
-rw-r--r-- 2 root supergroup 77 2018-04-22 02:34 /wordcount
2.2 -mkdir Function: Create a directory on HDFS file system.
# ./hdfs dfs -mkdir /wanhe
# ./hdfs dfs -ls /
drwxr-xr-x - root supergroup 0 2018-07-30 00:09 /hbase
drwxr-xr-x - root supergroup 0 2018-06-23 15:22 /output
drwx------ - root supergroup 0 2018-07-31 00:32 /tmp
drwxr-xr-x - root supergroup 0 2018-07-31 00:41 /user
drwxr-xr-x - root supergroup 0 2018-09-12 18:00 /wanhe
-rw-r--r-- 2 root supergroup 77 2018-04-22 02:34 /wordcount
2.3 -put function: upload local files to HDFS specified directory.
# ./hdfs dfs -put info /wanhe
# ./hdfs dfs -ls /wanhe
-rw-r--r-- 2 root supergroup 38 2018-09-12 18:10 /wanhe/info
2.4 -get Function: hdfs download the file locally.
# rm -rf info
# ls
container-executor hadoop hadoop.cmd hdfs hdfs.cmd mapred mapred.cmd rcc test-container-executor yarn yarn.cmd
# ./hdfs dfs -get /wanhe/info ./
# ls
container-executor hadoop hadoop.cmd hdfs hdfs.cmd info mapred mapred.cmd rcc test-container-executor yarn yarn.cmd
2.5 -rm function: delete files from HDFS.
# ./hdfs dfs -rm /wanhe/info
# ./hdfs dfs -ls /wanhe
空
2.6 -moveFromLocal Function: Cut a file to the HDFS
# ./hdfs dfs -moveFromLocal info /wanhe
# ./hdfs dfs -ls /wanhe
-rw-r--r-- 2 root supergroup 38 2018-09-12 22:04 /wanhe/info
# ls
container-executor hadoop hadoop.cmd hdfs hdfs.cmd mapred mapred.cmd rcc test-container-executor yarn yarn.cmd
2.7 -cat functions: file contents.
# ./hdfs dfs -cat /wanhe/info
jiangsuwanhe
2.8 -appendToFile function: append data to the end of the file.
# ./hdfs dfs -appendToFile info /wanhe/info
# ./hdfs dfs -cat /wanhe/info
jiangsuwanhe
jiangsuwanhe
2.9 -chmod function: change the file belongs permissions.
# ./hdfs dfs -ls /wanhe
-rw-r--r-- 2 root supergroup 51 2018-09-12 22:13 /wanhe/info
# ./hdfs dfs -chmod 777 /wanhe/info
# ./hdfs dfs -ls /wanhe
-rwxrwxrwx 2 root supergroup 51 2018-09-12 22:13 /wanhe/info
2.10 -cp functions: copying files.
将/wanhe/info拷贝到/tmp下:
# ./hdfs dfs -cp /wanhe/info /tmp/
# ./hdfs dfs -ls /tmp
-rw-r--r-- 2 root supergroup 51 2018-09-12 22:20 /tmp/info
2.11 -mv function: move the file.
将/wanhe/info移动到 /user下
# ./hdfs dfs -mv /wanhe/info /user/
# ./hdfs dfs -ls /wanhe
空
# ./hdfs dfs -ls /user
-rwxrwxrwx 2 root supergroup 51 2018-09-12 22:13 /user/info
2.12 -df functions: file system space available statistical information.
# ./hdfs dfs -df -h /
Filesystem Size Used Available Use%
hdfs://master:9000 17.5 G 352 K 11.4 G 0%
2.13 -du features: size information statistics folder.
# ./hdfs dfs -du /user
51 /user/info
2.14 -count features: a statistical number of files in the specified directory.
# ./hdfs dfs -count /user
2 1 51 /user
第一列2表示/user/下文件夹的数量,第二列1表示/user/下文件的个数。51表示/user/目录下所有文件占用的磁盘容量(不计算副本个数)。
3 summary
HDFS command line command operates similarly linux, linux skilled use command can use the command line skilled HDFS.