1 HDFS三种shell命令方式:
• hadoop fs:适用于任何不同的文件系统,比如本地文件系统和HDFS文件系统
• hadoop dfs:只能适用于HDFS文件系统
• hdfs dfs:跟hadoop dfs的命令作用一样,也只能适用于HDFS文件系统
2 查看帮助
[hadoop@hadoop102 hadoop-2.7.2]$ hadoop fs
Usage: hadoop fs [generic options]
[-appendToFile <localsrc> ... <dst>]
[-cat [-ignoreCrc] <src> ...]
[-checksum <src> ...]
[-chgrp [-R] GROUP PATH...]
[-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
[-chown [-R] [OWNER][:[GROUP]] PATH...]
[-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
[-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-count [-q] [-h] <path> ...]
[-cp [-f] [-p | -p[topax]] <src> ... <dst>]
[-createSnapshot <snapshotDir> [<snapshotName>]]
[-deleteSnapshot <snapshotDir> <snapshotName>]
[-df [-h] [<path> ...]]
[-du [-s] [-h] <path> ...]
[-expunge]
[-find <path> ... <expression> ...]
[-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
[-getfacl [-R] <path>]
[-getfattr [-R] {-n name | -d} [-e en] <path>]
[-getmerge [-nl] <src> <localdst>]
[-help [cmd ...]]
[-ls [-d] [-h] [-R] [<path> ...]]
[-mkdir [-p] <path> ...]
[-moveFromLocal <localsrc> ... <dst>]
[-moveToLocal <src> <localdst>]
[-mv <src> ... <dst>]
[-put [-f] [-p] [-l] <localsrc> ... <dst>]
[-renameSnapshot <snapshotDir> <oldName> <newName>]
[-rm [-f] [-r|-R] [-skipTrash] <src> ...]
[-rmdir [--ignore-fail-on-non-empty] <dir> ...]
[-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
[-setfattr {-n name [-v value] | -x name} <path>]
[-setrep [-R] [-w] <rep> <path> ...]
[-stat [format] <path> ...]
[-tail [-f] <file>]
[-test -[defsz] <path>]
[-text [-ignoreCrc] <src> ...]
[-touchz <path> ...]
[-truncate [-w] <length> <path> ...]
[-usage [cmd ...]]
Generic options supported are
-conf <configuration file> specify an application configuration file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|resourcemanager:port> specify a ResourceManager
-files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
-archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
3 查看某条命令具体的作用
#格式: hadoop fs -help 某个命令的名称
[hadoop@hadoop102 hadoop-2.7.2]$ hadoop fs -help cp
-cp [-f] [-p | -p[topax]] <src> ... <dst> :
Copy files that match the file pattern <src> to a destination. When copying
multiple files, the destination must be a directory. Passing -p preserves status
[topax] (timestamps, ownership, permission, ACLs, XAttr). If -p is specified
with no <arg>, then preserves timestamps, ownership, permission. If -pa is
specified, then preserves permission also because ACL is a super-set of
permission. Passing -f overwrites the destination if it already exists. raw
namespace extended attributes are preserved if (1) they are supported (HDFS
only) and, (2) all of the source and target pathnames are in the /.reserved/raw
hierarchy. raw namespace xattr preservation is determined solely by the presence
(or absence) of the /.reserved/raw prefix and not by the -p option.
[hadoop@hadoop102 hadoop-2.7.2]$
4 在HDFS文件系统上操作的具体应用
4.1 创建一个input文件夹
[hadoop@hadoop102 hadoop-2.7.2]$ hadoop fs -mkdir -p /usr/abes/input
4.2 查看input文件夹
[hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -ls -R /
drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp
drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp/hadoop-yarn
drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp/hadoop-yarn/staging
drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp/hadoop-yarn/staging/hadoop
drwx------ - hadoop supergroup 0 2021-04-01 15:48 /tmp/hadoop-yarn/staging/hadoop/.staging
drwxr-xr-x - hadoop supergroup 0 2021-04-01 16:18 /usr
drwxr-xr-x - hadoop supergroup 0 2021-04-01 16:18 /usr/abes
drwxr-xr-x - hadoop supergroup 0 2021-04-01 16:18 /usr/abes/input
4.3 本地创建测试文件
1.在hadoop-2.7.2目录下创建wcinput文件夹
2.在input目录下创建wc.input文件
3.编辑wc.input文件
内容如下:
hadoop hdfs
hadoop mapreduce
abes
abes
4.3 上传测试文件到input
1.上传wc.input至hdfs文件系统下的/usr/abes/input
[hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -put wcinput/wc.input /usr/abes/input
2.查看是否上传成功
[hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -ls /usr/abes/input
Found 1 items
-rw-r--r-- 1 hadoop supergroup 39 2021-04-01 16:43 /usr/abes/input/wc.input
[hadoop@hadoop102 hadoop-2.7.2]$
4.4 使用MapReduce程序运行测试文件
#使用mapreduce运行4.2已经创建的文件
#wordcount为词频统计
[hadoop@hadoop102 hadoop-2.7.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /usr/abes/input /usr/abes/output
#运行需要一段时间,耐心等待
4.5 查看输出结果
#查看运行的结果
[hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -cat /usr/abes/output/*
abes 2
hadoop 2
hdfs 1
mapreduce 1
4.6 下载输出结果
[hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -get /usr/abes/output/part-r-00000 ./wcinput/
[hadoop@hadoop102 hadoop-2.7.2]$ ls wcinput/
part-r-00000 wc.input
4.7 删除输出结果
[hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -rm -r /usr/abes/output
21/04/01 17:24:47 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
Deleted /usr/abes/output