1 HDFS三种shell命令方式:

• hadoop fs:适用于任何不同的文件系统,比如本地文件系统和HDFS文件系统
• hadoop dfs:只能适用于HDFS文件系统
• hdfs dfs:跟hadoop dfs的命令作用一样,也只能适用于HDFS文件系统

2 查看帮助

  1. [hadoop@hadoop102 hadoop-2.7.2]$ hadoop fs
  2. Usage: hadoop fs [generic options]
  3. [-appendToFile <localsrc> ... <dst>]
  4. [-cat [-ignoreCrc] <src> ...]
  5. [-checksum <src> ...]
  6. [-chgrp [-R] GROUP PATH...]
  7. [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
  8. [-chown [-R] [OWNER][:[GROUP]] PATH...]
  9. [-copyFromLocal [-f] [-p] [-l] <localsrc> ... <dst>]
  10. [-copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
  11. [-count [-q] [-h] <path> ...]
  12. [-cp [-f] [-p | -p[topax]] <src> ... <dst>]
  13. [-createSnapshot <snapshotDir> [<snapshotName>]]
  14. [-deleteSnapshot <snapshotDir> <snapshotName>]
  15. [-df [-h] [<path> ...]]
  16. [-du [-s] [-h] <path> ...]
  17. [-expunge]
  18. [-find <path> ... <expression> ...]
  19. [-get [-p] [-ignoreCrc] [-crc] <src> ... <localdst>]
  20. [-getfacl [-R] <path>]
  21. [-getfattr [-R] {-n name | -d} [-e en] <path>]
  22. [-getmerge [-nl] <src> <localdst>]
  23. [-help [cmd ...]]
  24. [-ls [-d] [-h] [-R] [<path> ...]]
  25. [-mkdir [-p] <path> ...]
  26. [-moveFromLocal <localsrc> ... <dst>]
  27. [-moveToLocal <src> <localdst>]
  28. [-mv <src> ... <dst>]
  29. [-put [-f] [-p] [-l] <localsrc> ... <dst>]
  30. [-renameSnapshot <snapshotDir> <oldName> <newName>]
  31. [-rm [-f] [-r|-R] [-skipTrash] <src> ...]
  32. [-rmdir [--ignore-fail-on-non-empty] <dir> ...]
  33. [-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]]
  34. [-setfattr {-n name [-v value] | -x name} <path>]
  35. [-setrep [-R] [-w] <rep> <path> ...]
  36. [-stat [format] <path> ...]
  37. [-tail [-f] <file>]
  38. [-test -[defsz] <path>]
  39. [-text [-ignoreCrc] <src> ...]
  40. [-touchz <path> ...]
  41. [-truncate [-w] <length> <path> ...]
  42. [-usage [cmd ...]]
  43. Generic options supported are
  44. -conf <configuration file> specify an application configuration file
  45. -D <property=value> use value for given property
  46. -fs <local|namenode:port> specify a namenode
  47. -jt <local|resourcemanager:port> specify a ResourceManager
  48. -files <comma separated list of files> specify comma separated files to be copied to the map reduce cluster
  49. -libjars <comma separated list of jars> specify comma separated jar files to include in the classpath.
  50. -archives <comma separated list of archives> specify comma separated archives to be unarchived on the compute machines.
  51. The general command line syntax is
  52. bin/hadoop command [genericOptions] [commandOptions]

3 查看某条命令具体的作用

  1. #格式: hadoop fs -help 某个命令的名称
  2. [hadoop@hadoop102 hadoop-2.7.2]$ hadoop fs -help cp
  3. -cp [-f] [-p | -p[topax]] <src> ... <dst> :
  4. Copy files that match the file pattern <src> to a destination. When copying
  5. multiple files, the destination must be a directory. Passing -p preserves status
  6. [topax] (timestamps, ownership, permission, ACLs, XAttr). If -p is specified
  7. with no <arg>, then preserves timestamps, ownership, permission. If -pa is
  8. specified, then preserves permission also because ACL is a super-set of
  9. permission. Passing -f overwrites the destination if it already exists. raw
  10. namespace extended attributes are preserved if (1) they are supported (HDFS
  11. only) and, (2) all of the source and target pathnames are in the /.reserved/raw
  12. hierarchy. raw namespace xattr preservation is determined solely by the presence
  13. (or absence) of the /.reserved/raw prefix and not by the -p option.
  14. [hadoop@hadoop102 hadoop-2.7.2]$

4 在HDFS文件系统上操作的具体应用

4.1 创建一个input文件夹

  1. [hadoop@hadoop102 hadoop-2.7.2]$ hadoop fs -mkdir -p /usr/abes/input

4.2 查看input文件夹

  1. [hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -ls -R /
  2. drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp
  3. drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp/hadoop-yarn
  4. drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp/hadoop-yarn/staging
  5. drwx------ - hadoop supergroup 0 2021-04-01 15:43 /tmp/hadoop-yarn/staging/hadoop
  6. drwx------ - hadoop supergroup 0 2021-04-01 15:48 /tmp/hadoop-yarn/staging/hadoop/.staging
  7. drwxr-xr-x - hadoop supergroup 0 2021-04-01 16:18 /usr
  8. drwxr-xr-x - hadoop supergroup 0 2021-04-01 16:18 /usr/abes
  9. drwxr-xr-x - hadoop supergroup 0 2021-04-01 16:18 /usr/abes/input

4.3 本地创建测试文件

  1. 1.hadoop-2.7.2目录下创建wcinput文件夹
  2. 2.input目录下创建wc.input文件
  3. 3.编辑wc.input文件
  4. 内容如下:
  5. hadoop hdfs
  6. hadoop mapreduce
  7. abes
  8. abes

4.3 上传测试文件到input

  1. 1.上传wc.inputhdfs文件系统下的/usr/abes/input
  2. [hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -put wcinput/wc.input /usr/abes/input
  3. 2.查看是否上传成功
  4. [hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -ls /usr/abes/input
  5. Found 1 items
  6. -rw-r--r-- 1 hadoop supergroup 39 2021-04-01 16:43 /usr/abes/input/wc.input
  7. [hadoop@hadoop102 hadoop-2.7.2]$

4.4 使用MapReduce程序运行测试文件

  1. #使用mapreduce运行4.2已经创建的文件
  2. #wordcount为词频统计
  3. [hadoop@hadoop102 hadoop-2.7.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar wordcount /usr/abes/input /usr/abes/output
  4. #运行需要一段时间,耐心等待

4.5 查看输出结果

  1. #查看运行的结果
  2. [hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -cat /usr/abes/output/*
  3. abes 2
  4. hadoop 2
  5. hdfs 1
  6. mapreduce 1

浏览器查看
image.png

4.6 下载输出结果

  1. [hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -get /usr/abes/output/part-r-00000 ./wcinput/
  2. [hadoop@hadoop102 hadoop-2.7.2]$ ls wcinput/
  3. part-r-00000 wc.input

4.7 删除输出结果

  1. [hadoop@hadoop102 hadoop-2.7.2]$ hdfs dfs -rm -r /usr/abes/output
  2. 21/04/01 17:24:47 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion interval = 0 minutes, Emptier interval = 0 minutes.
  3. Deleted /usr/abes/output