1. Man手册

NAME
find - search for files in a directory hierarchy
SYNOPSIS
find [-H] [-L] [-P] [-D debugopts] [-Olevel] [starting-point…] [expression]
DESCRIPTION
This manual page documents the GNU version of find. GNU find searches the directory tree rooted at
each given starting-point by evaluating the given expression from left to right, according to the rules
of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and
operations, true for or), at which point find moves on to the next file name. If no starting-point is
specified, `.’ is assumed.

  1. If you are using find in an environment where security is important (for example if you are using it to<br /> search directories that are writable by other users), you should read the "Security Considerations" chapter<br /> of the findutils documentation, which is called Finding Files and comes with findutils. That document<br /> also includes a lot more detail and discussion than this manual page, so you may find it a more useful<br /> source of information.

OPTIONS
If no paths are given, the current directory is used. If no expression is given, the expression -print is used (but you should probably consider using -print0 instead, anyway).

  1. ** -D debugoptions**<br /> help Explain the debugging options<br /> tree Show the expression tree in its original and optimised form.<br /> stat Print messages as files are examined with the stat and lstat system calls. The find program<br /> tries to minimise such calls.<br /> opt Prints diagnostic information relating to the optimisation of the expression tree; see the -O<br /> option.<br /> rates Prints a summary indicating how often each predicate succeeded or failed.

TESTS
-name pattern
Base of file name (the path with the leading directories removed) matches shell pattern pattern.
Because the leading directories are removed, the file names considered for a match with -name will
never include a slash, so -name a/b' will never match anything (you probably need to use **_-path_** in‐<br /> stead). A warning is issued if you try to do this, unless the environment variable POSIXLY_CORRECT<br /> is set. The metacharacters (‘, ?', and[]’) match a `.’ at the start of the base name (this is
a change in findutils-4.2.2; see section STANDARDS CONFORMANCE below). To ignore a directory
and__ the files under it, use **-prune*
; see an example in the description of -path. Braces are not
recognised as being special, despite the fact that some shells including Bash imbue braces with a
special meaning in shell patterns. The filename matching is performed with the use of the
fnmatch(3) library function. Don’t forget to enclose the pattern in quotes in order to protect it
from expansion by the shell.
-iname pattern
Like -name, but the match is case insensitive.

  1. -path pattern<br /> File name matches shell pattern pattern. The metacharacters do not treat `/' or `.' specially; so,<br /> for example,<br /> find . -path "./sr*sc"<br /> will print an entry for a directory called `./src/misc' (if one exists). _To ignore a whole direc__tory tree_,<br /> use **-prune** rather than checking every file in the tree. For example, to _skip the direc__tory _<br />_ `src/emacs' and all files and directories under it, and print the names of the other files __found, _<br />_do something like this_:<br /> `find . -path ./src/emacs `**`-prune`**` -o -print`<br /> Note that the pattern match test applies to the whole file name, starting from one of the start<br /> points named on the command line. It would only make sense to use an absolute path name here if<br /> the relevant start point is also an absolute path. This means that this command will never match<br /> any thing:<br /> find bar -path /foo/bar/myfile -print<br /> Find compares the -path argument with the concatenation of a directory name and the base name of<br /> the file it's examining. Since the concatenation will never end with a slash, -path arguments ending<br /> in a slash will match nothing (except perhaps a start point specified on the command line). The<br /> predicate -path is also supported by HP-UX find and will be in a forthcoming version of the POSIX <br />standard.<br />-ipath<br />Like -path. but the match is case insensitive.
  2. -regex pattern<br /> File name matches regular expression pattern. This is a match on the whole path, not a search. For<br /> example, to match a file named `./fubar3', you can use the regular expression `.*bar.' or `.*b.*3',<br /> but not `f.*r3'. The regular expressions understood by find are by default Emacs Regular Expres‐<br /> sions, but this can be changed with the -regextype option.<br />-iregex pattern<br />Like -regex, but the match is case insensitive.
  3. -false Always false.<br />-true Always true.
  4. -empty File is empty and is either a regular file or a directory.

ACTIONS
-delete
Delete files; true if removal succeeded. If the removal failed, an error message is issued. If
-delete fails, find’s exit status will be nonzero (when it eventually exits). Use of -delete auto‐
matically turns on the `-depth’ option.

  1. -exec command ;<br /> **Execute command;** true if 0 status is returned. _All following arguments to find are taken to be_<br />_ arguments to the command_ **until** an argument consisting of `;` is encountered. The string `{ }` is<br /> replaced by the current file name being processed everywhere it occurs in the arguments to the<br /> command, not just in arguments where it is alone, as in some versions of find. Both of these<br /> constructions might need to be escaped (with a `\`) or quoted to protect them from expansion by <br /> the shell.<br /> See the EXAMPLES section for examples of the use of the -exec option. **The specified command is**<br />** ****run**** once for each matched file**. The command is executed in the starting directory. There are <br /> unavoidable security problems surrounding use of the -exec action; you should use the -execdir<br /> option instead.

-ls True; list current file in ls -dils format on standard output.
-print True; print the full file name on the standard output, followed by a newline. If you are piping
the output of find into another program and there is the faintest possibility that the files which
you are searching for might contain a newline, then you should seriously consider using the -print0
option instead of -print.
-print0
True; print the full file name on the standard output, followed by a null character (instead of the
newline character that -print uses). This allows file names that contain newlines or other types of
white space to be correctly interpreted by programs that process the find output. This option cor‐
responds to the -0** option of **xargs.

  1. -prune True; **if the file is a directory, do not descend into it**. If -depth is given, false; no effect.<br /> Because -delete implies -depth, you cannot usefully use -prune and -delete together.

2. find原理解析

find [path…] [expression_list]
**
expression分为3种:即 man find 中的 OPTIONS、TESTS 以及 ACTIONS.

find是从左向右处理的,表达式的前后顺序,会造成不同的搜索性能差距,甚至搜索结果的不同。

2.1 OPERATORS

多个 expression 之间以OPERATORS相连,若无OPERATORS,认为与 -and 等同,与;若中间用 -or 相连,认为是 或 操作。

expr1 expr2 等同于and操作符
expr1 -a expr2 等同于and操作符
expr1 -and expr2 若expr1为false,则立即结束;
若expr1为true,则expr2依据expr1的执行结果继续执行
expr1 -o expr2 等同于or操作符
expr1 -or expr2 expr2的输入为expr1为假的输出
! expr 对expr的true和false结果取反。需要使用引号包围
-not expr 等价于”! expr”
( expr ) 强制优先级;使用 \ 防止转义: \(...\)

Notes:切不可将 -or逻辑或 等同。此处应理解为两层含义:

  1. expr1为True,则结果为expr1;否则结果为expr2 — 或
  2. 为了获取expr2的输入而做的铺垫

对于第2点,例子如下:

  1. find . -name "1.txt" -prune -o ! -path . -and -exec cp '{}' '{}'.bak \;

该命令的目的,实为对名 非1.txt 的文件进行拷贝并增加尾缀 .bak,执行后,只有1.txt未被备份.
前一个表达式旨在选出不想拷贝的文件,将此结果的 非集 作为后面表达式的输出

2.2 执行逻辑

and 的执行优先级高于 or,逐个符号进入表达式列表进行判断,再循环下一个符号。
整个执行过程,类似于迭代器,不是预先预存已有符号集,而是逐个进行。

-name "*.log" -o -name "*.txt" ,搜索到1.txt文件时,.log评估结果为假,于是评估.txt,评估为真,于是整个-o逻辑返回真.

以一个特殊的例子为例,来说明 **-o** 的特殊之处

  1. # 1.log 2.log 3.log 1.txt 2.txt 3.txt
  2. find . -type f -o -name "*.log" -print

该条指令,本预期输出所有尾缀为 .log 的文件。
通过 debug 选项可知等价于 find . -type f -o ( -name *.txt -and -print )
当前目录下的所有文件皆为 regular file,expr1表达式皆返回true;
故不会执行 -o 之后的expr2,即什么都不会打印。

2.3 -print的默认与非默认使用差异

-print 属 ACTIONS类,man手册中有以下两点在使用expression的OPERATORS时需特别注意:

  1. 无expressions,默认使用-print
  2. 整个表达式list,如果ACTIONS只有-prune 或 -print, 则 -print 的作用域为 整个表达式为真的文件.
  1. #drwxrwxr-x 2 xx xx 4096 4月 24 10:06 dir/
  2. #-rw-rw-r-- 1 xx xx 0 4月 23 19:42 dir/1.png
  3. #-rw-rw-r-- 1 xx xx 4 4月 23 19:02 dir/1.txt
  4. #-rw-rw-r-- 1 xx xx 41 4月 23 13:49 haha.txt
  5. find -D rates -path "./dir" -prune -o -type f -print
  6. ( -path ./dir -a -prune ) -o ( -type f -a -print ) # 显式 -print 与 -type f 成 AND
  7. ./haha.txt
  8. find -D rates -path "./dir" -prune -o -type f
  9. (( -path ./dir -a -prune) -o -type f) -a -print # 隐式 -print 作用为整个表达式为 True
  10. ./haha.txt
  11. ./dir/1.png
  12. ./dir/1.txt

3. 命令常用选项方式及组合

find 命令默认搜索路径是当前目录,默认参数是 -print。
-print ,它默认以 \n 将找到的文件分隔;
-print0\n\0

3.1 根据文件名搜索

选项 -name-path
-name 选项 只能针对 basename 进行搜索
-path 选项 可以针对 dirname+basename 进行搜索

通配符: * ? [xX2] [-] [^]

3.2 根据文件类型搜索

选项 -type
f : regular file
d : directory
l : symbolic link

3.3 获取文件绝对路径

利用readlink -f选项,对普通文件操作

  1. # 获取当前路径下所有 .txt 类型文件的绝对路径
  2. find . -type f -name "*.txt" -print0 | xargs -0 readlink -f

3.4 获取文件名部分

-printf 输出 可以根据修饰符来判决输出形式:
%f : basename
%p : 获取路径自身
%P : 获取除了find搜索路径的剩余部分

  1. find . -type f -name "*.txt" -printf "%f\n"

3.5 执行命令选项 -exec

配合find命令搜索进而执行后续cmd. -exec cmd 直到 ; 之间的参数,都将作为cmd的参数传入,而 { } 将作为find搜索到的文件名的替换占位符,同样作为cmd的参数.
Tips: ; 需要使用 \ 进行转义以防止被shell扩展它用。

  1. # 1.txt : aaa
  2. # 2.txt : bbb
  3. # 3.txt : ccc
  4. find -type f -regex ".*[0-9]\.txt" -exec cat '{}' \;
  5. bbb
  6. ccc
  7. aaa

3.6 搜索某个目录下,不包含dir1和dir2目录下文件的所有以.c为尾缀的文件

  1. find . \( -path "*/dir1" -path "*/dir2" \) -prune -o ! -empty -type f -name "*.c" -print

3.7 将find结果用于pipe的输入做下一级应用的参数

-print 将输出项之间以 newline 做为分隔,管道下级输入一般为多次单独输入,而非一次性所有参数输入
-print0 将输出项之间以 NUL 作为分隔,恰好可用于xargs -0选项作为管道一次性输入所有参数

  1. find xxx -print0 | xargs -0 cmd