
Disclaimer (免責聲明)
To continue reading means you accept the above disclaimer.

2015年6月24日 星期三

grep to find word in files

$ man grep

-r search recursively if the given path is a directory
-i case-insensitive search
-n print the line number

-H print the filename [implied when multiple files are searched, i.e. no need to add -H
when the given path stands for mutliple files]

-I ignore binary files (complement: -a treat all files as text)
-F treat search term as a literal, not a regular expression

Skip files whose base name matches GLOB (using wildcard
matching). A file-name glob can use *, ?, and [...] as
wildcards, and \ to quote a wildcard or backslash character

Skip files whose base name matches any of the file-name globs
read from FILE (using wildcard matching as described under

Exclude directories matching the pattern DIR from recursive searches.
[ e.g, to exlcude .svn and .git
xxx $ grep -r "non-sense" --exclude-dir='\.svn|\.git' (regex not supported for --exclude-dir ??)
--> $ grep -r "non-sense" --exclude-dir=.svn --exclude-dir=.git


the version of grep may not be able to use --exclude-dirs.


-E, --extended-regexp
-e PATTERN, --regexp=PATTERN
Use PATTERN as the pattern. This can be used to specify
multiple search patterns, or to protect a pattern beginning with
a hyphen (-). (-e is specified by POSIX.)

-R, --dereference-recursive
Read all files under each directory, recursively. Follow all
symbolic links, unlike -r.

-w, --word-regexp [to find a string that is a separate word (enclosed by spaces)]
Select only those lines containing matches that form whole
words. The test is that the matching substring must either be
at the beginning of the line, or preceded by a non-word
constituent character. Similarly, it must be either at the end
of the line or followed by a non-word constituent character.
Word-constituent characters are letters, digits, and the

-L, --files-without-match [only print filenames that are not matched]
Suppress normal output; instead print the name of each input
file from which no output would normally have been printed. The
scanning will stop on the first match.

-l, --files-with-matches [only print the matching filenames]
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match. (-l is specified by

$ grep -r word_to_find .
[to search "word_to_find" recursively under the current directory ]

$ grep -n "> tmp1.txt" *.sh
[to find string "> tmp1.txt" in all files with extension name ".sh" in the current folder, not recursively]

xxx often mistake
$ grep myutil.js .
grep: .: Is a directory

$ grep -r myutil.js .
$ grep myutil.js *.js
$ grep "myutil.js" *.js

$ grep *.js *.js
$ grep ".*js" *.js
[here, ".*js" stands for a regex in bash]

$ grep -l "> tmp1.txt" *.sh
[list all the names of files which contain "> tmp1.txt" ]

$ grep -r "myutil.js" . --exclude-dir=.git
$ grep -r "myutil.js" . --exclude-dir=?git
[though works as well, it means to exclude all subfolders whose name is even like agit, bgit, mgit, .git, etc. ]


bash regex


