grep

output lines from files that match a pattern

grep [options] pattern  … [file...]  

       grep [options] [-e pattern | -f patternFile] [--] [file …]

bzgrep, zgrep handle compressed files.

files (or STDIN if no files are named, or if file is - )

The -- marks the end of the options (useful if the result of metacharacter expanion includes a filename that begins with -.

-G --basic-regexp patternis a Basic Regular Expression (default).
? , + , { , | , ( and ) must be preceeded by \ to ENABLE special meaning
-E --extended-regexp pattern is an Extended Regular Expression.
This adds ?, +, and |, and it removes the need to escape ( ) and { }.
Which permits, for example, matching EITHER patternA | patternB.
egrep is the same as grep -E.

-F --fixed-strings pattern is a list of fixed strings, separated by newlines, any of which is to be matched.
fgrep is the same as grep -F.

-P --perl-regexp Interpret pattern as a Perl regular expression.

As with all the documentation on this site, this document has been severly revised. see for the real deal.

controling matching
`-v --invert-match`	Invert the sense of matching to select non-matching lines.
`-i --ignore-case`
`-e pattern --regexp=pattern`	Use `pattern` as the matching pattern. Useful to protect `pattern`s beginning with `-` from being intrepreted as options.
`-f file --file=file`	Obtain patterns from `file`, one per line. An empty file contains zero `pattern`s and matches nothing.
`-w --word-regexp`	Matches must be whole words. The matching substring must either be at the beginning of the line, or preceded by a non-alphanumeric character (example `space` ) and followed by a non-alphanumeric character or must be either at the end of the line. Alphanumeric characters are letters, digits, and the underscore. As if surrounded by '[[:<:]]' and '[[:>:]]'
`-x --line-regexp`	exactly match the whole line.
`-y`	Obsolete synonym for `-i` (--ignore-case).

controlling output

-q
 --quiet
 --silent

Do not write to STDOUT.
Useful for announcing failure :

%  if [ `grep --quite failure resultfile` -eq 0 ];then echo " Step $stepFailed "; exit 3; fi

Errors like "Permission denied" are written to STDERR and return code is 2.
Exit with 0 status as soon as a match is found.
If no match return code is 1 (Useful in conditional statments.)

-s --no-messages Suppress error messages about nonexistent, unaccessable files or a directory normally sent to STDERR.

-o --only-matching only the matching part of the line is output.

--color[=when] 

-colour[=when]

when may be never, always, or auto

Matched sections are displayed using $GREP_COLOR in the form ff[;bb]
where ff is the code used for the foreground and bb is used for the background.

Examples:
magenta on yellow 35;43 blue on cyan 34;46 red 31 (default)
Foreground: black 30, green 32, yellow 33, blue 34, magenta 35, cyan 36, white 37, red 31 (default)
Background: black 40, green 42, yellow 43, blue 44, magenta 45, underscore 4, [blink] 5, inverse 7
export GREP_COLOR="33;44"

Codes are not output when inappropriate (For example: to a file or pipe) unless when=always

grep --color=always birds logfile | more -r

-C^†num --context=num output num lines of Context before and after match

-B num --before-context=num Output num lines of context before lines containing the pattern
then the lines containing pattern.

-A num
 --after-context=num

Output num lines of context after outputting the pattern.
Places a line containing -- between sets of matches.
Examples:

	grep -C1 day	grep -B2 day	grep -A2 day
input	output
week
mon		mon
yesterday	yesterday	yesterday
day	day	day	day
later	later		later
tuesday
sunday
	--	--	--	set seperator
week2
mon2		mon2
yr2	yr2	yr2
day2	day2	day2	day2
later2	later2		later2
f			f

-c
 --count

output only count of matches (or with --invert-match non-matches) for each file.
With multiple files prefixed by 'filename :', even if count is 0!

 grep -c images *html |grep -v 0$ #dont show lines with 0 count
BackupStrageties.html:1
Characters.html:2
clonezilla.1.html:6
colors.html:3
css-box.html:4

-n
 --line-number

not with

--count

Prefix lines with their line number, starting from the beginning of files.

-m  num
 
        --max-count= num ¬

stop reading after num matching lines.
With 0 no lines are read.
If the input is standard input from a regular file, and num matching lines are output, the standard input is positioned to just after the last matching line, enables calling process to resume a search.

Example:Show 2 lines before body with ++ between each occurance.

> while grep -B 2 -m 1 body ; do echo ++; done<index.html

Outputs any trailing context lines.
With -c or --count , does not output a count greater than num.

With -v or --invert-match, stops after outputting num non-matching lines.

-b --byte-offset prefix each line with the byte offset within the input file

-u --unix-byte-offsets Use Unix-style byte offsets, with CR characters stripped off. only for MS-DOS and MS-Windows.

-T --initial-tab Align the content to a tab-stop, useful with : --with-filename(-H),--line-number(-n) and --byte-offset( -b).
Causes the line number and byte offset in a minimum width.

-H --with-filename filename is prepended to each match when multiple files are searched.

-h --no-filename don't

-l --files-with-matches outputs the only the filename of files containing a match.
Once a match is found proceeds to the next file.

-L --files-without-match outputs the filename with no match

-Z
 --null

terminate filename with NULL . Used when file names contain newlines!.

Used with commands find -print0, perl -0, sort -z, and xargs -0.

--label=label

Displays input actually coming from standard input as input coming from file label. Useful for tools like zgrep, e.g.

 gzip -cd foo.gz |grep --label=foo something

Mac OS only (grep (BSD grep, GNU compatible) 2.6.0-FreeBSD))

Directory processing
`-d action --directories=action`	If an input is a directory: `action`: `read`: as ordinary files, default. `skip`. `recurse`: reads all files under each directory, recursively.
`-R -r --recursive`	equivalent to `-d` `recurse` .
`--include=pattern`	Recurse only searching file matching `pattern.`
`--exclude=pattern`	Recurse skip file matching `pattern.`

binary (not ASCII text) file handling
If the file appears to be a text file, `␍` s are ignored. This will have regular expressions with `^`^† and `$` ^† work correctly.
`-U --binary`	Process `file`(s) as binary(non ASCII text). Causes all files to be read and passed to the matching mechanism verbatim. If the file is actually a text file with `␍` or `␍␊` pairs, regular expressions with `^`^† or `$` ^† will not match. Only under MS-Windows unless used with `-b` will not match. By default, under MS-Windows, uses the first 32KB of the file to determine if a file is text.
`--binary‑files=type`	If the first few bytes of a file indicate it contains non-ASCII bytes^†, On a match, outputs: `Binary file name matches` . `type` `binary` and `without-match` does not a match, default. `text` process all `file`s as text; equivalent to `-a` . Control characters sent to the terminal can set attributes making it unreadable. If this happens try `STTY SANE` and filter the stream, deleting non-printable characters: `tr -cd "\n[:print:]"`.
`-I`	binary files do not match. `‑‑binary‑files=without‑match` equivalent.
`-a --text`	Process all file as text Equivalent to: `‑‑binary‑files=text`

`--help`
`-V --version`	Display the version to standard error.
`--mmap`	use the `mmap` system call to read input, instead of `read` which may yield better performance. May cause undefined behavior if an input file shrinks, or if an I/O error occurs.
`-D action --devices=action`	If an input file is a device, FIFO or socket, use `action` to process it. `read`, which means that devices are read as ordinary files (default). `skip`, devices are silently skipped.
`--line-buffered`	flush after each match. This permits piping to process each match as it occurs rather than waiting for full buffer. has a performance penality.

Basic Regular Expressions: ? , + , { , | , ( and ) alone have no special meaning
Use backslash to ENABLE special meaning (ex: \?)
* is a meta character.

Regular Expressions

A pattern describes a set of strings, using operators to combine smaller expressions.

grep processes both basic regular expressions and extended regular expressions which add ?, +, and |, and it removes the need to escape ( ) and { }, ( with GNU grep, there is no difference. )

The simplest regular expression matchs a single character. For example: a matches an a , R matches an R.
Regular expresions can be combined. For example: AR matches AR.

Regular expressions joined by | match any string matching either expression.

Metacharacters, can be treated as normal character by preceding it with a \ (backslash) (the opposite of BRE)!
. (period) matches any single character except null.
^ (caret) matches the beginning of a line
$ (dollar sign) matches the end of a line.

Repetition Operators may follow a regular expression.
preceeding item is matched
`?`	at most once (more) ^†
`*`	zero or more times ^† Match is greedy attempting to match as much as possible. For example `/.xx/` matches everything in the line:`asdfbxx asdfxx asdfxx` For stingy* (aka laxy ) matching: a question mark after any of the greedy quantifiers, chooses the smallest quantity for the first try. `/.foo/` matches BOTH words in `barfoo stoolfoo` . `/.?foo/` matches the FIRST word in `barfoo stoolfoo` `perlpcre` `vim` does NOT use `?` rather: `.\{-}`© See `:help non-greedy` and `:ver` and plugin `eregex.vim` vimdoc/pattern.
`+`	one or more times ^†
`\{n\}`	exactly `n` times. `\{` is a brace escaped from being normal character, as is `\}`.
`\{n,\}`	`n` or more times
`\{n,m\}`	at least `n` times, but not more than `m` times.
`\{0,\}`	0 or more times, i.e. optional.
Braces may need to be escaped as in `z\{1,3\}` means: `z zz or zzz`

/9\{2,3\} bytes/ matches 99 bytes and 999 bytes

\> matches the empty string at the beginning of a word, \< … at the end of a word.
\b matches the empty string at the edge of a word,
\B matches the empty string provided it's not at the edge of a word.

Selecting a single character

Use repetions to select mutiple characters.

bracket expressions

in a list of characters enclosed by [ and ]
Eaxmple: match from 1 to 9 spaces \{1,9\} .
nonmatching list
^ (caret) preceeding the expression matchs any character not in the list.
For example:
^[abcdefghijklmnopqrstuvwxyz] matches a single non-lowercase character.
range expression
Seperating two characters by - (hyphen) matches any single character that sorts between the two characters, inclusive.
using the locale's collating sequence and character set.
For example:
[3-7] is equivalent to [34567].
Many locales sort characters in dictionary order, in these locales [a-d] is typically equivalent [aBbCcDd] not [abcd].
To obtain the traditional interpretation of bracket expressions ( where [a-d] is equivalent to [abcd]), use the C locale using set LC_ALL=C.
named classes match a single character
[:alpha:] for [abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ] NOT including Underscore ( _ ).
[:lower:] . for [abcdefghijklmnopqrstuvwxyz]
[:upper:] for [ABCDEFGHIJKLMNOPQRSTUVWXYZ]
[:digit:] for [0123456789] synonym: \d (word)
[:alnum:] for [a-zA-Z0-9] synonyms : \w (word). \W NOT alnum.
Frequently equivalent to [0-9A-Za-z].
Depending on the C locale and the ASCII character encoding.
[:xdigit:] for [0123456789abcdefABCDEF]
[:print:]
[:punct:]
" # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ ` { | } ~.
[:space:]
includes tab, newline, vertical tab, form feed, carriage return, and space.
[:cntrl:] In ASCII 0x00-0x1F & 0xFF.
[:graph:]
Brackets are part of the class names and must be included in addition to brackets that delimit a bracket expression.
For example:
/z[[:digit:]]z/ matches z3z but not z:z, zdz,…
However:
/z[:digit:]z/ Is a bracket expression list with an ignored second i and : .
matches the strings : z:z, zdz, ziz, zgz, ztz(but not z3z)

\s is a synonym for any white space character.

^†

subexpression is defined using $…$

backreference \n, where n is a single digit, matches the substring previously matched by the n^th subexpression
Frequently on the right hand side of a substitute command

Special characters:
To include ] place it first in the list. ][)(
To include ^ place it anywhere but first. Otherwise it matches the beginning of the line.
To include - place it last, lest it be considered a range
For example: [12345689] matches any single digit except 0 or 7.
Other metacharacters have no special meaning inside lists.

Precedence: Repetition, concatenation, alternation.
A subexpression overrides precedence rules.

Environment Variables

$GREP_OPTIONS are placed before explicit options specified on the command.
Options are separated by whitespace.
A backslash escapes the next character to specify an option containing whitespace or a backslash.
Example: $GREP_OPTIONS='--color'

$GREP_COLORS Specifies the color for highlighting.
Colors and attributes used for highlighting.
fb ForegroundColor BackgroundColor

`cx=fb`	context lines inverted
`sl=fb`	selected lines or unselected with `-v`
`rv`	reverses meanings of `sl and cx` with `-v`
`mt=fb`	matching text 01;31. equivalent to `ms and mc` to the same value. Default :bold red text foreground / current background.
`ms=fb`	matched string text in a selected line. (used when -v omitted.) The effect of the sl= (or cx= if rv) capability remains active when this kicks in. Default bold red text
`mc=fb`	matching text in a context line. (only with -v ) The effect of the `cx` (or `sl` if `rv`) capability remains active when this kicks in. Default bold red text
`fn=fb`	file names prefixing any content line. default magenta text 35
`ln=fb`	line numbers prefix. default green text 32
`bn=`	byte number offsets prefixing any content line. Default green text 32
`se=fb`	separators inserted between selected line fields (:), between context line fields, (-), and between groups of adjacent lines when nonzero context is specified (--). default cyan text

fb: Select Graphic Rendition (SGR) forgroundBackground integers concatenated with semicolons. ANSI).
Common values include 1 for bold, 4 for underline, 5 for blink, 7 for inverse, 39 for default foreground color, 30 to 37 for foreground colors, 90 to 97 for foreground colors, 38;5;0 to 38;5;255 for 88-color and 256-color modes foreground colors, 49 for default background color, 40 to 47 for background colors, 100 to 107 for 16-color mode background colors, and 48;5;0 to 48;5;255 for 88-color and 256-color modes background colors.

The locale $LC_xxx is take from: $LC_ALL, $LC_xxx, $LANG.
The first of these variables that is set specifies the locale.
For example, if $LC_ALL is not set, but $LC_MESSAGES is set to pt_BR, then BrazilianPortuguese is used.
The C locale is used if none of these are set, or if the locale catalog is not installed, or if grep was not compiled with national language support (NLS).
$LC_ALL, $LC_COLLATE, LANG collating sequence used to interpret range expressions like [a-z].
$LC_ALL, $LC_CTYPE, LANG type of characters, e.g., which characters are whitespace.
$LC_ALL, $LC_MESSAGES, LANG language for messages. The default C locale uses "American English" messages.

POSIXLY_CORRECT If set, grep behaves as POSIX.2 requires; otherwise, grep behaves more like other GNU programs.

requires that options that follow file names must be treated as file names; by default, such options are permuted to the front of the operand list and are treated as options.
unrecognized options be diagnosed as "illegal".

_N_GNU_nonoption_argv_flags_ (N is grep's process ID.)
If the i^th character of this environment variable's value is 1,
do not consider the i^th operand to be an option, even if it appears to be one.
A shell can put this variable in the environment for each command it runs, specifying which operands are the results of file name wildcard expansion and therefore should not be treated as options.
Only with the GNU C library, and only when POSIXLY_CORRECT is not set.

Returns

0: selected lines WERE found
1: no lines selected. if [ $? != 1 ];then …
2: an error occurred like: no permission or file not found.

Large repetition counts in the {n,m} construct may cause grep to use lots of memory.
Certain obscure regular expressions require exponential time and space.

Backreferences are very slow, and require exponential time.

Current "official" GNU grep

See also egrep, fgrep, sed, sh, attributes, environ, largefile, regex, regexp, XPG4

Examples

Find all uses of Posix ( -i ignoring case) in the file text.mm, and write lines with line numbers( -n ):
grep -i -n posix text.mm

Display line numbers( -n ) containg empty lines ( i.e where beginning is immediately followed by end of line)
grep -n ^$ or grep -n -v .

Display all lines containing strings abc or def or both :
grep -E 'abc def' -or- grep -F 'abc def'

Both of the following commands display all lines matching exactly abc or def:
grep -E '^abc$ ^def$' -or- grep -F -x 'abc def'

To find an A surrounded by tabs, using ANSI-C quoting for bash use:
grep $'\tA\t'

Environment Variables

$LC_COLLATE, $LC_CTYPE, $LC_MESSAGES, and $NLSPATH.

Notes

A line with embedded nulls will only be compared up to the first null; if it matches the entire line is output.

The results are unspecified if input files contain non-ASCII or
lines longer than LINE_MAX (2048) bytes (defined in /usr/include/sys/syslimits.h or /usr/include/limits.h

Large File Behavior.
See largefile(5) for the description of the behavior of grep when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes). Lines are limited only by the size of the available virtual memory.

International Components for Unicode

See gnu documenation
egrep, fgrep, sed, sh, attributes(5), environ(5), largefile(5), regex(5), regexp(5), XPG4(5)