Bash Pattern Matching
Part of the book: Bash: The Linux Command Line
Pattern matching in Bash is also called Globbing. It sounds all bloaty and goey, but it's really boring and plain and not sticky at all, but also quite useful. Glob is actually the name of the glibc function that does the real work.
File pattern matching is usually about selecting groups of files, but it can be useful in avoiding typing long file names. Rather than type out a full file name, just type a pattern that contains a unique part and you've matched the file.
How it Works
When you issue a command and the command or argument contains a pattern, the shell first expands the pattern to one or more file names and then runs the command. If the pattern matches the pattern is replaced with the matching files and the command doesn't see the pattern, only the matching files. If the pattern fails to match, then, as a default, the command is given the pattern.
By default the shell doesn't match hidden file names, i.e. file and directory names that begin with a dot (.) won't get matched. This behaviour can be changed (see Configuring Pattern Matching below.)
Case Sensitive
By default file patterns are also case sensitive. Meaning that the files "upper" and "UPPER" are different.
Asterisk
The most used pattern is the asterisk (*). It matches zero or more of any character. It can be used at the beginning, middle or end of a pattern.
Some examples are:
Pattern | Matches | |
Anything.pdf or just .pdf | ||
img*jpg | img0001.jpg or imgjpg | |
index.* | index.html or index.php or index.html.bak | |
* | anything or really or Anything |
Question Mark
A lesser used but still useful pattern is the question mark (?). This indicates any one character.
Pattern | Matches |
?ndex.html | Index.html or index.html |
file.?? | file.01 or file.js |
More complicated
Rather than matching all characters you can specify a list of characters, a range or a class of characters. Using the square brackets ([ and ]) you can specify the characters to match. Despite the pattern taking up more than one character the pattern will only match one character.
Pattern | Matches |
messages.[123] | messages.1, messages.2 or messages.3 |
page[a-z].txt | pagea.txt, pageb.txt ... pagez.txt |
page[-a-z].txt | As above, but also matches page-.txt |
page[^m-z].txt or page[!m-z].txt | Doesn't match files pagem.txt through pagez.txt but matches all others (i.e. page?.txt) |
Classes can be used, these are shortforms for full ranges of characters. The following clases can be used: alnum, alpha, ascii, blank, cntrl, digit, graph, lower, print, punct, space, upper, word, xdigit. Use a class like this:
ls img[:digit:].jpg
That matches img0.jpg through img9.jpg. It seems like it's useless, but consider that this works independant of language. The real use of these classes is that if the locale changes (i.e. the language) then the characters that match also change. So this will match English or Arabic numerals.
Extended Patterns
Extended patterns can be handy in niche situations, but you'll probably find that they are disabled by default in your bash. To enable them you would have to run:
shopt -s extglob
And like all the settings I've mentioned, if you want it to be turned on every time to start a shell you need to add it to your .bashrc file or the system-wide /etc/bashrc file.
With extended patterns you can create more complex patterns that match more than a single character but less than all characters of any length.
?(pattern-list) | Matches zero or one occurrence of the given patterns |
*(pattern-list) | Matches zero or more occurrences of the given patterns |
+(pattern-list) | Matches one or more occurrences of the given patterns |
@(pattern-list) | Matches one of the given patterns |
!(pattern-list) | Matches anything except one of the given patterns |
The patters can be any of the regular patterns so +([:digit:]) matches one or more digits.
Configuring Pattern Matching
There are shell options that allow control over how the shell matches patterns and how it reacts to failed patterns.
dotglob | If set, bash includes filenames beginning with a ‘.’ in the results of pathname expansion. |
extglob | If set, the extended pattern matching features described above under Pathname Expansion are enabled. |
failglob | f set, patterns which fail to match filenames during pathname expansion result in an expansion error. |
globstar | If set, the pattern ** used in a filename expansion context will match a files and zero or more directories and subdirectories. If the pattern is followed by a /, only directories and subdirectories match. |
To see if these options are set run:
shopt dotglob
To set it:
shopt -s dotglob
To unset it:
shopt -u dotglob
As you have come to expect it, to make these changes permanent you need to add the shopt commands to your .bashrcfile or the system-wide /etc/bashrc file.
Seeing File Expansion Work
The easy way to see file expansion working is to use the echo command. Just give it a pattern and it will print the results.
echo *.jpg
Another way it to position you cursor at the end of the pattern and press Tab twice and the shell will list matching files below. You can also expand the pattern on the line by positioning your cursor at the end of the pattern and pressing Ctrl-x then *, the matching files now appear on the command line.