Regular Expressions
Regular expressions is a formal language that is used to find strings or parts of strings that match a specific pattern.
Examples of regular expressions that are useful in the context of Mp3tag are collected at the Regular Expressions topic of the Mp3tag Community Forums.
The following provides a short introduction to the syntax of regular expressions in Mp3tag.
Literals
All characters except .|*?+(){}[]^$
. These characters are literals when preceded by a \
.
Wildcard
The dot character .
matches any single character.
Repeats
* |
repeated any number of times including zero |
+ |
repeated any number of times, but at least once |
? |
repeated zero or one times only |
a{n} |
represents the letter a repeated exactly n times |
a{n,} |
represents the letter a repeated at least n times with no upper limit |
a{n,m} |
represents the letter a repeated between n and m times |
Non-greedy repeats
A non-greedy repeat is one which will match the shortest possible string. Non-greedy repeats are possible by appending a ?
after the repeat.
Parenthesis
Parentheses serve two purposes,
- to group items together into a subexpression, and
- to mark what generated the match.
$N
Expands to the text that matched sub-expressionN
in Replace matches with.
Alternatives
Alternatives occur when the expression can match either one sub-expression or another, each alternative is separated by a |
.
Line anchors
^ |
matches the start of a line |
$ |
matches the end of a line |
Sets
[abc] |
matches either of a , b , or c |
[^abc] |
matches any character other than a , b , or c |
[a-z] |
matches any character in the range a to z |
[^A-Z] |
matches any character other than those in the range A to Z |
\w |
matches any word character - all alphanumeric characters plus the underscore |
\s |
matches any whitespace character |
\d |
matches any digit (0-9) |
\l |
matches any lower case character |
\u |
matches any upper case character |
\W |
matches any non-word character |
\S |
matches any non-whitespace character |
\D |
matches any non-digit character |
\L |
matches any non-lower case character |
\U |
matches any non-upper case character |
\t |
matches the tab character |
\n |
matches the newline character |
\r |
matches the carriage return character |
\r\n |
matches a Windows style line break |
\xnn |
matches a character with Unicode hex value nn |
\x{nnnn} |
matches a character with Unicode hex value nnnn |
Word boundaries
\b |
matches a word boundary (the start or end of a word) |
\B |
matches only when not at a word boundary |
Mp3tag uses the Boost.Regex expression engine which has a Perl regular expression syntax. The Boost-Extended Format String Syntax is not enabled in Mp3tag.