Regular Expressions

Regular expressions is a formal language that is used to find strings or parts of strings that match a specific pattern.

Examples of regular expressions that are useful in the context of Mp3tag are collected at the Regular Expressions topic of the Mp3tag Community Forums.

The following provides a short introduction to the syntax of regular expressions in Mp3tag.

Literals

All characters except .|*?+(){}[]^$. These characters are literals when preceded by a \.

Wildcard

The dot character . matches any single character.

Repeats

* repeated any number of times including zero
+ repeated any number of times, but at least once
? repeated zero or one times only
a{n} represents the letter a repeated exactly n times
a{n,} represents the letter a repeated at least n times with no upper limit
a{n,m} represents the letter a repeated between n and m times

Non-greedy repeats

A non-greedy repeat is one which will match the shortest possible string. Non-greedy repeats are possible by appending a ? after the repeat.

Parenthesis

Parentheses serve two purposes,

Alternatives

Alternatives occur when the expression can match either one sub-expression or another, each alternative is separated by a |.

Line anchors

^ matches the start of a line
$ matches the end of a line

Sets

[abc] matches either of a, b, or c
[^abc] matches any character other than a, b, or c
[a-z] matches any character in the range a to z
[^A-Z] matches any character other than those in the range A to Z
\w matches any word character - all alphanumeric characters plus the underscore
\s matches any whitespace character
\d matches any digit (0-9)
\l matches any lower case character
\u matches any upper case character
\W matches any non-word character
\S matches any non-whitespace character
\D matches any non-digit character
\L matches any non-lower case character
\U matches any non-upper case character
\t matches the tab character
\n matches the newline character
\r matches the carriage return character
\r\n matches a Windows style line break
\xnn matches a character with Unicode hex value nn
\x{nnnn} matches a character with Unicode hex value nnnn

Word boundaries

\b matches a word boundary (the start or end of a word)
\B matches only when not at a word boundary

Mp3tag uses the Boost.Regex expression engine which has a Perl regular expression syntax. The Boost-Extended Format String Syntax is not enabled in Mp3tag.