| |||||||||||
| |||||||||||
Elements of a regular expression
A regular expression comprises a number of elements which must be matched in order to give a successful match. There
are many different elements that you can use; on this page, we'll show you examples of each main type of element, and we'll
use elements that are common to regular expressions in all languages.
Literals
Many characters - all letters, all digits, and many of the special symbols match literally. So if I write
if ($p =~ /cat/) ....(example is in Perl), I will have a succesful match if my variable $p contains the word cat. So each of the following will match The cat sat on the mat cat cats, dogs and hamsters are popular animals to keep as pets Concatenation is used to join strings togetherbut it will fail to match there is nothing in here that will match c-a-t The Cat sat on the mat Anchors or Assertions
If you want to check whether an incoming string starts with something (or ends with something), you can use an
anchor - ^ for start and $ for end, and there are also other assertions (such as \b for word boundary) in some
regular expression handlers. So if I write
if ($p =~ /^cat/) ....(example is in Perl), I will have a succesful match is my variable $p starts with the word cat. So each of the following will match cat cats, dogs and hamsters are popular animals to keep as petsbut it will fail to match Concatenation is used to join strings together The cat sat on the mat there is nothing in here that will match c-a-t The Cat sat on the mat Character Groups
In a character group, you can specify a list of possibilities to match against one character in the incoming
string - for example [abcd] will look for either an a or a b or a c or a d. There are many short cust to
specify groups without having to list all the charcacters, but this is one of the areas that regular expression
engines differ from each other.
If I write
if ($p =~ /[csm]at/) ....(example is in Perl), I will have a succesful match is my variable $p contains the string cat sat or mat. So each of the following will match cat cats, dogs and hamsters are popular animals to keep as pets Concatenation is used to join strings together The cat sat on the mat The Cat sat on the matbut it will fail to match there is nothing in here that will match c-a-t Counts
If you want to match an element of a regular expression a number of times, you can follow it with a count
character- the two that are always available are a ? for 0 or 1, and a * for 0 or more.
If I write
if ($p =~ /c-?a-?t/) ....(example is in Perl), I will have a succesful match is my variable $p contains the string cat, perhaps with a - sign between the c and a and perhaps another between the a and t. So each of the following will match cat cats, dogs and hamsters are popular animals to keep as pets Concatenation is used to join strings together The cat sat on the mat there is nothing in here that will match c-a-tbut it will fail to match The Cat sat on the mat Others
Regular expressions can commonly (but not always - it depends on the language) include alternations -
a | to say "or", and brackets to group together sections of a regular expression so that counts can be
applied. You'll also find that the brackets often "capture" the part of the incoming string enclosed in
the brackets for later use in the program.
Back to Regular Expression Home Page Jump to Regular Expression training Jump to What languages can I use a regular expression in? |
| ||||||||||
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho |