Training, Open Source computer languages

PerlPythonMySQLTclRubyC & C++LuaJavaTomcatPHPhttpdLinux

Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
Retiring, March 2020 - sorry, you have missed our final public course.
The Coronavirus situation has lead us to suspend public training - which was on the cards anyway, with no plans to resume

Please ask about private 'maintenance' training for Python, Tcl, Perl, PHP, Lua, etc
Happily continuing private consultancy / programming work
 


Perl Regular Expressions
In Perl, you write regular expressions between / delimiters (or you can change the delimiter if you wish), and you add modifiers after the closing /. To match the contents of a variable to a regular expression, use the =~ operator. Regular expressions are also used by perl built in functions such as grep and split, and by the s operator.

Perl uses a very full set of elements within its regular expressions, most of which are terse so hard for the newcomer to follow when maintaining code. It predates, so does not follow, the POSIX standard.

Perl 6, currently under development, will support grammars and rules rather than regular expressions. Grammars and Rules will take pattern matching to a whole new level, and tools will be available to covert code - in other words, rules and grammars will do everything that the old Regular Expressions didn't, and more.

Operator TypeExamplesDescription
Literal Characters
Match a character exactly
a A y 6 % @Letters, digits and many special
characters match exactly
\$ \^ \+ \\ \?Precede other special characters
with a \ to cancel their regex special meaning
\n \t \rLiteral new line, tab, return
\cJ \cGControl codes
\xa3Hex codes for any character
Anchors and assertions ^Starts with
$Ends with
\b \Bon a word boundary,
NOT on a word boundary
Character groups
any 1 character from the group
[aAeEiou]any character listed from [ to ]
[^aAeEiou]any character except aAeEio or u
[a-fA-F0-9]any hex character (0 to 9 or a to f)
.any character at all
(not new line in some circumstances)
\sany space character (space \n \r or \t)
\wany word character (letter digit or _)
\dany digit (0 through 9)
\S \W \Dany character that is NOT a space
word character or digit
Counts
apply to previous element
+1 or more ("some")
*0 or more ("perhaps some")
?0 or 1 ("perhaps a")
{4}exactly 4
{4,}4 or more
{4,8}between 4 and 8
Add a ? after any count to turn it sparse (match as few as possible) rather than have it default to greedy
Alternation |either, or
Grouping ( )group for count and save to variable
(?: )group for count but do not save
Variables $xyzInsert contents of $xyz into regular expression
\1 \2Back reference to 1st, 2nd etc matched groups

After the closing / of your regular expression, you can add one or more modifiers to change its behaviour.
ModifierDescription
iIgnore case in matching
gGlobal match. Return a list of all matches (list context) or return the next match (scalar context)
xWhite space is to be treated as a comment (otherwise it matches exactly)
s. to match everything including new line (otherwise it matches everything except new line)
m^ and $ to match embedded new lines
oTell compiler that regular expression doesn't change even if it includes a variable reference
es command only. Execute the output before you substitute it in


The following Perl functions and operators use regular expressions
Function / Operatoruse
 If you write a regular expression without an operator, it matches the regular expression against the contents of the $_ variable.
=~Match the regular expression to the right against the variable to the left
sSubstitute the matched regular expression with a replacement string
grepFilter a list for all member scalars that match the regular expression
splitsplit a scalar into a list, dividing the elements at the regular expression

The above lists show the most commonly used elements of Perl regular expressions, and are not exhaustive.

In Perl, you can change the / regular expression delimiter to almost any other special character if you preceed it with the letter m (for match); if you change to ( { or [, the balancing end expression character becomes ) } or ].

Back to Regular Expression Home Page
Jump to Elements of a regular expression
Order a Regular Expression Mousemat for £4.95 inclusive

Comment: "it would be nice to give more good examples at the bottom, ..."
Visitor Ranking 4.0 (5=excellent, 1=poor)

Comment by Christian (published 2010-03-04)
it would be nice to give more good examples at the bottom, which special characters would cause the least problems (unwanted side effects), which you might want to use instead of /, I guess e.g. ! ~ = : < > ,
thanks [#3474]

You can Add a comment or ranking or edit your own comments

Average page ranking - 4.0

© WELL HOUSE CONSULTANTS LTD., 2020: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01225 708225 • FAX: 01225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho