Guy walks up to me in the street and asks "Could you direct me to the Town Centre?". So I answer him "yes, I could", and walk on. Does he thank me? No - he probably thinks "what a useless half answer" or "how rude can you be", even though I completely and correctly answered his question.
It's sometimes a bit like that in programming too. If I, in my code, as the question "Does what the user typed in looks like it contains a postcode", then that may be only be half an answer. I may
really want to know what the postcode is, and I may want to take significant parts of the postcode and do something more with then. For example, if my user enters
156 Broadway, Chadderton, Oldham, Lancashire, OL9 8AU, UK
I may want to know some or all of:
• Yes, it contains a postcode
• The postcode is OL9 8AU
• The postcode is in the OL area.
Regular Expressions allow me to check whether a string matches a pattern, and in most languages return a true / false (yes / no) type answer. But they're also capable of returning or storing ancilliary results from the match so that the programmer isn't require to write loads of other follow up code.
On Friday's
Regular Expression Course we took a look at that, using examples in PHP as that was the most relevant language to the student group.
$result = preg_match('/[A-Z]{1,2}[0-9][0-9A-Z]{0,1} {1,}[0-9]{1}[A-Z]{2}/',$line);
print ("5. Result is $result\n");
will say "yes, that line contains a postcde (set $result to 1)" if the line contains something in postcode format, or "no, that line does not contain a postcode (set $result to 0) if it does not. However, if I write:
$result = preg_match('/[A-Z]{1,2}[0-9][0-9A-Z]{0,1} {1,}[0-9]{1}[A-Z]{2}/',$line,$gotten);
print ("6. Result is $result. Side result $gotten[0]\n");
I'll be given the postcode back too, as the first member of a who array of extra output data which I have chosen to call
$gotten. I can take this a whole lot further - identifying multiple postcode is that's what the incoming string contains, and also telling my program what are the interesting bits that I want to store in further elements of $gotten. Thus:
$result = preg_match_all('/(([A-Z]{1,2})\d[0-9A-Z]?) +(\d[A-Z]{2})/',$line,$gotten);
print ("10. Result is $result. Side result "); spew2d($gotten) ;
With input string:
I live at SN12 6QL which is just up the road from here and in 2 weeks train near E3 4HC in London?
I got the following results from the code above:
5. Result is 1
6. Result is 1. Side result SN12 6QL
10. Result is 2. Side result
0/0: SN12 6QL 0/1: E3 4HC
1/0: SN12 1/1: E3
2/0: SN 2/1: E
3/0: 6QL 3/1: 4HC
Full program - including the
spew function -
(written 2012-06-30, updated 2012-07-14)
Associated topics are indexed as below, or enter for individual articles
Q806 - Regular Expression Cookbook [672] Keeping your regular expressions simple - (2006-04-05)
[1230] Commenting a Perl Regular Expression - (2007-06-12)
[1305] Regular expressions made easy - building from components - (2007-08-16)
[1840] Validating Credit Card Numbers - (2008-10-14)
[2165] Making Regular Expressions easy to read and maintain - (2009-05-10)
[2563] Efficient debugging of regular expressions - (2010-01-04)
[2608] Search and replace in Ruby - Ruby Regular Expressions - (2010-01-31)
[2702] First and last match with Regular Expressions - (2010-04-02)
[2727] Making a Lua program run more than 10 times faster - (2010-04-16)
[2804] Regular Expression Myths - (2010-06-13)
[3218] Matching a license plate or product code - Regular Expressions - (2011-03-28)
H107 - String Handling in PHP [31] Here documents - (2004-08-28)
[54] PHP and natural sorting - (2004-09-19)
[337] the array returned by preg_match_all - (2005-06-06)
[422] PHP Magic Quotes - (2005-08-22)
[463] Splitting the difference - (2005-10-13)
[493] Running a Perl script within a PHP page - (2005-11-12)
[558] Converting between acres and hectares - (2006-01-08)
[560] The fencepost problem - (2006-01-10)
[574] PHP - dividing a string up into pieces - (2006-01-23)
[589] Robust PHP user inputs - (2006-02-03)
[608] Don't expose your regular expressions - (2006-02-15)
[642] How similar are two words - (2006-03-11)
[716] Evaluating arithmetic expressions in configuration files - (2006-05-10)
[728] Looking ahead and behind in a Regular Expression - (2006-05-22)
[1008] Date conversion - PHP - (2006-12-26)
[1058] PHP Regular expression to extrtact link and text - (2007-01-31)
[1195] Regular Express Primer - (2007-05-20)
[1336] Ignore case in Regular Expression - (2007-09-08)
[1372] A taster PHP expression ... - (2007-09-30)
[1533] Short and sweet and sticky - PHP form input - (2008-02-06)
[1603] Do not SHOUT and do not whisper - (2008-04-06)
[1613] Regular expression for 6 digits OR 25 digits - (2008-04-16)
[1799] Regular Expressions in PHP - (2008-09-16)
[2046] Finding variations on a surname - (2009-02-17)
[2238] Handling nasty characters - Perl, PHP, Python, Tcl, Lua - (2009-06-14)
[2629] Curly braces within double quoted strings in PHP - (2010-02-09)
[3020] Handling (expanding) tabs in PHP - (2010-10-29)
[3424] Divide 10000 by 17. Do you get 588.235294117647, 588.24 or 588? - Ruby and PHP - (2011-09-08)
[3515] PHP - moving from ereg to preg for regular expressions - (2011-11-11)
[3516] Regular Expression modifiers in PHP - summary table - (2011-11-12)
[3534] Learning to program in PHP - Regular Expression and Associative Array examples - (2011-12-01)
[3789] More than just matching with a regular expression in PHP - (2012-06-30)
[3790] Solution looking for a problem? Lookahead and Lookbehind - (2012-06-30)
[4071] Setting up strings in PHP - (2013-04-27)
[4072] Splitting the difference with PHP - (2013-04-27)
Some other Articles
Managing daemons from a terminal sessionThe Kernel, Shells and Daemons. Greek Gods in computingGetting more than a yes / no answer from a regular expression pattern matchMelksham Pride - the Chamber of Commerce, and the futureImproving Wiltshire Rail Offer - it WILL be happeningProgramming languages - what are the differences between them?Steam train calls at Melksham - PicturesLoad path, load and require in Ruby, and a change from 1.8 to 1.9