Guy walks up to me in the street and asks "Could you direct me to the Town Centre?". So I answer him "yes, I could", and walk on. Does he thank me? No - he probably thinks "what a useless half answer" or "how rude can you be", even though I completely and correctly answered his question.
It's sometimes a bit like that in programming too. If I, in my code, as the question "Does what the user typed in looks like it contains a postcode", then that may be only be half an answer. I may
really want to know what the postcode is, and I may want to take significant parts of the postcode and do something more with then. For example, if my user enters
156 Broadway, Chadderton, Oldham, Lancashire, OL9 8AU, UK
I may want to know some or all of:
• Yes, it contains a postcode
• The postcode is OL9 8AU
• The postcode is in the OL area.
Regular Expressions allow me to check whether a string matches a pattern, and in most languages return a true / false (yes / no) type answer. But they're also capable of returning or storing ancilliary results from the match so that the programmer isn't require to write loads of other follow up code.
On Friday's
Regular Expression Course we took a look at that, using examples in PHP as that was the most relevant language to the student group.
So:
$result = preg_match('/[A-Z]{1,2}[0-9][0-9A-Z]{0,1} {1,}[0-9]{1}[A-Z]{2}/',$line);
print ("5. Result is $result\n");
will say "yes, that line contains a postcde (set $result to 1)" if the line contains something in postcode format, or "no, that line does not contain a postcode (set $result to 0) if it does not. However, if I write:
$result = preg_match('/[A-Z]{1,2}[0-9][0-9A-Z]{0,1} {1,}[0-9]{1}[A-Z]{2}/',$line,$gotten);
print ("6. Result is $result. Side result $gotten[0]\n");
I'll be given the postcode back too, as the first member of a who array of extra output data which I have chosen to call
$gotten. I can take this a whole lot further - identifying multiple postcode is that's what the incoming string contains, and also telling my program what are the interesting bits that I want to store in further elements of $gotten. Thus:
$result = preg_match_all('/(([A-Z]{1,2})\d[0-9A-Z]?) +(\d[A-Z]{2})/',$line,$gotten);
print ("10. Result is $result. Side result "); spew2d($gotten) ;
With input string:
I live at SN12 6QL which is just up the road from here and in 2 weeks train near E3 4HC in London?
I got the following results from the code above:
5. Result is 1
6. Result is 1. Side result SN12 6QL
10. Result is 2. Side result
0/0: SN12 6QL 0/1: E3 4HC
1/0: SN12 1/1: E3
2/0: SN 2/1: E
3/0: 6QL 3/1: 4HC
Full program - including the
spew function -
[here].
(written 2012-06-30, updated 2012-07-14)
2dd6
Associated topics are indexed under
Q806 - Regular Expression Cookbook [3218] Matching a license plate or product code - Regular Expressions - (2011-03-28)
[2804] Regular Expression Myths - (2010-06-13)
[2727] Making a Lua program run more than 10 times faster - (2010-04-16)
[2702] First and last match with Regular Expressions - (2010-04-02)
[2608] Search and replace in Ruby - Ruby Regular Expressions - (2010-01-31)
[2563] Efficient debugging of regular expressions - (2010-01-04)
[2165] Making Regular Expressions easy to read and maintain - (2009-05-10)
[1840] Validating Credit Card Numbers - (2008-10-14)
[1305] Regular expressions made easy - building from components - (2007-08-16)
[1230] Commenting a Perl Regular Expression - (2007-06-12)
[672] Keeping your regular expressions simple - (2006-04-05)
H107 - String Handling in PHP [4072] Splitting the difference with PHP - (2013-04-27)
[4071] Setting up strings in PHP - (2013-04-27)
[3790] Solution looking for a problem? Lookahead and Lookbehind - (2012-06-30)
[3789] More than just matching with a regular expression in PHP - (2012-06-30)
[3534] Learning to program in PHP - Regular Expression and Associative Array examples - (2011-12-01)
[3516] Regular Expression modifiers in PHP - summary table - (2011-11-12)
[3515] PHP - moving from ereg to preg for regular expressions - (2011-11-11)
[3424] Divide 10000 by 17. Do you get 588.235294117647, 588.24 or 588? - Ruby and PHP - (2011-09-08)
[3020] Handling (expanding) tabs in PHP - (2010-10-29)
[2629] Curly braces within double quoted strings in PHP - (2010-02-09)
[2238] Handling nasty characters - Perl, PHP, Python, Tcl, Lua - (2009-06-14)
[2046] Finding variations on a surname - (2009-02-17)
[1799] Regular Expressions in PHP - (2008-09-16)
[1613] Regular expression for 6 digits OR 25 digits - (2008-04-16)
[1603] Do not SHOUT and do not whisper - (2008-04-06)
[1533] Short and sweet and sticky - PHP form input - (2008-02-06)
[1372] A taster PHP expression ... - (2007-09-30)
[1336] Ignore case in Regular Expression - (2007-09-08)
[1195] Regular Express Primer - (2007-05-20)
[1058] PHP Regular expression to extrtact link and text - (2007-01-31)
[1008] Date conversion - PHP - (2006-12-26)
[728] Looking ahead and behind in a Regular Expression - (2006-05-22)
[716] Evaluating arithmetic expressions in configuration files - (2006-05-10)
[642] How similar are two words - (2006-03-11)
[608] Don't expose your regular expressions - (2006-02-15)
[589] Robust PHP user inputs - (2006-02-03)
[574] PHP - dividing a string up into pieces - (2006-01-23)
[560] The fencepost problem - (2006-01-10)
[558] Converting between acres and hectares - (2006-01-08)
[493] Running a Perl script within a PHP page - (2005-11-12)
[463] Splitting the difference - (2005-10-13)
[422] PHP Magic Quotes - (2005-08-22)
[337] the array returned by preg_match_all - (2005-06-06)
[54] PHP and natural sorting - (2004-09-19)
[31] Here documents - (2004-08-28)
Some other Articles
Managing daemons from a terminal sessionThe Kernel, Shells and Daemons. Greek Gods in computingGetting more than a yes / no answer from a regular expression pattern matchMelksham Pride - the Chamber of Commerce, and the futureImproving Wiltshire Rail Offer - it WILL be happeningProgramming languages - what are the differences between them?Steam train calls at Melksham - PicturesLoad path, load and require in Ruby, and a change from 1.8 to 1.9