|
Looking ahead and behind in a Regular Expression
Regular expressions in Perl and PHP include facilities called zero width assertions, zero width lookahead and lookbehinds. A case of jargon that looks almost calculated to confuse?
Zero width assertions are where a regular expression matches some sort of condition in the line, without actually consuming any characters from the incoming string - the three most common examples are ^ (must be at start of string), $ (must be at end of string) and \b (must be at word boundary).
There are times when you may wish to say "if followed by", "if not followed by", "if following" and "if NOT following" in a regular expression match, but to not actually move backward or forward over the incoming string - for example, in a spell checker I was writing yesterday ( source, read about it and try it out) I was looking to split my incoming string at each word boundary, but only if NOT following or followed by a single quote. And, crucially, the single quote character was not to be included in the matched string itself - I was just saying "no break here" in the case of words like hasn't and I'll. This is a requirement for a zero width negative look behind written (?<!') and a zero width negative look ahead written (?!').
Here's the complete regular expression of my example:
$elements = preg_split("/\b(?<!')(?!')/",$page);
Footnote - Zero width positive lookaheads are written (?=xx) and zero width positive look behinds are written (?<=xx), where xx is the expression that you're looking back or forward to match (written 2006-05-22 05:48:17)
Associated topics are indexed under H107 - String Handling in PHPQ805 - Object Orientation and General technical topics - Advanced Regular Expression Components
Some other Articles
Where is a web site visitor browsing fromHotel Technology RequirementsReading the newspaper and working with other restrictionsCareer development adviceLooking ahead and behind in a Regular ExpressionA journey, an arrival, a peopleIn praise of training course delegates.Better communicationHelping mental health through diet, exercise and other lifestyle mattersViewing images held in a MySQL database via PHP
|
2259 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46 at 50 posts per page
This is a page archived from The Horse's Mouth at
http://www.wellho.net/horse/ -
the diary and writings of Graham Ellis.
Every attempt was made to provide current information at the time the
page was written, but things do move forward in our business - new software
releases, price changes, new techniques. Please check back via
our main site for current courses,
prices, versions, etc - any mention of a price in "The Horse's Mouth"
cannot be taken as an offer to supply at that price.
Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).
|
|