If you're looking for part of a string that's repeated again later in the string, you can capture the first occurrence and then use a back reference (\1, \2 etc) to refer to "same again". In Python, you can also name the element that you want to repeat - examples
[here].
If you have a number of fields on a line, rather than look to identify each field with a match, you'll often find it easier to match the separator - and you can do this in many languages with a function or method called
split. In Python, there are two different split methods - one is a method on a string object and splits at an exact (literal) string, and the other is a method on a regular expression, and that one splits at a pattern. Beware - the calling sequences are different between the two splits - they are
not polymorphic. See example
[here].
Regular expressions can easily become very long and complex and have sections that repeat themselves ... so you should remember that if you find yourself repeating something
there has to be an easier way!. In the case of regular expressions, you can often build up your regular expression as a string from a number of elements (which you can reuse), meaning that only the component elements actually appear in your source. If you want to see what I mean, there's a source code example
[here].
In a regular expression, you match from left to right and each time you specify an individual character or a character from a group, you move on along the regular expression. Occasionally - VERY occasionally - you want to say "is this followed by" but NOT move on, giving you the opportunity to match the same part of the incoming string against two different patterns, and continue on only of it matches both of them. You may also want to do the same thing but continue on only if the upcoming text fails to match a pattern - this is known as
negative lookeahead and turns out to be more useful that positive lookahead. I've added a source code example onto our site for negative lookahead - it's
[here] - where we're looking for town names that end in "ing?on", but we're using negative lookahead to exclude specifically "ington".
As well as lookahead, many regular expression handlers offer
lookbehind and I've added a
negative lookbehind examples
[here]. Again - you'll only find occasional good uses for lookbehind.
We provide some coverage of Python regular expressions on our regular public
Python courses. More advanced / specialized topics such as lookahead are covered on our
Regular Expressions day. Note that if you're on one of our main Python courses and would like an introduction to some of the more advanced features, I can easily be persuaded to take you through some of them after the course finishes one day so that you don't need to come back for the "Regex Special" ...
(written 2010-12-17)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
Y115 - Additional Python Facilities [183] The elegance of Python - (2005-01-19)
[208] Examples - Gadfly, NI Number, and Tcl to C interface - (2005-02-10)
[239] What and why for the epoch - (2005-03-08)
[463] Splitting the difference - (2005-10-13)
[663] Python to MySQL - (2006-03-31)
[672] Keeping your regular expressions simple - (2006-04-05)
[753] Python 3000 - the next generation - (2006-06-09)
[901] Python - listing out the contents of all variables - (2006-10-21)
[1043] Sending an email from Python - (2007-01-18)
[1136] Buffering output - why it is done and issues raised in Tcl, Perl, Python and PHP - (2007-04-06)
[1149] Turning objects into something you can store - Pickling (Python) - (2007-04-15)
[1305] Regular expressions made easy - building from components - (2007-08-16)
[1336] Ignore case in Regular Expression - (2007-09-08)
[1337] A series of tyre damages - (2007-09-08)
[1876] Python Regular Expressions - (2008-11-08)
[2407] Testing code in Python - doctest, unittest and others - (2009-09-16)
[2435] Serialization - storing and reloading objects - (2009-10-04)
[2462] Python - how it saves on compile time - (2009-10-20)
[2655] Python - what is going on around me? - (2010-02-28)
[2721] Regular Expressions in Python - (2010-04-14)
[2745] Connecting Python to sqlite and MySQL databases - (2010-04-28)
[2746] Model - View - Controller demo, Sqlite - Python 3 - Qt4 - (2010-04-29)
[2764] Python decorators - your own, staticmethod and classmethod - (2010-05-14)
[2765] Running operating system commands from your Python program - (2010-05-14)
[2786] Factory methods and SqLite in use in a Python teaching example - (2010-05-29)
[2790] Joining a MySQL table from within a Python program - (2010-06-02)
[3442] A demonstration of how many Python facilities work together - (2011-09-16)
[3469] Teaching dilemma - old tricks and techniques, or recent enhancements? - (2011-10-08)
[4085] JSON from Python - first principles, easy example - (2013-05-13)
[4211] Handling JSON in Python (and a csv, marshall and pickle comparison) - (2013-11-16)
[4298] Python - an interesting application - (2014-09-18)
[4439] Json is the new marshall, pickle and cPickle / Python - (2015-02-22)
[4451] Running an operating system command from your Python program - the new way with the subprocess module - (2015-03-06)
[4536] Json load from URL, recursive display, Python 3.4 - (2015-10-14)
[4593] Command line parameter handling in Python via the argparse module - (2015-12-08)
[4709] Some gems from Intermediate Python - (2016-10-30)
Q805 - Object Orientation and General technical topics - Advanced Regular Expression Components [728] Looking ahead and behind in a Regular Expression - (2006-05-22)
[2909] Be gentle rather than macho ... regular expression techniques - (2010-08-08)
[3100] Looking ahead and behind in Regular Expressions - double matching - (2010-12-23)
[3790] Solution looking for a problem? Lookahead and Lookbehind - (2012-06-30)
Q803 - Object Orientation and General technical topics - Regular Expressions - Extra Elements [943] Matching within multiline strings, and ignoring case in regular expressions - (2006-11-25)
[1372] A taster PHP expression ... - (2007-09-30)
[1601] Replacing the last comma with an and - (2008-04-04)
[1613] Regular expression for 6 digits OR 25 digits - (2008-04-16)
[1735] Finding words and work boundaries (MySQL, Perl, PHP) - (2008-08-03)
[1860] Seven new intermediate Perl examples - (2008-10-30)
[3516] Regular Expression modifiers in PHP - summary table - (2011-11-12)
[3650] Possessive Regular Expression Matching - Perl, Objective C and some other languages - (2012-03-12)
Some other Articles
How many toilet rolls - hotel inventory and useagewxPython geometry - BoxSizer exampleHow do regular expressions work / Regular Expression diagramsMatching to a string - what if it matches in many possible ways?Python regular expressions - repeating, splitting, lookahead and lookbehindMelksham - two many councils?Making the most of critical emails - reading behind the sceneSizers (geometry control) in a wxPython GUI - a first exampleObject Oriented Programming for Structured Programmers - conversion trainingCan you trust the big brand names?