Exercises, examples and other material relating to training module P212. This topic is presented on public courses
a.k.a. "Regular Expressions, part 2". How to match a string to a pattern, how to extract information from a match, how to repeat matches, to substitute one string of text for another, and how to translate on a character by character basis.
Articles and tips on this subject | updated |
4452 | Binary data handling - Python and Perl Handling binary data has become a somewhat rarer requirement over tiem, but that doensn't mean the need has gone away - and on last week's Python course, my delegates had a requirement to read data in this format.
First and foremost with binary data, you need to understand what you're looking to read ... | 2015-03-09 |
3927 | First match or all matches? Perl Regular Expressions If you match a string to a regular expression, there are often lots of ways it can match. And if you're just saying "does this match", that's fair enough ... but if you're wanting to extract the matched data, you need to give it more thought.
A Perl match to /-?\d+\.\d+/ in an if statement will give ... | 2012-11-24 |
3707 | Converting codons via Amino Acids to Proteins in Perl DNA is the code of life - a double helix, comprising just four different basic codon elements:
• Adenine (A)
• Thymine (T)
• Guanine (G)
• Cytosine (c)
and a huge amount of work has gone into analysing these for the genes right across ... | 2012-04-28 |
3650 | Possessive Regular Expression Matching - Perl, Objective C and some other languages "I'm looking to spend between £200,000 and £225,000 on a new home" you say to the salesman and - guess what - you're offered something much nearer £225,000 that £200,000.
With Regular Expression matching, you can ask the question "do we have a match", and that returns a Yes / ... | 2012-03-12 |
3630 | Serialsing and unserialising data for storage and transfer in Perl If you want to save a series of strings to a file, or pass them over a network connection, you'll need to delimit them - add in a special character so that the receiving / reading program will know where one piece of data ends and the next starts. The problem comes if the chosen special character may ... | 2012-03-10 |
3546 | The difference between dot (a.k.a. full stop, period) and comma in Perl If (Perl) I write
$x = "12";
$y = "25";
print $x,$y;
print $x.$y;
print "\n";
Then I'll get output
12251225
In other words - the output is the same. So is there a difference?
Yes - there's a huge difference.
$x.$y - using the ... | 2011-12-17 |
3411 | Single and double quotes strings in Perl - what is the difference? In Perl, there's usually more than one way of doing it ...
If you're writing a string of text into your program, your first possibility is to use single quotes - in which case you're writing a literal string with everything between the single quote chartacters included exactly in the string. And your ... | 2011-08-30 |
3332 | DNA to Amino Acid - a sample Perl script A really rewarding course this week - Perl programming, for a dozen bright delegates in the bioinformatics field - the people who have defined the human code as billions of C A T and Gs and are then fuzzy matching against that human code to help in medical research. I hope I am forgiven for that simplistic ... | 2011-06-24 |
3322 | How much has Perl (and other languages) changed? How much has Perl code mover forward over the years? A lot, and not a lot. To some extend, programming languages are the eye of the storm of technology - and that's because people who invest in code want it to be good for many years, so the same tools are used for several generations of products, ... | 2011-06-10 |
3100 | Looking ahead and behind in Regular Expressions - double matching Look-ahead and look-behind are a way of "double matching" in a regular expression. If you're at a certain point in the match and you think "the next bit should conform to xxx and at the same time it should conform to yyy" then you can describe xxx via a look-ahead, and follow that with matching yyy ... | 2010-12-24 |
3059 | Object Orientation in an hour and other Perl Lectures I enjoy the occasional course that's different in its design and specification, and yesterday was one of those - more lectures that training, on intermediate and advanced Perl, for a group of eight delegates who were all well experienced at PHP, but Perl "dabblers" to this point. During the day, we ... | 2010-12-04 |
2993 | Arrays v Lists - what is the difference, why use one or the other If you want a program to run quickly through a data set (that's the sort of thing you'll be doing in heavy scientific work), you'll want the data loaded into successive memory locations - but that means that you have to know how much space to allocate before you set the data up. Otherwise, you'll find ... | 2010-10-10 |
2874 | Unpacking a Perl string into a list In Perl, you can extract data from a string in a lot of different ways. You can split the string if you want to use a uniform separator, you can use a regular expression if you want to grab out bits that match a pattern, and you can use substr to extract data based on specific character positions.
Which ... | 2010-07-31 |
2877 | Further more advanced Perl examples I've uploaded a further batch of new examples (that makes around 40 in total!) from the private Perl course that I ran from Wednesday through Saturday last week - many of them adding a new twist on to previous examples. If you read a comment below and thing "that's what I'm looking for an example of", ... | 2010-07-30 (longer) |
2834 | Teaching examples in Perl - third and final part Three part article ... this is part 3. Jump back to part [1] [2]
Following on from two earlier posts, here is the final third of the new examples that I wrote during last week's Perl course, and to which I have added extra documentation over the last couple of days.
P212 More on Character Strings
"Does ... | 2010-06-27 (longer) |
2657 | Want to do a big batch edit? Nothing beats Perl! I still love Perl ...
Wanting to convert a file of lines like this:
<img src=rp_153_track.jpg><br><br>
into lines like this:
rp_153_track.jpg <img src=rp_153_track.jpg><br><br>
The code is as simple as:
/=(.*?)>/;
print "$1 $_";
And ... | 2010-06-23 |
2801 | Binary data handling with unpack in Perl During today's Perl course, I was asked to provide an example of the unpack function for extracting multiple values from a piece of data - typically binary data extracted from a file into a scalar variable.
Dorothy-2:de$ perl imgsize sd*.gif
sd1.gif is GIF, 89a, 450 by 250
sd2.gif is GIF, 89a, 250 by ... | 2010-06-10 |
586 | Perl Regular Expressions - finding the position and length of the match If you want to find the position of a match in an incoming string, simply check the length of $` (That's $PREMATCH if you've chosen to use English;) to check where it starts, and add the length of $& (that's $MATCH) to find where it ends.
Lets say I want to find all the URLs referred to in a web ... | 2009-11-29 |
2379 | Making variables persistant, pretending a database is a variable and other Perl tricks Have a look at this Perl program:
use fyle;
tie $counter,"fyle";
$counter = $counter + 1;
print ("This is access no. $counter\n");
Apart from the rather curious module loaded at the top, this seems to take an undefined variable, set it to one, and print it out. What a - err - pointless (!) ... | 2009-08-28 |
2230 | Running a piece of code is like drinking a pint of beer Q: What is the effect when I drink a pint of beer?
A: I get slightly tipsy.
But that's too simplistic!
A: The brewery has some more money
A: There's a glass to wash up
A: I need the loo!
Running a piece of code is like drinking a pint of beer - as well as a headline result, you get extra variables ... | 2009-06-12 |
928 | C++ and Perl - why did they do it THAT way? "Why did [they] do it THAT way?". It's a question often asked by the brighter and more perspective delegates on courses concerning some features of a language that I'm teaching them. And the answer "because they did" is a poor one. It's like saying to a child "because I said so" rather than looking ... | 2009-01-01 |
1947 | Perl substitute - the e modifier Here's a graphic illustration of the use of the "e" for "execute" modifier used on the end of substitute operation in Perl.
The "s" for substitute allows you to replace a matched pattern with a STRING in which you can use special references like \1 or $1 for the first matched substring. If you want ... | 2008-12-16 |
737 | Coloured text in a terminal from Perl If you're looking to do something in Perl and the back of your mind tells you that, surely, someone's done this before then there are two things to note:
• Someone probably HAS and
• It's probably available on the CPAN or as a built in module.
Thus when I was asked the question "How do I get ... | 2008-12-09 |
1735 | Finding words and work boundaries (MySQL, Perl, PHP) If you're searching for the word "mile", you probably don't want the page that tells you that Sally Smiled at Harry. But you may want to find a Milestone, even if it is within quotes.
Regular Expressions are your friends!
In Perl style regular expressions (which also work in Python, and in PHP with ... | 2008-08-03 |
1727 | Equality and looks like tests - Perl Whenever you do an equality check in a Perl program, you must think whether you're checking if two numbers are equal, if two test strings are equal, or if a string looks like a pattern. And you write different code in each case:
Checking numbers: If ($stuff == 6) { ...
Tests whether $stuff contains ... | 2008-07-29 |
583 | Remember to process blank lines I've got a Perl program that processes a data file 200 lines long.
15 of the lines are comments that start with a # (I test for those using ($line =~ /^#/), and 181 of the lines contain real data - in other words they start with a character that's not a # - my regular expression match reads ($line =~ ... | 2008-06-11 |
1510 | Handling Binary data (.gif file example) in Perl Perl is very good for handling binary data - it can do things you can't do with other utilities and scripting languages, and things that are very much harder to do in C - that's because C's strings are null terminated and in the case on binary strings, there may be an embedded null anywhere.
Finding ... | 2008-01-19 |
1336 | Ignore case in Regular Expression Do you want to ignore case in a regular expression? There are a variety of ways of doing it ... depending on the language you're writing. Here are some hints:
/abcd/i Perl - an i after the regular expression
eregi PHP - use eregi rather than ereg
re.I or re.IGNORECASE Python - extra parameters ... | 2007-09-07 |
1305 | Regular expressions made easy - building from components There seems to be a certain macho desire in many programmer's minds to write a single complicated regular expression to match against an input line, ignorning the structured approach that everyone accepts quite cheerfully in almost every other case. Have a look at this Python line:
wholeline = r"\d\d-...-\d\d\d\d\s+(\d\d):(\d\d):(\d\d.\d\d),\s+(-?\d+\.\d+),\s+(-?\d+\.\d+),(-?\d+\.\d+),\s+(-?\d+\.\d+),(-?\d+\.\d+),\s+(-?\d+\.\d+)"
Impressive, ... | 2007-08-16 |
1251 | Substitute operator / modifiers in Perl Perl's substitute operator lets you replace a Regular Expression with another string within a target string. For example
$hello = "Grating";
$hello =~ s/a/ee/;
print "$hello\n";
Will turn Grating into Greeting within the $hello variable. You'll note that you can use almost any special character in ... | 2007-07-06 |
1230 | Commenting a Perl Regular Expression The x modifier on the end of a Perl regular expression causes all spaces in the regular expression to be treated as comments (rather than matching exactly). This means that you can lay out your regular expressions much more cleanly.
And wherever you're allowed white space, you can add comments from ... | 2007-06-13 |
1222 | Perl, the substitute operator s In Perl, the s (or substitute) operator allows you to match a regular expression and replace the part of your incoming string that matched with another string. Your incoming string should be specified to the left of an =~ operator and is changed in situ. For example:
$sample = "The cat sat on the ... | 2007-06-08 |
943 | Matching within multiline strings, and ignoring case in regular expressions Regular Expressions are powerful matching tools and you can specify almost anything within them. But there are certain facilities that are naturally applied to the regular expression as a whole rather than to parts of the match, and there are specified in a different way in each language / implementation.
For ... | 2006-11-26 |
453 | Commenting Perl regular expressions Do you sometimes find Perl regular expressions hard to follow? If you do, remember that you can use the "x" modifier which allows you to space them out; with the "x" modifier, white spaces in the regular expression are ignored.
You can go further; once you've specified the "x" modifier, you can ... | 2006-06-05 |
3to3 | translate a DNA 3-character codon to an amino acid |
bincopi | Read and analyse binary .gif files |
bindemo | Printing out and reading in binary numbers |
cats2 | Sample answer 2 |
catshow | Sample answer 1 |
cstr | Defining Strings |
emma_hunter | Match and Capture - email address |
emre | KISS - keep it simple - regular expression |
favex.pl | Postcode, Zipcode, credit card no. etc - regular expression matches |
filler | Using Regular Expressions to "mailmerge" |
getlinks | Find all href links from a page |
glomatch | Use of "g" modifier |
greedyvglobal.pl | Greedy matches v Global matches |
holiday | Packing and unpacking binary data |
html1 | Matching HTML - a greedy match doesn't work |
html2 | Sparse matching, looking for an HTML tag |
html3 | Global matching in a scalar context |
html4 | Global matching in a list context |
itsperl | Serialise and unserialise strings |
letters | Look for word starting and ending with same letter |
murl | Regular expression with comments |
n2 | Capturing groups into $1 and $2 |
n3 | Capturing groups into a list |
n4 | Special variables $' $& and $` |
n5 | $PREMATCH, $MATCH and $POSTMATCH |
name | Match and substitute (long winded way!) |
name2 | Match and substitute - (example that fails) |
name3 | Match and substitute executed block |
names | Regular expression match - revision |
newsub | Examples of the =~ s for substitute operator |
ogado | Anagrams of First Great Western served stations |
packet | pack and unpack |
pcrd | Modifiers im matching |
pcv1 | Postcode extractor - mark 1 |
pcv2 | Postcode extractor Mk2 - save into named variables |
pcv3 | Postcode, Mk3 - extract multiple postcodes |
phone | Substitution using back reference |
pusher | Single v global match and alternatives |
pwline | Character by character translation with tr |
pwline2 | using tr to change multiple characters; also c and s switches |
reg | Stepping through regular expressions |
regextra | Splitting up a URL via a regex - sample exercise answer |
rogues | Using tr to find invalid characters in a string |
sedm | Substitue operator |
slurpex | matching lines - whole file at a time |
sting | different ways of defining a string |
stuff | storing a compiled regular expression - qr |
tophat | Hashes, Regular Expressions, Topicalisation ... end-of-course example |
totext | Converting < > and & to web standard sequences |
trandy | tr (or y) and its modifiers |
yem | Perl regular expression - information returned |
ystwyth.pl | binary data handling - examine a .gif file |
Summary to date.
Extracting information from a match.
$1, $2, etc.
Assign to a list.
$`, $& and $Õ.
More about regular expressions.
What else can I put in regular expressions?.
More brackets.
Match modifiers.
Global v Greedy.
Alternative delimiters.
Some favourite regular expressions.
To match an email address.
To match a UK Postcode.
To match an American Zip code.
To match a date (UK Style).
To match a time.
To match a complete URL for a web page.
To match a Visa number.
To match a Mastercard number.
To match a UK Phone number.
To match a UK car registration plate.
To match a UK national insurance number.
To match a book's ISBN number.
Substitutions.
Substitute and execute.
Regular expression efficiency.
tr.
Handling binary text.
Summary.
If you are looking for a complete course and not just a information on a single subject, visit our
Listing and schedule page.
Well House Consultants specialise in training courses in
Ruby,
Lua,
Python,
Perl,
PHP, and
MySQL. We run
Private Courses throughout the UK (and beyond for longer courses), and
Public Courses at our training centre in Melksham, Wiltshire, England.
It's surprisingly cost effective to come on our public courses -
even if
you live in a different
country or continent to us.
We have a technical library of over 700 books on the subjects on which we teach.
These books are available for reference at our training centre.