Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Learning about Regular Expressions in C through examples

Although we more usually teach Regular Expressions on courses on Perl, Python, PHP, Ruby, etc ... there is also a standard C library, which uses the POSIX flavour of regular expressions, and I've put a short example together to "show you how".

Firstly - what is a regular expression?

It's a "pattern match" - you use it so say does this look like that - not checking for equality, but rather checking to see if something conforms to a pattern. But then you fully define the pattern with a regular expression.

So - for example - you could write a regular expression like
  ^[0-9]{5}$
which means:
• starts with
• a digit
• (five of those)
• and ends
and would let you match the format for an American Zip code.

To load in the C standard Library, you include regex.h:
  #include <regex.h>

You then need to "compile" the regular expression:
  regcomp(&emma,reginald,REG_EXTENDED|REG_NOSUB);

and you can see if another string matches it:
  status = regexec(&emma,millie,(size_t)0,NULL,0);

The returned status is "0" for "yes that matched" and "1" for "no, that did not match".

There's a complete sample program (showing the context of all the various variables in the lines above) - [here]. It reads a regular expression that the user types in (not usually a good idea, as most users don't understand regular expressions!) then it reads a whole series of further lines and tells you if it matches or not. Here's some sample output:

Please give test expression: ^[0-9]{5}$
Validity of regex (0 => OK): 0
Please give test string: 77663
Matched (0 => Yes): 0
Please give test string: 987662
Matched (0 => Yes): 1
Please give test string: 55332
Matched (0 => Yes): 0
Please give test string:
wizzard:c graham$


As you can see, it's very useful indeed - and rather clever.

In the case of a USA zip code, I could simply use the atof function if the regexec function reported a match, but it would be rather trickier with a UK postcode ... or indeed if the zip code was embedded within a full line of text. So in many circumstances you want not only to ask "did it match?", but also to ask "which part of the incoming string matched which part of the regular expression?". With different parameters, regcomp and regexec can return an array of structures so that you can get at this information.

There's a complete source code example of this "match and capture" - [here] - and it's got further comments in it to help you follow how it works. Running this program on a UK postcode, our output included the following:

wizzard:c graham$ ./reg2
Please give test expression: ([A-Z]{1,2})[0-9][0-9A-Z]? +[0-9][A-Z]{2}
Validity of regex (0 => OK): 0
Please give test string: We are at SN12 7NY for this course
Matched (0 => Yes): 0
From 10 to 18 (SN12 7NY)
From 10 to 12 (SN)
Please give test string:


If you're looking for further information about the elements within regular expressions, have a look at our regular expression pages - [here]. C uses the "POSIX Style" which is similar to you'll find in Tcl and Expect ... a remember that we cover regular expressions on a special regular expression course that we run from time to time, as well as where appropriate on language course. This week - thus the blog - it's during a C and C++ course.
(written 2010-06-30)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
Q801 - Object Orientation and General technical topics - What are Regular Expressions?
  [1195] Regular Express Primer - (2007-05-20)
  [2563] Efficient debugging of regular expressions - (2010-01-04)
  [4505] Regular Expressions for the petrified - in Ruby - (2015-06-03)
  [4763] Regex Reference sheet - (2017-10-10)

C206 - C and C based languages - Character Strings
  [1338] Handling Binary data in Tcl (with a note on C) - (2007-09-09)
  [2843] String functions in C - (2010-06-30)
  [3122] When is a program complete? - (2011-01-06)
  [3144] Setting up arrays in C - fixed size at compile time, or dynamic - (2011-01-24)
  [3146] Strings in C - (2011-01-25)
  [3593] Chars, char arrays and strings in C. Some early cautions and pitfalls. - (2012-01-26)
  [3718] Splitting a record into individual data values in C - (2012-05-04)
  [4340] Simple C structs - building up to full, dynamic example - (2014-12-03)
  [4556] Strings in C - strncmp strncpy and friends - (2015-10-27)
  [4633] String handling in C - new examples of extracting integers from a string - (2016-01-27)


Back to
String functions in C
Previous and next
or
Horse's mouth home
Forward to
Objects and Inheritance in C++ - an easy start
Some other Articles
C course - final course example puts it all together
Local information for Melksham Hotel Guests
Catching up with an old friend
Objects and Inheritance in C++ - an easy start
Learning about Regular Expressions in C through examples
Staring a C course with Hello World - why?
C Course exercise and sample answer - source in 2 files
Just pass a pointer - do not duplicate the data
Software versions used - June 2010
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/2844_Lea ... mples.html • PAGE BUILT: Sun Oct 11 16:07:41 2020 • BUILD SYSTEM: JelliaJamb