regex matching with regexp
Posted by redbanditos1999 (redbanditos1999), 16 October 2002Hi people !!
I need help for following problem:
how can I match \%\w in strings like
hallo %hallo = 12U;
12 %12 = 0;
but expect of strings like
printf("Today is the %d.%d.1998\n", pstruToday->sDay, pstruToday->sMonth);
printf("%d", xxx_ts_ExternCharArrayWrongType );
printf ("%d >> 2 = %d\n", sSignedShift, sResult);
I need the \%\w to be relevant only outside of printf(...) sequences.
Many thanks for quick help !!!
Posted by admin (Graham Ellis), 16 October 2002OK - you're looking for a backslash followed by any word character using the Tcl language in what looks like a source file of C++, but wanting to omit any matches that are within the format string of a printf?
Think I would do it in three simple steps not one complicated one.
a) Find all %-wordchar sequences within printf statments and temporarily replace them with something else
b) Do whatever you want with the remaining %-wordchar sequences (you don't tell us what you want to do in your question once you've found them)
c) Replace the sequences you found at the beginning within the printf by thge original sequence.
Posted by redbanditos1999 (redbanditos1999), 17 October 2002Thx for the answer Graham !
You're right - I dont ever tell you what I want to do with .
Im working on a syntax checker for C (Logiscope Tau Studio), where you can define your own rules for checking source files. The rules must be written in TCL mixed with special methods based on C-Language Data Model.
In the rule I write for now I need to match all binary operators, which dont have blanks after and before. So I have now 2 regexp calls which define regular expressions to find my operators. And now I only need the regular expression which allows me to find "%blahblah" or "%1234" in my source files expect of using "%" in printf (like "%o" "%d" "%c" "%s" "%x").
Ive tryed diverse expressions but I couldnt take advantage .
I hope U're better informed ow
Posted by admin (Graham Ellis), 18 October 2002Let me express a worry before I get too far into an answer. I'm always concerned at using a regular expression to analyse a programming language; programming languages are structured to be analysed by a compiler or interpretter that tokenises the source - splitting it an each word boundary, and handling it token by token. Although regular expressions are a fabulous capability, in the instance of analysing a language (such as your request) they have the capability of getting things a bit wrong. That may not be a concern, but you're going to have to be very caerful if you want to correctly analyse code like:
I think that's a valid piece of code - not the hard to analyse embedded comments and protected double quotes, the statement spread over a number of lines, and the %dunits which is really a printf %d followed by the literal next untis.
OK - end of my caution. Perhaps your C++ isn't this nasty and a quick and rough tool using regular expressions will suffice ...
a) All printfs are at the start of a line
b) There are no emberred "s in the " string in the printf
c) No embedded comments at the start of the printf
d) The print format is a plain text string - you're not using an expression
to make it up, or calling another method?
Umm .... I'm starting to worry myself now. Although each of these is an unliekly scenario, put toghether chances are that one or other will crop up from time to time ... Redbanditos ... will a rough toll suffice (in which case I'll spend a few minutes writing a regular expression to demonstarte how I would do this job), or have I persuabded you that you'll need to do a rethink?
Posted by redbanditos1999 (redbanditos1999), 25 October 2002Hi Graham !!
You're so right with your worries, but the C-code that I have to check isnt so nasty .
We develop software for aeroplane devices and the code is protected by certain rules before cases like you've described.
So I would be very grateful to you, If you could demonstrate your approach to me
Posted by admin (Graham Ellis), 26 October 2002Here's a file to process:
Here's a program that does the conversions
and here's the output
a) See cautions on previous postings. Note that my sample code makes changes within comments which you might not want!
b) My code is screaming out for a proc to be written to do this sort of global substition - I'll leave that to you once you've fine tuned the code.
c) I wondered about using -all on regsub but there's a number of problems with that, staring with the fact that my matches can overlap if there is more than one format directive in any printf.
PH: 01225 708225 • FAX: 01225 793803 • EMAIL: email@example.com • WEB: http://www.wellho.net • SKYPE: wellho