Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
Regular Expression Help

Posted by jessjav (jessjav), 4 October 2005
Hi I am very new to perl and definitely not that good with regular expressions, but I was wondering if someone could help me with a regular expression. I would like to write a regular expression that would replace one line return with a <br /> tag but would also check to see if you had two line return together and if so replace them with a <p> paragraph tag. The only problem is that it would have to put one <p> tag at the beginning of the line and then place the ending </p> tag at the end of the line. Hopefully I haven't confused everyone and it this all makes sense. Any suggestions at all would be so helpful.

Posted by admin (Graham Ellis), 5 October 2005
The question comes - how are you going to recognise the end of the paragraph - perhaps by the next \n\n sequence?   If so, then

while ($text =~ s/\n{2,}(.*?)\n{2,}/<p>$1<\/p>\n\n/s);

should do the trick.

Notes:

1 - NO g (global) modifier on the match, as I want to restart at the beingging each time ... the matches overlap slightly.

2. Use of {2,} to collect any number of new lines (2 or more)

3. Use of s modifier on regular expression to ensure that the "." charcater does match any embedded new lines

4. use of *? count (0 or more, but do a sparse match) to ensure that the algorithm recoginses each paragraph and not the whole thing as one huge paragraph

5. Adding back in of \n\n at the paragraph end to ensure that the next paragraph sill start where the previous one ended.

I've had to do something like this in the past ... it works well enough but you need to ensure that you do your paragraph substitutions (2 or more new lines) before your line breaks (single new line).   You may also need to add something extra / appropriate to deal with the very first / very last paragraphs.



Posted by nano (nano), 13 October 2005
have  a question here relating to PERL matching

stripping a tag like so <p>Unison, </p>

want to put quotes around the full  value coz  of the , comma

heres what i have
my $ttag = $_;
         chomp($ttag);
         if( /<t>([^<>]+)<\/t>/){
             $ttag = $1;
             $ttag =~ s{,}{"$ttag"}xg;
             print " T tag value is $ttag \n";
         }

but what i get is  ->   T tag value is unison"unison,"

what i want to achieve is T tag value is "unison,"

any ideas out ther ?

Posted by admin (Graham Ellis), 14 October 2005
You're replacing just the comma with the entire string - actually doing too much work!

Code:
if ($line =~ />(.*?)</) {
        print qq!value is "$1"\n!;
}


Woulod look like a shorter and simpler solution ...

Posted by nano (nano), 14 October 2005

thanks for the above solution but
this will only  put double quotes around the line even if it has a comma or not

like so  :

<t> United nations <\t>

output "$1"  =  "United nations"

whereas i just want double quotes if any word within the the <t> tag has a comma

any ideas ..



Posted by admin (Graham Ellis), 14 October 2005
Perhaps

Code:
if ($line =~ />([^<]*?,[^<]*?)</) {
    print qq!value is "$1"\n!;
}



Posted by nano (nano), 17 October 2005

How can one  pattern match  a block of text for a  carriage return  and replace it with a new line feed  '\n'  



Posted by admin (Graham Ellis), 17 October 2005
$block ~= s/\r/\n/g;



Posted by nano (nano), 18 October 2005
Thanks Graham

Another one
want to pull out the value within a tag

<p LINK="hello" EDITION="CE5" CAT="world"> ....... <t> </t>

Want to retrieve the value for CAT ??

this is what i was using to achieve it .. but no luck
$cat =~ /CAT=(.*?)<\/>/) {
            $cat = qq!$1!;
any ideas ..

Posted by admin (Graham Ellis), 18 October 2005
($cat) = $cat =~ /CAT=([^>]*)/;

Posted by nano (nano), 19 October 2005

Another one .. ??

trying to  add '\n\n'  when i encounter the <P> tag
however  when i use the below it works in some parts of the block of text but not  thru out

for example the block of text is :
</P><P>bla blah blah blah </P>hello  world<P>abcbabcbabdabb</P></div>Earlyistory</div><P>Little is known of the</P>

using the following syntax i would expect any matching <P> tag to be replaced by \n\n<P> but not the case -

any ideas ..


$entry =~s{<P>}{\n\n<P>};
 

Posted by admin (Graham Ellis), 19 October 2005
Hey ... try a whole day of learning these things!

Posted by nano (nano), 16 November 2005
Hi Guys ,

Have another issue running the following expression on a mac box using perl .

$my logfile = "temp"
$my date    = "10-12-05"

my $log = $logfile.$date

get the following error message


se of uninitialized value in concatenation (.) or string at ./rename.pl line 20.
Use of uninitialized value in concatenation (.) or string at ./rename.pl line 29.
Use of uninitialized value in opendir at ./rename.pl line 32.
Use of uninitialized value in concatenation (.) or string at ./rename.pl line 32.

I ran the above on mac Kernel 7.3 and it was fine

But with Kernel version 7.9 it reports the following  - i am using perl v5.8.1 - it looks like it doesn't like the (.) for cocatenation ..

What else can i use ..

Any ideas much appreciated .


Posted by admin (Graham Ellis), 16 November 2005
It looks like you have warnings on (on one box, but not the other) which is why you're getting varied behaviour.

Try rewriting

$my logifle = "temp";
$my date = "10-12-05";

as

my $logfile = "temp";
my $date = "10-12-05";

and it should get rid of the warnings.

Posted by nano (nano), 16 November 2005
Sorry that was a typo  mistake when iw as writing up the  problem
$my logifle = "temp";
$my date = "10-12-05";

I had already this

my $logfile = "temp";
my $date = "10-12-05";

still getting the eroor message i don't have warnings on ..

Posted by nano (nano), 16 November 2005
Ok sorry about this  - I got the solution some Env vars that the prog was using weren't set .. and therefore throw the above error message nothing relating to the lines of code above ..

Thanks for your help.

Posted by nano (nano), 18 November 2005
Hi ,

Trying to do some maipulatiion within a block  of text  where i need to change a char .
what i have is  
$entry =~ s/javascript_fn(xxx-xx)/javascript_fn(xxx_xxx)/g

now the string inside javascript_fn canbe anything but will contain a dash that needs to be replaced with an underscore .. within this function .

any ideas how i can achieve this.



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2014: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 899360 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho