Training, Open Source computer languages

This is page http://www.wellho.net/forum/Perl-Programming/Fastest- ... chars.html

Our email: info@wellho.net • Phone: 01144 1225 708225

 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Fastest way to replace chars

Posted by John_Moylan (jfp), 25 September 2002
Evening all.

Now this has puzzled me, in the sense that I never expected it ito be faster....but.

I want to convert 2002-09-18 04:45:22 to  20020918044522
This has to be done on hundreds of thousands of lines so I thought I'd better benchmark  it on 6000 lines.
(Can't test on more for now, corrupt mysqldump, lIwas lucky to salvage 6000 lines, but anyway)

First I thought  of using s/\D//g; in my method below.
This took 5 seconds (using Benchmark.pm)

But the three steps of    
$date =~ tr/-//d;
$date =~ tr/ //d;
$date =~ tr/://d;
took only 4 seconds.

Is this to do with the regex engine overhead?
I was sure the 3 step process would be slower.

Code:
sub DateToTimestamp () {

   # the date is in the format of  '2002-09-18 04:45:22'
   # but I want a timestamp of '20020918044522'

   my ($self, $date) = @_;

   $date =~ tr/-//d;
   $date =~ tr/ //d;
   $date =~ tr/://d;

   print "$date\n";

   return $date;
}


Or have I missed something?

jfp

Posted by admin (Graham Ellis), 25 September 2002
I'm not at all suprised at the result.  Regular expressions are very clever, and that cleverness does add some slowing down.  On the other hand, tr simply builds up a 256 character translate table and blats the data through it.

If you think about it, even your simple regex has to look at each character against a list of (10) digits and loop internally to check that each character isn't one of them ....

Question for you.  Are the dashes, colons and spaces always at exacltly the same character position number in the string?   If they are, I think you might find that it's even quicker to use unpack or a series of substrs, followed by a pack to reform the parts of the time and datestamp.

By the way - welcome to the rank of "Established Poster" - you're no longer a Newcomer! - G



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

© WELL HOUSE CONSULTANTS LTD., 2024: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho