Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
For 2023 (and 2024 ...) - we are now fully retired from IT training.
We have made many, many friends over 25 years of teaching about Python, Tcl, Perl, PHP, Lua, Java, C and C++ - and MySQL, Linux and Solaris/SunOS too. Our training notes are now very much out of date, but due to upward compatability most of our examples remain operational and even relevant ad you are welcome to make us if them "as seen" and at your own risk.

Lisa and I (Graham) now live in what was our training centre in Melksham - happy to meet with former delegates here - but do check ahead before coming round. We are far from inactive - rather, enjoying the times that we are retired but still healthy enough in mind and body to be active!

I am also active in many other area and still look after a lot of web sites - you can find an index ((here))
Simple script to strip out MD5 checksums

Posted by Stimpmeter (Stimpmeter), 25 February 2005
I have the need to parse text files and collect the MD5 checksums contained within the file so they can be written to another file. The text files will either have 0 or 1 MD5 checksums on each line. ie I want to turn:

dog fish 2345acdb1890ffff3897cdfa2276addc
goat rhino
6776acdb1890ffff3897cdfa2276bbbb whale pig
womble

into...

2345acdb1890ffff3897cdfa2276addc
6776acdb1890ffff3897cdfa2276bbbb

My preference is to do this in Perl. I think the regular expression I am after is something like [0-9a-fA-F]{32}, but I couldn't get this working in Awk. I wrote an Awk script that sort of works (inelegantly):

BEGIN { FS = " "}
{
nfields = NF

for (count = 1; count <= nfields; count++) {

if ($count ~ /^[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] --- See edit note below ---
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]$/) {
  print $count
  break
}

}

}
END {
 print ""
 print "Field count: ", nfields
 print FS
}

Thanks in advance

Edited by Graham - I have split your very long line so that it doesn't cause browsers to put left-right scroll bars up when people view this page - people really aren't keen on that type of scrolling for some reason!

Posted by admin (Graham Ellis), 25 February 2005
Here's the sort of thing ...

Code:
while ($line = <DATA>) {
 @parts = split(/\s+/,$line);
 foreach $field (@parts) {
   print "$field\n" if ($field =~ /^[[:xdigit:]]{32}$/);
 }    
}
__END__
dog fish 2345acdb1890ffff3897cdfa2276addc
goat rhino
6776acdb1890ffff3897cdfa2276bbbb whale pig
womble


Which gives ...

Code:
earth-wind-and-fire:~/feb05 grahamellis$ perl gmd5
2345acdb1890ffff3897cdfa2276addc
6776acdb1890ffff3897cdfa2276bbbb
earth-wind-and-fire:~/feb05 grahamellis$


I don't know what your earlier Perl problems were (since you've only supplied your awk example!)  but I'm suspecting you might not have been splitting the lines down.   Anyway - the code above works; it could be written shorter, but I've gone for clarity instead.

If you remove the DATA file handle so that you've just got <> rather that <DATA>, you'll be able to run your program on data contained in a file named on the command line rather than on the data that I've embedded at the end of the program - that's just a mechanism I often use to provide a short example with everything in the one file in case I come back to it in the future.

Oh - and [0-9A-Fa-f] would have worked just as well as my "xdigit" stuff.  I just felt that xdigit was a bit clearer when you come back later to do code maintenance.

Posted by Stimpmeter (Stimpmeter), 1 March 2005
Perfect. Thankyou.



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2024: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho