Training, Open Source computer languages
PerlPHPPythonMySQLApache / TomcatTclRubyJavaC and C++LinuxCSS 
Search for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
20.9.2014 - We have just updated our course layouts and descriptions and added our 2015 schedule.

Simple script to strip out MD5 checksums

Posted by Stimpmeter (Stimpmeter), 25 February 2005
I have the need to parse text files and collect the MD5 checksums contained within the file so they can be written to another file. The text files will either have 0 or 1 MD5 checksums on each line. ie I want to turn:

dog fish 2345acdb1890ffff3897cdfa2276addc
goat rhino
6776acdb1890ffff3897cdfa2276bbbb whale pig
womble

into...

2345acdb1890ffff3897cdfa2276addc
6776acdb1890ffff3897cdfa2276bbbb

My preference is to do this in Perl. I think the regular expression I am after is something like [0-9a-fA-F]{32}, but I couldn't get this working in Awk. I wrote an Awk script that sort of works (inelegantly):

BEGIN { FS = " "}
{
nfields = NF

for (count = 1; count <= nfields; count++) {

if ($count ~ /^[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] --- See edit note below ---
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]
[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]$/) {
  print $count
  break
}

}

}
END {
 print ""
 print "Field count: ", nfields
 print FS
}

Thanks in advance

Edited by Graham - I have split your very long line so that it doesn't cause browsers to put left-right scroll bars up when people view this page - people really aren't keen on that type of scrolling for some reason!

Posted by admin (Graham Ellis), 25 February 2005
Here's the sort of thing ...

Code:
while ($line = <DATA>) {
 @parts = split(/\s+/,$line);
 foreach $field (@parts) {
   print "$field\n" if ($field =~ /^[[:xdigit:]]{32}$/);
 }    
}
__END__
dog fish 2345acdb1890ffff3897cdfa2276addc
goat rhino
6776acdb1890ffff3897cdfa2276bbbb whale pig
womble


Which gives ...

Code:
earth-wind-and-fire:~/feb05 grahamellis$ perl gmd5
2345acdb1890ffff3897cdfa2276addc
6776acdb1890ffff3897cdfa2276bbbb
earth-wind-and-fire:~/feb05 grahamellis$


I don't know what your earlier Perl problems were (since you've only supplied your awk example!)  but I'm suspecting you might not have been splitting the lines down.   Anyway - the code above works; it could be written shorter, but I've gone for clarity instead.

If you remove the DATA file handle so that you've just got <> rather that <DATA>, you'll be able to run your program on data contained in a file named on the command line rather than on the data that I've embedded at the end of the program - that's just a mechanism I often use to provide a short example with everything in the one file in case I come back to it in the future.

Oh - and [0-9A-Fa-f] would have worked just as well as my "xdigit" stuff.  I just felt that xdigit was a bit clearer when you come back later to do code maintenance.

Posted by Stimpmeter (Stimpmeter), 1 March 2005
Perfect. Thankyou.



This page is a thread posted to the opentalk forum at www.opentalk.org.uk and archived here for reference. To jump to the archive index please follow this link.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2014: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 899360 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho