| |||||||||||
| |||||||||||
Simple script to strip out MD5 checksums Posted by Stimpmeter (Stimpmeter), 25 February 2005 I have the need to parse text files and collect the MD5 checksums contained within the file so they can be written to another file. The text files will either have 0 or 1 MD5 checksums on each line. ie I want to turn:dog fish 2345acdb1890ffff3897cdfa2276addc goat rhino 6776acdb1890ffff3897cdfa2276bbbb whale pig womble into... 2345acdb1890ffff3897cdfa2276addc 6776acdb1890ffff3897cdfa2276bbbb My preference is to do this in Perl. I think the regular expression I am after is something like [0-9a-fA-F]{32}, but I couldn't get this working in Awk. I wrote an Awk script that sort of works (inelegantly): BEGIN { FS = " "} { nfields = NF for (count = 1; count <= nfields; count++) { if ($count ~ /^[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] --- See edit note below --- [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F] [0-9a-fA-F][0-9a-fA-F][0-9a-fA-F]$/) { print $count break } } } END { print "" print "Field count: ", nfields print FS } Thanks in advance Edited by Graham - I have split your very long line so that it doesn't cause browsers to put left-right scroll bars up when people view this page - people really aren't keen on that type of scrolling for some reason! Posted by admin (Graham Ellis), 25 February 2005 Here's the sort of thing ...Code:
Which gives ... Code:
I don't know what your earlier Perl problems were (since you've only supplied your awk example!) but I'm suspecting you might not have been splitting the lines down. Anyway - the code above works; it could be written shorter, but I've gone for clarity instead. If you remove the DATA file handle so that you've just got <> rather that <DATA>, you'll be able to run your program on data contained in a file named on the command line rather than on the data that I've embedded at the end of the program - that's just a mechanism I often use to provide a short example with everything in the one file in case I come back to it in the future. Oh - and [0-9A-Fa-f] would have worked just as well as my "xdigit" stuff. I just felt that xdigit was a bit clearer when you come back later to do code maintenance. Posted by Stimpmeter (Stimpmeter), 1 March 2005 Perfect. Thankyou.This page is a thread posted to the opentalk forum
at www.opentalk.org.uk and
archived here for reference. To jump to the archive index please
follow this link.
|
| ||||||||||
PH: 01144 1225 708225 • FAX: 01144 1225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho |