Posted by Gelembjuk (Gelembjuk), 14 November 2006
Hello.
I have developed perl module HTML::ListGrabber for extracting data from HTML pages.
I want to hear remarks about it from Perl developers.
Manual and Demo on my site http://gelembjuk.il.if.ua/
What do you think?
Posted by admin (Graham Ellis), 15 November 2006
Well - there's a lot of other modules already out there. If you compare it to others (such as LWP - see
http://www.perl.com/pub/a/2002/08/20/perlandlwp.html which some of us already use) then you'll help us learn under what circumstances it might be appropriate to use your module.
Posted by Gelembjuk (Gelembjuk), 15 November 2006
LWP module only downloads web resources by http protocol.
My module is used to extract data from html text.
Example (extracting products info from amazon.com search results):
Code:use HTML::ListGrabber;
$grabber=HTML::ListGrabber->new;
$template='<td class="imageColumn" ><table><tr><td> <a ><datatag name="img" -extractfrom="img" -attrforextract="src"></a> <datatag name="null" -pass="all"> <td class="dataColumn"><table ><tr><td> <datatag name="link" -extractfrom="a" -attrforextract="href"> <datatag name="title" -pass="span"></a> <datatag name="author"><span class="bindingBlock"> <datatag name="null" -pass="all"> <norequired> <span class="listprice"><datatag name="listprice"></span> </norequired> <datatag name="null" -pass="all"> <span class="otherprice"><datatag name="otherprice"></span>';
$grabber->setTemplate($template);
$url="http://amazon.com/s/field-keywords=perl";
@a=$grabber->grabListedData($url);
foreach $k(@a){ print "--------------------------------\n"; foreach $p (keys %$k){ print "$p => $$k{$p}\n"; } } |
|
Output (not all):
Code:-------------------------------- link => http://www.amazon.com/Learning-Perl-Second-Randal-Schwartz/dp/B00005R09A/sr=8-1/qid=1163574960/ref=pd_bbs_1/103-2135971-0091843?ie=UTF8&s=books listprice => $39.95 img => http://ec1.images-amazon.com/images/P/B00005R09A.01._SCTHUMBZZZ_.jpg title => Learning Perl, Second Edition author => by Randal L. Schwartz, Tom Christiansen, and Larry Wall otherprice => $19.50 -------------------------------- link => http://www.amazon.com/Programming-Perl-2nd-Larry-Wall/dp/B00005R09P/sr=8-3/qid=1163574960/ref=pd_bbs_3/103-2135971-0091843?ie=UTF8&s=books listprice => $49.95 img => http://ec1.images-amazon.com/images/P/B00005R09P.01._SCTHUMBZZZ_.jpg title => Programming Perl (2nd Edition) author => by Larry Wall, Tom Christiansen, Randal L. Schwartz, and Stephen Potter otherprice => $23.20 -------------------------------- link => http://www.amazon.com/Perl-Cookbook-Second-Tom-Christiansen/dp/0596003137/sr=8-5/qid=1163574960/ref=pd_bbs_sr_5/103-2135971-0091843?ie=UTF8&s=books listprice => $49.95 img => http://ec1.images-amazon.com/images/P/0596003137.01._SCTHUMBZZZ_.jpg title => Perl Cookbook, Second Edition author => by Tom Christiansen and Nathan Torkington otherprice => $18.02 |
|
Posted by admin (Graham Ellis), 15 November 2006
Many thanks for clearing that up .... there's so many modules out there that it's sometimes difficult to see where each of them fits in.
This page is a thread posted to the opentalk forum
at
www.opentalk.org.uk and
archived here for reference. To jump to the archive index please
follow
this link.