Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
Python and Tcl - public course schedule [here]
Private courses on your site - see [here]
Please ask about maintenance training for Perl, PHP, Lua, etc
 
Using LWP to write Web Clients

As well as running server side scripts (through CGI, modPerl or ASP) and many uses independent of networks, Perl can be used to write server processes, and client processes too.

"Why do I want to write my own browser?" you will ask. The answer, of course, is that you don't ... but you might well want to write an application to mimic a browser as it collects information from a server using HTTP.

Let's say, for example, that I wish to write a Perl program to convert a price in one currency into another currency, using the current "spot" exchange rates. There's a suitable table available at:
 http://www.ecb.int/home/eurofxref.htm
and it's updated daily.

[Like all websites, ecb has just changed ;-( ... they now provide an XML file that's specifically intended for applications like this one :-) - it's at
http://www.ecb.int/stats/eurofxref/eurofxref-daily.xml]
Although we could write a program using our own connections and sockets, it's much easier to use Perl's LWP module to do so - here are the sort of results we might get:

$ ecbgrab
 AUD 1.693 Australian dollar
 BGN 1.951 Bulgarian lev
 CAD 1.371 Canadian dollar
 CHF 1.478 Swiss franc
 CYP 0.576 Cyprus pound
 CZK 31.910 Czech koruna
 DKK 7.428 Danish krone
 EEK 15.647 Estonian kroon
 EUR 1.000 European euros
 GBP 0.610 Pound sterling
 HKD 6.732 Hong Kong dollar
 HUF 243.000 Hungarian forint
 ISK 88.650 Icelandic krona
 JPY 115.660 Japanese yen
 KRW 1133.040 South Korean won
 LTL 3.453 Lithuanian litas
 LVL 0.555 Latvian lat
 MTL 0.398 Maltese lira
 NOK 7.840 Norwegian krone
 NZD 2.068 New Zealand dollar
 PLN 3.585 Polish zloty
 ROL 27700.000 Romanian leu
 SEK 9.188 Swedish krona
 SGD 1.585 Singaporean dollar
 SIT 222.598 Slovenian tolar
 SKK 42.470 Slovakian koruna
 TRL 1133000.000 Turkish lira
 USD 0.863 US dollar
 ZAR 9.931 South African rand
Please enter an amount to convert (e.g. 290.00 GBP) ... 19.99 GBP
into what currency ... NOK
19.99 Pound sterling converts to 256.94 Norwegian krone (GBP to NOK at 12.8533)
$

Here's the program:

#!/usr/bin/perl

use LWP::UserAgent;

$agent = LWP::UserAgent->new;
$agent->agent("Well House Consultants/$0 ");

$req = HTTP::Request->new(GET => "http://www.ecb.int/home/eurofxref.htm");
$res = $agent->request($req);

$page = $res->content;

%currencies = ("EUR","European euros",($page =~ />([A-Z]{3})<.*?>\s*<.*?>(.*?)</gs));
%rates = ("EUR",1,($page =~ />([A-Z]{3})<.*?>\s*<.*?>.*?<.*?>\s*<.*?>(.*?)</gs));
foreach $c (sort keys %currencies) {
 printf ("%4s %12.3f %s\n",$c,$rates{$c},$currencies{$c});
 }

print "Please enter an amount to convert (e.g. 290.00 GBP) ... ";
chop ($yousaid = <STDIN>);
if ($yousaid) {
die ("invalid entry\n") unless(($amount,$incurr) = ($yousaid =~ /(.*?)\s*([A-Z]{3})$/)) ;
die ("not a know currency\n") unless ($rates{$incurr});

print "into what currency ... ";
chop ($yousaid = <STDIN>);
die ("invalid entry\n") unless(($outcurr) = ($yousaid =~ /([A-Z]{3})$/)) ;
die ("not a know currency\n") unless ($rates{$outcurr});

$becomes = $amount / $rates{$incurr} * $rates{$outcurr};
$erate = 1.0 / $rates{$incurr} * $rates{$outcurr};

printf ("%.2f %s converts to %.2f %s (%s to %s at %.4f)\n",
 $amount, $currencies{$incurr},
 $becomes, $currencies{$outcurr},
 $incurr, $outcurr, $erate);
 

}

We've used the LWP::UserAgent module from the CPAN - Browsers are know as "User Agents" in case you were wondering. We've given the User Agent a program name so that the server knows what type of agent (i.e. what model or browser) we are, and we've then made up a GET request and submitted it.

There are other modules such as HTML-Parser available to help you parse the response, although in this particular case the response is in a straightforward enough format, so we've just used regular expressions for the job.

A NOTE OF CAUTION

Web servers were designed to supply information to browsers at the request of human users. Such accesses are relatively sporadic in computing terms, even for an enthusiastic user, so that lots of users can all be accessing the same web site in the same period of time and the web server can cope.

If you write a client program that reaps a very large number of pages as fast as it can, it's quite possible that you'll overload the server or the internet connection to it, and your effort may be seen as unwelcome - it may even be classified as a "denial of service" attack.

There are two rules to note if you are going to be looking for many pages, or if you are going to be looking regularly.

Firstly, you should look at a file called robots.txt which should be in the home directory on the web server; this file contains information placed on the web server by the web site administrator, and tells robots where they are not welcome.

Second, if you require more than one page from a server, you should pause between each page that you grab to give other users a chance of a look in. Chances are that if you have a major robotic program, you'll be looking at pages on many different sites so you don't actually have to slow your program down - just grab pages from each site in turn


See also Perl on the Web course

Please note that articles in this section of our web site were current and correct to the best of our ability when published, but by the nature of our business may go out of date quite quickly. The quoting of a price, contract term or any other information in this area of our website is NOT an offer to supply now on those terms - please check back via our main web site

Related Material

Web Application Deployment - XML, DTD, XSLT, XHTML and More
  [2554] Adding retrospective ALT attributes to IMG - (2009-12-28)
  [2378] Handling XML in Perl - introduction and early examples - (2009-08-27)
  [2246] What difference does using the XHTML standard really make? - (2009-06-18)
  [1901] XML, HTML, XHTML and more - (2008-11-23)
  [1050] The HTML++ Metalanguage - (2007-01-22)
  [653] Easy feed! - (2006-03-21)

Handling XML in Perl
  [3874] Using Perl to read an RSS feed off a web site and extract data - via LWP and XML modules - (2012-09-30)
  [2555] Bookkeeping - (2009-12-29)
  [2378] Handling XML in Perl - introduction and early examples - (2009-08-27)

Perl - Standard Web Modules
  [4100] Perl Dancer - from installation to your first real application - (2013-05-24)
  [4099] Perl Dancer - a Perl Framework - Installation and first test - (2013-05-23)
  [3485] Perl - retrieving and caching web resources - (2011-10-18)
  [2416] Automating access to a page obscured behind a holding page - (2009-09-23)
  [2402] Automated Browsing in Perl - (2009-09-11)
  [2229] Do not re-invent the wheel - use a Perl module - (2009-06-11)
  [975] Answering ALL the delegate's Perl questions - (2006-12-09)

resource index - Perl
Solutions centre home page

You'll find shorter technical items at The Horse's Mouth and delegate's questions answered at the Opentalk forum.

At Well House Consultants, we provide training courses on subjects such as Ruby, Lua, Perl, Python, Linux, C, C++, Tcl/Tk, Tomcat, PHP and MySQL. We're asked (and answer) many questions, and answers to those which are of general interest are published in this area of our site.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2019: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01225 708225 • FAX: 01225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/solutions/perl-usi ... ients.html • PAGE BUILT: Wed Mar 28 07:47:11 2012 • BUILD SYSTEM: wizard