NEW SERVER - 11.9.2020 ... Retiring, March 2020 - sorry, you have missed our final public course.
The Coronavirus situation has lead us to suspend public training - which was on the cards anyway, with no plans to resume
Please ask about private 'maintenance' training for Python, Tcl, Perl, PHP, Lua, etc
Happily continuing private consultancy / programming work
Do not re-invent the wheel - use a Perl module
"If you think 'surely someone has done this before', you're probably right ... and in Perl, you'll find the resource you need available as a module on your system, or if it's not quite to common, on the CPAN"
. I was reminded of this advise today, when I got involved with web site checking ... and rather than writing my own robotic browser in Perl, I used the LWP module ("Library for Web Processes" in case you wondered!)
What can I do with LWP? Well - I have several new examples to show you.
Reporting all the internal and external links from a page
- this uses LWP::Simple, standard on my Perl and easy to use
A short example that grabs a page and echos its content and status
, using a minimal series of calls to the more complete LWP module
A script that grabs a web page, then checks all the links from it
- a prototype example which needs some more work, but it's already found a broken link to an external site from one of our pages - and such things are very time-consuming to monitor by hand!
Here's an example of the sort of outputs you can get from that last program:
Dorothy-2:perl grahamellis$ perl goodlinks http://www.wellhousemanor.co.uk/ (written 2009-06-11, updated 2009-06-12)
Status from http://www.wellhousemanor.co.uk/whm.css is 200
Status from https://lightning.he.net/~wellho/hotel/reservation.php is 500
Status from http://www.wellhousemanor.co.uk/rooms.html is 200
Status from http://www.wellho.net/happens/rooms.php is 200
Status from http://www.wellhousemanor.co.uk/amenities.html is 200
Status from http://www.wellhousemanor.co.uk/events.html is 200
Status from http://www.wellhousemanor.co.uk/contact.html is 200
Status from http://www.westwiltshire.gov.uk/index/env/env-health-service
/food-hygiene/scores-on-doors.htm is 404
Status from http://www.wellho.net is 200
Status from http://www.wiltshirebusinessoftheyear.co.uk/ is 200
Status from http://www.aguafabrics.com/default.asp is 200
Status from http://www.hoteldesigns.net/industrynews/news_2745.html is 200
Status from http://www.macformat.co.uk is 200
Status from http://www.wellhousemanor.co.uk/art.html is 200
Status from http://www.tripadvisor.co.uk/ is 200
Status from http://www.tripadvisor.co.uk/Hotel_Review-g528775-d645951-
Reviews-Well_House_Manor-Melksham_Wiltshire_England.html is 200
Status from http://www.freeindex.co.uk/profile(Well-House-Consultants-Ltd)
_44477.htm is 200
Status from http://validator.w3.org/check is 200
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articlesP219 - Perl - Libraries and Resources 
What do I mean when I add things in Perl? - (2011-08-02) 
The week before Christmas - (2010-12-23) 
Expect in Perl - a short explanation and a practical example - (2010-10-22) 
Syncronise - software, trains, and buses. Please! - (2010-08-22) 
Operator overloading - redefining addition and other Perl tricks - (2009-09-27) 
Loading external code into Perl from a nonstandard directory - (2009-06-12) 
Debugging and Data::Dumper in Perl - (2008-11-02) 
About dieing and exiting in Perl - (2008-11-01) 
Using English can slow you right down! - (2007-11-25) 
Ordnance Survey Grid Reference to Latitude / Longitude - (2007-10-14) 
Outputting numbers as words - MySQL with Perl or PHP - (2007-06-17) 
Judging the quality of contributed Perl code - (2007-06-06) 
Self help in Perl - (2006-06-14) 
Coloured text in a terminal from Perl - (2006-05-29) 
Why reinvent the wheel - (2006-05-06) 
Use standard Perl modules - (2005-06-25) 
Where do Perl modules load from - (2005-06-24) 
Avoid the wheel being re-invented by using Perl modules - (2004-11-08) 
Talk review - Idiomatic Perl, David Cross - (2004-10-12)P405 - Perl - Web Service - Our Own Client 
Automated Browsing in Perl - (2009-09-11)P408 - Perl - Standard Web Modules 
Perl Dancer - from installation to your first real application - (2013-05-24) 
Perl Dancer - a Perl Framework - Installation and first test - (2013-05-23) 
Perl - retrieving and caching web resources - (2011-10-18) 
Automating access to a page obscured behind a holding page - (2009-09-23) 
Answering ALL the delegate's Perl questions - (2006-12-09)P608 - Perl - Robots, Crawlers and Spiders 
Does robots.txt actually work? - (2009-02-16) 
robots.txt - a clue to hidden pages? - (2007-01-13)
Some other Articles
Transforming data in Perl using lists of lists and hashes of hashesWhy sendmail one way, and pop3 the other?What is CGI.pm / A dozen new examplesRunning a piece of code is like drinking a pint of beerDo not re-invent the wheel - use a Perl moduleWhere do I start when writing a program?Learning PHP, Ruby, Lua and Python - upcoming coursesRevision / Summary of lists - PerlHow important is a front page ranking on a search engine?Trowbridge - a missed opportunity? Melksham - into the breach?
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page
This is a page archived from The Horse's Mouth at
the diary and writings of Graham Ellis.
Every attempt was made to provide current information at the time the
page was written, but things do move forward in our business - new software
releases, price changes, new techniques. Please check back via
our main site for current courses,
prices, versions, etc - any mention of a price in "The Horse's Mouth"
cannot be taken as an offer to supply at that price.
Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).