If your PHP allows for remote URLs to be handled / read as if they were files (and that's the default), you have useful tool which lets you include the content of one web page (or part of it) within another. For example, I can "scrape" the sections of a
coming on a course page and insert them into another page.
Here's an example of the mechanism in use ...
1. Grab the page to be scraped:
$lyne = file_get_contents("http://www.wellho.co.uk/net/join.html");
2. Extract the data you want from it:
$includedtext = "";
preg_match_all("!<dt>(.+?)</dt>.*?<dd>(.+?)</dd>!s",$lyne,$here);
for ($k=0; $k<count($here[0]); $k++) {
$includedtext .= "<b>".htmlspecialchars(
strip_tags($here[1][$k])).
"</b><br />". htmlspecialchars(
strip_tags($here[2][$k])).
"<br /><br />";
}
3. Use the
$includedtext within your code
You can try this out
[here] and see the source code
[here]
This example comes with a string of cautions ...
1. Do NOT allow just any old URL to be scraped, especially one that our users may enter. This leaves you open to having your content filled with their adverts!
2. If you are scraping the same page regularly and it doesn't change very much, you should cache the results and not make the inquiry every time.
3. Respect the robots exclusion standard (robots.txt) of the remote site that you're scraping,, and ensure that you have copyright permission to reproduce the material on your site too
4. Remember that if the remote site's format changes so that your regular expression no longer matches, you'll have a correction to make on your site PDQ!
We currently have examples of the use of scraped material on the
Melksham Chamber of Commerce home page and also the
First Great Western Coffee Shop. "Take the power of this facility ... but be careful how you use it!
(written 2009-12-21, updated 2010-01-06)
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
H307 - PHP - Web2 and caching [1633] Changing a screen saver from a web page (PHP, Perl, OSX) - (2008-05-06)
[1647] Exchange Rates - PHP with your prices in your users currency - (2008-05-19)
[1733] memcached - overview, installation, example of use in PHP - (2008-08-02)
[1812] Starting Ajax - easy example of browser calling up server data - (2008-09-27)
[1813] Ajax - going Asyncronous and what it means - (2008-09-28)
[1814] Javascript/HTML example, dynamic server monitor - (2008-09-28)
[1926] Flash (client) to PHP (server) - example - (2008-12-06)
[1995] Automated server heartbeat and health check - (2009-01-16)
[2196] New Example - cacheing results in PHP for faster loading - (2009-05-24)
[2321] Uploading and Downloading files - changing names (Perl and PHP) - (2009-08-04)
[3029] PHP data sources - other web servers, large data flows, and the client (browser) - (2010-11-04)
[3094] Setting your user_agent in PHP - telling back servers who you are - (2010-12-18)
[3186] How to add a customised twitter feed to your site - (2011-02-27)
[3458] On this day ... one PHP script with three uses - (2011-09-26)
[3955] Building up from a small PHP setup to an enterprise one - (2012-12-16)
[3999] Handling failures / absences of your backend server nicely - (2013-02-08)
[4055] Using web services to access you data - JSON and RESTful services - (2013-03-29)
[4075] Further recent PHP examples - (2013-04-28)
[4106] Web server efficiency - saving repetition through caches - (2013-05-30)
[4136] How do I post automatically from a PHP script to my Twitter account? - (2013-07-10)
[4627] Caching results in an object for efficiency - avoiding re-calculation - (2016-01-20)
Some other Articles
Christmas Day ...Ten years ago, we moved to Melksham SpaThe great thing about snow ....How well do you know Melksham - a quiz for ChristmasScraping content for your own page via PHPVision for WiltshireDay and night at ChristmasMy armpit was like a zebra crossingVAT ChangesRock and hard place .. and the relaxing right one won