Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
 
Python and Tcl - public course schedule [here]
Private courses on your site - see [here]
Please ask about maintenance training for Perl, PHP, Lua, etc
 
Suggesting alternative search terms to web site users

How do you report search results to your users? Users are notorious spellers, and they'll enter text in the wrong case, use the wrong letters, and join words up where they shouldn't.

On our web site, we wanted to report on accurate matches when a user searches, but also recommend other possible search terms when the search finds nothing or very little to report.

A COMPARISON OF PHP "LOOK ALIKE" FUNCTIONS

PHP supports a number of functions that allow the comparison of text strings to see "how similar" they are.

1. soundex returns a 4 character string that represents what a word sounds like when pronounced

2. metaphone creates a variable length character string that also represents what a word sounds like when it's pronounced. More accurate than soundex.

3. similar_text calculates the similarity (as a percentage) between two strings.

4. levenshtein returns the number of changes (character deletes, inserts and replaces) that are needed to transform one string into another. More efficient than similar_text.

You can also use low level string functions and regular expressions to compare text strings - although the four functions listed above are probably more suited for looking for "words like".

APPLICATION OF LOOK ALIKE FUNCTIONS TO A WEB SEARCH

There are nearly 40000 different words used on our web site, and our search pages examine over 5000 different pages. Clearly it's not practical to go through every page and do searches and "look like" comparisons at every search, so here's what we do:

a) When our website is updated, we generate a table of all the words used on the site and their metaphone values. A third column in this table records the number of times each word occurs.

b) When a search fails, we look for all words that we've used with an identical metaphone, and we offer these starting with the most used word first.

c) If the user is searching for a single word and there's still very little we can offer, we attempt to make up a matching metaphone using two words. For example, the metaphone for "dresscode" is TRSKT, so we look to make up that string with words like "dress" and "trees".

d) Stage (c) can generate a very large number of options so we use levenshtein to select those which are nearest to the use's original search term, and to eliminate those which are spelt wildly differently


See also PHP training courses

Please note that articles in this section of our web site were current and correct to the best of our ability when published, but by the nature of our business may go out of date quite quickly. The quoting of a price, contract term or any other information in this area of our website is NOT an offer to supply now on those terms - please check back via our main web site

Related Material

PHP - Further Web Page and Network Handling
  [4483] Moving from mysql to mysqli - simple worked example - (2015-05-03)
  [4070] Passing variable between PHP pages - hidden fields, cookies and sessions - (2013-04-26)
  [3918] Multiple page web applications - maintaining state - PHP - (2012-11-10)
  [3568] Telling which ServerAlias your visitor used - useful during merging domains - (2012-01-04)
  [3540] Easy session example in PHP - keeping each customers data apart - (2011-12-06)
  [3432] 3 digit HTTP status codes - what are they, which are most common, which should be a concern? - (2011-09-11)
  [3036] Sending out an email containing HTML from within a PHP page - (2010-11-07)
  [2918] Downloading a report from the web for further local analysis - (2010-08-13)
  [2729] Uploading a document or image to its own URL via a browser - (2010-04-18)
  [2679] How to build a test harness into your PHP - (2010-03-16)
  [2632] Shipping a test harness with your class in PHP - (2010-02-12)
  [1549] http, https and ajp - comparison and choice - (2008-02-22)
  [1518] Downloading data for use in Excel (from PHP / MySQL) - (2008-01-25)
  [1515] Keeping staff up to date on hotel room status - (2008-01-22)
  [1505] Script to present commonly used images - PHP - (2008-01-13)
  [1496] PHP / Web 2 logging - (2008-01-06)
  [1495] Single login and single threaded models - Java and PHP - (2008-01-04)
  [1485] Copyright and theft of images, bandwidth and members. - (2007-12-26)
  [1379] Simple page password protection - PHP - (2007-10-04)
  [1355] .php or .html extension? Morally Static Pages - (2007-09-17)
  [1210] PHP header() function - uses and new restrictions - (2007-05-30)
  [1187] Updating a page strictly every minute (PHP, Perl) - (2007-05-14)
  [1183] Improving searches - from OR to AND? - (2007-05-11)
  [1114] PHP Image upload script - (2007-03-21)
  [1009] Passing GET parameters through Apache mod_rewrite - (2006-12-27)
  [936] Global, Superglobal, Session variables - scope and persistance in PHP - (2006-11-21)
  [904] Of course I'll tell you by email - (2006-10-25)
  [847] Image maps for navigation - a straightforward example - (2006-08-28)
  [789] Hot answers in PHP - (2006-07-02)
  [767] Finding the language preference of a web site visitor - (2006-06-18)
  [675] Adding PHP tags to an old cgi program - (2006-04-08)
  [603] PHP - setting sort order with an associative array - (2006-02-13)
  [565] Using PHP to output images, XML, Style sheets, etc - (2006-01-15)
  [542] Morning image, afternoon image - (2005-12-26)
  [537] Daily Image Santafied - (2005-12-22)
  [484] Setting the file name for a downloaded document - (2005-11-03)
  [451] Accessing a page via POST from within a PHP script - (2005-09-26)
  [443] Server side scripting of styles to suit the browser - (2005-09-12)
  [425] Caching an XML feed - (2005-08-26)
  [410] Reading a news or blog feed (RSS) in your PHP page - (2005-08-12)
  [376] What brings people to my web site? - (2005-07-13)
  [372] Time calculation in PHP - (2005-07-08)
  [356] Sudoku helper or sudoku cheat - (2005-06-23)
  [345] Spotting a denial of service attack - (2005-06-12)
  [314] What language is this written in? - (2005-05-17)
  [220] When to use Frames - (2005-02-19)

Web site techniques, utility and visibility
  [4492] Almost so wrong, but perhaps it's right for some? - (2015-05-11)
  [4474] Effect on external factors on traffic to our web sites - an update - (2015-04-26)
  [4401] Selecting RECENT and POPULAR news and trends for your web site users - (2015-01-19)
  [4376] Well House Consultants, Well House Manor, First Great Western Coffee shop, TransWilts / 2014 web site reports - (2015-01-01)
  [4239] Facebook marketing - early experiences - (2014-01-19)
  [4136] How do I post automatically from a PHP script to my Twitter account? - (2013-07-10)
  [4115] More or less back - what happened to our server the other day - (2013-06-14)
  [4076] Web site - fully back! - (2013-04-29)
  [4001] Helping search engines with appropriate 400 error codes - (2013-02-11)
  [3974] TV show appearance - how does it effect your web site? - (2013-01-13)
  [3896] An email marathon - (2012-10-15)
  [3776] Some traps it's so easy to fall into in designing your web site - (2012-06-23)
  [3745] Legal change - You need to obtain user consent if you use cookies on your website - (2012-06-01)
  [3744] Short Web Addresses for Melksham - (2012-05-30)
  [3734] QR codes with marketing logos embedded - (2012-05-16)
  [3623] Some TestWise examples - helping use Ruby code to check your web site operation - (2012-02-24)
  [3589] Promoting a single one of your domains on the search engines - (2012-01-22)
  [3563] How big is a web page these days? Does the size of your pages matter? - (2011-12-26)
  [3554] Learning more about our web site - and learning how to learn about yours - (2011-12-17)
  [3532] Sharing the user experience - designing a form with the customer in mind - (2011-11-29)
  [3491] Who is knocking at your web site door? Are you well set up to deal with allcomers? - (2011-10-21)
  [3426] Automed web site testing scripted in Ruby using watir-webdriver - (2011-09-09)
  [3367] Google +1 - what is it? - (2011-07-22)
  [3197] Finding and diverting image requests from rogue domains - (2011-03-08)
  [3149] Looking back at www.wellho.net - (2011-01-28)
  [3087] Making the most of critical emails - reading behind the scene - (2010-12-16)
  [3022] Retaining web site visitors - reducing the one page wonders - (2010-10-31)
  [2981] How to set up short and meaningfull alternative URLs - (2010-10-02)
  [2668] Is it worth it? - (2010-03-09)
  [2569] How to run a successful online poll / petition / survey / consultation - (2010-01-10)
  [2552] Web site traffic - real users, or just noise? - (2009-12-26)
  [2532] Analysing Google arrivals by country of origin - (2009-12-10)
  [2519] Status Page / breaks of service in early December - (2009-11-30)
  [2410] Removal of technical resources from this site - (2009-09-19)
  [2389] Writing with our customers words - (2009-09-01)
  [2341] Koulutus, Open Source tietokone kielillä - (2009-08-09)
  [2340] ldning, Open Source dator språk - (2009-08-09)
  [2340] ldning, Open Source dator språk - (2009-08-09)
  [2339] Opplæring, Open Source datamaskinen språk - (2009-08-09)
  [2338] Uddannelse, Open Source computer sprog - (2009-08-09)
  [2337] Opleiding, Open Source computertalen - (2009-08-09)
  [2336] Formação, Open Source computador línguas - (2009-08-09)
  [2335] Ausbildung, die Open-Source-Sprachen - (2009-08-09)
  [2334] Formazione, Open Source computer lingue - (2009-08-09)
  [2333] Formación, de los lenguajes de código abierto - (2009-08-09)
  [2332] Formation, des langages Open Source - (2009-08-09)
  [2225] How important is a front page ranking on a search engine? - (2009-06-09)
  [2065] Static mirroring through HTTrack, wget and others - (2009-03-03)
  [2056] Web Site Loading - experiences and some solutions shared - (2009-02-26)
  [1982] Cooking bodies and URLs - (2009-01-08)
  [1970] Plagarism - who is copying my pages? - (2009-01-02)
  [1961] Making our things easier to find - (2008-12-26)
  [1955] How to avoid duplicating web page maintainance - (2008-12-20)
  [1888] Find the link - (2008-11-16)
  [1856] A few of my favourite things - (2008-10-26)
  [1833] Web Bloopers - good form design - avoiding pitfalls - (2008-10-11)
  [1797] I have been working hard but I do not expect you noticed - (2008-09-14)
  [1793] Which country does a search engine think you are located in? - (2008-09-11)
  [1756] Ever had One of THOSE mornings? - (2008-08-16)
  [1747] Who is watching you? - (2008-08-10)
  [1711] Rapid growth leads to server move - (2008-07-17)
  [1653] How do Google Ads work? - (2008-05-25)
  [1634] Kiss and Book - (2008-05-07)
  [1630] To provide external links, or not? - (2008-05-04)
  [1610] PHP course dot co, dot uk - (2008-04-13)
  [1554] Online hotel reservations - Melksham, Wiltshire (near Bath) - (2008-02-24)
  [1541] Colour, Composition or Content - (2008-02-16)
  [1534] Where in the world / country is my visitor from? - (2008-02-07)
  [1513] Perl, PHP or Python? No - Perl AND PHP AND Python! - (2008-01-20)
  [1506] Ongoing Image Copyright Issues, PHP and MySQL solutions - (2008-01-14)
  [1505] Script to present commonly used images - PHP - (2008-01-13)
  [1494] A time to update pictures - (2008-01-03)
  [1437] Above the fold with First Great Western - (2007-11-19)
  [1297] Stuffing content into a web page - easy maintainance - (2007-08-09)
  [1237] What proportion of our web traffic is robots? - (2007-06-19)
  [1212] What brought YOU to our web site? - (2007-06-01)
  [1207] Simple but effective use of mod_rewrite (Apache httpd) - (2007-05-27)
  [1198] From Web to Web 2 - (2007-05-21)
  [1186] Two new pages / sites - (2007-05-14)
  [1184] Finding resources - some pointers - (2007-05-13)
  [1177] Sorting out for a site map - (2007-05-05)
  [1104] Drawing dynamic graphs in PHP - (2007-03-09)
  [1055] Above the fold - (2007-01-28)
  [1029] Our search engine placement is dropping. - (2007-01-11)
  [1015] Search engine placement - long term strategy and success - (2006-12-30)
  [994] Training on Cascading Style Sheets - (2006-12-17)
  [976] Santa at the station - (2006-12-09)
  [916] Driving customers away - (2006-11-07)
  [893] Visibility - (2006-10-14)
  [800] Effective web campaign? - (2006-07-12)
  [767] Finding the language preference of a web site visitor - (2006-06-18)
  [757] Horse and Python training - (2006-06-12)
  [732] Where is a web site visitor browsing from - (2006-05-24)
  [718] Protecting images from theft - (2006-05-12)
  [681] Mirroring a dynamic site - (2006-04-12)
  [658] Keeping the visitors happy and browsing - (2006-03-26)
  [649] Denial of Service ''attack'' - (2006-03-17)
  [533] Bigger Box Campaign - (2005-12-18)
  [528] Getting favicon to work - avoiding common pitfalls - (2005-12-14)
  [510] Dynamic Web presence - next generation web site - (2005-11-29)
  [492] New Navigation Aid - Launch of My Wellho - (2005-11-11)
  [414] Form Madness - (2005-08-14)
  [376] What brings people to my web site? - (2005-07-13)
  [369] CMS - the minefield of Choices - (2005-07-05)
  [348] Graveyard pages - (2005-06-15)
  [347] Frightening and from-friend viruses and spams - (2005-06-14)
  [322] More maps - (2005-05-23)
  [320] Ordnance Survey - using a 'Get a map' - (2005-05-22)
  [314] What language is this written in? - (2005-05-17)
  [311] Growth pains - (2005-05-14)
  [288] Colour blindness for web developers - (2005-04-22)
  [284] The Iconish language - (2005-04-19)
  [278] Cover all the options - (2005-04-13)
  [276] An apology to Mr Boneparte - (2005-04-11)
  [274] Our most popular resources - (2005-04-10)
  [268] Information request forms, cleaning up spam - (2005-04-05)
  [261] Putting a form online - (2005-03-29)
  [259] Responding to spam - (2005-03-27)
  [222] Who are all these visitors? - (2005-02-20)
  [202] Searching for numbers - (2005-02-04)
  [197] Allow for peak traffic on your web site - (2005-02-01)
  [182] Your personal Google ranking - (2005-01-19)
  [179] The hunt for unique words - (2005-01-16)
  [173] Data Mining - (2005-01-09)
  [165] Implementing an effective site search engine - (2005-01-01)
  [142] Colour for access - (2004-12-06)
  [117] A case of case - (2004-11-14)
  [109] URLs - a service and not a hurdle - (2004-11-04)
  [98] No more 'Error 404' pages. Something better. - (2004-10-24)
  [32] Web design platoon - (2004-08-29)
  [23] Skills and responsibilities - (2004-08-22)

resource index - PHP
Solutions centre home page

You'll find shorter technical items at The Horse's Mouth and delegate's questions answered at the Opentalk forum.

At Well House Consultants, we provide training courses on subjects such as Ruby, Lua, Perl, Python, Linux, C, C++, Tcl/Tk, Tomcat, PHP and MySQL. We're asked (and answer) many questions, and answers to those which are of general interest are published in this area of our site.

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2019: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01225 708225 • FAX: 01225 793803 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/solutions/php-sugg ... users.html • PAGE BUILT: Wed Mar 28 07:47:11 2012 • BUILD SYSTEM: wizard