Do you want to help your web site user find what he's looking for on your web site, even if he mis-spells a name or word in a search? PHP provides you with three facilities - soundex, metaphones and Levenshtein distance calculations - which let you compare two words and see how similar they are when written (levenshtein) or spoken (metaphone, soundex).
I've put a demonstration up for you to try -
it's here - using metaphones and levenshtein - here's the "engine" at the heart of the code:
$ident = levenshtein($first,$second);
$meta1 = metaphone($first);
$meta2 = metaphone($second);
if ($ident) {
print "Words are $ident levenshtein steps out<br>";
if ($meta1 == $meta2) {
print "But they sound the same (metaphone $meta1)\n";
} else {
$id = levenshtein($meta1,$meta2);
print "They sound different too - metaphones ";
print "$meta1 and $meta2 are $id steps out\n";
}
} else {
print "Words are identical\n";
}
The
complete source code is available too if you want to get in deep.
Having learnt how to see if two words are similar, you'll want to know how to make lots of comparisons against a single word when you're writing a search algorithm. That's another day's story perhaps, but it's something that we do as a matter of routine by keeping a database table of metaphones ....
(written 2006-03-11 06:26:58)
Associated topics are indexed under
H107 - String Handling in PHP [2629] Curly braces within double quoted strings in PHP - (2010-02-09)
[2238] Handling nasty characters - Perl, PHP, Python, Tcl, Lua - (2009-06-14)
[2165] Making Regular Expressions easy to read and maintain - (2009-05-10)
[2046] Finding variations on a surname - (2009-02-17)
[1799] Regular Expressions in PHP - (2008-09-16)
[1613] Regular expression for 6 digits OR 25 digits - (2008-04-16)
[1603] Do not SHOUT and do not whisper - (2008-04-06)
[1533] Short and sweet and sticky - PHP form input - (2008-02-06)
[1372] A taster PHP expression ... - (2007-09-30)
[1336] Ignore case in Regular Expression - (2007-09-08)
[1195] Regular Express Primer - (2007-05-20)
[1058] PHP Regular expression to extrtact link and text - (2007-01-31)
[1008] Date conversion - PHP - (2006-12-26)
[728] Looking ahead and behind in a Regular Expression - (2006-05-22)
[716] Evaluating arithmetic expressions in configuration files - (2006-05-10)
[608] Don't expose your regular expressions - (2006-02-15)
[589] Robust PHP user inputs - (2006-02-03)
[574] PHP - dividing a string up into pieces - (2006-01-23)
[560] The fencepost problem - (2006-01-10)
[558] Converting between acres and hectares - (2006-01-08)
[493] Running a Perl script within a PHP page - (2005-11-12)
[463] Splitting the difference - (2005-10-13)
[422] PHP Magic Quotes - (2005-08-22)
[337] the array returned by preg_match_all - (2005-06-06)
[54] PHP and natural sorting - (2004-09-19)
[31] Here documents - (2004-08-28)
Q110 - Object Orientation and General technical topics - Programming Algorithms [2617] Comparing floating point numbers - a word of caution and a solution - (2010-02-01)
[2586] And and Or illustrated by locks - (2010-01-17)
[2509] A life lesson from the accuracy of numbers in Excel and Lua - (2009-11-21)
[2259] Grouping rows for a summary report - MySQL and PHP - (2009-06-27)
[2189] Matching disparate referencing systems (MediaWiki, PHP, also Tcl) - (2009-05-19)
[1949] Nuclear Physics comes to our web site - (2008-12-17)
[1840] Validating Credit Card Numbers - (2008-10-14)
[1391] Ordnance Survey Grid Reference to Latitude / Longitude - (2007-10-14)
[1187] Updating a page strictly every minute (PHP, Perl) - (2007-05-14)
[1157] Speed Networking - a great evening and how we arranged it - (2007-04-21)
[227] Bellringing and Programming and Objects and Perl - (2005-02-25)
[202] Searching for numbers - (2005-02-04)
W603 - Web and Intranet - Server Side Technologies [2282] Checking robots.txt from Python - (2009-07-12)
[2055] Effect on server when memory runs out and swapping starts - (2009-02-26)
[1749] Using server side and client side programming together - (2008-08-11)
[1615] PHP training courses every month - (2008-04-18)
[1554] Online hotel reservations - Melksham, Wiltshire (near Bath) - (2008-02-24)
[1365] Korn Shell scripts on the web - (2007-09-25)
[1355] .php or .html extension? Morally Static Pages - (2007-09-17)
[1031] robots.txt - a clue to hidden pages? - (2007-01-13)
[1020] Parallel processing in PHP - (2007-01-03)
[732] Where is a web site visitor browsing from - (2006-05-24)
[653] Easy feed! - (2006-03-21)
Some other Articles
PHP - London course, Melksham Course, Evening courseLost CamelUsing a MySQL database from PerlIf it's Sunday, it must be BedwynHow similar are two wordsSimple but rugged form handling demoTraining Centre PicturesProgress bars and other dynamic reportsA pile of sand? Where do we stand?Carnival