Training, Open Source computer languages

PerlPHPPythonMySQLhttpd / TomcatTclRubyJavaC and C++LinuxCSS

Search our site for:
Home Accessibility Courses Diary The Mouth Forum Resources Site Map About Us Contact
Searching for numbers
If I'm searching for a 50kg bag of cement, and the online store only offers 48kg bags, will their search engine find this product and say "is this what you want"? Our own site searches do clever things with alphabetic searches but we're rarely had to do a "near number" hunt on our own behalf ... but we have for client sites.

Is 48 near to 50? Yes. Is 8 near to 10? Maybe, but not so near. Is 1 near to 3? No - almost certainly not. So you can't rely just on difference - indeed 93 is nearer to 100 that 1 is to 3 and the difference is much more.

Algorithm 1.

Let "$h" be the value you have and "$t" being the value you're testing. Then the nearness factor is defined as
abs( ($h + $t) / ($h - $t))
with the larger number being the closest. On this algorithm, an infinite result tells you that two values are numerically identical (so you had better extract that special case first), and higher numbers indicate better matches. Let's see some example factors:
48 and 50 - factor is 49
8 and 10 - factor is 9
1 and 3 - factor is 2
93 and 100 - factor is 27.57

Here's Perl code for searching (yes, we have Perl search training) to work this our:

#!/usr/bin/perl

if ($ARGV[0] == $ARGV[1]) {
print "Parameters are numerically identical\n";
} else {
printf ("factor is %.2f for %s and %s\n",
abs(($ARGV[0]+$ARGV[1])/($ARGV[0]-$ARGV[1])),
$ARGV[0], $ARGV[1]);
}


Algorithm 2

The algorithm above isn't always ideal. If you're searching for phone numbers, for example, it's not helpful. If you've transposed digits values, you'll want to score hits on values that are numerically very different. For this, you'll want to use someting like a Levenshtein distance algorithm. We talk further about this on our web site in the Solutions Centre


(written 2005-02-04 06:06:02)

Commentatorsays ...
Graham:This is an offtopic comment. The software running "The Horse's Mouth" has just been upgraded to block bulk automated offtopic contributions. You may now find your comments are noted as being sent "for approval" even if you're a regular on Opentalk - especially if you comment on an old entry. Nothing personal, just trying to avoid providing free advertising for products I don't actually use or recommend.
(comment added 2005-02-04 07:01:33)
Associated topics are indexed under
G902 - Well House Consultants - Web site techniques, utility and visibility

Back to
0870 telephone numbers
Previous and next
or
Horse's mouth home
Forward to
Holes in on line information

Some other Articles
Fox and Python
PHP5 lets you say no
The confidence to allow public comments
Holes in on line information
Searching for numbers
0870 telephone numbers
Tips for the top
Post course support - part of the service
A new skill may not be quick and easy
Allow for peak traffic on your web site
1694 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

© WELL HOUSE CONSULTANTS LTD., 2008: Well House Manor • 48 Spa Road • Melksham, Wiltshire • United Kingdom • SN12 7NY
PH: 01144 1225 708225 • FAX: 01144 1225 707126 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho