String Handling in PHP example from a Well House Consultants training course
More on String Handling in PHP
[link]
This example is described in the following article(s): • Looking ahead and behind in a Regular Expression - [link] • Hot answers in PHP - [link] • Helping new arrivals find out about source code examples - [link] |
If you're searching for a page where you can try this code,
select here |
Source code: spell.php Module: H107
<?php
/* This code demonstratates PHP regular expressions
in heavy use - the application is to spell check a
file and report on any spelling mistakes that are
detected. Please note that this is a proof of
concept and the algorithms aren't 100% accurate.
You MUST check the code for suitability before you
use it */
## Sort function for use later
function bycount($first,$second) {
global $oops;
$diff = $oops[$second] - $oops[$first];
if (! $diff) $diff = strcmp($first,$second);
return $diff;
}
######################################################
# Read in the data for checking and ensure that there's
# no attempt by the user to get to the wrong places when
# used on the live site!
######################################################
$dopageerr = "";
$page = "";
$local = 0;
if (ereg("192\.168\.200",$_SERVER[REMOTE_ADDR])) $local = 1;
if (ereg("100\.100\.100",$_SERVER[REMOTE_ADDR])) $local = 1;
if (ereg("127\.0\.0",$_SERVER[REMOTE_ADDR])) $local = 1;
if ($_REQUEST[dopage]) {
if (ereg('(^/)|(\.\.)',$_REQUEST[page]) and $local==0) {
$dopageerr = "<b>File path must be relative and not ascend.".
" Full http:// URL also acceptable</b><br>";
} else {
$page = @file_get_contents($_REQUEST[page]);
if (! $page)
$dopageerr =
"<b>Sorry - unable to read this resource</b><br>";
}
}
if ($_REQUEST[dotext]) {
$page = stripslashes($_REQUEST[text]);
}
######################################################
# If we have valid data to check, read in the dictionary
######################################################
if ($page) {
$bad = $good = 0;
$oops = array();
$wtab = array();
$files = array("words","connectives","propernames");
foreach ($files as $section) {
foreach (file("$_SERVER[DOCUMENT_ROOT]/../words/$section") as $word) {
$word = strtolower(trim($word));
$wtab[$word] = 1;
}
}
foreach (file("extrawords") as $word) {
if (ereg("^#",$word)) continue;
$word = strtolower(trim($word));
$wtab[$word] = 1;
}
######################################################
# Strip out Javascript and tags from source to check
######################################################
$page = preg_replace('/<script(.*?)<\/script>/is'," ",$page);
$page = preg_replace('!</td>!i'," ",$page); # strip_tags bug
$page = preg_replace('!<(div|br)[^>]*>!is'," ",$page); # div bug
$page = strip_tags($page);
######################################################
# Split clean source into words. A ' is allowed when it
# is embedded within a word
######################################################
$elements = preg_split("/\b(?<!')(?!')/",$page);
######################################################
# Loop through each element and check that, if it contains
# a letter, that it's a valid word.
######################################################
foreach ($elements as $cell) {
if (ereg("[[:alpha:]]", $cell)) {
$ok = 0;
if ($wtab[strtolower($cell)]) {
$ok = 1;
} else {
# Not a valid word. Perhaps it's a derivative?
# Allow words with leading / trailing (but not embedded) quotes
ereg("[[:alnum:]]+.*[[:alnum:]]",$cell,$c2);
$celld = $c2[0];
if ($wtab[strtolower($celld)]) $ok = 1;
# allow nails as well as nail, sleeved as well as sleeve
if (eregi('(.*)(s|d)$',$celld,$ig)) {
if ($wtab[strtolower($ig[1])]) $ok = 1;
}
# allow miner as well as mine
if (eregi('(.*e)r$',$celld,$ig)) {
if ($wtab[strtolower($ig[1])]) $ok = 1;
}
# allow ending, endings, ended, etc as well as end
if (eregi("(.*)(ings?|es|er|ed|'s)\$",$celld,$ig)) {
if ($wtab[strtolower($ig[1])]) $ok = 1;
}
# allow tipping as well as tip, fanning as well as fan
if (eregi('(.*)([tnpl])(ings?|ed)$',$celld,$ig)) {
if ($wtab[strtolower($ig[1])]) $ok = 1;
}
# allow placing as well as place
if (eregi('(.*)ing$',$celld,$ig)) {
if ($wtab[strtolower($ig[1])."e"]) $ok = 1;
}
# allow copied as well as copy
if (eregi('(.*)ied$',$celld,$ig)) {
if ($wtab[strtolower($ig[1])."y"]) $ok = 1;
}
# allow copies as well as copy
if (eregi('(.*)ies$',$celld,$ig)) {
if ($wtab[strtolower($ig[1])."ey"]) $ok = 1;
if ($wtab[strtolower($ig[1])."y"]) $ok = 1;
}
}
if ($ok == 0) {
$oops[strtolower($cell)]++;
$cell = "<b><font color=red>$cell</font></b>";
$bad++;
} else {
$good++;
}
}
$result .= $cell;
}
$page = nl2br($result);
$page = preg_replace('!(<br />\s*){3,}!s',"<br /><br />\n",$page);
$oopsline = "";
uksort($oops,bycount);
$ohdear = "";
foreach (array_keys($oops) as $oh) {
$wordflow = $oops[$oh] == 1 ? "once" : "$oops[$oh] times";
$ohdear .= "$oh - $wordflow<br>";
}
}
############## The web page generator
?>
<html><head><title> Web page spell check. <?= htmlspecialchars
($_REQUEST[page]) ?></title>
<body><h1>Spellcheck a web page or block of html</h1>
--> Link - <a
href=/solutions/php-web-page-and-html-spell-checker.html
>description of this application</a><br>
--> Link - <a href=/resources/ex.php4?item=h107/spell.php>source
code</a><br><br>
<?php if ($page) { ?>
--> Link - <a href=#refill>form to resubmit current
text</a><br>
--> Link - <a href=?>test a new piece of text</a><br><br>
<b>On <?= $_REQUEST[page] ?> there were <?= $bad ?> words
that were doubted out of <?= $bad+$good ?>:</b><blockquote>
<?= $ohdear ?></blockquote><br>
<b>Here is the full text of the page with the doubted words
in red</b><br>
<?= $page ?>
<?php } ?><hr>
<b><a name=refill>Select the source you would like to test</a>
</b><br><br> You may enter the full URL of a page you would
like checked, or a block of HTML source. This script will check
the spelling of the words in the page against its English
dictionary and highlight words that do not match. <br><br>
<form method=POST>EITHER a page to test:<br>
<input name=page value="<?=
htmlspecialchars(stripslashes($_REQUEST[page]))
?>" size=60><br><?= $dopageerr ?>
<input type=submit name=dopage value=go><br><br>
OR some HTML to check:<br>
<textarea name=text rows=20 cols=60><?=
htmlspecialchars(stripslashes($_REQUEST[text]))
?></textarea><br>
<input type=submit name=dotext value=go><br><br>
</form><br><br><hr>
This page is provided by Well House Consultants who provide
training course in Perl, Python, PHP, MySQL, Linux and Tcl/Tk
from their base in Melksham, Wiltshire, England. We work
worldwide, though ... this script was written during a quiet
couple of hours in Saudi Arabia. Please consider our <a
href=/course/phfull.html>PHP course</a> if you want to learn
how to write web applications such as this spell checker</a>.
<br><br>
Copyright <?= date("Y") ?>,
<a href=http://www.wellho.net/>Well House Consultants.</a>
</body>
</html>
Learn about this subject
Books covering this topic
Yes. We have over 700 books in our library.
Books
covering PHP are listed here and when you've selected a
relevant book we'll link you on to Amazon to order.
Other Examples
This example comes from our "String Handling in PHP" training module. You'll find a description of the topic and some
other closely related examples on the
"String Handling in PHP" module index page.
Full description of the source code
You can learn more about this example on the training courses listed on this page,
on which you'll be given a full set of training notes.
Many other training modules are available for download (for limited use) from
our download centre under an
Open Training Notes License.
Other resources
• Our
Solutions centre provides a number of longer technical articles.
• Our
Opentalk forum archive provides a question and answer centre.
•
The Horse's mouth provides a daily tip or thought.
• Further resources are available via the
resources centre.
• All of these resources can be searched through through our
search engine
• And there's a global index
here.
Web site author
This web site is written and maintained by
Well House Consultants.
Purpose of this website
This is a sample program, class demonstration or answer from a
training course. It's main purpose
is to provide an after-course service to customers who have attended our
public private or
on site courses, but the examples are made
generally available under conditions described below.
Conditions of use
Past attendees on our training courses are welcome to use individual
examples in the course of their programming, but must check
the examples they use to ensure that they are suitable for their
job.
Remember that some of our examples show you how not to do
things - check in your notes. Well House Consultants take no responsibility
for the suitability of these example programs to customer's needs.
This program is copyright Well House Consultants Ltd. You are
forbidden from using it for running your own training courses
without our prior written permission. See
our
page on courseware provision for more details.
Any of our images within this code may NOT be reused on a public URL without our
prior permission. For Bona Fide personal use, we will often grant you permission provided
that you provide a link back. Commercial use on a website will incur a license fee for
each image used - details on request.