If you want to say does this string "look like" another, without being able to give an explicit string for the 'another', then you're probably looking for a regular expression.
REGULAR EXPRESSION ELEMENTS
Regular expressions come in various flavours, and the flavour used in Ruby is "Perl Style" (i.e. it's similar to Perl, Python and the preg functions in PHP, and differs from Tcl and the ereg functions in PHP).
In summary, a regular expression comprises elements such as:
Literals - exact matches
- letters, digits and some special characters match exactly
- special characters can be matched when preceded by a \
Character groups - any one character from a selection
- [abcyh] any character from the list
- [^abcyh] any character not in the list
- [A-Z0-4] any capital letter of digit 0 through 4
- \s \d \w any space, digit or word character
- \S \D \W any non-space, non-digit or non-word character
- . any character at all (may not match \n)
Counts - apply to previous literal, character or group
- {2,6} 2 to 6 occurrences of previous item
- {4,} 4 or more occurrences of previous item
- {5} exactly 5 occurrences of previous item
- + 1 or more of previous item
- * 0 or more of previous item
- ? 0 or 1 of previous item
Anchors (a.k.a. Zero width assertions)
- ^ match here at start of string or line
- $ match here at end of string or line
- \b match here at word boundary
Miscellany
- ( .... ) grouping for capture and counting
- | alternation; "either / or"
- \1 \2 references back to previous groups.
There are more options - but those are the common ones.
USE OF REGULAR EXPRESSIONS IN RUBY
The =~ operator is the 'match' operator, so I can ask if something looks like a regular expression. The index method on a string will recognise a regular expression and use it to separate. The variable $& contains the matched string, and $1, $2 etc contain matched capture groups.
Example:
places = ["Training in Melksham and elsewhere",
"We are at SN12 6QL (HQ) and SN12 7NY (training centre)",
"And can train you at even if you're at HS7 5LZ"]
# Matching to see whether or not it fits the pattern
places.each do |place|
if place =~ /\b[A-Z]{1,2}\d\w?\s+\d[A-Z]{2}\b/
puts %Q!There's a postcode in "#{place}"!
end
end
# Making use of the matched string
places.each do |place|
if place =~ /\b([A-Z]{1,2}\d\w?)\s+\d[A-Z]{2}\b/
puts %Q!We found #{$&} sorted via #{$1}!
end
end
# More careful extraction - global matching to regular
# expressions is not brilliant except in very recent
# releases, but the index method on string does very well
places.each do |place|
sf = 0
while sfn = place.index(/\b(([A-Z]{1,2}\d\w?)\s+\d[A-Z]{2})\b/,sf)
sf = sfn+1
puts %Q!We found #{$&}!
end
end
Here is the result of running that program:
earth-wind-and-fire:~/ruby/r109 grahamellis$ ruby rex1.rb
There's a postcode in "We are at SN12 6QL (HQ) and SN12 7NY (tr ..."
There's a postcode in "And can train you at even if you're at HS7 5LZ"
We found SN12 6QL sorted via SN12
We found HS7 5LZ sorted via HS7
We found SN12 6QL
We found SN12 7NY
We found HS7 5LZ
earth-wind-and-fire:~/ruby/r109 grahamellis$
Ruby also allows you to produce compiled regular expression objects, and matchdata objects, and use those for more efficient and more sophisticated matching. And methods such as split, too, can use regular expressions.
DATA.read.each_line do |host|
print host
stuff = host.split(/[\s,]+/)
ip = stuff.shift
stuff.each do |name|
print "#{ip} may be called #{name}\n"
end
end
__END__
192.168.200.66 earth
192.168.200.67 fire, sea pickle
192.168.200.68 wind blows
Results:
earth-wind-and-fire:~/ruby/r109 grahamellis$ ruby rex2.rb
192.168.200.66 earth
192.168.200.66 may be called earth
192.168.200.67 fire, sea pickle
192.168.200.67 may be called fire
192.168.200.67 may be called sea
192.168.200.67 may be called pickle
192.168.200.68 wind blows
192.168.200.68 may be called wind
192.168.200.68 may be called blows
earth-wind-and-fire:~/ruby/r109 grahamellis$
Although we've been using / delimiters for regular expressions, you can if you prefer user %r! through ! (or replace the ! with any other special character) in much the same was as %Q and %q
See also
Programming in Ruby - Course
Please note that articles in this section of our
web site were current and correct to the best of our ability when published,
but by the nature of our business may go out of date quite quickly. The
quoting of a price, contract term or any other information in this area of
our website is NOT an offer to supply now on those terms - please check
back via
our main web site