Home Accessibility Courses Twitter The Mouth Facebook Resources Site Map About Us Contact
 
Python and Tcl - public course schedule [here]
Private courses on your site - see [here]
Please ask about maintenance training for Perl, PHP, Lua, etc
 
Ruby - a teaching example showing many of the language features in short but useful program

Although the main publicity and driver for the Ruby language has been the Rails web framework (see previous article here), it's an excellent data manipulation language too - with many of the short and efficient coding techniques that you would have available to you in Perl, yet additionally with an object oriented design that's neat and easy built in from its conception, making it easy to code and - importantly - easy to debug and maintain.

On the course that's just finished, my delegates were going to be using Ruby primarily within Rails, but also for some substantial data manipulation work away from the web. That's a good approach, as it allows them to reuse code across the two different (overnight batch and web interactive) environments. And it's also excellent for data monging - reformatting and filtering, every offer short tasks and in some environments, tasks that will change in what they need to do.

Here's a question that I set my delegates:

"Given a file of staff members names and their skills - such as

  morris Perl Java PHP Tcl/Tk
  nigel PHP Python Java Perl
  orpheus MySQL Ruby,Tcl/Tk XML
  peter PHP Java Perl


produce a sorted list of skills, with a list of team members names alongside that skiil - here's a part of the report:

  Python: adam barry harry hazel ken leane nigel olivia rupert
  Ruby: barbara charles cherry ed florence hazel ivan jenny kerry
        len margaret nina orpheus petra que


If you want to try this question for yourself before I do a step by step answer, you may download the data from [here]

Solution

Open the file, creating a file handle object. Each time you run the gets method on this file handle object, you'll get the next line back and you'll get back a nil when you have run out of data:
  fh = File.new "requests.xyz"

We'll need a table to store our skills, and alongside each skill a list of people with that skill. We'll use a hash, as that's a key / value pair table in Ruby, where the key can be just about anything. Hash.net would create a new, empty hash. However, there's a shorthand for that which is {}, and that's what I've used in my example:
  langs = {}

We want to loop through all the data in the incoming file, so I've written a line of code which sets up a loop that keeps reading lines of data while they're available, using the file handle already opened. Once we get a nil back, the loop will exit.
  while fh.gets
You'll notice that I've not explicitly saved the line that was read into a variable. That's because I'm using a feature sometimes known as topicalisation. In many circumstances, where you don't give Ruby an explicit variable name, it will assume you mean a special global variable called $_. So in this case, each line in turn is read into that variable.

All lines read will end with a new line, which you'll want to remove. You could use chop or chomp. These methods both run on a string object ... and if you don't specify that object, they'll run on $_ instead. So
  chomp!
removes the last character from $_. chomp returns a new string - but you can use the alternative chomp! as I have done, which alters the incoming string object in situ - i.e. I'm altering the value in $_ by removing any new line on the end of it.

My data file has a series of space delimited fields on each line, but if you look at the raw data carefully you'll find that there are multiple spaces sometimes, and sometimes a comma appears as well as, or instead of, space. We'll deal with that by splitting (our string in $_ as we've not given an object on which the method is to run) at a "regular expression" - i.e. at a pattern:
  fields = split(/[, ]+/)
Regular expressions are usually written between slashes, and contain a number of elements each of which is followed by a count of the number of time that element is to occur. In this example, [, ] specifies that we're looking for a comma or a space (it's a character group when you come to learn about regular expressions), and + specifies that we're looking for one or more occurrences of that group. Finally, split returns an array of strings that we've saved into the variable called fields.

Shifting an array returns the first element to us, and moves all the others up. So:
  name = fields.shift
will strip off the first element (that's the staff member's name) and save it into a new variable, reducing the length of the array by one. Perfect!

We're left with the fields array containing purely the list of skills (languages) and so we can loop through that list of skills in turn. This is the for loop - taking each member of an array into a separate variable in turn (strictly, copying a reference to the object in each member of the array):
  for lang in fields

We're going to build up our table of languages in the hash we created at the top of our program. If the current language is one we've just found for the first time in this run of the program, we need to create an empty array within the hash:
  langs[lang] ||= []
We're using the "lazy or" operator ||,and setting up and assigning an empty array object into the hash element in question, thus creating that element. Languages such as Perl will usually assume an empty array (ok - it's called a list in Perl!) in such a circumstance through what's known as autovivification, but in Ruby you have to specifically create an object before you can modify it. To some extent, that's a side effect of the underlying object oriented basis of Ruby, but it also provides very practical assistance during the development phase of programs where variable name mis-spellings and failures to initialise are quickly rooted out.

Now that we've ensured that there is a member of the hash for the language skill of the currently named person, we can simply add their name onto the end of that list.
  langs[lang].push name
No need to look at a count of how many names there are already, as the push method simply says "add onto the end".

And that's about it for the code to set up the hash of lists. All that remains is for us to close the loop through the skills for each person:
  end
and to close the loop that reads in all the lines of the file.
  end

The purist will be asking "should we now close the input file?". Maybe, but Ruby will do it for us automatically when we exit the program after a few more lines anyway.

Having read in all the lines, let's output our new report, language by language.

We want to take the languages in some sort of order - and alphabetic seems appropriate. However, a hash always appears jumbled up to the human eye and cannot be sorted (I spend a few minutes showing you why on our Ruby courses). So we'll use the keys method to give us an array of the keys. This can be sorted, so we'll run the sort method on it, and we'll run the following loop with each member of the resulting array in turn:
  for lang in langs.keys.sort

Within each member of my hash, I have an array of people's names, and I want those names sorted in alphabetic order too. In this case, I can sort them in situ using sort! rather than sort (see the pattern here - like chomp and chomp! earlier):
  langs[lang].sort!

And having sorted I can now output my results. And it turns out that the formatting I've decided to do makes this the longest line of the program:
  puts "#{lang}: #{langs[lang].join " "}".gsub(/(.{60,}?)\s/,"\\1\n ")
So - what's all that about? I'm outputting the language, followed by a list of the names of all the people who know the language, that list being constructed from an array of names using join. The result, though, can be a very long line indeed. So I've used a regular expression to find the first white space character after the 60th character, and replace it with a new line and some space to split the line up. Because I used gsub rather than sub, the substitution is then repeated after the next block of 60 characters, and so on until the whole string has been divided up in this way. If we just split every 60 characters, this would be messy with names being split between lines, but by looking forward for the next space we're generating really neat output.

And, finally, we need to close our output loop:
  end

That's a very long explanation of a short piece of code.

"Could I write code like that?" you may ask. In time, and with practise, most people can. Once you're trained, skilled and experienced, code such as this only takes a few minutes to write in Ruby. Indeed, I'll go so far as to say that it's unlikely it could be quicker in another language. It would be significantly slower in Java and much slower in C, but then very large Java systems will probably be more maintainable in their future, and C programs can always run faster if you invest enough time into writing them.

Here's the final code:

  fh = File.new "requests.xyz"
  langs = {}
  
  while fh.gets
    chomp!
    fields = split(/[, ]+/)
    name = fields.shift
    for lang in fields
      langs[lang] ||= []
      langs[lang].push name
    end
  end
  
  for lang in langs.keys.sort
    langs[lang].sort!
    puts "#{lang}: #{langs[lang].join " "}".gsub(/(.{60,}?)\s/,"\\1\n ")
  end


Complete source code (without the comments!) and full sample output also available [here].
(written 2012-06-09, updated 2012-06-16)

 
Associated topics are indexed as below, or enter http://melksh.am/nnnn for individual articles
R107 - Collections (Arrays and Hashes) in Ruby
  [4502] Reading and parsing a JSON object in Ruby - (2015-06-01)
  [4499] Significant work - beyond helloworld in Ruby - (2015-05-27)
  [4368] Shuffling a list - Ruby and Python - (2014-12-28)
  [3435] Sorta sorting a hash, and what if an exception is NOT thrown - Ruby - (2011-09-12)
  [3257] All possible combinations from a list (Python) or array (Ruby) - (2011-04-23)
  [3255] Process every member of an array, and sort an array - Ruby - (2011-04-21)
  [3253] Is this number between? Does this list include? - Ruby - (2011-04-18)
  [2976] Creating, extending, traversing and combining Ruby arrays - (2010-09-30)
  [2621] Ruby collections and strings - some new examples - (2010-02-03)
  [2618] What are Ruby Symbols? - (2010-02-02)
  [2606] Sorting arrays and hashes in Ruby - (2010-01-30)
  [2291] Collection objects (array and hash) in Ruby - (2009-07-16)
  [991] Adding a member to a Hash in Ruby - (2006-12-16)

R109 - Ruby - Strings and Regular Expressions
  [4549] Clarrissa-Marybelle - too long to really fit? - (2015-10-23)
  [4505] Regular Expressions for the petrified - in Ruby - (2015-06-03)
  [4388] Global Regular Expression matching in Ruby (using scan) - (2015-01-08)
  [3758] Ruby - standard operators are overloaded. Perl - they are not - (2012-06-09)
  [3621] Matching regular expressions, and substitutions, in Ruby - (2012-02-23)
  [3424] Divide 10000 by 17. Do you get 588.235294117647, 588.24 or 588? - Ruby and PHP - (2011-09-08)
  [2980] Ruby - examples of regular expressions, inheritance and polymorphism - (2010-10-02)
  [2623] Object Oriented Ruby - new examples - (2010-02-03)
  [2614] Neatly formatting results into a table - (2010-02-01)
  [2608] Search and replace in Ruby - Ruby Regular Expressions - (2010-01-31)
  [2295] The dog is not in trouble - (2009-07-17)
  [2293] Regular Expressions in Ruby - (2009-07-16)
  [1891] Ruby to access web services - (2008-11-16)
  [1887] Ruby Programming Course - Saturday and Sunday - (2008-11-16)
  [1875] What are exceptions - Python based answer - (2008-11-08)
  [1588] String interpretation in Ruby - (2008-03-21)
  [1305] Regular expressions made easy - building from components - (2007-08-16)
  [1195] Regular Express Primer - (2007-05-20)
  [987] Ruby v Perl - interpollating variables - (2006-12-15)
  [986] puts - opposite of chomp in Ruby - (2006-12-15)
  [970] String duplication - x in Perl, * in Python and Ruby - (2006-12-07)

R110 - Ruby - Special Variables and Pseudo-Variables
  [4682] One line scripts - Awk, Perl and Ruby - (2016-05-20)
  [2613] Constants in Ruby - (2010-02-01)
  [2296] Variable scope - what is it, and how does it Ruby? - (2009-07-18)
  [1587] Some Ruby programming examples from our course - (2008-03-21)
  [1586] Variable types in Ruby - (2008-03-21)
  [990] Ruby - Totally Topical - (2006-12-16)


Back to
Ruby on Rails - how it flows, and where the files go
Previous and next
or
Horse's mouth home
Forward to
Ruby - standard operators are overloaded. Perl - they are not
Some other Articles
Melksham - placed 2254 out of 2255. What can be done about it?
Why you should use objects even for short data manipulation programs in Ruby
The five oldest blogs and the horses mouth
Ruby - a teaching example showing many of the language features in short but useful program
Ruby on Rails - how it flows, and where the files go
Cruising on the Mersey Ferry?
Eyes Wide Open
Adding a passcode to a directory
Melksham Visitors Map - Bus routes and train lines to and from the town
4759 posts, page by page
Link to page ... 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96 at 50 posts per page


This is a page archived from The Horse's Mouth at http://www.wellho.net/horse/ - the diary and writings of Graham Ellis. Every attempt was made to provide current information at the time the page was written, but things do move forward in our business - new software releases, price changes, new techniques. Please check back via our main site for current courses, prices, versions, etc - any mention of a price in "The Horse's Mouth" cannot be taken as an offer to supply at that price.

Link to Ezine home page (for reading).
Link to Blogging home page (to add comments).

You can Add a comment or ranking to this page

© WELL HOUSE CONSULTANTS LTD., 2019: 404 The Spa • Melksham, Wiltshire • United Kingdom • SN12 6QL
PH: 01225 708225 • EMAIL: info@wellho.net • WEB: http://www.wellho.net • SKYPE: wellho

PAGE: http://www.wellho.net/mouth/3757_Rub ... ogram.html • PAGE BUILT: Sat May 27 16:49:10 2017 • BUILD SYSTEM: WomanWithCat