Troubleshooters.Com, Code Corner and Ruby Revival Present

Ruby Basic Tutorial
Copyright (C) 2005 by Steve Litt
Note: All materials in Ruby Revival are provided AS IS. By reading the materials in Ruby Revival you are agreeing to assume all risks involved in the use of the materials, and you are agreeing to absolve the authors, owners, and anyone else involved with Python Patrol of any responsibility for the outcome of any use of these materials, even in the case of errors and/or omissions in the materials. If you do not agree to this, you must not read these materials.
To the 99.9% of you honest readers who take responsibility for your own actions, I'm truly sorry it is necessary to subject all readers to the above disclaimer.


CONTENTS

About this Tutorial

This is a Ruby tutorial for one not knowing Ruby. Therefore, we use many constructs and styles that, while familiar to programmers and intuitive to beginners, are not optimal for Ruby. A companion document, Ruby the Right Way, discusses how to use Ruby to full advantage and have your code compatible with the vast body of Ruby code out there.

Ruby can be used as a fully object oriented language, in which case you'd create classes and objects to accomplish everything. However, it can be used quite nicely with only the objects and classes that ship with Ruby, in which case it can be used as a procedural language, except that functions are typically methods of the program's variables.

If that doesn't make any sense to you, don't worry, it's just a way of saying that Ruby can be very easy to learn and use.

Even if you want to become a Ruby expert, you need to learn the basic functionality before you can become a Ruby OOP ninja. This tutorial gives you those basics.


Hello World

This is the simplest possible Ruby program, hello.rb. As you'd expect, it prints "Hello World" on the screen. Be sure to set it executable.

#!/usr/bin/ruby
print "Hello World\n"

Although this program works as expected, it goes against the philosophy of Ruby because it's not object oriented. But as a proof of concept that Ruby's working on your computer, it's just fine.

Besides print, there's also a puts keyword. The difference is that puts automatically inserts a newline at the end of the string being printed, whereas print does not. In other words, puts is more convenient, but print is necessary if separate statements print to the same line. Througout this tutorial we'll use both print and puts.

Loops

Let's count to 10...

#!/usr/bin/ruby
for ss in 1...10
print ss, " Hello\n";
end

The elipses (...) indicate the range through which to loop. The for is terminated by an end. You don't need braces for a loop. Whew!

The following is the output:

[slitt@mydesk slitt]$ ./loop.rb
1 hello
2 hello
3 hello
4 hello
5 hello
6 hello
7 hello
8 hello
9 hello
[slitt@mydesk slitt]$

Notice that it stops on 9. The number following the elipses causes termination at the top of the loop. The 1...10 means 1 TO BUT NOT INCLUDING 10, it does NOT mean 1 through 10. Please remember this when using Ruby loops.

NOTE

There are actually two versions of the elipses operator, the three period version as shown previously, and the two period version. The two period version is inclusive. In other words, 1...3 means 1 up to but not including 3, while 1..3 means one through 3.

By using the appropriate version of the elipses operator you can save having to code convoluted end conditions.


Now let's iterate through an array.

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
for ss in 0...presidents.length
print ss, ": ", presidents[ss], "\n";
end

We defined an array of presidents using a Perl like syntax (except we used brackets instead of parens), and we iterated from 0 (Ruby is 0 based, like most languages), through the final subscript in the presidents array. Remember, the triple dot stops before executing the final number, which is why it doesn't count to 6. If you had wanted it to count to 6 (which in this case would have walked off the end of the array), you would have used the double dot. The output of the preceding code follows:

[slitt@mydesk slitt]$ ./loop.rb
0: Ford
1: Carter
2: Reagan
3: Bush1
4: Clinton
5: Bush2
[slitt@mydesk slitt]$

Now lets iterate backwards through the array, using the fact that array[-1] is the last item, array[-2] is the second to last, etc:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
for ss in 0...presidents.length
print ss, ": ", presidents[presidents.length - ss - 1], "\n";
end

The preceding program produces the following output:

[slitt@mydesk slitt]$ ./hello.rb
0: Bush2
1: Clinton
2: Bush1
3: Reagan
4: Carter
5: Ford
[slitt@mydesk slitt]$

Of course, the preceding was a very contrived example just to demonstrate negative subscripts. Note that the subscripts no longer match the presidents, which is probably not what you want. You probably want to do it like you'd do it in any language -- tweak the subscript to the end value and decrease it:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
for ss in 0...presidents.length
ss_tweaked = presidents.length - ss - 1
print ss_tweaked, ": ", presidents[ss_tweaked], "\n"
end

If you're familiar with C, Pascal or Perl, you're probably dissappointed you couldn't just use presidents.length...0. Backwards iteration doesn't work in Ruby -- it must iterate up.

Iterators and Blocks

Another way to loop through an array is to use an iterator (in red in the following code) and a block (in blue in the following code:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each {|prez| puts prez}

In the preceding code, the block argument (prez) contains the current array element, and everything else until the closing brace contains code to operate on the block argument. The block argument is always enclosed in vertical lines (pipe symbols). The following is the output of the preceding code:

[slitt@mydesk slitt]$ ./hello.rb
Ford
Carter
Reagan
Bush1
Clinton
Bush2
[slitt@mydesk slitt]$


The block needn't be on one line:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each {
|prez|
puts prez
}

As shown in the previous examples, you can define the block by enclosing it in curly braces. You can also define it by enclosing it in a do and an end, where the do replaces the opening brace, and the end replaces the closing brace:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each do
|prez|
puts prez
end

Personally, I greatly prefer the do/end syntax for multiline blocks, because as a Perl/C/C++ guy I have a very different perception of braces than their limited use in Ruby, and also because of all the brace placement religious wars I've endured (I'm a Whitesmith type guy myself). However, on short single line blocks, using the braces saves valuable line space. From what I understand, the methods are interchangeable in features and performance, with one small exception...

Speaking of performance, if you declare the block argument outside the block (in other words, make it a local variable), performance improves because Ruby needn't recreate a variable every iteration. HOWEVER, the loop messes with the value of the variable, so it's best to use a specific variable for that purpose, and do not use it for other purposes within the subroutine. Here's an example of using a local variable as a block argument:

#!/usr/bin/ruby
i = -99
puts "Before: " + i.to_s
(1..10).each{|i| puts i}
puts "After : " + i.to_s
[slitt@mydesk slitt]$ ./loop.rb          
Before: -99
1
2
3
4
5
6
7
8
9
10
After : 10
[slitt@mydesk slitt]$

If you use a local variable for a block argument, do so only in loops with huge numbers of iterations, and use only variables that are specifically intended to serve as block arguuments and nothing else.

A Difference Between {} and do/end

As mentioned, there's one small difference between brace enclosed blocks and do/end enclosed blocks: Braces bind tighter. Watch this:

#!/usr/bin/ruby
my_array = ["alpha", "beta", "gamma"]
puts my_array.collect {
|word|
word.capitalize
}
puts "======================"
puts my_array.collect do
|word|
word.capitalize
end
[slitt@mydesk slitt]$ ./test.rb
Alpha
Beta
Gamma
======================
alpha
beta
gamma
[slitt@mydesk slitt]$

The braces bound tightly like this:

puts (my_array.collect {|word| word.capitalize})
Whereas do/end bind more loosely, like this:
puts (my_array.collect) do |word| word.capitalize} end
Note that the latter represents a syntax error anyway, and I've found no way to coerce do/end into doing the right thing simply by using parentheses. However, by assigning the iterator's results to a new array, that array can be used. It's one more variable and one more line of code. If the code is short, use braces. If it's long, the added overhead is so small a percentage that it's no big deal:

#!/usr/bin/ruby
my_array = ["alpha", "beta", "gamma"]
puts my_array.collect {
|word|
word.capitalize
}
puts "======================"
new_array = my_array.collect do
|word|
word.capitalize
end
puts new_array
[slitt@mydesk slitt]$ ./test.rb
Alpha
Beta
Gamma
======================
Alpha
Beta
Gamma
[slitt@mydesk slitt]$

Generally speaking, if you want to directly use the result of iterators, use braces. For longer blocks, do/end is more readable, and the overhead for the extra variable and line of code is trivial.

while Loops

All the loops previously discussed looped through either an array or a set of numbers. Sometimes you need a more generic loop. That's when you use a while loop:

#!/usr/bin/ruby
ss = 4
while ss > 0
puts ss
ss -= 1
end
puts "======================"
while ss < 5
puts ss
ss += 1
break if ss > 2
end
puts "======================"
ss = 5
while ss > 0
puts ss
ss -= 2
if ss == 1
ss += 5
end
end
[slitt@mydesk slitt]$ ./loop.rb
4
3
2
1
======================
0
1
2
======================
5
3
6
4
2
[slitt@mydesk slitt]$

The first while loop iterated from 4 down to 1, quitting when ss became 0 and hit the while condition. The second loop was intended to iterate up to 4 and quit when 5 was encountered, but a break statement inside the loop caused it to terminate after printing 2 and then incrementing to 3. This demonstrates the break statement.

The third loop was intended to loop from 5 down to 1, quitting after printing 1 and then decrementing. However, the statement in the body of the loop added 5 when it reached 1, pushing it back up to 6, so it had to count down again. On the second countdown, the numbers were even, so it didn't trigger the if statement. This shows that unlike Pascal, it's OK to tamper with the loop variable inside the loop.

Branching

Looping is one type of flow control in pure procedural languages. The other is branching. The following program implements an array called democrats and another called republicans . Depending on the command line argument, the program prints either the democratic presidents since 1974, the republican presidents since 1974, or an appropriate error message.

#!/usr/bin/ruby
democrats = ["Carter", "Clinton"]
republicans = ["Ford", "Reagan", "Bush1", "Bush2"]
party = ARGV[0]
if party == nil
print "Argument must be \"democrats\" or \"republicans\"\n"
elsif party == "democrats"
democrats.each { |i| print i, " "}
print "\n"
elsif party == "republicans"
republicans.each { |i| print i, " "}
print "\n"
else
print "All presidents since 1976 were either Democrats or Republicans\n"
end

Note the if, elsif, else and end keywords, and how they delineate the branching. Note also the democrats.each syntax, which is a very shorthand way of iterating through an array, assuming what you want to do to each element can be stated succinctly.

One last note. The error handling in the preceding would be much better handled by exceptions, but they haven't been covered yet.

Like Perl, the if keyword can follow the action instead of preceding it:

#!/usr/bin/ruby
democrats = ["Carter", "Clinton"]
republicans = ["Ford", "Reagan", "Bush1", "Bush2"]
party = ARGV[0]
if party != nil
democrats.each { |i| print i, " "} if party == "democrats"
republicans.each { |i| print i, " "} if party == "republicans"
print "All presidents since 1976 were either Democrats or Republicans\n"\
if (party != "democrats" && party != "republicans")
end

The preceding is a very contrived program to showcase using the if keyword after the action. Note the following:
  1. The if keyword must be on the same line as the action
  2. Only a single action can precede the if keyword. Multiple actions separated by semicolons will do quite unexpected things.

Containers

Containers are entities that contain other entities. Ruby has two native container types, arrays and hashes. Arrays are groups of objects ordered by subscript, while hashes are groups of key->value pairs. Besides these two native container types, you can create your own container types.

Arrays

You've already seen how to initialize an array and how to use the each method to quickly iterate each element:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.each { |i| print i, "\n"}


[slitt@mydesk slitt]$ ./array.rb          
Ford
Carter
Reagan
Bush1
Clinton
Bush2
[slitt@mydesk slitt]$

Now let's manipulate the array, starting by deleting the last three presidents:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.pop
presidents.pop
presidents.pop
presidents.each { |i| print i, "\n"}

The pop method deletes the final element. If you were to assign the pop method to a variable, it would store that last element and then delete it from the array. In the preceding code, you pop the last three presidents. Here is the result:

[slitt@mydesk slitt]$ ./array.rb          
Ford
Carter
Reagan
[slitt@mydesk slitt]$

Now let's prepend the previous three presidents, Kennedy, Johnson and Nixon:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.pop
presidents.pop
presidents.pop
presidents.unshift("Nixon")
presidents.unshift("Johnson")
presidents.unshift("Kennedy")
presidents.each { |i| print i, "\n"}

The result is as expected:

[slitt@mydesk slitt]$ ./array.rb          
Kennedy
Johnson
Nixon
Ford
Carter
Reagan
[slitt@mydesk slitt]$

However, you might not like the idea of prepending in the reverse order. In that case, prepend all three at once:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
presidents.pop
presidents.pop
presidents.pop
presidents.unshift("Kennedy", "Johnson", "Nixon")
presidents.each { |i| print i, "\n"}

Ruby arrays have methods shift, unshift, push, and pop:

METHOD
ACTION
ARGUMENT
RETURNS
push
Appends its argument to the end of the array.
Element(s) to be appended to end of the array.
A string consisting of the concatination of all non-nil elements in the array AFTER the action was taken.
pop
Returns the last element in the array and deletes that element.
None.
The last element of the array.
shift
Returns the first element of the array, deletes that element, and shifts all other elements down one location to fill its empty spot.
None.
The first element in the array.
unshift
Shifts all elements of the array up one, and places its argument at the beginning of the array.
Element(s) to be prepended to start of array.
A string consisting of the concatination of all non-nil elements in the array AFTER the action was taken.

You can assign individual elements of an array:
#!/usr/bin/ruby
presidents = []
presidents[2] = "Adams"
presidents[4] = "Madison"
presidents[6] = "Adams"
presidents.each {|i| print i, "\n"}
print "=======================\n"
presidents[6] = "John Quincy Adams"
presidents.each {|i| print i, "\n"}
print "\n"

The preceding code produces this output:

[slitt@mydesk slitt]$ ./array.rb
nil
nil
Adams
nil
Madison
nil
Adams
=======================
nil
nil
Adams
nil
Madison
nil
John Quincy Adams

[slitt@mydesk slitt]$

The length of the array is the determined by the last initialized element, even if that element was initialized to nil. That can be very tricky, especially because if you read past the end of the array it returns nil. Be careful.

You can insert an element by assignment, as shown in the preceding code. If you assign to an element that already exists, you simply change its value, as we changed "Adams" to "John Quincy Adams".

Another thing you can do is get a slice of an array.

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
p123=presidents[1..3]
p123.each { |i| print i, "\n"}

Notice this time I used the two period version of the elipses operator, so you'd expect it to list Carter, Reagan and Bush1, and indeed it does. The preceding slice produces the following output:

[slitt@mydesk slitt]$ ./array.rb
Carter
Reagan
Bush1
[slitt@mydesk slitt]$

Another way to slice an array is with a start and a count instead of a range. The following is another way to write basically the same code as the preceding code:

#!/usr/bin/ruby
presidents = ["Ford", "Carter", "Reagan", "Bush1", "Clinton", "Bush2"]
p123=presidents[1,3]
p123.each { |i| print i, "\n"}


The preceding used a starting subscript of 1 and a count of 3, instead of  a range 1 through 3.

You can also use slices in insertions, deletions and replacements, and you can insert/replace with elements or whole arrays. Our first example deletes unneeded elements from the middle of an array:

#!/usr/bin/ruby
numbers = ["one", "two", "buckle", "my", "shoe", "three", "four"]
numbers.each { |i| print i, "\n"}
print "=====================\n"
numbers[2,3]=[]
numbers.each { |i| print i, "\n"}

In the preceding, we have extraneous elements "buckle", "my" and "shoe", which we want to delete. So we replace element 2, for a count of 4 (element 2 and the next 2, in other words), to an empty array, effectively deleting them. The result follows:


[slitt@mydesk slitt]$ ./array.rb
one
two
buckle
my
shoe
three
four
=====================
one
two
three
four
[slitt@mydesk slitt]$

Next, let's replace three numeric representations with their spelled out equivalents, plus add in another element we had forgotten:
#!/usr/bin/ruby
numbers = ["one", "two", "3", "4", "5", "seven"]
numbers.each { |i| print i, "\n"}
print "=====================\n"
numbers[2,3]=["three", "four", "five", "six"]
numbers.each { |i| print i, "\n"}

You can see we deleted the three numerics, and then added the four spelled out versions in their place. Here's the output:

[slitt@mydesk slitt]$ ./array.rb
one
two
3
4
5
seven
=====================
one
two
three
four
five
six
seven
[slitt@mydesk slitt]$

But what if you don't want to replace anything -- what if you just want to insert in the middle? No problem -- use 0 for the count...

#!/usr/bin/ruby
numbers = ["one", "two", "five"]
numbers.each { |i| print i, "\n"}
print "=====================\n"
numbers[2,0]=["three", "four"]
numbers.each { |i| print i, "\n"}

The only trick here is that if you are not deleting the starting point element, the insertion will occur AFTER the starting element. Here is the output:

[slitt@mydesk slitt]$ ./array.rb
one
two
five
=====================
one
two
three
four
five
[slitt@mydesk slitt]$

You might ask yourself what to do if you need to append before the first element, given that slice type insertion inserts   AFTER the starting point. The simplest answer is to use the unshift method.

You can construct an array using a parenthesized range:
   
#!/usr/bin/ruby
myArray = (0..9)
myArray.each{|i| puts i}
[slitt@mydesk slitt]$ ./array.rb
0
1
2
3
4
5
6
7
8
9
[slitt@mydesk slitt]$

Finally, remembering that Ruby is intended to be an object oriented language, let's look at some of the more common methods associated with arrays (which are really objects in Ruby):

#!/usr/bin/ruby
numbers = Array.new
numbers[3] = "three"
numbers[4] = nil
print "Class=", numbers.class, "\n"
print "Length=", numbers.length, "\n"
numbers.each { |i| print i, "\n"}

The Array.new method types numbers as an array. You could have done the same thing with numbers=[]. The next line assigns text three to the element with subscript 3, thereby setting the element and also setting the array's length. The next line sets the element whose subscript is 4 to nil, which, when you view the output, will prove that the length method returns one plus the last initialized element, even if it's initialized to nil. This, in my opinion, could cause trouble.

The class method returns the variable's class, which in a non-oop language could be thought of as its type.  The following is the output:

[slitt@mydesk slitt]$ ./hello.rb
Class=Array
Length=5
nil
nil
nil
three
nil
[slitt@mydesk slitt]$

We've gone through arrays in great detail, because you'll use them regularly. Now it's time to review Ruby's other built in container class...

Hashes

There are two ways to think of a hash:
  1. A set of key->value pairs
  2. An array whose subscripts aren't necessarily ordered or numeric
Both of the preceding are correct, and do not conflict with each other.

#!/usr/bin/ruby
litt = {"lname"=>"Litt", "fname"=>"Steve", "ssno"=>"123456789"}
print "Lastname : ", litt["lname"], "\n"
print "Firstname : ", litt["lname"], "\n"
print "Social Security Number: ", litt["ssno"], "\n"
print "\n"
litt["gender"] = "male"
litt["ssno"] = "987654321"
print "Corrected Social Security Number: ", litt["ssno"], "\n"
print "Gender : ", litt["gender"], "\n"
print "\n"
print "Hash length is ", litt.length, "\n"
print "Hash class is ", litt.class, "\n"

In the preceding, we initialized the hash with three elements whose keys were lname, fname and ssno. We later added a fourth element whose key was gender, as well as correcting the value of ssno. The class and length methods do just what we'd expect, given our experience from arrays. This hash could be thought of as a single row in a database table. Here is the result:

[slitt@mydesk slitt]$ ./hash.rb
Lastname : Litt
Firstname : Litt
Social Security Number: 123456789

Corrected Social Security Number: 987654321
Gender : male

Hash length is 4
Hash class is Hash
[slitt@mydesk slitt]$


Better yet, hashes values can be other types of classes. For instance, consider a hash of hashes:

#!/usr/bin/ruby
people = {
"torvalds"=>{"lname"=>"Torvalds", "fname"=>"Linus", "job"=>"maintainer"},
"matsumoto"=>{"lname"=>"Matsumoto", "fname"=>"Yukihiro", "job"=>"Ruby originator"},
"litt"=>{"lname"=>"Litt", "fname"=>"Steve", "job"=>"troubleshooter"}
}

keys = people.keys

for key in 0...keys.length
print "key : ", keys[key], "\n"
print "lname: ", people[keys[key]]["lname"], "\n"
print "fname: ", people[keys[key]]["fname"], "\n"
print "job : ", people[keys[key]]["job"], "\n"
print "\n\n"
end

Here's the output:

[slitt@mydesk slitt]$ ./hash.rb
key : litt
lname: Litt
fname: Steve
job : troubleshooter


key : matsumoto
lname: Matsumoto
fname: Yukihiro
job : Ruby originator


key : torvalds
lname: Torvalds
fname: Linus
job : maintainer


[slitt@mydesk slitt]$

Basically, you just implemented the equivalent of a database table, whose rows correspond to Litt, Matsumoto and Torvalds, and whose columns are lname, fname and job. There are probably a dozen better ways to actually print this information, but at this point I'm still learning Ruby, so I did it with a distinctively Perl accent. Perhaps that's a good thing -- it proves that Ruby follows ordinary programming logic in addition to its many wonderful features.

Sorting Hashes

You sort hashes by converting them to 2 dimensional arrays -- an array of key/value pairs, and then sorting them. The sort method does just that. Here's an example:
#!/usr/bin/ruby -w

h = Hash.new
h['size'] = 'big'
h['color'] = 'red'
h['brand'] = 'ford'

av = h.sort{|a,b| a[1] <=> b[1]}
ak = h.sort{|a,b| a[0] <=> b[0]}
ak.each do
|pair|
print pair[0]
print "=>"
print pair[1]
puts
end
puts "=============="
av.each do
|pair|
print pair[0]
print "=>"
print pair[1]
puts
end
[slitt@mydesk ~]$ ./test.rb
brand=>ford
color=>red
size=>big
==============
size=>big
brand=>ford
color=>red
[slitt@mydesk ~]$
Notice that often a simple <=> command does not suffice, and you actually need to write your own function to establish collation order. Simply write a function taking two arguments (a and b) that returns 1 when a is superior to b, -1 when a is inferior to b, and 0 when they are equivalent.

Tests and Info Requests on Hashes

Method What it does Synonyms
has_key?(key) Tests whether the key is present in the hash. include?(key), key?(key) and member?(key)
has_value?(value) Tests whether any element of the hash has the value, returning true or false. value?(value)
index(value) Returns the key for an element with the value. I don't know what happens if multiple elements have that value.
select {|key, value| block} => array Returns an array of key/value pairs for which block evaluates true:
h.select {|k,v| v < 200}
empty? Returns True if no key/value pairs
inspect Return contents of the hash as a string
invert Returns a new hash with keys and values switched.
length How many key/value pairs does it have? size()
sort {| a, b | block } => array

Strings

Strings are a class that ship with Ruby. The String class has a huge number of methods, such that memorizing them all would be futile. If you really want a list of them all, go http://www.rubycentral.com/book/ref_c_string.html., but don't say I didn't warn you.

What I'd like to do here is give you the 10% of strings you'll need for 90% of your work. By the way, Ruby has regular expressions, and that will be covered in the following section. This section covers only Ruby's String class methods.

Let's start with string assignment and concatination:

#!/usr/bin/ruby
myname = "Steve Lit"
myname_copy = myname
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"
print "\n=========================\n"
myname << "t"
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"

The double less than sign is a Ruby String overload for concatination. If all goes well, we'll change the original string but the copy won't change. Let's verify that:

[slitt@mydesk slitt]$ ./string.rb
myname = Steve Lit
myname_copy = Steve Lit

=========================
myname = Steve Litt
myname_copy = Steve Litt
[slitt@mydesk slitt]$

Oh, oh, it changed them both. String assignment copies by reference, not by value. Do you think that might mess up your loop break logic?

Use the String.new() method instead:

#!/usr/bin/ruby
myname = "Steve Lit"
myname_copy = String.new(myname)
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"
print "\n=========================\n"
myname << "t"
print "myname = ", myname, "\n"
print "myname_copy = ", myname_copy, "\n"

Here's the proof that it works the way you want it:

[slitt@mydesk slitt]$ ./hello.rb
myname = Steve Lit
myname_copy = Steve Lit

=========================
myname = Steve Litt
myname_copy = Steve Lit
[slitt@mydesk slitt]$

One really nice thing about the Ruby String class is it works like an array of characters with respect to splicing:

#!/usr/bin/ruby
myname = "Steve was here"
print myname[6, 3], "\n"
myname[6, 3] = "is"
print myname, "\n"

[slitt@mydesk slitt]$ ./string.rb
was
Steve is here
[slitt@mydesk slitt]$

This gets more powerful when you introduce the index string method, which returns the subscript of the first occurrence of a substring:


#!/usr/bin/ruby
mystring = "Steve was here"
print mystring, "\n"

substring = "was"
start_ss = mystring.index(substring)
mystring[start_ss, substring.length] = "is"
print mystring, "\n"

In the preceding, the start point for replacement was the return from the index method, and the count to replace is the return from the length method (on the search text). The result is a generic replacement:

[slitt@mydesk slitt]$ ./string.rb
Steve was here
Steve is here
[slitt@mydesk slitt]$

Naturally, in real life you'd need to add code to handle cases where the search string wasn't found.

You already saw in-place concatenation with the << method, but in addition there's the more standard plus sign concatenation:

#!/usr/bin/ruby
mystring = "Steve" + " " + "was" + " " + "here"
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
Steve was here
[slitt@mydesk slitt]$

If the addition sign means to add strings together, it's natural that the multiplication sign means string together multiple copies:

#!/usr/bin/ruby
mystring = "Cool " * 3
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
Cool Cool Cool
[slitt@mydesk slitt]$

Do you like the sprintf() command in C? Use the % method in Ruby:

#!/usr/bin/ruby
mystring = "There are %6d people in %s" % [1500, "the Grand Ballroom"]
print mystring, "\n
[slitt@mydesk slitt]$ ./string.rb
There are 1500 people in the Grand Ballroom
[slitt@mydesk slitt]$

You can compare strings:

#!/usr/bin/ruby
print "frank" <=> "frank", "\n"
print "frank" <=> "fred", "\n"
print "frank" <=> "FRANK", "\n"
[slitt@mydesk slitt]$ ./hello.rb
0
-1
1
[slitt@mydesk slitt]$

Here are some other handy string methods:


mystring.capitalize
Title case. Returns new string equal to mystring except that the first letter of every word is uppercase
mystring.capitalize!
Title case in place.
mystring.center(mynumber)
Returns a new string mynumber long with mystring centered within it. If mynumber is already less than the length of mystring, returns a copy of mystring.
mystring.chomp
Returns a new string equal to mystring except any newlines at the end are deleted. If chomp has an argument, that argument serves as the record separator, replacing the default newline.
mystring.chomp!
Same as chomp, but in place. Equivalent of Perl chomp().
mystring.downcase
Returns new string equal to mystring but entirely lower case.
mystring.downcase!
In place modifies mystring, making everything lower case.
mystring.reverse
Returns new string with all characters reversed. IOWA becomes AWOI.
mystring.reverse!
Reverses mystring in place.
mystring.rindex(substring)
Returns the subscript of the last occurrence of the substring. Like index except that it returns the last instead of first occurrence. This method actually has more options, so you might want to read the documentation.
mystring.rjust(mynumber)
Returns a copy of mystring, except the new copy is mynumber long, and mystring is right justified in that string. If mynumber is smaller than the original length of mystring, it returns an exact copy of mystring.
mystring.split(pattern, limit)
Returns a new array with parts of the string split wherever pattern was encountered as a substring. If limit is given, returns at most that many elements in the array.
mystring.strip
Returns a new string that is a copy of mystring except all leading and trailing whitespace have been removed.
mystring.to_f
Returns the floating point number represented by mystring. Returns 0.0 if it's not a valid number, and never raises exceptions. Careful!
mystring.to_i Returns an integer represented by mystring. Non-numerics at the end are ignored. Returns 0 on invalid numbers, and never raises exceptions. Careful!
mystring.upcase
Returns a new string that's an uppercase version of mystring.
mystring.upcase!
Uppercases mystring in place.





There are many, many more methods, but the preceding should get you through most programming tasks. If you end up using Ruby a lot, it would help to learn all the methods.

A word about mystring.split(pattern). What about the reverse -- turning an array into a string? Try this:

#!/usr/bin/ruby
mystring=""
presidents = ["reagan", "bush1", "clinton", "bush2"]
presidents.each {|i| mystring << i+" "}
mystring.strip
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
reagan bush1 clinton bush2
[slitt@mydesk slitt]$

Here's a version that turns it into a comma delimited file with quotes:

#!/usr/bin/ruby
mystring=""
presidents = ["reagan", "bush1", "clinton", "bush2"]
presidents.each {|i| mystring << "\"" + i + "\", "}
mystring[mystring.rindex(", "), 2] = ""
print mystring, "\n"
[slitt@mydesk slitt]$ ./string.rb
"reagan", "bush1", "clinton", "bush2"
[slitt@mydesk slitt]$

You now know most of the Ruby string techniques you need for the majority of your work. Well, except for regular expressions, of course...

Regular Expressions


NOTE

This section assumes you understand the concept of regular expressions. If you do not, there are many fine regular expression tutorials on the web, including this one on my Litt's Perls of Wisdom subsite.

Regular expressions make life so easy, often replacing 100 lines of code with 5. Perl is famous for its easy to use and intuitive regular expressions.

Ruby is a little harder because most regular expression functionality is achieved by a regular expression object that must be instantiated. However, you CAN test for a match the same as in Perl:

#!/usr/bin/ruby
string1 = "Steve was here"
print "e.*e found", "\n" if string1 =~ /e.*e/
print "Sh.*e found", "\n" if string1 =~ /Sh.*e/
[slitt@mydesk slitt]$ ./regex.rb
e.*e found
[slitt@mydesk slitt]$


Here's the code to actually retrieve the first match of /w.ll/ in the string:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
if string1 =~ /(w.ll)/
print "Matched on ", $1, "\n"
else
puts "NO MATCH"
end
[slitt@mydesk slitt]$ ./regex.rb
Matched on will
[slitt@mydesk slitt]$

This was almost just like Perl. You put parentheses in the regular expression to make a group, perform the regular expression search with the =~ operator, and then the match for the group is contained in the $1 variable. If there had been multiple groups in the regular expressions, matches would have also been available in $2, $3, and so on, up to the number of groups in the regular expression.


The more OOPish method of doing all this is to instantiate a new Regexp object and using its methods to gain the necessary information:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
regex = Regexp.new(/w.ll/)
matchdata = regex.match(string1)
if matchdata
puts matchdata[0]
puts matchdata[1]
else
puts "NO MATCH"
end

[slitt@mydesk slitt]$ ./hello.rb
will
nil
[slitt@mydesk slitt]$

If you change /w.ll/ to /z.ll/, which of course does not match because there's not a "z" in string1, the output looks like this:

[slitt@mydesk slitt]$ ./hello.rb
NO MATCH
[slitt@mydesk slitt]$

The preceding example shows how to do complete regex in Ruby. Start by creating a regular expression object using Regexp.new(). Then use that object's match method to find a match and return it in a MatchData object. Test that the MatchData object exists, and if it does, get the first match (matchdata[0]). The reason we also printed matchdata[1] was to show that, in the absense of groups surrounded by parentheses, the match method returns only a single match. Later you'll see a special way to return all matches of a single regular expression.

Another thing to notice is that, in Ruby, matching is not greedy by default. It finds the shortest string that satisfies the regular expression. If Ruby's matching was greedy like Perl's, the match would have included the entire string:

"will drill for a well in walla wall"

In other words, it would have returned everything from the first w to the last double l. Ungreedy matches go along with Ruby's principle of least surprise, but sometimes what you want is greedy matching.

You can return several matches using multiple groups, like this:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
regex = Regexp.new(/(w.ll).*(in).*(w.ll)/)
matchdata = regex.match(string1)
if matchdata
for ss in 0...matchdata.length
puts matchdata[ss]
end
else
puts "NO MATCH"
end
[slitt@mydesk slitt]$ ./hello.rb
will drill for a well in walla wall
will
in
wall
[slitt@mydesk slitt]$

Note the different behavior when you use parentheses. Here you see that the 0 subscript element matches the entire regular expression, while elements 1, 2 and 3 are the individual matches for the first, second and third parenthesized groups.

What if you wanted to find ALL the matches for /w.ll/ in the string, without guessing beforehand how many parentheses to put in? Here's the way you do it:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
regex = Regexp.new(/w.ll/)
matchdata = regex.match(string1)
while matchdata != nil
puts matchdata[0]
string1 = matchdata.post_match
matchdata = regex.match(string1)
end
[slitt@mydesk slitt]$ ./regex.rb
will
well
wall
wall
[slitt@mydesk slitt]$

What you've done here is repeated the match, over and over again, each time assigning the remainder of the string after the match to string1 via the post_match method. The loop terminates when no match is found.

Regex Substitution

My research tells me Ruby's regular expressions do not, in and of themselves, have a provision for substitution. From what I've found, you need to use Ruby itself, specifically the String.gsub() method, to actually perform the substitution. If that's true, to me that represents a significant hassle, although certainly not a showstopper. If I'm wrong about this, please let me know.

The following makes all occurrences of /w.ll/ uppercase in the string:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
string1.gsub!(/(w.ll)/){$1.upcase}
puts string1
[slitt@mydesk slitt]$ ./hello.rb
I WILL drill for a WELL in WALLa WALLa washington.
[slitt@mydesk slitt]$

The preceding depends on the block form of the String.gsub() method. I could not get the non-block form to accept the matches of the regular expression.

If you had wanted to replace only the first occurrence of /w.ll/, you would have had to do this (warning, ugly!):

#!/usr/bin/ruby
puts string1
regex = Regexp.new(/w.ll/)
match = regex.match(string1)
offsets = match.offset(0)
startOfMatch = offsets[0]
endOfMatch = offsets[1]
string1[startOfMatch...endOfMatch] = match[0].upcase
puts string1
[slitt@mydesk slitt]$ ./regex.rb
I WILL drill for a well in walla walla washington.
[slitt@mydesk slitt]$

Being a Perl guy, I'm used to having the regular expression do the entire substitution in a single line of code, and find the preceding quite cumbersome. Obviously, some of the preceding code was inserted just for readability. For instance, I could have done this:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
match = /w.ll/.match(string1)
string1[match.offset(0)[0]...match.offset(0)[1]] = match[0].upcase
puts string1

Or even this, which I'm sure would have fit right in with K&R first edition:

#!/usr/bin/ruby
string1 = "I will drill for a well in walla walla washington."
match = /w.ll/.match(string1)
string1[/w.ll/.match(string1).offset(0)[0].../w.ll/.match(string1).offset(0)[1]] = match[0].upcase
puts string1

If you can read the preceding, you're a better programmer than I.

In my opinion, Ruby beats the daylights out of Perl in most aspects, but not in regular expressions.

Subroutines

A subroutine starts with def and ends with a corresponding end. Subroutines pass back values with the return keyword. In a welcome change from Perl, variables declared inside a subroutine are local by default, as shown by this program:

#!/usr/bin/ruby
def passback
howIfeel="good"
return howIfeel
end

howIfeel="excellent"
puts howIfeel
mystring = passback
puts howIfeel
puts mystring

In the preceding, note that the puts command writes the string and then prints a newline, as opposed to the print command, which doesn't print a newline unless you add a newline to the string being printed.

If the howIfeel variable inside subroutine passback were global, then after running the subroutine, the howIfeel variable in the main program would change from "excellent" to good. However, when you run the program you get this:

[slitt@mydesk slitt]$ ./hello.rb
excellent
excellent
good
[slitt@mydesk slitt]$

The first and second printing of the howIfeel variable in the main program both print as "excellent", while the value passed back from the subroutine, and stored in variable mystring prints as "good", as we'd expect. Ruby's variables are local by default -- a huge encapsulation benefit.

You can pass variables into a subroutine as shown in the following code:

#!/usr/bin/ruby
def mult(multiplicand, multiplier)
multiplicand = multiplicand * multiplier
return multiplicand
end

num1 = 4
num2 = 5
result = mult(num1, num2)
print "num1 is ", num1, "\n"
print "num2 is ", num2, "\n"
print "result is ", result, "\n"
[slitt@mydesk slitt]$ ./hello.rb
num1 is 4
num2 is 5
result is 20
[slitt@mydesk slitt]$

The value of num1 was not changed by running mult(), showing that arguments are passed by value, not reference, at least for integers. But what about for objects like strings?

#!/usr/bin/ruby
def concat(firststring, secondstring)
firststring = firststring + secondstring
return firststring
end

string1 = "Steve"
string2 = "Litt"
result = concat(string1, string2)
print "string1 is ", string1, "\n"
print "string2 is ", string2, "\n"
print "result is ", result, "\n"
[slitt@mydesk slitt]$ ./hello.rb
string1 is Steve
string2 is Litt
result is SteveLitt
[slitt@mydesk slitt]$

Once again, manipulations of an argument inside the subroutine do not change the value of the variable passed as an argument. The string was passed by value, not reference.

Exceptions

Growing up with C, I wrote code for every possible error condition. Or, when I was too lazy to write code for error conditions, my code was less robust.

The modern method of error handling is with exceptions, and Ruby has that feature. Use them.

There are two things you can do: handle an exception, and raise an exception. You raise an exception by recognizing an error condition, and then associating it with an exception type. You usually don't need to raise an exception because most system calls already raise exceptions on errors. However, if you've written a new bit of logic, and encounter a forbidden state, then you would raise an exception.

You handle an exception that gets raised -- typically by system calls but possibly by your code. This handling is only for protected code starting with begin and ending with end. Here's a simple example:

#!/usr/bin/ruby
begin
input = File.new("/etc/resolv.conf", "r")
rescue
print "Failed to open /etc/fstab for input. ", $!, "\n"
end
input.each {
|i|
puts i;
}
input.close()
 
The preceding code produces the following output:

[slitt@mydesk slitt]$ ./hello.rb
search domain.cxm
nameserver 192.168.100.103

# ppp temp entry
[slitt@mydesk slitt]$

However, if the filename in File.new() is changed to the nonexistent /etc/resolX.conf, the output looks like this:

[slitt@mydesk slitt]$ ./hello.rb
Failed to open /etc/fstab for input. No such file or directory - /etc/resolX.conf
./hello.rb:7: undefined method `each' for nil:NilClass (NoMethodError)
[slitt@mydesk slitt]$

Global variable $!i had the value "No such file or directory - /etc/resolX.con", so that printed along with the error message in the rescue section. This exception was then passed to other exception handlers, that wrote additional messages and eventually terminated the program.

Exceptions are implemented as classes (objects), all of whom are descendents of the Exception class. Some have methods over and above those of the Exception class, some do not. Here is a list of the exceptions I was able to find in documentation on the web:

The following is a more generic error handling syntax:
begin
# attempt code here
rescue SyntaxError => mySyntaxError
print "Unknown syntax error. ", mySyntaxError, "\n"
# error handling specific to problem here
rescue StandardError => myStandardError
print "Unknown general error. ", myStandardError, "\n"
# error handling specific to problem here
else
# code that runs ONLY if no error goes here
ensure
# code that cleans up after a problem and its error handling goes here
end
In the preceding, variables mySyntaxError and myStandardError are local variables to store the contents of global variable $!, the exception that was raised.

Retry

There's a retry keyword enabling a retry on error. This is handy when performing an activity that might benefit from a retry (reading a CD, for instance):
begin
# attempt code here
rescue
puts $!
if EscNotPressed()
print "Reload the CD, or press ESC\n"
retry
else
puts "User declined to retry further"
end
end

Raising an Exception

Sometimes the neither the system nor the language detect an error, but you do. Perhaps the user input someone 18 years old for Medicare. Linux doesn't know that's wrong. Ruby doesn't know that's wrong. But you do.

You can raise a generic exception (or the current exception if there is one) like this:
raise if age < 65

#!/usr/bin/ruby
age = 18
raise if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:3: unhandled exception
[slitt@mydesk slitt]$

To raise a RuntimeError exception with your own message, do this:
raise "Must be 65 or older for Medicare"

#!/usr/bin/ruby
age = 18
raise "Must be 65 or older for Medicare." if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:3: Must be 65 or older for Medicare. (RuntimeError)
[slitt@mydesk slitt]$

To raise a RangeError exception (you wouldn't really do this), you'd do this:
raise RangeError, "Must be 65 or older for Medicare", caller

#!/usr/bin/ruby
age = 18
raise RangeError, "Must be 65 or older for Medicare", caller if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:3: Must be 65 or older for Medicare (RangeError)
[slitt@mydesk slitt]$

Perhaps the best way to do it is to create a new exception class specific to the type of error:

#!/usr/bin/ruby
class MedicareEligibilityException < RuntimeError
end

age = 18
raise MedicareEligibilityException , "Must be 65 or older for Medicare", caller if age < 66
print "Age is ", age, ". This happens after the exception was raised\n"
[slitt@mydesk slitt]$ ./hello.rb
./hello.rb:6: Must be 65 or older for Medicare (MedicareEligibilityException)
[slitt@mydesk slitt]$

Now let's combine raising and handling, by creating a subroutine called signHimUp(), which raises the exception, and the calling main routine, which handles. In this particular, rather contrived program, information about the person whose information raised the exception is stored in the exception itself, by the initialize() method, which assigns its arguments to the class's instance variables, so that this call:
myException = MedicareEligibilityException.new(name, age)
creates an instance of class MedicareEligibilityException whose instance variables contain the person's name and age for later reference. Once again, this is very contrived, but it illustrates some of the flexibility of exception handling:

#!/usr/bin/ruby
class MedicareEligibilityException < RuntimeError
def initialize(name, age)
@name = name
@age = age
end
def getName
return @name
end
def getAge
return @age
end
end

def writeToDatabase(name, age)
# This is a stub routine
print "Diagnostic: ", name, ", age ", age, " is signed up.\n"
end

def signHimUp(name, age)

if age >= 65
writeToDatabase(name, age)
else
myException = MedicareEligibilityException.new(name, age)
raise myException , "Must be 65 or older for Medicare", caller
# raise MedicareEligibilityException , "Must be 65 or older for Medicare", caller
end
end

# Main routine
begin
signHimUp("Oliver Oldster", 78)
signHimUp("Billy Boywonder", 18)
signHimUp("Cindy Centurinarian", 100)
signHimUp("Bob Baby", 2)

rescue MedicareEligibilityException => elg
print elg.getName, " is ", elg.getAge, ", which is too young.\n"
print "You must obtain an exception from your supervisor. ", elg, "\n"

end

print "This happens after signHimUp was called.\n"


In the preceding code, the main routine calls subroutine signHimUp for each of four people, two of whom are underage. The begin/rescue/end structure in the main routine allows exceptions of type MedicateEligibilityException to be handled cleanly, although such exceptions are raised by the called subroutine, signHimU(). , signHimU(). routine tests for age 65 and older, and if so, calls dummy writeToDatabase() and if not, creates a new instance of MedicateEligibilityException containing the person's name and age, and then raises that exception, with the hope that the calling routine's exception handling will be able to use that information in its error message.

The MedicateEligibilityException definition itself is a typical class definition, with instance variables beginning with @, an initialize() constructor that assigns its arguments to the instance variables, and get routines for the instance variables. All of this will be covered later when we discuss classes and objects.

Here is the result:

[slitt@mydesk slitt]$ ./hello.rb
Diagnostic: Oliver Oldster, age 78 is signed up.
Billy Boywonder is 18, which is too young.
You must obtain an exception from your supervisor. Must be 65 or older for Medicare
This happens after signHimUp was called.
[slitt@mydesk slitt]$

As you can see, the first call to signHimUp() successfully ran the stub write to database routine, as indicated by the diagnostic line. The next call to signHimUp() encountered an exceptio MedicateEligibilityException exception, and the code in the rescue block got the patient's name and age from the exception, and wrote it. At that point the begin block was terminated, and execution fell through to the line below the end matching the exception handling's begin. If we had wanted to, we could have terminated the program from within the rescue block, in many ways, including ending that block with a raise command, or to bail immediately, an exit command.

Catch and Throw

The catch and throw keywords enable you to jump up the error stack, thereby in effect performing a goto. If you can think of a good reason to do this, research these two keywords on your own. Personally, I'd prefer to stay away from them.

We've just scratched the surface of exception handling, but you probably have enough now to at least write simple exceptions and read other people's exception code.

Terminal IO

This section will cover just a few of the many ways you can do terminal IO. You've already learned about print and puts:

#!/usr/bin/ruby
print "This is the first half of Line 1. "
print "This is the second half.", "\n"
puts "This is line 2, no newline necessary."