Troubleshooters.Com and Code Corner Present

Steve Litt's Perls of Wisdom:
Perl OOP
(With Snippets)

Copyright (C) 1998-2002 by Steve Litt



Contents

Introduction

Perl 5 can be used in OOP programming. It's not as encapsulated as I'd like, but it's not bad if both the class design programmer and the application programmer are cool about what they do.

This document is NOT a detailed description of OOP in Perl. It simply discusses a few very basic techniques, the 10% of Perl OOP you'll use 90% of the time.
 

Perl OOP Gotchas

Watch out!

Perl OOP can really nail you if you don't watch out. Here are some things to look for:
 

Always call methods with an object

Always do this:
 
$myobj->mymethod(arg1, arg2);
Never:
&mymethod(arg1, arg2);
That's because the object becomes arg0 ($_[0]) inside the methods, and most good methods self-reference by arg0 as follows:
sub mymethod
    {
    my($self) = shift;    # shift returns $_[0] on its first call
    ...
    }
This may not be so obvious to C++ programmers (like me), who call one method from another simply by mentioning the function name. Note in Perl. Instead, explicitly do this:
sub myfunction1
    {
    $self->myfunction2($whatever);
    }

Tutorial: Build a Tree Traversing Class

Make a Dummy Class

Start with a file in the current directory called Tree.pm. Remember that Perl Classes are contained in Module files, and Perl Modules have .pm extensions. At the top line, declare it to be a package with the following:
 
#File Tree.pm, module for class Tree
#The "package Tree" syntax declares it as a package (class)
package Tree;

#The constructor is always called new(). It can take as many args
#as required.
sub new
     {
     #Arg0 is the type because the constructor will look like
     #  my($instance) = Tree->new(arg1,arg2,whatever)
     #so arg0 will be Tree.
     my($type) = $_[0];

     #Make subroutine-local var $self, and make it a reference.
     #Specifically, make it a reference to a (right now) empty hash.
     #Later on, that hash will contain object properties.
     my($self) = {};

     #For now, we'll have one instance variable (property, whatever)
     #It will be in the hash referenced by $self, and will have
     #the index 'root'. This will be the first arg (inside the parentheses)
     #of the call to the constructor in the main program.
     $self->{'root'} = $_[1];    #remember $_[0] was the Tree before the ->

     #There's nothing reserved about the word $self. It could have been
     #called $oodolaboodola. To link the object with both the hash pointed
     #to by $self and the type (Tree), we use the 2 argument version
     #of the keyword bless:
     bless($self, $type);

     #Now finally, return the hash as a reference to be used as an "object"
     return($self);
     }

#Now make diagnostic routine tellroot to make sure everything's OK.
sub tellroot
     {
     #first "find yourself". Once again, there's nothing reserved
     #about the word $self. We simply assume that whoever called tellroot
     #was smart enough to call it like $myinstance->tellroot().
     my($self)=$_[0];

     #Now that we have $self, we can get the root from the hash after
     #dereferencing.
     print "Root is $self->{'root'}.\n";
     }

return(1);           #package files must always return 1.

Now make the main program
 
 
#main.pl

use Tree;                         #include the tree class file.

my($TreeObj) = Tree->new("c:\\"); #instantiate. Note that arg0 is Tree.

$TreeObj->tellroot();             #Note that arg0 is $TreeObj.

#This code should print out "C:\".

Run it. It should print out "c:\", just like you expect. When you understand everything that happened, go on to the next step, where we flush it out (and drop some of the comments).

In the next step, we'll make the Tree class a dedicated tree browser, a sort of dir /s if you will. Don't worry, the final version will actually be useful and reusable.

Make it into a tree browser

This is the Tree class does the same as a DOS dir /s. Don't worry, the final version will actually be useful and reusable.
 
#File Tree.pm, module for class Tree
package Tree;

sub new
     {
     my($type) = $_[0];
     my($self) = {};
     $self->{'root'} = $_[1];    #remember $_[0] was the Tree before the ->
     bless($self, $type);
     return($self);
     }

sub tellroot
     {
     my($self)=$_[0];
     print "Root is $self->{'root'}.\n";
     }

sub cruisetree
   {
   my($self) = $_[0];                  #Find yourself

   #*** Now call method onedir with self->onedir, NEVER &onedir ***
   $self->onedir($self->{'root'});       #note called with instance
   }

sub onedir
   {
   my($self) = $_[0];                  #Find yourself
   my($dirname) = $_[1];               #Directory passed in

   #*** Below this point there's nothing OOP, EXCEPT ***
   #*** EXCEPT for the line commented %%%% O O P %%%% ***
   opendir(DIR, $dirname);
   my(@Names) = readdir(DIR);
   closedir(DIR);

   # Blow off possible trailing backslash before appending one.
   # Don't want 2 consecutive backslashes.
   if($dirname =~ /(.*)\\$/) 
      {$dirname = $1;}

   # Loop thru directory, handle files and directories   
   my($Name);
   foreach $Name (@Names)
     {
     chomp($Name);
     my($Path) = "$dirname\\$Name";
     if( -d $Path )                     # if path represents a directory
       {
       if(($Name ne "..") && ($Name ne "."))
          {
          print "Directory $Path...\n";
          $self->onedir($Path);               #%%%% O O P %%%%
          }
       }
     else                               # if path represents a file
       {
       print "         File $Path\n"
       }
     }
   return;
   }

return(1);           #package files must always return 1.

And here is the main.pl which calls it:
#main.pl

use Tree;                         #include the tree class file.

my($TreeObj) = Tree->new("c:\\"); #instantiate. Note that arg0 is Tree.

$TreeObj->cruisetree();           #Note that arg0 is $TreeObj.

#This code should print out the entire c:\ tree.

Now make it a reusable tool

We'll now make the Tree class an encapsulated object which is instantiated with the root of the tree and references to two external subroutines. One external subroutine performs on directories found, the other on files found. The Tree object passes each two arguments, the path and the bare filename. From that the external directories can deduce file dates and a lot more.

You'll note that the entire functionality of the program can be changed just by changing subroutines showfile() and showdir() in the main routine. The Tree.pm code requires no changes. It's truly reusable.
 
#File Tree.pm, module for class Tree
package Tree;

sub new
     {
     my($type) = $_[0];
     my($self) = {};
     $self->{'root'}    = $_[1]; #remember $_[0] was the Tree before the ->
     $self->{'dirfcn'}  = $_[2];
     $self->{'filefcn'} = $_[3];
     bless($self, $type);
     return($self);
     }

sub tellroot
     {
     my($self)=$_[0];
     print "Root is $self->{'root'}.\n";
     }

sub cruisetree
   {
   my($self) = $_[0];                  #Find yourself

   #*** Now call method onedir with self->onedir, NEVER &onedir ***
   #*** Note that dirfcn and filefcn aren't passed ***
   #*** Because they're contained in $self and don't change ***
   $self->onedir($self->{'root'});       #note called with instance
   }

sub onedir
   {
   my($self) = $_[0];                  #Find yourself
   my($dirname) = $_[1];               #Directory passed in

   #*** Below this point there's nothing OOP, EXCEPT ***
   #*** EXCEPT for the line commented %%%% O O P %%%% ***
   opendir(DIR, $dirname);
   my(@Names) = readdir(DIR);
   closedir(DIR);

   # Blow off possible trailing backslash before appending one.
   # Don't want 2 consecutive backslashes.
   if($dirname =~ /(.*)\\$/) 
      {$dirname = $1;}

   # Loop thru directory, handle files and directories   
   my($Name);
   foreach $Name (@Names)
     {
     chomp($Name);
     my($Path) = "$dirname\\$Name";
     if( -d $Path )                     # if path represents a directory
       {
       if(($Name ne "..") && ($Name ne "."))
          {
          &{$self->{'dirfcn'}}($Path, $Name);  #%%%% O O P %%%%
          $self->onedir($Path);                #%%%% O O P %%%%
          }
       }
     else                               # if path represents a file
       {
       &{$self->{'filefcn'}}($Path, $Name)  #%%%% O O P %%%%
       }
     }
   return;
   }

return(1);           #package files must always return 1.
#main.pl

use Tree;                         #include the tree class file.

my($TreeObj) = Tree->new("c:\\windows", \&showdir, \&showfile);

$TreeObj->cruisetree();           #Note that arg0 is $TreeObj.

sub showdir
   {
   print "Directory: $_[0] ...\n";
   }

sub showfile
   {
   print "     File: $_[0] ...\n";
   }
#This code should print out "C:\".

Perl Inheritance

First, create the following directories beneath your home directory: First, create the Person class in the $HOME/personclass directory. Note that this subdirectory could be named anything. We just named it a descriptive and memorable name. The following is the $HOME/personclass/Person.pm file, which defines the Person class:
 
package Person;

sub new
     {
     my($type) = $_[0];
     my($self) = {};
     $self->{'name'} = $_[1];
     bless($self, $type);
     return($self);
     }

sub tellname
     {
     my($self)=$_[0];
     print "Person name is $self->{'name'}.\n";
     }

return(1);

The preceding is a simple class whose single constructor argument is the name of the person. And the tellname() function prints the name and also indentifies this as the base class. This is nothing more than a plain vanilla class. It knows nothing of any decendents it might have, nor of any programs that use it. When instantiated, it stores the argument as its name. When its tellname() method is called, it prints that name, and also identifies this as a person.

Now create a subclass of Person called Male. Like all subclasses, this one must know about its parent, because that's where it inherits from. The following file should be placed in the $HOME/personclass/Person directory, because subclasses of a module must go in a subdirectory with the same name as that module. Here's how you code the Male subclass of Person, $HOME/personclass/Person/Male.pm:
 
use Person;                        #Children must know about their parents
package Person::Male;              #This class is called Person::Male

BEGIN{@ISA = qw ( Person );}       #Declare this a child of the Person class

sub tellname
     {
     my($self)=$_[0];
     print "Male name is $self->{'name'}.\n";
     }

return(1);

Notice that this class's name is Person::Male. It is declared a child of the Person class with the @ISA = qw ( Person ); construct. In any class, the elements of @ISA identify its parent(s). The reason for the BEGIN{} construct is that the setting of @ISAmust happen before the program.

This subclass overwrites the tellname() method. Specifically, it identifies the name as belonging to a male. This subclass does NOT override the constructor. So how is the class constructed? Because this class doesn't define a new() method, its parent is consulted (via the @ISA variable). If the parent didn't have that method, its parent would be consulted. But in this case the parent defines the method, so the parent's constructor is used.

Let's make a Female subclass of Person, similar to the Male. The following is the code for $HOME/personclass/Person/Female.pm:
 
use Person;                        #Children must know about their parents
package Person::Female;            #This class is called Person::Female

BEGIN{@ISA = qw ( Person );}       #Declare this a child of the Person class

sub tellname
     {
     my($self)=$_[0];
     print "Female name is $self->{'name'}.\n";
     }

return(1);

The only difference between the Male and Female subclasses is that they identify their gender in the tellname() method.

Now for the main program. The main program can be placed absolutely anywhere on the disk. It doesn't need to be placed anywhere relative to the classes you've just created. Unlike .pm files, the main program needn't be named in any special manner.In our case we'll put the main program in the $HOME/persontest directory, and we'll call the file test.pl.
 
#!/usr/bin/perl -w
use strict;

use lib $ENV{"HOME"} . "/personclass" ;   #Look for modules in this tree
use Person;                               #The Person class
use Person::Male;                         #The Male subclass of Person
use Person::Female;                       #The Female subclass of Person

my($wr) = Person::Male->new("Doug");      #Make a Male
$wr->tellname();

$wr = Person::Female->new("Tiffany");     #Make a Female
$wr->tellname();

$wr = Person->new("Baby");                #Make a Person
$wr->tellname();

There are several lines of interest. The line starting with use lib tells this program where to look for the the code comprising the Person class and its descendents. The three use statements following that make the main program aware of the Person, Male and Femaleclasses. The three assignments to $wr, and the three calls to that object's tellname() method, demonstrate the use of the three classes.

In a real app you'd probably have an if statement to determine which of the three instantiations should be done, after which you'd have common code describing the methods to be called. In that way, you shift the burdon of branching to the subclasses themselves, greatly simplifying your programming. I will soon be enhancing the EMDL to UMENU converter program to allow conversions to IceWM menus. To do that, I'll do the following:
 

By doing this, I'll open the door to easily make Writer objects for other types of menus in the near future. A single EMDL file can be used to produce identically structured Umenu and IceWM menus, and who knows, maybe KDE and Gnome too.

Perl Inheritance Summary

It's simple. A class knows its parent by the @ISA variable, which must be set with a BEGIN{} construct to ensure that it's set before the program runs. Obviously, the child class must know about its parent, which is why the parent is named in a use statement. The child can either overwrite a parent method, or can not define that method, in which case the parent copy of that method is run.

A parent class needs know nothing about its children.

The main program must find the modules defining these classes. If those modules are not kept in standard places, the location of their tree must be named using the use lib construct. All classes to be used, including subclasses, must be named in use statements. These things being done, it remains only to instantiate the classes and call their methods.

Loading Modules from Various Locations

Consider the following familiar line:
use Node;
When encountering that line, Perl starts down the module path denoted by built-in variable @INC for a file called Node.pm. When it finds that file, it loads it, imports it, and then quits looking. One more gotcha -- this all happens at compile time, not at runtime. For the more Perlesque reader, let me say that the use Node; line is exactly equivalent to the following:
BEGIN { require Node; import Node; }
In the preceding, BEGIN is a special subroutine name that Perl executes at compile time, before runtime code is executed.

Next question: What if Node.pm resides in a directory not currently in @INC? You have several choices:
  1. Include the proper directory using the -I argument to the perl command
  2. Use the use lib (Node)syntax to add the directory from within the code
  3. Manually manipulate @INC within the runtime code
Alternative #1 can be used when issuing the perl command, or it can be placed on the "shebang" line (#!command) to simulate including the inclusion within the code. The added directory(s) must be a constant.

The following is how it's done at the operating system's command prompt:
perl -I /home/slitt/mymodules umenu.pl s

The following is how it's done from the "shebang" line in the perl script:
#!/usr/bin/perl -w -I /home/slitt/mymodules

Alternative #2 is certainly the simplest. If the directory is known when the program is written, certainly this is the best way to do it.
use lib /home/slitt/mymodules;

You might want to use this syntax to prepend a directory deduced at runtime. DON'T! The use directive is a pragma that is executed at compile time, meaning that the variable containing the directory to be prepended won't exist yet. There are ways to get around this using a BEGIN{} construct, but these workarounds can get truly ugly.

Alternative #3 is useful when the added directory must be deduced at runtime. The rest of this article focuses on ways to do it comfortably at runtime.

Deducing the Module's Directory at Runtime

Why would you want to place an additional directory in the module path at runtime? Many reasons. As one example, consider my situation.

I write free software, and have no control over how it's installed. My newest Umenu and EMDL programs both use a module called Node.pm. Where should that module be located? In the Umenu program directory, the EMDL program directory, both, or somewhere else? It's really up to the user, and the user might not be up to changing the source code.

Instead, I write the code so if the user includes the following line in umenu.cnf:
nodedir=/home/slitt/mymodules
that directory is prepended to @INC. Here's a subroutine, called loadNodeModule(), that does just that:
sub loadNodeModule()
{
my($conffile) = $ENV{'UMENU_CONFIG'};
$conffile = "./umenu.cnf" unless defined($conffile);
print "Using config file $conffile.\n";

open CONF, '<' . $conffile or die "FATAL ERROR: Could not open config file $conffile.";
my @lines = <CONF>;
close CONF;

my @nodedirs;
foreach my $line (@lines)
{
chomp $line;
if($line =~ m/^\s*nodedir\s*=\s*([^\s]*)/)
{
my $dir = $1;
if($dir =~ m/(.*)\$HOME(.*)/)
{
$dir = $1 . $ENV{'HOME'} . $2;
}
push @nodedirs, ($dir);
}
}

if(@nodedirs)
{
unshift @INC, @nodedirs;
}

require Node;
import Node;
}


The first three lines define what to use as the configuration file, with umenu.cnf in the current directory being the default. The next three lines dump the configuration file to an array.

The loop parses each line for nodedir=, and if found, adds the value following the equal sign to array @nodedirs after tweaking that value to do the right thing if the value contains $home. At this point you have an array containing any and all values from the config file's nodedir= lines.

The if() statement after the loop adds that list of directories to @INC if the list contains elements, otherwise it does nothing. At the conclusion of the if() statement @INC contains the module directories intended by the user, who configured umenu.cnf.

The final two statements simply load Node.pm and import its subroutines and variables. They are a runtime substitute for use Node;.

Before instantiating any objects of type Node, in other words, calling Node->new() or new Node, you would run loadNodeModule().

Yes, this is a lot of work, but remember the old saying, it's better that the programmer do hard work than the user. Also, keep in mind that because everything in this subroutine happens at runtime, this functionality can be placed within the subroutine that originally reads the configuration file. HOWEVER, doing so often makes the code unreadable, so that with anything but the longest configuration files, it's usually better to incur the tiny performance penalty of rereading the configuration file than to endure the human task of deducing the functionality of require and import statements intermingled with other config file parsing.


 [ Troubleshooters.com| Code Corner | Email Steve Litt ]

Copyright (C)1998-2002 by Steve Litt --Legal