Troubleshooters.Com Presents

Linux Productivity Magazine

Volume 3 Issue 7, July 2004

Node.pm

Copyright (C) 2004 by Steve Litt. All rights reserved. Materials from guest authors copyrighted by them and licensed for perpetual use to Linux Productivity Magazine. All rights reserved to the copyright holder, except for items specifically marked otherwise (certain free software source code, GNU/GPL, etc.). All material herein provided "As-Is". User assumes all risk and responsibility for any outcome.

Have Steve Litt write you a quick application!

See also Troubleshooting Techniques of the Successful Technologist
and Rapid Learning: Secret Weapon of the Successful Technologist
by Steve Litt

[ Troubleshooters.Com | Back Issues |Troubleshooting Professional Magazine ]


 
Man is a tool-using animal.... Without tools he is nothing, with tools he is all.
-- Thomas Carlyle

CONTENTS

Editor's Desk

By Steve Litt
The original EMDL to UMENU converter was the worst code I ever wrote. Totally OOP and totally inscrutable, it consistently defied maintenance.

That's not to say it didn't work. It worked beautifully. I'd run it on my 2200 line EMDL (Easy Menu Definition Language) menu tree, and 15 seconds later I'd have a perfectly conforming set of UMENU menu definition files, ready to use with my UMENU system.
What is UMENU?

UMENU is a keyboard driven menu program that accepts user keystrokes, and in response invokes either submenus or programs. It runs on any Linux or Unix computer, and can substitute for the desktop manager's menu. It can also serve as the front end for commands with numerous options.

Crafted for the touch typist, only a single keystroke is required to trigger a submenu or program. It is not necessary to press the Enter key.

UMENU runs from a group of menu definition files, all ending in .mnu, each of which defines a single submenu and everything on it. The reason it was decided to have one file run one menu, instead of a single file defining the entire menu hierarchy, is because small files load fast relative to human keystrokes, so the user perceives no delay.

The trouble with the file per submenu paradigm is that it's cumbersome to author. To eliminate that disadvantage, EMDL (Easy Menu Definition Language), and an EMDL to UMENU converter, were created. Now an entire menu system can be created and maintained as a single tab indented outline file, which can then be compiled into performance enhancing UMENU files.

The only problem was, I wanted to make the EMDL parser generic, outputting not only UMENU menu definition files, but also IceWM menu files and whatever other menus were needed. But UMENU is based on many peer files, one for each menu, while IceWM menus are based on a single hierarchical file. There was no single place within the old EMDL parser to connect the IceWM Writer object. It would have been necessary to place hooks in several different places. Given the fact that 6 hours of study was required before making even the slightest modification to this program, adding IceWM capabilities would have made this program collapse under its own complexity.

The EMDL to UMENU parser was written in March 2002. Within 6 months I realized that adding an IceWM menu writer would be impossible. I put it aside.

-*-*-*-

A year before writing the EMDL to UMENU converter, my big project was a massive Troubleshooting Professional Magazine exploration of XML, including the DOM (Document Object Model) spec. There are many ways to describe DOM, but perhaps the most basic is that a DOM tree consists of a tree of nodes that can be navigated by functions defined in the DOM spec. Researching DOM, I found the following functions:
Why, I asked myself, would they have getNextSibling() and getPreviousSibling()? Instead of navigating siblings, why not have getNextChild() and getPreviousChild()? If you're going to recurse through a tree, that's the way it's normally done. These sibling-centric are useful only if one continually changes his reference point -- sort of like moving a checker (from a game of checkers) around the node tree. Why would one do that?

Then it hit me like a ton of bricks. By moving the reference point around the node tree, one can navigate it using iteration instead of recursion:

#                     TOP
# |
# .--------------+--------------.
# | | |
# N N N
# | | |
# .---+---. .---+---. .---+---.
# | | | | | | | | |
# N N N N N N N N N
# |
# .-+-.
# | |
# N N


while(1)

{
if ascending
if back at top node
quit
else
perform action on this node

if you can go down
go down
else if you can go right
go right
else
go up
}

The preceding algorithm is conceptually valuable, but to make it workable, when you test if you can go down, you must make sure you've never been at this node before, or you'll infinite-loop. In practice, you'd try to go down only if you are not ascending.

The XML issue of Troubleshooting Professional Magazine was a hit, I converted my business to Linux, and time moved on...

-*-*-*-

I finished "Troubleshooting Techniques of the Successful Technologist" in December 2001. It had been authored in LyX, after being outlined in VimOutliner. I had created scripts to display a subtree at a time, display an entire outline to  a specific level, convert an outline to pages, and number the outline. As time went on I realized all these scripts had something in common -- their data was a hierarchy (tree). Remembering the DOM spec, I created a Node object to represent each piece of data.

Over time, I migrated the Node object to its own .pm file (Node.pm). Over time, I added an object to parse a tab indented outline (OutlineParser) and an object to walk and manipulate a tree of Node objects (Walker). Over time, Node.pm was debugged and became more robust. Time marched on...

-*-*-*-

I had little time to spare, but at the monthly installfests in June, July and August 2003, I rewrote the EMDL to UMENU converter, from scratch, using Node.pm. Doing so revealed several subtle bugs in Node.pm, which I fixed. Finally, regression tests proved that for a given small EMDL input, my new program produced the same UMENU files as the old program.

Now it was time to test with my 2200 line personal EMDL file, and I was afraid. The new algorithm repeatedly walked the Node tree, tweaking one little property at a time, instead of walking it once or twice and tweaking everything at once. The single property tweak algorithm made for a much simpler program, but at what cost to speed?

The old program took 15 seconds to run against my 2200 line personal EMDL file -- barely tolerable. If the new algorithm took much longer, it would be useless. I crossed my fingers and ran the program.

It took 2 seconds!

I danced around the room, rejoicing the 7.5x speed increase bestowed by Node.pm. In hindsight it was predictable. Node.pm keeps all data in memory, linked by very simple pointers (references actually). Methods primarily return such pointers. The application algorithm might have been suboptimal, but it was built on a very efficient Node.pm.

By early August 2003, Node.pm was a rock solid tool, and I had become expert with the tool by crafting the new EMDL to UMENU parser. I fully understood the power of Node.pm, but wasn't yet cognizant of either its ease of use or its wide applicability. That would come soon enough...


-*-*-*-

Between 1997 and 2003 I did OOP for OOP's sake. You can't blame me -- I was just mirroring what was going on in the IT world at large. The old "Structured paradigm" was considered obsolete -- the domain of old programmers who couldn't adapt.

As the new century progressed, I became increasingly disillusioned with OOP for OOP's sake. OOP's scalability advantages were greatly overrated. Unless done brilliantly, pure OOP produced code almost as unfathomable as the spaghetti code produced by our goto ancestors. OOP adherants had many design methodologies, most of which were more intuition than method. I began to feel there's got to be a better way.

In October 2003 there was a marathon thread on the VimOutliner list and some other lists concerning programming productivity. One thing led to another, and Noel Henson recommended using UNIX pipes to connect all modules, citing the \RDB RAD tool as an example. I tried it on a domain reporting program, and the results were horrible, slow and complex.

But the discussion led me to an understanding:

It's the Data, Stupid!

The more logic that can be moved out of procedural code and into data, the more maintainable the program. The more the program's data matches the underlying problem domain, the simpler and more elegant the program.

A long time ago I started every design with functional decomposition. In the late 1990's and early 2000's I started by asking: "if the program is considered a machine, what parts would it have?". Now I started every design with the question: "What is (are) the underlying data structure(s)?". Often the answer is "a hierarchy", or "a stack" or "a short list of entities with properties". Node.pm is a natural for hierarchies, stacks and lists with entities. I began using it more often.

In December 2003 I decided that UMENU needed a rewrite...


-*-*-*-

The UMENU code wasn't as bad as the EMDL to UMENU converter. But it wasn't good.

The code was written in 2 days, with an OOP design that sounded great in concept, but produced complexity upon coding. It sported four objects types:
  1. CHOICE <pertains to a menu choice>
  2. CONFIG <stores and handles configuration info>
  3. OS <stores and handles operating system dependencies>
  4. UMENUGLOBAL <repository for global variables>
It was ugly. CHOICE.pm was over 196 lines non-blank, non-comment, with CONFIG.pm at 84 lines, and OS.pm weighing in at around 120 lines. The main program, ./umenu.pl, was a whopping 400 lines. The objects called each other -- it was a mess.

The new program replaces each of the four objects with Node trees, so that the new program consists only of a main program (umenu.pl) and the Node.pm tool. Instead of needing to understand four different object, the maintenance programmer now needs to understand only the fully documented and debugged Node object. It's likely the programmer  already has experience with Node objects, either from EMDL or elsewhere.

Without all the back and forth of objects calling objects calling objects, the program is much simpler to understand. At 649 non-blank, non-comment lines, umenu.pl isn't small, but it's much more straightforward and maintainable. And I've learned some valuable lessons...

-*-*-*-

Objects are wonderful as data wrappers and as emulators for physical objects. In other roles, objects are not ideal, and their use often becomes contrived and convoluted.

If all you have is a hammer,  everything looks like a nail. Many recent programs have been written all OOP all the time, with lousy readability, maintainability, and yes, reusability. Programmers need tools other than OOP. Sometimes we need functional decomposition. Sometimes we need data analysis. Obviously, before any of this, we need to analyze the system being automated.

Occasionally we need to make tools. Tools should be widely applicable, reusable, easy and fast. Node.pm is such a tool. It's applicable any time hierarchical information needs to be manipulated. Its interface is simple, generic and complete, making it amazingly reusable with absolutely no modification. Its ability to granularize data manipulation makes it easy and fast to write and maintain. And the fact that it's built from simple pointers gives it fast runtime performance.

This month's Linux Productivity Magazine is devoted to the Node.pm tool. If you're a Perl developer, or a  Linux or free software user, this is your magazine. Enjoy!
Steve Litt is the author of Samba Unleashed.   Steve can be reached at his email address.

Help Publicize Linux Productivity Magazine

By Steve Litt
Loyal readers, I need your help.

For months I've publicized Linux Productivity Magazine, expanding it from a new magazine to a mainstay read by thousands. There's a limit to what I can do alone, but if you take one minute to help, the possibilities are boundless.

If you like this magazine, please report it to one of the Linux magazines. Tell them the URL, why you like it, and ask them to link to it.

I report it to them, but they don't take it very seriously when an author blows his own horn. When a hundred readers report the magazine, they'll sit up and take notice.

Reporting is simple enough. Just click on one of these links, and report the magazine. It will take less than 5 minutes.

News Mag
Submission URL or Email address
Comments
Slashdot
http://slashdot.org/submit.pl
Just fill in the short form.
LinuxToday
http://linuxtoday.com/contribute.php3
Just fill in the short form.
Linux Weekly News
lwn@lwn.net
Just tell them the URL, why you like it.
NewsForge
webmaster@linux.com
Just tell them the URL, why you like it.
VarLinux
http://www.varlinux.org/vlox/html/modules/news/submit.php
Just fill in the short form.
LinuxInsider.Com,
Newsfactor Network
http://www.newsfactor.com/perl/contact_form.pl?to=contact
Just tell them the URL, why you like it.
The Linux Knowledge Portal
webmaster@linux-knowledge-portal.org
Just tell them the URL, why you like it.
OS News
http://www.osnews.com/submit.php
Just tell them the URL, why you like it.
DesktopLinux
http://www.desktoplinux.com/cgi-bin/news_post.cgi
Only for LPM issues involving the Linux desktop, not for programming or server issues.

If you really like this magazine, please take 5 minutes to help bring it to a wider audience. Submit it to one of the preceding sites.
Steve Litt is the founder and acting president of Greater Orlando Linux User Group (GoLUG).   Steve can be reached at his email address.

GNU/Linux, open source and free software

By Steve Litt
Linux is a kernel. The operating system often described as "Linux" is that kernel combined with software from many different sources. One of the most prominent, and oldest of those sources, is the GNU project.

"GNU/Linux" is probably the most accurate moniker one can give to this operating system. Please be aware that in all of Troubleshooters.Com, when I say "Linux" I really mean "GNU/Linux". I completely believe that without the GNU project, without the GNU Manifesto and the GNU/GPL license it spawned, the operating system the press calls "Linux" never would have happened.

I'm part of the press and there are times when it's easier to say "Linux" than explain to certain audiences that "GNU/Linux" is the same as what the press calls "Linux". So I abbreviate. Additionally, I abbreviate in the same way one might abbreviate the name of a multi-partner law firm. But make no mistake about it. In any article in Troubleshooting Professional Magazine, in the whole of Troubleshooters.Com, and even in the technical books I write, when I say "Linux", I mean "GNU/Linux".

There are those who think FSF is making too big a deal of this. Nothing could be farther from the truth. The GNU General Public License, combined with Richard Stallman's GNU Manifesto and the resulting GNU-GPL License, are the only reason we can enjoy this wonderful alternative to proprietary operating systems, and the only reason proprietary operating systems aren't even more flaky than they are now. 

For practical purposes, the license requirements of "free software" and "open source" are almost identical. Generally speaking, a license that complies with one complies with the other. The difference between these two is a difference in philosophy. The "free software" crowd believes the most important aspect is freedom. The "open source" crowd believes the most important aspect is the practical marketplace advantage that freedom produces.

I think they're both right. I wouldn't use the software without the freedom guaranteeing me the right to improve the software, and the guarantee that my improvements will not later be withheld from me. Freedom is essential. And so are the practical benefits. Because tens of thousands of programmers feel the way I do, huge amounts of free software/open source is available, and its quality exceeds that of most proprietary software.

In summary, I use the terms "Linux" and "GNU/Linux" interchangably, with the former being an abbreviation for the latter. I usually use the terms "free software" and "open source" interchangably, as from a licensing perspective they're very similar. Occasionally I'll prefer one or the other depending if I'm writing about freedom, or business advantage.

Steve Litt is the author of Troubleshooting Techniques of the Successful Technologist.   Steve can be reached at his email address.

Node.pm Installation

By Steve Litt
For the purposes of this magazine, install Node.pm in /home/yourid/Node. as follows:
  1. Download http://troubleshooters.a3b3.com/projects/Node/download/0.2.0/Node.0.2.0.tgz into your home directory.
  2. tar xzvf Node.0.2.0.tgz
  3. cd ~/Node
  4. ./example_hello.pl
If everything's gone correctly, example_hello.pl should yield the following results:

[slitt@mydesk Node]$ ./example_hello.pl

::: myname ::: mytype ::: myvalue :::
[slitt@mydesk Node]$

If so, go on to the next article. If not, check that Node.pm exists in the directory, as well as example_hello.pl. Troubleshoot as necessary.
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Node.pm Hello World

By Steve Litt
Create the following hello.pl in your home directory:
#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict; # prevent hard to find errors
use Node;

my $topNode = Node->new("myname", "mytype", "myvalue");
print "\n::: ";
print $topNode->getName(), " ::: ";
print $topNode->getType(), " ::: ";
print $topNode->getValue(), " :::\n";


The "use Node" line includes the Node.pm tool. The Node->new() command instantiates a new Node object with name, type and value according to its three arguments.

Now run it, and watch what happens:

[slitt@mydesk slitt]$ ./hello.pl
Can't locate Node.pm in @INC (@INC contains:
/usr/lib/perl5/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/5.8.1
/usr/lib/perl5/site_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.1
/usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_perl/5.8.1/i386-linux-thread-multi
/usr/lib/perl5/vendor_perl/5.8.1
/usr/lib/perl5/vendor_perl/5.8.0
/usr/lib/perl5/vendor_perl .) at ./hello.pl line 9.
BEGIN failed--compilation aborted at ./hello.pl line 9.
[slitt@mydesk slitt]$

As you can see, the program could not find Node.pm because it was not installed on Perl's module path. There are many ways to address this problem. To give this document the ultimate applicability with minimal change, we'll create a shellscript to run the program, so that no matter where you home directory, it will run correctly. The shellscript is called run, is placed in your home directory, and looks like this:

#!/bin/sh
perl -w -I$HOME/Node $@

Be sure to chmod the run file executable. Now watch what happens when you run hello.pl through the run script:

[slitt@mydesk slitt]$ ./run hello.pl

::: myname ::: mytype ::: myvalue :::
[slitt@mydesk slitt]$

The hello.pl script runs correctly. All exercises in this magazine will use the run script to pass the proper Node.pm path to the perl program.

The preceding program created a single Node object with a name, a type and a value, and then output the information from that Node object.

Congratulations! You've just created a Node.pm program, including tackling the sticky issue of defining the path to Node.pm.
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Parsing an Outline

By Steve Litt
The preceding article was a simple proof of concept instantiating a single Node object and printing its name, type and value. This article guides you through parsing an outline into a Node tree, and then outputting the results.

Node.pm was originally created to parse and manipulate tab indented outlines. That's why it has an OutlineParser object whose entire job is to read an outline and convert it to a Node tree.

This article presents a typical, if oversimplified, outline manipulation program. At the most basic level, this is the logic of outline manipulation programs:
  1. Use an OutlineParser object to parse the outline into a Node tree
  2. Instantiate a Callbacks object.
  3. Use one or more Walker objects to manipulate the Node tree
  4. Use a Walker object to print the Node tree
The program in this article does not do #3, manipulate the Node tree. In that respect it's still a proof of concept.

Use an OutlineParser object to parse the outline into a Node tree

The OutlineParser object has several properties. You can define a comment character so it skips lines whose first nonblank is that character. You must define its input source either as stdin (fromStdin()) or a file (fromFile()). If from a file, the filename is passed as an argument to the parse() method. Once the OutlineParser object is configured, its parse() method is called to do the actual work of parsing the outline into a Node tree.

One more interesting point. The OutlineParser object creates a brand new Node object to serve as the ultimate ancestor of all Node objects representing lines in the outline. That way, an outline having multiple top level lines can be represented, with those top level lines being immediate children of the brand new Node object. This is a difference between the Node.pm OutlineParser object and the DOM XML parser, which cannot accommodate multiple top level entries. The OutlineParser created Node object has name="Header Node", type="Head", and value="Inserted by OutlineParser".

Once the outline is parsed into a tree of Node objects, it's ready to manipulate. Such manipulation is performed by Walker objects, which traverse the tree as if being recursed. At each node a callback function is called to perform work on the Node object.

Instantiate a Callbacks object.

Callback routines are called by the Walker object at two events:
  1. Upon first entry into a Node <called the Entry callback>
  2. Upon reentry into the Node from its children <called the return callback>
Note that the code in Node.pm calls the return callback the "exit callback". This is misleading because the return callback is NOT called on Node objects with no children.

Any callback routines used by the Node.pm tool's Walker object MUST be methods of an object. They cannot be free standing subroutines. The reason for making callbacks part of an object is so that persistent data can be accessed and manipulated by the callbacks, without the need for global variables.

The object containing the callback routine(s) needs no special naming convention. Readability favors the name Callbacks, but that's just custom. Also, there can be several different Callback objects. However, be sure to group Callback routines by needed persistent data.

A Callback object is typically instantiated like this:
my $callbacks = Callbacks->new();
However, there's nothing to stop a Callback object's new() method from instantiating variables or performing other functions.

Once the Callbacks object is instantiated, Walker objects can be instantiated using the Callbacks object's methods.

Use a Walker object to print the Node tree

A Walker object "walks" the Node tree, taking action on each Node object. The Walker object implements the algorithm discussed in this month's Editor's Desk article, as follows:

#                     TOP
# |
# .--------------+--------------.
# | | |
# N N N
# | | |
# .---+---. .---+---. .---+---.
# | | | | | | | | |
# N N N N N N N N N
# |
# .-+-.
# | |
# N N


while(1)

{
if ascending
if back at top node
quit
else
perform action on this node

if you can go down
go down
else if you can go right
go right
else
go up
}

The preceding is for the most part the exact algorithm implemented by the Walker object. Notice the perform action on this node line in the preceding pseudocode. The action obviously cannot be defined within the Walker code because it changes from application to application. Therefore, the Walker code calls a dummy subroutine reference, called a callback routine, to do the work. It is then up to the application programmer to link the callback routine with the Walker object, after which the Walker object "magically does the right thing".

The only remaining question is, what does the callback routine do? Here it is:

package Callbacks;
sub cbPrintNode()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}
print $level, " ::: ";
print $checker->getValue();
print "\n";
}

As you can see in the preceding, the callback, which is a method of an object, has three arguments: 0) The object it's attached to, 1) The Node for which it was called, and 2) The level of that node in the node hierarchy. The Node argument is called $checker because the Walker action resembles a checker slid from Node to Node. The $level argument isn't strictly necessary, as the level of any Node could be deduced by repeated calls to getParent(). However, such repeated getParent() calls would be a major performance hit, and would also complicate callback routines. By passing the level as an argument to the callback, the Walker object makes the program faster and the programming easier.

The line starting with unless prevents the algorithm from being run on an undefined Node. In most real callback routines, several lines at the top return under certain conditions. By writing the routine like that, nested if statements are avoided, and the algorithm is kept simple.

This particular callback simply prints the level, followed by " ... ", followed by the value of the Node object.

The program code follows:

#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict;

use Node;

package Callbacks;
sub new()
{
my($type) = $_[0];
my($self) = {};
bless($self, $type);
return($self);
}


sub cbPrintNode()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}
print $level, " ::: ";
print $checker->getValue();
print "\n";
}

package Main;

sub main()
{
#### PARSE FROM FILE README.otl
my $parser = OutlineParser->new();
$parser->setCommentChar("#");
$parser->fromFile();
my $topNode=$parser->parse("Node/README.otl");

#### INSTANTIATE THE Callbacks OBJECT
my $callbacks = Callbacks->new();

#### WALK THE NODE TREE,
#### OUTPUTTING LEVEL AND TEXT
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintNode, $callbacks]
);
$walker->walk();
}

main();

Run the preceding, and notice the output:
./run test.pl | less
0 ::: Inserted by OutlineParser
1 ::: MANUAL FOR THE Node.pm Tool
2 ::: Version 0.2.0 released 5/13/2004
1 ::: License
2 ::: Litt Perl Development Tools License, version 1
3 ::: See COPYING file
3 ::: This license is the GNU GPL with an exception
4 ::: See COPYING.GPL
2 ::: NO WARRANTY!!!!! See COPYING.GPL
1 ::: Purpose
2 ::: Handling hierarchies in Perl
2 ::: Implements a tree of nodes
2 ::: Each node has a name, a type, a value, and optionally attributes
2 ::: Each node can have zero, one or many attributes
2 ::: Each attribute has a name and a value
2 ::: Especially made to handle tab indented outlines in memory
1 ::: Learning Node.pm
2 ::: Learn from the example programs: Study them in this order:
3 ::: example_hello.pl
3 ::: example_parse.pl
3 ::: example_otl2markup.pl
3 ::: example_attribs.pl
3 ::: example_bylevel.pl

Lessons Learned

This article covered most of the elements of a Node.pm application. We used an OutlineParser object to parse the outline into a Node tree, instantiated a callback object, instantiated a Walker object linked to the callback routine designed to print out the nodes, and then set the walker in action with the walk() method.

Callback routines must be methods of an object, and have three arguments: 1) The callback object, 2) the Node object on which it's called, and 3) the level of that Node.
Steve Litt is the author of Rapid Learning: Secret Weapon of the Successful Technologist.   Steve can be reached at his email address.

Outputting Markup

By Steve Litt
The preceding article has all the elements of a practical Node.pm app, but it isn't really practical. This article converts an outline to markup, with start and end tags.

If you think about the concept of tags and end tags, they're nested. If an object contains subobjects, its end tag is not printed until all its subobjects' tags and end tags are printed. This feature can be implemented with complex break logic, or with a stack. Fortunately, with Node.pm it's as simple as printing the end tags with the return callback.

Also introduced in this program is indentation, which is as simple as a tab printing loop indexed by $level. Here's the code:


#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict;

use Node;

package Callbacks;
sub new()
{
my($type) = $_[0];
my($self) = {};
bless($self, $type);
return($self);
}

sub cbPrintTag()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}

#### PRINT START TAG AND CONTENT
for(my $n = 0; $n < $level; $n++) {print "\t";}
print "<node level=", $level, ">";
print $checker->getValue() if $checker->hasValue();

#### IF THIS IS A LEAF LEVEL ITEM, PRINT THE
#### END TAG IMMEDIATELY. OTHERWISE, THE
#### RETURN CALLBACK WILL TAKE CARE OF THE END TAG.
unless($checker->hasFirstChild())
{
print "</node>";
}

#### PRINT NEWLINE
print "\n";
}

sub cbPrintEndTag()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}

#### PRINT END TAG FOR PARENT
for(my $n = 0; $n < $level; $n++) {print "\t";}
print "</node>";
print "\n";
}

package Main;

sub main()
{
#### PARSE FROM FILE README.otl
my $parser = OutlineParser->new();
$parser->setCommentChar("#");
$parser->fromFile();
my $topNode=$parser->parse("Node/README.otl");

#### INSTANTIATE THE Callbacks OBJECT
my $callbacks = Callbacks->new();

#### WALK THE NODE TREE,
#### OUTPUTTING TAG, VALUE, AND END TAG
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintTag, $callbacks],
[\&Callbacks::cbPrintEndTag, $callbacks]
);
$walker->walk();
}

main();

When running it through the less command, you see that it succeeds:

./run test.pl | less
<node level=0>Inserted by OutlineParser
<node level=1>MANUAL FOR THE Node.pm Tool
<node level=2>Version 0.2.0 released 5/13/2004</node>
</node>
<node level=1>License
<node level=2>Litt Perl Development Tools License, version 1
<node level=3>See COPYING file</node>
<node level=3>This license is the GNU GPL with an exception
<node level=4>See COPYING.GPL</node>
</node>
</node>
<node level=2>NO WARRANTY!!!!! See COPYING.GPL</node>
</node>
<node level=1>Purpose
<node level=2>Handling hierarchies in Perl</node>
<node level=2>Implements a tree of nodes</node>
<node level=2>Each node has a name, a type, a value, and optionally attributes</node>
<node level=2>Each node can have zero, one or many attributes</node>
<node level=2>Each attribute has a name and a value</node>
<node level=2>Especially made to handle tab indented outlines in memory</node>
</node>
<node level=1>Learning Node.pm
<node level=2>Learn from the example programs: Study them in this order:
<node level=3>example_hello.pl</node>
<node level=3>example_parse.pl</node>
<node level=3>example_otl2markup.pl</node>
<node level=3>example_attribs.pl</node>
<node level=3>example_bylevel.pl</node>
<node level=3>example_delete.pl</node>
<node level=3>example_insert.pl</node>
<node level=3>example_nodepath.pl</node>
</node>

It's pretty self-explanatory. The only notable code feature is the instantiation of the Walker object, which is instantiated with two callback routines, not just one:

my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintTag, $callbacks],
[\&Callbacks::cbPrintEndTag, $callbacks]
);


The second callback routine is the return callback, triggering when  a Node object is revisited after all its children have been visited. That second callback routine prints the end tag. Notice that the return callback is initiated only upon return to the Node object from its children, meaning that it's never called for leaf level Node objects. To print an end tag for leaf level nodes, the entry callback itself prints the end tag, but only if the node passes the test for being childless.

Lessons Learned

Sometimes you need to perform an action on a node, but only after all its children have been considered. End tags, tree totals, and percentage completion on trees are just a few examples. When this is a requirement, use the return callback to print the final information. Remember, return callbacks are not called on childless nodes, so if childless nodes need this same feature, the entry callback must test for children, and if there are none, print the final information.

Correct indentation is as simple as printing a Tab character $level times, typically in a loop.

Steve Litt is the author of Samba Unleashed.   Steve can be reached at his email address.

Using Attributes

By Steve Litt
A node can have zero or more attributes. An attribute is a piece of information about the node. Each attribute has a name and a value, so they're key/value pairs.

Almost anything you can do with an attribute can be done with a subnode, so you might wonder why attributes are used, and when to use each. Here are some general guidelines:
Node.pm gives the application programmer two different ways to access attributes of a node:
Collective access to attributes includes these:
Single attribute access is provided by these methods
Notice that although there's a remove method for single access, there is no remove method for collective attribute access. This is an oversight that should probably be corrected in a future version.

The program discussed in this article prints out each node, properly indented, with an asterisk as a bullet. Below each node is a list of its attributes, in key=value format, enclosed in parentheses. Below the attribute line, if and only if the node has children, is a message stating how many children. This message is in the program strictly for instructional purposes in that it shows how to retrieve a single attribute.

When you run the program, you'll notice that every Node object except the top one has a _lineno attribute. This was added by the OutlineParser object so that you can give line numbers with error messages. The top node has no such attribute because it was created by the OutlineParser object, not by the parsing of the file.

Here is the source code:

#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict;

use Node;

package Callbacks;
sub new()
{
my($type) = $_[0];
my($self) = {};
bless($self, $type);
return($self);
}

sub cbCountChildren()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}

my $childCount=0;
if($checker->hasFirstChild())
{
$childCount++;
my $checker2 = $checker->getFirstChild();
while($checker2->hasNextSibling())
{
$childCount++;
$checker2 = $checker2->getNextSibling();
}
$checker->setAttribute("children", $childCount);
}
}

sub cbPrintNode()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}

#### PRINT NODE'S VALUE
for(my $n=0; $n < $level; $n++) {print "\t";}
print "* ";
print $checker->getValue();
print "\n";

#### PRINT NODE'S ATTRIBUTES
for(my $n=0; $n <= $level; $n++) {print "\t";}
print "(";

my %attribs = ();
%attribs = $checker->getAttributes() if $checker->hasAttributes();

my @keys = keys(%attribs);
foreach my $key (sort @keys)
{
print $key, "=", $attribs{$key}, "; ";
}

print ")\n";

#### PRINT SPECIAL MESSAGE ABOUT CHILDREN IF IT HAS ANY
if($checker->hasAttribute("children"))
{
for(my $n=0; $n <= $level; $n++) {print "\t";}
print "This node has ";
print $checker->getAttribute("children");
print " children.\n";
}
}

package Main;

sub main()
{
#### PARSE FROM FILE README.otl
my $parser = OutlineParser->new();
$parser->setCommentChar("#");
$parser->fromFile();
my $topNode=$parser->parse("Node/README.otl");


#### INSTANTIATE THE Callbacks OBJECT
my $callbacks = Callbacks->new();

#### WALK THE NODE TREE,
#### COUNTING EACH NODE'S CHILDREN
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbCountChildren, $callbacks]
);
$walker->walk();

#### WALK THE NODE TREE,
#### PRINTING EACH NODE'S VALUE, ATTRIBUTE LIST AND CHILD MESSAGE
$walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintNode, $callbacks]
);
$walker->walk();
}

main();

The following is a partial output:

[slitt@mydesk slitt]$ ./run test.pl | head -n20
* Inserted by OutlineParser
(children=7; )
This node has 7 children.
* MANUAL FOR THE Node.pm Tool
(_lineno=1; children=1; )
This node has 1 children.
* Version 0.2.0 released 5/13/2004
(_lineno=2; )
* License
(_lineno=3; children=2; )
This node has 2 children.
* Litt Perl Development Tools License, version 1
(_lineno=4; children=2; )
This node has 2 children.
* See COPYING file
(_lineno=5; )
* This license is the GNU GPL with an exception
(_lineno=6; children=1; )
This node has 1 children.
* See COPYING.GPL
./run: line 2: 15281 Broken pipe perl -w -I$HOME/Node $@
[slitt@mydesk slitt]$

Lessons Learned

Any node can have zero, one or multiple attributes. Each attribute is a key/value pair. No two attributes of the same node can have the same key. Attributes can be accessed either collectively with hasAttributes(), getAttributes() and/or setAttributes(), or singly via hasAttribute($name), getAttribute($name), setAttribute($name, $value), and/or removeAttribute($name).
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Accessing Nodes by Level

By Steve Litt
Typical access through a hierarchy is via recursive order. In other words, go deep before going next. In a normal outline, recursive order is simply reading down the outline from the top of the page to the bottom.

Sometimes recursive order isn't what you want. Sometimes you might need to present everything at the top level, then everything at the second level, the third, continuing down to the bottom leaf levels. You might need to print each node with its direct children, but no deeper descendents. If so, you can do that by giving your callback object two properties: currentLevel and childrenAtLevel.

The currentLevel property serves as a marker for the callback, which returns with no action unless the level matches currentLevel. At each level, the callback records any children found, incrementing childrenAtLevel. The level by level loop terminates after any iteration failing to increment childrenAtLevel.

The program displayed in this article traverses and prints a node tree level by level. In the output, each level is preceded by a level header. The value of each node at that level is printed, indented. So that you can know that node's parentage, above each group of sibling nodes is a header displaying the full ancestry. That full ancestry is calculated by the Callbacks::cbCalculateFullPath() callback routine, and stored in each node's fullpath attribute.


#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict;

use Node;


package Callbacks;
sub new()
{
my($type) = $_[0];
my($self) = {};
bless($self, $type);
$self->{'childrenatlevel'} = 0;
$self->{'currentlevel'} = 0;
$self->{'previousparentfullpath'} = "initialize";
return($self);
}

sub getChildrenAtLevel(){return $_[0]->{'childrenatlevel'};}
sub setChildrenAtLevel(){$_[0]->{'childrenatlevel'} = $_[1];}
sub incChildrenAtLevel(){$_[0]->{'childrenatlevel'}++;}

sub getCurrentLevel(){return $_[0]->{'currentlevel'};}
sub setCurrentLevel(){$_[0]->{'currentlevel'} = $_[1];}

sub cbCalculateFullPath()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;} # don't process undef node

if($checker->hasParent)
{
my $fullpath = $checker->getParent()->getAttribute("fullpath");
$fullpath .= "/";
$fullpath .= $checker->getValue();
$checker->setAttribute("fullpath", $fullpath);
}
else
{
$checker->setAttribute("fullpath", $checker->getValue());
}
}

sub cbPrintNode()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;} # don't process undef node

#### DO NOTHING UNLESS THIS NODE IS AT THE CURRENTLY SOUGHT LEVEL
return unless $level == $self->getCurrentLevel();

#### DO NOTHING UNLESS THIS NODE HAS CHILDREN
return unless $checker->hasFirstChild();

#### PRINT HEADER
print "\n", $checker->getAttribute("fullpath"), "\n";

#### PRINT CHILDREN AND COUNT CHILDREN AT LEVEL
my $checker2 = $checker->getFirstChild();
print "\t", $checker2->getValue(), "\n";
$self->incChildrenAtLevel();

while($checker2->hasNextSibling())
{
$checker2 = $checker2->getNextSibling();
print "\t", $checker2->getValue(), "\n";
$self->incChildrenAtLevel();
}
}


package Main;

sub main()
{
#### PARSE FROM FILE README.otl
my $parser = OutlineParser->new();
$parser->setCommentChar("#");
$parser->fromFile();
my $topNode=$parser->parse("Node/README.otl");


#### INSTANTIATE THE Callbacks OBJECT
my $callbacks = Callbacks->new();

#### WALK THE NODE TREE,
#### CALCULATING FULL PATHS AND PUTTING THEM IN AN ATTRIBUTE
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbCalculateFullPath, $callbacks]
);
$walker->walk();

#### PRINT LEVEL 0
print "\n\n********** BEGIN LEVEL ", "0", "\n";
print "\t", $topNode->getValue(), "\n";

#### SET STARTING PARENT LEVEL,
#### AND SET $childCount SO THE LOOP WILL FIRE THE FIRST TIME
my $level=0;
my $childCount=9999;

#### LOOP THROUGH EACH LEVEL, QUIT WHEN NO MORE CHILDREN
while($childCount > 0)
{
print "\n\n********** BEGIN LEVEL ", $level + 1, "\n";
$callbacks->setChildrenAtLevel(0);
$callbacks->setCurrentLevel($level);
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintNode, $callbacks]
);
$walker->walk();
$childCount = $callbacks->getChildrenAtLevel();
$level++;
}
}

main();

z
./run ./test.pl | head -n70

********** BEGIN LEVEL 0
Inserted by OutlineParser


********** BEGIN LEVEL 1

Inserted by OutlineParser
MANUAL FOR THE Node.pm Tool
License
Purpose
Learning Node.pm
File manifest
Objects
Installation


********** BEGIN LEVEL 2

Inserted by OutlineParser/MANUAL FOR THE Node.pm Tool
Version 0.2.0 released 5/13/2004

Inserted by OutlineParser/License
Litt Perl Development Tools License, version 1
NO WARRANTY!!!!! See COPYING.GPL

Inserted by OutlineParser/Purpose
Handling hierarchies in Perl
Implements a tree of nodes
Each node has a name, a type, a value, and optionally attributes
Each node can have zero, one or many attributes
Each attribute has a name and a value
Especially made to handle tab indented outlines in memory

Inserted by OutlineParser/Learning Node.pm
Learn from the example programs: Study them in this order:
That's the only way to learn this tool
Do each example program in order
Example programs

Inserted by OutlineParser/File manifest
Documentation
Licensing
Node.pm file
Example Programs
Sample node path config file (for example_nodepath.pl)
Sample outline (used for example_delete.pl)

Inserted by OutlineParser/Objects
Node.pm implements three object types:
Node
OutlineParser
Walker

Inserted by OutlineParser/Installation
See INSTALL file


********** BEGIN LEVEL 3

Inserted by OutlineParser/License/Litt Perl Development Tools License, version 1
See COPYING file
This license is the GNU GPL with an exception

Inserted by OutlineParser/Learning Node.pm/Learn from the example programs: Study them in this order:
example_hello.pl
example_parse.pl
example_otl2markup.pl
example_attribs.pl

Lessons Learned

We kept three variables in the callback object (children at level, current level, and full path). Access to this capability is the reason that Node.pm demands all callbacks be an object method. And we walked the tree multiple times -- one for each level. This was done in the main routine.
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Deleting Nodes

By Steve Litt
Because Perl is a garbage collection language, node deletion DOES NOT deallocate memory and the like. However, in the absense of a copy of the node, it will be garbage collected and unavailable. The deletion process also specificly undef's the deleted node's first and last children.

You noticed I mentioned keeping a copy of the deleted node. The algorithm of a Walker object moves a node around the tree like a checker. Calling $checker->deleteSelf() does not render $checker undefined. In fact, $checker still has its parent, nextSibling and previousSibling pointers intact. What this means is that the Walker object's next iteration goes to exactly the same node as it would have if the deletion had not taken place. In other words, you do not need to "move the checker  back one" after a deletion. This makes deletion algorithms very simple.

There may come a time when you want to delete a node but keep its children. In that case, you must first attach its children to nodes that will not be deleted.

This article showcases a program that deletes every node containing the word deleteme, and all of its children. The input file is the deletetest.otl file packaged with the Node.pm distribution. This file looks like this:

[slitt@mydesk slitt]$ cat Node/deletetest.otl
Top
Level2
Level2b
Level3
deleteme
gone
gone
deleteme
gone
gone
this should stay
deleteme
deleteme
gone
gone
Level3b
2level2
Top2
[slitt@mydesk slitt]$

 A Walker object walks the tree, calling a callback routine that tests for the deleteme string, and if found deletes the calling node. The result is as follows:
z
[slitt@mydesk slitt]$ ./run ./test.pl
Inserted by OutlineParser
Top
Level2
Level2b
Level3
this should stay
Level3b
2level2
Top2
[slitt@mydesk slitt]$

Please remember,
x
#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict;

use Node;

package Callbacks;
sub new()
{
my($type) = $_[0];
my($self) = {};
bless($self, $type);
return($self);
}

sub cbDelete()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return;}

#### DELETE THIS NODE IF ITS VALUE CONTAINS deleteme
my $text = "init";
$text = $checker->getValue() if $checker->hasValue();
if($text =~ m/deleteme/)
{
$checker->deleteSelf();
}
}

sub cbPrintNode()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return -999;}

for(my $n=0; $n < $level; $n++) {print "\t";}
print $checker->getValue(), "\n";
}

package Main;

sub main()
{
#### PARSE FROM FILE deletetest.otl
my $parser = OutlineParser->new();
$parser->setCommentChar("#");
$parser->fromFile();
my $topNode=$parser->parse("Node/deletetest.otl");

#### INSTANTIATE THE Callbacks OBJECT
my $callbacks = Callbacks->new();

#### WALK THE NODE TREE,
#### DELETING NODES WITH "deleteme" IN THEM
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbDelete, $callbacks]
);
$walker->walk();

#### WALK THE NODE TREE,
#### OUTPUTTING INDENTED VALUE
$walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintNode, $callbacks]
);
$walker->walk();
}

main();

Lessons Learned

Deletion is as simple as calling the Node::deleteSelf() method. This is Perl, so the deletion doesn't actually deallocate the node's memory. Instead, it sets pointers of its parents and siblings to "cut it out of the picture". Because the deleted node's parent and sibling pointers are still intact, there is no need for the Walker object or any other algorithm to "backslide" the pointer. The algorithm can continue. Once all references to the deleted node are gone, Perl's garbage collection reclaims the node's memory.

Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Inserting Nodes

By Steve Litt
Node.pm isn't rocket science. A node holds these pieces of data:
A node also holds the following location/navigation information:
To insert a new first child for node $currNode, you must:
  1. Create a new node
  2. Set the new node's parent pointer to $currNode
  3. If $currNode has a first child pointer, set the new node's next sibling pointer to the current first child
  4. If $currNode has a first child pointer, set the current first child's previous sibling pointer to the new node
  5. Set $currNode's first child pointer to the new node
If the preceding seems like too much of a hassle, simply do this:
$currNode->insertFirstChild(Node->new("myName", "myType", "myValue"));
There's a Node::insertLastChild() that does the same, except inserts the node as the last child.

What if you want to insert a child in the middle, instead of as the first or last child? In that case, navigate to the node just before where you want to insert the new one, and do this, assuming the node before the one you want inserted is called $currNode:
$currNode->insertSiblingAfterYou(Node->new("myName", "myType", "myValue"));
This single call does the following:
  1. Create the new node
  2. Set the new node's parent pointer to the parent of $currNode
  3. Set the new node's previous sibling pointer to $currNode
  4. If $currNode has a next sibling pointer, set the new node's next sibling pointer to the next sibling pointer of $currNode
  5. If $currNode has a next sibling pointer, set the previous sibling pointer of $currNode's next sibling to the new node
  6. If $currNode does not have a next sibling pointer, set $currNode's parent's last child pointer to the new node
  7. Set $currNode's next sibling pointer to the new node
As you can see, methods like Node::insertFirstChild() and  Node::insertSiblingAfterYou() make tree insertions a breeze. The biggest challenge, and it's not much of a challenge at that, is to navigate either to the node before the intended insertion, after the intended insertion, or if the intended insertion is the first or last child of a parent node, move to that parent node.

This article builds a program to create and manipulate an appointment calender as a node tree, the hierarchy being:
The calendar is built with the help of two arrays: @monthNames and @monthLengths. @monthLengths does not take into account leap years because this is an exercise, not a practical program. The following is a general description of what this program does:
This is a pretty challenging program. It's long, with lots of steps. The section that switches the two nodes is extremely complex, but valuable because it teaches you working with clone nodes.

Here's the code, all 468 lines of it:

#!/usr/bin/perl -w

# Copyright (C) 2004 by Steve Litt
# Licensed with the GNU General Public License, Version 2
# ABSOLUTELY NO WARRANTY, USE AT YOUR OWN RISK
# See http://www.gnu.org/licenses/gpl.txt

use strict; # prevent hard to find errors

use Node; # Use Node.pm tool

package Callbacks;
sub new()
{
my($type) = $_[0];
my($self) = {};
bless($self, $type);
return($self);
}

#=================================================================
# This callback places asterisks in the value of years, months and
# days whose anscestors have appointments, one asterisk per appointment
#=================================================================
sub cbMakeMarks()
{
my($self, $checker, $level) = @_;
return unless defined($checker); # don't process undef node

#### PROCESS ONLY DAY, MONTH OR YEAR NODES
return unless (
$checker->getType() eq "Day" ||
$checker->getType() eq "Month" ||
$checker->getType() eq "Year"
);

#### COUNT APPOINTMENTS IN DESCENDENTS
my $count = 0;
my $childNode = $checker->getFirstChild();
while(defined($childNode))
{
if($checker->getType() eq "Day")
{
if(defined($childNode->getValue()))
{
$count++;
}
}
else
{
if($childNode->hasAttribute("appointments"))
{
$count += $childNode->getAttribute("appointments");
}
}
$childNode = $childNode->getNextSibling();
}

#### PLACE COUNT IN ATTRIBUTE
$checker->setAttribute("appointments", $count);

#### MARK STARS, ONE FOR EACH APPOINTMENT IN DESCENDENTS
if($count > 0)
{
my $string;
for(my $n=0; $n < $count; $n++){$string .= '*';}
$checker->setValue($string);
}
}

#=================================================================
# This callback operates ONLY on day nodes. When
# called from a day node, it inserts hourlong appointment slots
# starting at 8am and ending at 5pm. The code is pretty
# straightforward.
#=================================================================
sub cbInsertHours()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return -999;} # don't process undef node


return unless $checker->getType() eq "Day"; # Insert hours under days only

my $checker2;
for(my $n=8; $n <= 16; $n++)
{
my $startHour = "$n:00";
my $n2 = $n + 1;
my $endHour = "$n2:00";
my $node = Node->new("$startHour" . "-" . "$endHour", "Hour", undef);
if($checker->hasFirstChild())
{
$checker2 = $checker2->insertSiblingAfterYou($node);
}
else
{
$checker2 = $checker->insertFirstChild($node);
}
}
}

#=================================================================
# The cbPrintNode() callback prints the name of the node,
# and its value if a value is defined. It's very straighforward.
#=================================================================
sub cbPrintNode()
{
my($self, $checker, $level) = @_;
unless (defined($checker)) {return -999;} # don't process undef node

#### DON'T PRINT LEVEL 0 (CALENDER)
return if $level == 0;

for(my $n=1; $n < $level; $n++) { print "\t";}

print $checker->getName() if $checker->hasName();
print ": ";

print $checker->getValue() if $checker->hasValue();
print "\n";
}


package Main;

###########################################################################
# The makeAppointments() subroutine manually makes several appointments
# To facilitate a later demonstration of node swapping, the June 22
# and Sept 22 appoinments are accidentally switched
###########################################################################
sub makeAppointments($)
{
my $yearNode = shift;

#### MARCH 22 AT 8AM
my $monthNode = $yearNode->getFirstChild() -> #January
getNextSibling() -> #February
getNextSibling(); #March
my $dayNode = $monthNode->getFirstChild();
while($dayNode->getName() != 22)
{
$dayNode = $dayNode->getNextSibling();
unless(defined($dayNode))
{
die "No March 22\n";
}
}
my $hourNode = $dayNode->getFirstChild();
$hourNode->setValue("Spring Cleaning");

#### JUNE 22 AT 9AM
#### WRONGLY LABELED AS FALL FESTIVAL
#### INSTEAD OF SUMMER BREAK
$monthNode = $monthNode->getNextSibling() -> # April
getNextSibling() -> # May
getNextSibling(); # June
$dayNode = $monthNode->getFirstChild();
while($dayNode->getName() != 22)
{
$dayNode = $dayNode->getNextSibling();
unless(defined($dayNode))
{
die "No June 22\n";
}
}
$hourNode = $dayNode->getFirstChild()->getNextSibling();
$hourNode->setValue("Fall Festival");

#### SEPTEMBER 22 AT 10AM
#### WRONGLY LABELED AS FALL FESTIVAL
#### INSTEAD OF SUMMER BREAK
$monthNode = $monthNode->getNextSibling() -> # July
getNextSibling() -> # August
getNextSibling(); # September
$dayNode = $monthNode->getFirstChild();
while($dayNode->getName() != 22)
{
$dayNode = $dayNode->getNextSibling();
unless(defined($dayNode))
{
die "No September 22\n";
}
}
$hourNode = $dayNode -> getFirstChild() -> #8-9
getNextSibling() -> # 9-10
getNextSibling(); # 10-11
$hourNode->setValue("Summer Break");

#### DECEMBER 22 FROM 3PM TO 5PM (2 TIMESLOTS)
#### HAPPY HOLIDAYS PARTY
$monthNode = $monthNode->getNextSibling() -> # October
getNextSibling() -> # November
getNextSibling(); # December
$dayNode = $monthNode->getFirstChild();
while($dayNode->getName() != 22)
{
$dayNode = $dayNode->getNextSibling();
unless(defined($dayNode))
{
die "No December 22\n";
}
}
$hourNode = $dayNode->getFirstChild();
while($hourNode->getName() ne "15:00-16:00")
{
$hourNode = $hourNode->getNextSibling();
unless(defined($hourNode))
{
die "No 4pm slot\n";
}
}
$hourNode->setValue("Happy Holidays Party");
$hourNode = $hourNode->getNextSibling();
$hourNode->setValue("Happy Holidays Party");

#### DECEMBER 30 AT 9AM BUY PARTY SUPPLIES
while($dayNode->getName() != 30)
{
$dayNode = $dayNode->getNextSibling();
unless(defined($dayNode))
{
die "No December 30\n";
}
}
$hourNode = $dayNode->getFirstChild()->getNextSibling();
$hourNode->setValue("Buy Party Supplies");
}

###########################################################################
# The insertMonthsAndDays() iterates through @monthNames. For each, it
# creates a month node, and then under than node iterates days from 1 to
# that month's entry in the @monthLengths array.
#
# Note that we could have avoided using a nested loop by using a Walker
# and associated callback to install the days under every month. In such
# a case the array of month lengths would have been placed in the Callback
# object. However, for the sake of variety, we chose to use a nested loop
# to load the months and days.
###########################################################################
sub insertMonthsAndDays($)
{
my $yearNode = shift;
my $checker = $yearNode;
my $checker2;
my @monthNames=("January", "February", "March", "April", "May",
"June", "July", "August", "September", "October",
"November", "December");
my @monthLengths=(31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31);
my $monthSS = 0;
foreach my $monthName (@monthNames)
{
my $node = Node->new($monthName, "Month", undef);
$node->setAttribute("days", $monthLengths[$monthSS]);
if($yearNode->hasFirstChild())
{
$checker = $checker->insertSiblingAfterYou($node);
}
else
{
$checker = $yearNode->insertFirstChild($node);
}
for(my $n=1; $n <= $monthLengths[$monthSS]; $n++)
{
$node = Node->new($n, "Day", undef);
if($checker->hasFirstChild())
{
$checker2 = $checker2->insertSiblingAfterYou($node);
}
else
{
$checker2 = $checker->insertFirstChild($node);
}
}
$monthSS++;
}
}

###########################################################################
# This subroutine switches the June 22 9am appointment and the
# September 22 10am appointment. In each case, both the appointment
# text and the time needed switching.
#
# The sane way to accomplish this task would have been to modify
# the nodes in place. However, this subroutine was created solely to
# demonstrate the movement of nodes, so that's what we did.
#
# Note that the fact that the two are at different times complicates the
# situation. It's not enough to just trade nodes -- the Sept 9am node
# must be placed after the existing June 10am node, which itself is after
# the erroneous June 9am node containing what should be September's
# appointment. After such placement, the original June 9am node must
# have its name updated so that it is a 10am node. A similar process
# takes place for September. The original nodes are also deleted.
#
# Please follow the (convoluted and contrived) logic:
# 1. Store the June hour node in $juneNode
# 2. Store the September hour node in $septNode
# 3. After the existing June 10am, place a CLONE of the Sept appointment
# 4. Before the existing Sept 9am, place a CLONE of the June appointment
# 5. Delete the original June appointment
# 6. Delete the original September appointment
# 7. On the original June 10am node, make it 9am
# 8. On the original September 9am node, make it 10am
###########################################################################
sub switchJuneAndSeptemberAppointments($)
{
my $yearNode = shift;

#### FIND NODE FOR JUNE 22 9AM APPOINTMENT
my $juneNode = $yearNode->getFirstChild();
while(defined($juneNode))
{
last if $juneNode->getName() eq "June";
$juneNode = $juneNode->getNextSibling();
}
die "Cannot find month of June\n" unless defined($juneNode);

$juneNode = $juneNode->getFirstChild();
while(defined($juneNode))
{
last if $juneNode->getName() eq "22";
$juneNode = $juneNode->getNextSibling();
}
die "Cannot find June 22\n" unless defined($juneNode);

$juneNode = $juneNode->getFirstChild();
while(defined($juneNode))
{
last if $juneNode->getName() eq "9:00-10:00";
$juneNode = $juneNode->getNextSibling();
}
die "Cannot find June 22 at 9am\n" unless defined($juneNode);

#### FIND NODE FOR SEPTEMBER 22 10AM APPOINTMENT
my $septNode = $yearNode->getFirstChild();
while(defined($septNode))
{
last if $septNode->getName() eq "September";
$septNode = $septNode->getNextSibling();
}
die "Cannot find month of September\n" unless defined($septNode);

$septNode = $septNode->getFirstChild();
while(defined($septNode))
{
last if $septNode->getName() eq "22";
$septNode = $septNode->getNextSibling();
}
die "Cannot find September 22\n" unless defined($septNode);

$septNode = $septNode->getFirstChild();
while(defined($septNode))
{
last if $septNode->getName() eq "10:00-11:00";
$septNode = $septNode->getNextSibling();
}
die "Cannot find September 22 at 9am\n" unless defined($septNode);

#### SWITCH THE NODES
my $newJune = $juneNode->getNextSibling()->insertSiblingAfterYou($septNode->clone());
my $newSept = $septNode->getPrevSibling()->insertSiblingBeforeYou($juneNode->clone());
$juneNode->deleteSelf();
$septNode->deleteSelf();

#### FIX NAMES OF SURROUNDING CLONES
$newJune->getPrevSibling()->setName("9:00-10:00");
$newSept->getNextSibling()->setName("10:00-11:00");

return;
}


###########################################################################
# In the main routine, you carry out or delegate the following tasks
# in order to create an appointment calendar:
# 1. Insert single level 0 and 1 nodes
# 2. Instantiate the Callbacks object
# 3. Insert all month and day nodes
# 4. Insert all hour nodes
# 5. Make appointments
# erroneously switching the june 22 & Sept 22 appointments
# 6. Mark days, months and years containing appointments
# 7. Output the calendar
# 8. Switch back June22 and Sept22
# 9. Re mark days, months and years
# 10. Output a separator between bad and good calendars
# 11. Re output the calendar
#
###########################################################################
sub main()
{
#### INSERT SINGLE LEVEL 0 AND 1 NODES
my $topNode=Node->new("Calender", "Calender", "Calender");
my $yearNode=$topNode->insertFirstChild(Node->new("2004", "Year", undef));

#### INSTANTIATE THE Callbacks OBJECT
my $callbacks = Callbacks->new();

#### INSERT MONTH AND DAY NODES
insertMonthsAndDays($yearNode);

#### INSERT THE HOURS USING A Walker
my $walker = Walker->new
(
$topNode,
[\&Callbacks::cbInsertHours, $callbacks]
);
$walker->walk();


#### MAKE A FEW APPOINTMENTS
#### ACCIDENTALLY SWITCHING SUMMER AND FALL
makeAppointments($yearNode);

#### MARK DAYS, MONTHS AND YEAR THAT HAVE APPOINTMENTS
#### USING A WALKER WITH ONLY A RETURN CALLBACK
$walker = Walker->new
(
$topNode,
undef,
[\&Callbacks::cbMakeMarks, $callbacks]
);
$walker->walk();

#### WALK THE NODE TREE,
#### OUTPUTTING THE CALENDER
$walker = Walker->new
(
$topNode,
[\&Callbacks::cbPrintNode, $callbacks]
);
$walker->walk();

#### CORRECT THE MISTAKE
#### SWITCH JUNE 22 AND SEPT 22
switchJuneAndSeptemberAppointments($yearNode);

#### RE-MARK DAYS, MONTHS AND YEAR THAT HAVE APPOINTMENTS
#### USING A WALKER WITH ONLY A RETURN CALLBACK
$walker = Walker->new
(
$topNode,
undef,
[\&Callbacks::cbMakeMarks, $callbacks]
);
$walker->walk();

#### OUTPUT A SEPARATOR BETWEEN ORIGINAL AND CORRECTED CALENDARS
for (my $n=0; $n<5; $n++)
{
print "######################################################\n";
}

#### RE-WALK THE NODE TREE,
#### RE-OUTPUTTING THE CALENDER
$walker = Walker->new
(
$topNode, # start with this node
[\&Callbacks::cbPrintNode, $callbacks] # do this on entry to each node
);
$walker->walk();
}

main();


View the output of this program like this:
./run ./test.pl | less

Key Points

This exercise introduces several key points.

Child Insertion Riff

Look at the cbInsertHours() method and notice how hour nodes are inserted:

for(my $n=8; $n <= 16; $n++)
{
# ...
my $node = Node->new($range, "Hour", undef);

if($checker->hasFirstChild())
{
$checker2 = $checker2->insertSiblingAfterYou($node);
}
else
{
$checker2 = $checker->insertFirstChild($node);
}
}


In the preceding, $checker is the node under consideration, with $checker2 being the node repeatedly added as a child of $checker. The first addition is performed via insertFirstChild(), returning itself as $checker2. Thereafter, insert more children of $checker by performing $checker2->insertSiblingAfterYou(). This riff is ubitquitous in the Node.pm world.

Child Search Riff

Look at the switchJuneAndSeptemberAppointments() method and notice how the children of the original $juneNode are searched:

$juneNode = $juneNode->getFirstChild();
while(defined($juneNode))
{
last if $juneNode->getName() eq "22";
$juneNode = $juneNode->getNextSibling();
}
die "Cannot find June 22\n" unless defined($juneNode);


In the preceding, we perform the equivalent of a priming read on the parent node, and then loop through the children using getNextSibling(). If the search is unsuccessful, $juneNode is undefined; a condition that can be caught at the bottom. This riff is ubitquitous in the Node.pm world.

Returns at Top of Callbacks

My circa 1983 professors