Troubleshooters.Com and
Steve Litt's Free Software Projects Present

Node.pm
An object library to parse outlines and handle hierarchies

NO WARRANTY!
There is no warranty for anything contained in the Node.pm distribution or documentation or its web pages, to the extent permitted by applicable law. Except when otherwise stated in writing the copyright holders and/or other parties provide the program, documentation and web pages "as is" without warranty of any kind, either expressed or implied, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The entire risk as to the quality and performance of the program is with you. Should the program, documentation or web pages prove defective, you assume the cost of all necessary servicing, repair or correction.

Node.pm is a Perl object library designed to quickly and easily parse outline files into node hierarchies, and to manipulate those node hierarchies. It is similar to DOM but lighter weight, and it can handle multiple top level items in parsed files.

Copyright and license
Project charter (why Node.pm exists, design manifesto, who is the audience. A must for contributors and maintainers).
Project specifications (a must for contributors and maintainers).
Download links for the project sources.
Maintainers list
Needed Programming and Documentation Tasks
How to Participate
Instructions on how to join the project mailing list
FAQ (Frequently Asked Questions) list.
HTMLized versions of the project documentation
Links to related projects.
Dedication: We Stand On Their Shoulders
Progress

Copyright and License

This software is copyright (C) 2003 by Steve Litt, all rights reserved. I have licensed it under the Litt Perl Development Tool License (LPDTL). The LPDTL is the GNU GPL with an exception and an exception to that exception. The intent of the LPDTL is to provide you with a tool that is copylefted free software, but does not require programs you write by including that tool to be either copylefted, free software, or GPL compatible. I have tried to craft this license so that badguys cannot proprietarize modifications of the tool itself.

The LPDTL contains provisions for the user to issue modifications of the code as pure GPL, in case you're uncomfortable with the LPDTL. Keep in mind, however, that if you do that, nobody (including yourself) will be able to use that modified tool to create software that is either proprietary or GPL incompatible.

Project Charter

The purpose of Node.pm is the quick and easy parsing of outline files, and quick and easy manipulation of hierarchies. This might not sound especially impressive, but consider the uses:

configuration file parsing
Menu creation and service
Conversion between markup languages
XML creation
HTML structuring
Creation of book structure
outline processing
Hierarchy drilldown
Hierarchy flattening

Node.pm is created as an object oriented tool. At the lowest level are Node objects, which have a name, type, value, and a list of named attributes, as well as containing pointers to parent, previous sibling, next sibling, first child and last child. Hierarchies are built from trees of these Node objects.

The design priorities of Node.pm are:

Quick, simple and easy development of hierarchy/outline manipulation algorithms
Easy learning curve relative to the tool's power
Ability to implement very complex hierarchical requirements in an easy to maintain manner

To simplify working with such hierarchies, the Walker object walks the tree recursively, handling all a Node's decendents before going on to its younger siblings. In the preceding sentence the word "recursively" is used loosely, because although the order of Node visitation mimics what you would see in a recursive algorithm, in fact the algorithm is a simple loop, conserving on memory. Having the Walker "walk the Node tree" is a nice intellectual exercise, but it accomplishes nothing unless it takes action. The action taken is determined by two callback routines passed GOT HERE the EMDL language is to facilitate fast and easy manipulation of large menu systems using a standard text editor capable of automatic indentation (the ability to indent to the same level as the preceding line, plus ways to cut, paste, indent and exdent multiple lines). The Vim editor is one such editor. VimOutliner is ideally suited for EMDL authoring because of its ability to collapse and expand trees.

Priorities of the EMDL initiative are:

Quick and easy creating and editing of large menu systems

Cut and paste
Promotion and demotion
Assignment of commands, directories, prepaths, and other parameters to choices

Authoring can be done with a text editor
Readability
Use of a text outliner to define menu hierarchies

And the following are not design priorities at all:

Ease of parsing
Run time efficiency
Gui environments
Inclusion of every last possible feature of every possible menu system

Priorities #1, 3 and 4 preclude anything with end tags, and suggest that the language should be line based. So it is. #3 suggests use of an outline to simulate the hierarchy of a menu system, with its tree of menus and each menu's selection of choices, many of which have multiple parameters.

EMDL is so productive you can create a substantial personal custom "start menu" in an hour or so. It's featureful enough to create fairly complex menus, although it lacks a generic facility for prompted argument substitution. However, if the destination menu supports prompted argument substitution in its commands, then the prompted argument substitution tags can be placed in an EMDL command parameter. This is why EMDL supports UMENU prompted argument substitution. It's anticipated that version 2 of the EMDL language specification will support prompted argument substitution natively.

The first program to use EMDL is emdl2umenu, which performs the function suggested by its name. It's a GNU GPL licensed program that runs on any Linux computer with a standard Perl 5 installation, and will likely run on any UNIX, UNIX workalike, or BSD computer. It hasn't been tested with Windows, but making it run on Windows would likely require only the most trivial changes.

emdl2umenu is consists of three objects -- the Parser, the Menutree, and the Writer. Making it work with another menu system would involve only changes to the Writer object. Creating a UMENU to EMDL converter, thus making this design tool round trip, would involve creating a Writer object to write EMDL and a Parser object to parse a tree of UMENU menu definition files.

The intended audience of EMDL are those who possess all of the following traits:

Have, or can create, a tool to convert EMDL to their menuing system of choice
Willing to edit and create menus

Maintainer's Guide

All current and future maintainers of Node.pm should be very cognizant of the project's priorities. Node.pm must be easy for the human maintaining or manipulating hierarchies.

Project Specifications

Node.pm implements three types of objects:

Node: Data for a single entity in a hierarchy
Parser: An object to convert a tab indented outline into a hierarchy of Node objects
Walker: An object that "walks" a hierarchy of Node objects, taking action on each Node as specified by its callback functions.

It should be noted that Node.pm was inspired by the Apache Software Foundation's DOM (Document Object Model) software. It differs from DOM in that it is less inclusive, and:

Much easier to add to Perl than DOM
Much smaller footprint than DOM
Includes a Walker object to simplify tree manipulation
Ability to parse input with multiple top level entities
Parses tab indented outlines instead of XML
No facilities for DTD's or Schemas

Node Object

The Node object represents one entity in a hierarchy or tree. It contains both data, navigational pointers and methods:

Data

Name
Type
Value
Zero, one or more attributes in key/value pairs

Navigational Pointers

Parent
Previous sibling
Next sibling
First child
Last child

Methods

Data

hasName()
getName()
setName()
hasType()
getType()
setType()
hasValue()
getValue()
setValue()
Attribute methods:

hasAttribute()
getAttribute()
setAttribute()
removeAttribute()
hasAttributes()
getAttributes()
setAttributes()

Navigation

hasParent()
getParent()
setParent()
hasNextSibling()
getNextSibling()
setNextSibling()
hasPrevSibling()
getPrevSibling()
setPrevSibling()
hasFirstChild()
getFirstChild()
setFirstChild()
hasLastChild()
getLastChild()
setLastChild()

Creation

new()
clone()

Insertion

insertSiblingBeforeYou()
insertSiblingAfterYou()
insertFirstChild()
insertLastChild()

Deletion

deleteSelf() [note that this deletes the node's whole subtree]

OutlineParser Object

The OutlineParser object is an object whose task is to parse a tab indented outline and place its information into a tree of Node objects. Because outlines frequently have multiple top level entries, but a true tree can have only one, the OutlineParser object creates a new Node object that becomes the top level, with the outline's top level entries becoming children of that Node object created by the OutlineParser.

Each line of the outline is converted to a Node object with the following data:

Name is undefined
Type is "Node"
Value is the text of the line itself
Attribute _lineno is the line number of the source outline file

This is vital for error messages

The outline file's hierarchy is mirrored in the Node tree. If line P has a child C in the outline, then node P has a child node C in the Node hierarchy.

OutlineParser Object Methods
Properties of the Parse
setCommentChar()	Single character signifying a line is a comment. This character must be the first nonblank character on the line in order to render the line a comment. Comment lines are not converted to nodes, nor are they checked for correct indentation or other syntax. Default is undef.
hasCommentChar()	Returns true of the OutlineParser object has a comment character.
getCommentChar()	Returns the OutlineParser's comment character.
fromStdin()	Tells the OutlineParser that it will receive its input from stdin. This is opposite of and contramands fromFile(). By default OutlineProcessor receives its input from stdin.
fromFile()	Tells the OutlineParser that it will receive its input from a file. The actual filename is passed as an argument to the parse() method. fromFile() is opposite of and contramands fromStdin(). By default OutlineProcessor receives its input from stdin.
zapBlanks()	Sets the OutlineParser to ignore blank lines, meaning lines with no characters and also lines with only whitespace. zapBlanks() is opposite of and contramands dontZapBlanks(). By default OutlineProcessor ignores blank lines.
dontZapBlanks()	Sets the OutlineParser to NOT ignore blank lines, meaning all-whitespace lines and lines with no characters are inserted as Node objects. dontzapBlanks() is opposite of and contramands zapBlanks(). By default OutlineProcessor ignores blank lines.
Action Methods
new()	Instantiates a new OutlineParser object, and passes it back as a return.
getFirstNonBlankChar()	Returns the first non-whitespace character on a line. This is used primarily internally, and the return for blank lines is not well defined. It is suggested that application programmers refrain from using this method.
parse()	Perform the parse. If parsing a file, pass the filename in via an argument. All other parse properties have been previously defined. The top level node (the one created by OutlineParser) is passed back as the function return. From that top level node, the application programmer can navigate or walk anywhere in the Node tree.

Walker Object

The Walker object "walks" an entire Node hierarchy defined by the new() arg1 Node argument and its descendents. This walk always goes deep before going broad. In other words, all children are processed before going to the next sibling.

The Walker object visits every Node in the hierarchy, which in itself does nothing. The Walker performs actions on these Node objects via two callback routines, the entry callback and the return callback. The entry callback is called immediately upon the Walker's first visiting the Node. The return callback is called upon the return to a Node after all its descendents have been processed. Leaf level nodes never trigger the return callback because they never return from descendents.

The entry and return callbacks are arg2 and arg3, respectively, of the new() function. Callbacks MUST be methods of a perl object. They CANNOT be free standing subroutines. The reason for this design is so the callbacks can keep persistent information, and so they can trade information with the "outside world" without resorting to global variables.

The following is a simple example of the use of a Walker object, assuming that object Callbacks contains method printNode(), whose three arguments are the object itself ($self), the Node object that called the callback, and the level of that node:

my $walker = Walker->new
	(
	$topNode,				# start with this node
	[\&Callbacks::cbPrintNode, $callbacks]	# do this on entry to each node
	);
$walker->walk();

You'll notice the preceding instantiates the Walker object with only two arguments. The return callback is not passed. In fact, most Walker objects use only an entry callback. If a return callback is required, it becomes the third argument. In the rare case where a return callback is required but not an entry callback, the argument for the entry callback is passed as undef.

Downloads

Node.0.2.0.tgz (tiny -- approximately 25.6K)
README.otl
INSTALL

Maintainers List

Steve Litt

Steve Litt's email address

Needed Programming and Documentation Tasks

I've been using Node.pm in its current form for several months, and it appears complete and stable. I've written several substantial apps using Node.pm, including the newest (unreleased but being heavily tested) versions of both EMDL and UMENU.

The one area where Node.pm might be a little light is in reverse direction navigation. It hasn't been tested well in reverse navigation, and there is no provision for a reverse Walker. A reverse walker could be created as a different object (ReverseWalker), or by adding a direction property to the existing Walker and placing code in the parse() method to accommodate different directions.

More esoterically, it might be nice some day to have an XMLParser object analogous to the OutlineParser object.

Where I could really use help is documentation!

How to Participate

Email Steve Litt if you'd like to participate. I'll work with you as much or as little as you want.

Mailing List

There's no mailing list yet. For now, communicate directly with Steve Litt. Once there are several participants, I'll make a mailing list.

FAQ (Frequently Asked Questions) list

None exists. The project is too new to really know what to put in it.

HTMLized versions of the project documentation

None. You might want to look at the README.otl. and the INSTALL files. See also http://www.troubleshooters.com/tpromag/199911/199911.htm, which discusses outlining in general, and http://www.troubleshooters.com/linux/olvim.htm, which discusses how to use Vim 6 as an outliner.

Links to related projects.

http://www.troubleshooters.com/projects/vimoutliner/: VimOutliner is the best editor I've found for authoring EMDL
http://www.vim.org/: The Vim editor project. Vim supplies all the outlining power, and VimOutliner simply supplies the configuration and some scripts.
http://www.troubleshooters.com/umenu/: The UMENU project. The first EMDL converter converts to UMENU.

Dedication: We Stand On Their Shoulders

Richard Stallman and the Free Software Foundation: Without them I shudder to think what the software world would be like today.
Linus Torvalds and the various Linux projects: Without Linux, I wouldn't need VimOutliner -- I'd just need a lot more money to purchase proprietary software and a lot more patience to deal with Blue Screens of Death and resulting data loss.
Greater Orlando Linux User Group (GOLUG): The peer to peer brain network of which I'm a small part.
The VimOutliner project, whose members gave me ideas and encouragement in developing Node.pm.
The Apache Software Foundation, whose DOM spec inspired me to create Node.pm.

Progress

On 5/14/2004 I released this software, after testing it in its current form for some six months.

Top of Page

Troubleshooters.Com andSteve Litt's Free Software Projects Present