Node.pm
An object library to parse
outlines and handle hierarchies
Copyright (C) 2004 by Steve Litt
NO WARRANTY!
There is no warranty for anything contained in the
Node.pm distribution or documentation or its web pages, to the extent
permitted by applicable law. Except when otherwise stated in
writing
the copyright holders and/or other parties provide the program,
documentation
and web pages "as is" without warranty of any kind, either expressed or
implied, including, but not limited to, the implied warranties of
merchantability
and fitness for a particular purpose. The entire risk as to the
quality
and performance of the program is with you. Should the program,
documentation
or web pages prove defective, you assume the cost of all necessary
servicing,
repair or correction.
Node.pm is a Perl object library
designed
to quickly and easily parse outline files into node hierarchies, and
to manipulate those node hierarchies. It is similar to DOM but lighter
weight, and it can handle multiple top level items in parsed files.
Copyright and License
This software is copyright (C) 2003 by Steve Litt, all rights reserved.
I have licensed it under the Litt Perl Development Tool License
(LPDTL). The LPDTL is the GNU GPL with an exception and an exception to
that exception. The intent of the LPDTL is to provide you with a tool
that is copylefted free software, but does not require programs you
write by including that tool to be either copylefted, free software, or
GPL compatible. I have tried to craft this license so that badguys
cannot proprietarize modifications of the tool itself.
The LPDTL contains provisions for the user to issue modifications of
the code as pure GPL, in case you're uncomfortable with the LPDTL. Keep
in mind, however, that if you do that, nobody (including yourself) will
be able to use that modified tool to create software that is either
proprietary or GPL incompatible.
Project Charter
The purpose of Node.pm is the quick and easy parsing of outline files,
and quick and easy manipulation of hierarchies. This might not sound
especially impressive, but consider the uses:
- configuration file parsing
- Menu creation and service
- Conversion between markup languages
- XML creation
- HTML structuring
- Creation of book structure
- outline processing
- Hierarchy drilldown
- Hierarchy flattening
Node.pm is created as an object oriented tool. At the lowest level are
Node objects, which have a name, type, value, and a list of named
attributes, as well as containing pointers to parent, previous sibling,
next sibling, first child and last child. Hierarchies are built from
trees of these Node objects.
The design priorities of Node.pm are:
- Quick, simple and easy development of hierarchy/outline
manipulation algorithms
- Easy learning curve relative to the tool's power
- Ability to implement very complex hierarchical requirements in an
easy to maintain manner
To simplify working with such hierarchies, the Walker object walks
the
tree recursively, handling all a Node's decendents before going on to
its younger siblings. In the preceding sentence the word "recursively"
is used loosely, because although the order of Node visitation mimics
what you would see in a recursive algorithm, in fact the algorithm is a
simple loop, conserving on memory.
Having the Walker "walk the Node tree" is a nice intellectual exercise,
but it accomplishes nothing unless it takes action. The action taken
is determined by two callback routines passed
GOT HERE the EMDL language is to facilitate fast and easy manipulation
of large menu systems using a standard text editor capable of automatic
indentation (the ability to indent to the same level as the preceding
line,
plus ways to cut, paste, indent and exdent multiple lines). The Vim
editor
is one such editor. VimOutliner is ideally suited for EMDL authoring
because
of its ability to collapse and expand trees.
Priorities of the EMDL initiative are:
-
Quick and easy creating and editing of large menu systems
-
Cut and paste
-
Promotion and demotion
-
Assignment of commands, directories, prepaths, and other parameters to
choices
-
Authoring can be done with a text editor
-
Readability
-
Use of a text outliner to define menu hierarchies
And the following are not design priorities at all:
-
Ease of parsing
-
Run time efficiency
-
Gui environments
-
Inclusion of every last possible feature of every possible menu system
Priorities #1, 3 and 4 preclude anything with end tags, and suggest
that
the language should be line based. So it is. #3 suggests use of an
outline
to simulate the hierarchy of a menu system, with its tree of menus and
each menu's selection of choices, many of which have multiple
parameters.
EMDL is so productive you can create a substantial personal custom
"start
menu" in an hour or so. It's featureful enough to create fairly complex
menus, although it lacks a generic facility for prompted argument
substitution.
However, if the destination menu supports prompted argument
substitution
in its commands, then the prompted argument substitution tags can be
placed
in an EMDL command parameter. This is why EMDL supports UMENU prompted
argument substitution. It's anticipated that version 2 of the EMDL
language
specification will support prompted argument substitution natively.
The first program to use EMDL is emdl2umenu, which performs the
function
suggested by its name. It's a GNU GPL licensed program that runs on any
Linux computer with a standard Perl 5 installation, and will likely run
on any UNIX, UNIX workalike, or BSD computer. It hasn't been tested
with
Windows, but making it run on Windows would likely require only the
most
trivial changes.
emdl2umenu is consists of three objects -- the Parser, the Menutree,
and the Writer. Making it work with another menu system would involve
only
changes to the Writer object. Creating a UMENU to EMDL converter, thus
making this design tool round trip, would involve creating a Writer
object
to write EMDL and a Parser object to parse a tree of UMENU menu
definition
files.
The intended audience of EMDL are those who possess all of the
following
traits:
-
Have, or can create, a tool to convert EMDL to their menuing system of
choice
-
Willing to edit and create menus
Willing to use an editor instead of a "drag and drop environment"
Maintainer's Guide
All current and future maintainers of Node.pm should
be very cognizant of the project's priorities. Node.pm must be easy for
the
human maintaining or manipulating hierarchies.
Project Specifications
Node.pm implements three types of objects:
- Node: Data for a single entity in a hierarchy
- Parser: An object to convert a tab indented outline into a
hierarchy of Node objects
- Walker: An object that "walks" a hierarchy of Node objects,
taking action on each Node as specified by its callback functions.
It should be noted that Node.pm was inspired by the Apache Software
Foundation's DOM (Document Object Model) software. It differs from DOM
in that it is less inclusive, and:
- Much easier to add to Perl than DOM
- Much smaller footprint than DOM
- Includes a Walker object to simplify tree manipulation
- Ability to parse input with multiple top level entities
- Parses tab indented outlines instead of XML
- No facilities for DTD's or Schemas
Node Object
The Node object represents one entity in a hierarchy or tree. It
contains both data, navigational pointers and methods:
- Data
- Name
- Type
- Value
- Zero, one or more attributes in key/value pairs
- Navigational Pointers
- Parent
- Previous sibling
- Next sibling
- First child
- Last child
- Methods
- Data
- hasName()
- getName()
- setName()
- hasType()
- getType()
- setType()
- hasValue()
- getValue()
- setValue()
- Attribute methods:
- hasAttribute()
- getAttribute()
- setAttribute()
- removeAttribute()
- hasAttributes()
- getAttributes()
- setAttributes()
- Navigation
- hasParent()
- getParent()
- setParent()
- hasNextSibling()
- getNextSibling()
- setNextSibling()
- hasPrevSibling()
- getPrevSibling()
- setPrevSibling()
- hasFirstChild()
- getFirstChild()
- setFirstChild()
- hasLastChild()
- getLastChild()
- setLastChild()
- Creation
- Insertion
- insertSiblingBeforeYou()
- insertSiblingAfterYou()
- insertFirstChild()
- insertLastChild()
- Deletion
- deleteSelf() [note that this deletes the node's whole
subtree]
OutlineParser Object
The OutlineParser object is an object whose task is to parse a tab
indented outline and place its information into a tree of Node objects.
Because outlines frequently have multiple top level entries, but a true
tree can have only one, the OutlineParser object creates a new Node
object that becomes the top level, with the outline's top level entries
becoming children of that Node object created by the OutlineParser.
Each line of the outline is converted to a Node object with the
following data:
- Name is undefined
- Type is "Node"
- Value is the text of the line itself
- Attribute _lineno is the line number of the source outline file
- This is vital for error messages
The outline file's hierarchy is mirrored in the Node tree. If line P
has a child C in the outline, then node P has a child node C in the
Node hierarchy.
OutlineParser Object Methods
|
Properties of the Parse
|
setCommentChar()
|
Single character signifying a
line is a comment. This character must be the first nonblank character
on the line in order to render the line a comment. Comment lines are
not converted to nodes, nor are they checked for correct indentation or
other syntax. Default is undef.
|
hasCommentChar()
|
Returns true of the
OutlineParser object has a comment character.
|
getCommentChar()
|
Returns the OutlineParser's
comment character.
|
fromStdin()
|
Tells the OutlineParser that it
will receive its input from stdin. This is opposite of and contramands
fromFile(). By default OutlineProcessor receives
its input from stdin.
|
fromFile()
|
Tells the OutlineParser that it
will receive its input from a file. The actual filename is passed as an
argument to the parse() method. fromFile()
is opposite of and contramands fromStdin(). By default OutlineProcessor
receives
its input from stdin. |
zapBlanks()
|
Sets the OutlineParser to ignore
blank lines, meaning lines with no characters and also lines with only
whitespace. zapBlanks() is opposite of and contramands dontZapBlanks().
By default OutlineProcessor ignores blank lines.
|
dontZapBlanks()
|
Sets the OutlineParser to NOT ignore blank lines, meaning
all-whitespace lines and lines with no characters are inserted as Node
objects. dontzapBlanks() is opposite of and contramands zapBlanks(). By
default OutlineProcessor ignores blank lines. |
Action Methods |
new()
|
Instantiates a new OutlineParser
object, and passes it back as a return.
|
getFirstNonBlankChar()
|
Returns the first non-whitespace
character on a line. This is used primarily internally, and the return
for blank lines is not well defined. It is suggested that application
programmers refrain from using this method.
|
parse()
|
Perform the parse. If parsing a
file, pass the filename in via an argument. All other parse properties
have been previously defined. The top level node (the one created by
OutlineParser) is passed back as the function return. From that top
level node, the application programmer can navigate or walk anywhere in
the Node tree.
|
Walker Object
The Walker object "walks" an entire Node hierarchy defined by the new()
arg1 Node argument and its descendents. This walk always goes deep
before going broad. In other words, all children are processed before
going to the next sibling.
The Walker object visits every Node in the hierarchy, which in itself
does nothing. The Walker performs actions on these Node objects via two
callback routines, the entry callback and the return callback. The
entry callback is called immediately upon the Walker's first visiting
the Node. The return callback is called upon the return to a Node after
all its descendents have been processed. Leaf level nodes never trigger
the return callback because they never return from descendents.
The entry and return callbacks are arg2 and arg3, respectively, of the
new() function. Callbacks MUST be methods of a perl object. They CANNOT
be free standing subroutines. The reason for this design is so the
callbacks can keep persistent information, and so they can trade
information with the "outside world" without resorting to global
variables.
The following is a simple example of the use of a Walker object,
assuming that object Callbacks
contains method printNode(),
whose three arguments are the object itself ($self), the Node object
that called the callback, and the level of that node:
my $walker = Walker->new
(
$topNode, # start with this node
[\&Callbacks::cbPrintNode, $callbacks] # do this on entry to each node
);
$walker->walk();
You'll notice the preceding instantiates the Walker object with only
two arguments. The return callback is not passed. In fact, most Walker
objects use only an entry callback. If a return callback is required,
it becomes the third argument. In the rare case where a return callback
is required but not an entry callback, the argument for the entry
callback is passed as undef.
Downloads
Maintainers List
Needed Programming and Documentation Tasks
I've been using Node.pm in its current form for several months, and it
appears complete and stable. I've written several substantial apps
using Node.pm, including the newest (unreleased but being heavily
tested) versions of both EMDL and UMENU.
The one area where Node.pm might be a little light is in reverse
direction navigation. It hasn't been tested well in reverse navigation,
and there is no provision for a reverse Walker. A reverse walker could
be created as a different object (ReverseWalker), or by adding a
direction property to the existing Walker and placing code in the
parse() method to accommodate different directions.
More esoterically, it might be nice some day to have an XMLParser
object analogous to the OutlineParser object.
Where I could really use help is documentation!
How to Participate
Email Steve Litt if you'd
like to participate. I'll work with you as much or as little as you
want.
Mailing List
There's no mailing list yet. For now, communicate directly with Steve
Litt. Once there are several participants, I'll make a mailing
list.
FAQ (Frequently Asked Questions) list
None exists. The project is too new to really know what to put in it.
HTMLized versions of the project
documentation
None. You might want to look at the README.otl.
and the INSTALL files. See also http://www.troubleshooters.com/tpromag/199911/199911.htm,
which discusses outlining in general, and http://www.troubleshooters.com/linux/olvim.htm,
which discusses how to use Vim 6 as an outliner.
Links to related projects.
Dedication: We Stand On Their Shoulders
-
Richard Stallman and the Free Software
Foundation:
Without them I shudder to think what the software world would be like
today.
-
Linus Torvalds and the various Linux projects: Without Linux, I
wouldn't
need VimOutliner -- I'd just need a lot more money to purchase
proprietary
software and a lot more patience to deal with Blue Screens of Death and
resulting data loss.
-
Greater Orlando Linux User Group
(GOLUG): The peer to peer brain network of which
I'm a small part.
- The VimOutliner project,
whose members gave me ideas and encouragement in developing Node.pm.
- The Apache Software Foundation,
whose DOM spec inspired me to create Node.pm.
Progress
On 5/14/2004 I released this software, after testing it in its current
form for some six months.
Top of Page