Troubleshooters.Com Presents

Linux Productivity Magazine

Volume 3 Issue 6, June 2004

Curses, Part 2

Copyright (C) 2004 by Steve Litt. All rights reserved. Materials from guest authors copyrighted by them and licensed for perpetual use to Linux Productivity Magazine. All rights reserved to the copyright holder, except for items specifically marked otherwise (certain free software source code, GNU/GPL, etc.). All material herein provided "As-Is". User assumes all risk and responsibility for any outcome.

Have Steve Litt write you a quick application!

See also Troubleshooting Techniques of the Successful Technologist
and Rapid Learning: Secret Weapon of the Successful Technologist
by Steve Litt

[ Troubleshooters.Com | Back Issues |Troubleshooting Professional Magazine ]


 
Fold knowledge into data so program logic can be stupid and robust. -- Eric Raymond, from the Rule of Representation in the Philosophy chapter of his "The Art of Unix Programming" book

CONTENTS

Editor's Desk

By Steve Litt
Last month's Linux Productivity Magazine explained the Curses command line interface extensions, including string placement, keystroke acquisition, colors, windows, and even menus. This month's issue builds on last months issue by demonstrating the use of Curses to build picklists and forms. Just for fun, there's an article showing how to create a maze runner program that runs mazes designed in an editor.

Perhaps the most notable facet of this Linux Productivity Magazine issue is its exploration of Data Driven Design -- the placement of knowledge in data instead of code, thereby making the code versatile, reusable and maintainable. These principles are demonstrated in one of the picklist articles.

This month's Linux Productivity Magazine discusses the philosophy of Data Driven Design, illustrated by the story of my long journey to its knowledge.

s how to use use Curses to create non-gui menus and picklists. If you're a Linux or free software user, of if you just want to learn more about Curses or Data Driven Design, this is your magazine. Enjoy!

Steve Litt is the author of Samba Unleashed.   Steve can be reached at his email address.

Help Publicize Linux Productivity Magazine

By Steve Litt
Loyal readers, I need your help.

For months I've publicized Linux Productivity Magazine, expanding it from a new magazine to a mainstay read by thousands. There's a limit to what I can do alone, but if you take one minute to help, the possibilities are boundless.

If you like this magazine, please report it to one of the Linux magazines. Tell them the URL, why you like it, and ask them to link to it.

I report it to them, but they don't take it very seriously when an author blows his own horn. When a hundred readers report the magazine, they'll sit up and take notice.

Reporting is simple enough. Just click on one of these links, and report the magazine. It will take less than 5 minutes.

News Mag
Submission URL or Email address
Comments
Slashdot
http://slashdot.org/submit.pl
Just fill in the short form.
LinuxToday
http://linuxtoday.com/contribute.php3
Just fill in the short form.
Linux Weekly News
lwn@lwn.net
Just tell them the URL, why you like it.
NewsForge
webmaster@linux.com
Just tell them the URL, why you like it.
VarLinux
http://www.varlinux.org/vlox/html/modules/news/submit.php
Just fill in the short form.
LinuxInsider.Com,
Newsfactor Network
http://www.newsfactor.com/perl/contact_form.pl?to=contact
Just tell them the URL, why you like it.
The Linux Knowledge Portal
webmaster@linux-knowledge-portal.org
Just tell them the URL, why you like it.
OS News
http://www.osnews.com/submit.php
Just tell them the URL, why you like it.
DesktopLinux
http://www.desktoplinux.com/cgi-bin/news_post.cgi
Only for LPM issues involving the Linux desktop, not for programming or server issues.

If you really like this magazine, please take 5 minutes to help bring it to a wider audience. Submit it to one of the preceding sites.
Steve Litt is the founder and acting president of Greater Orlando Linux User Group (GoLUG).   Steve can be reached at his email address.

GNU/Linux, open source and free software

By Steve Litt
Linux is a kernel. The operating system often described as "Linux" is that kernel combined with software from many different sources. One of the most prominent, and oldest of those sources, is the GNU project.

"GNU/Linux" is probably the most accurate moniker one can give to this operating system. Please be aware that in all of Troubleshooters.Com, when I say "Linux" I really mean "GNU/Linux". I completely believe that without the GNU project, without the GNU Manifesto and the GNU/GPL license it spawned, the operating system the press calls "Linux" never would have happened.

I'm part of the press and there are times when it's easier to say "Linux" than explain to certain audiences that "GNU/Linux" is the same as what the press calls "Linux". So I abbreviate. Additionally, I abbreviate in the same way one might abbreviate the name of a multi-partner law firm. But make no mistake about it. In any article in Troubleshooting Professional Magazine, in the whole of Troubleshooters.Com, and even in the technical books I write, when I say "Linux", I mean "GNU/Linux".

There are those who think FSF is making too big a deal of this. Nothing could be farther from the truth. The GNU General Public License, combined with Richard Stallman's GNU Manifesto and the resulting GNU-GPL License, are the only reason we can enjoy this wonderful alternative to proprietary operating systems, and the only reason proprietary operating systems aren't even more flaky than they are now. 

For practical purposes, the license requirements of "free software" and "open source" are almost identical. Generally speaking, a license that complies with one complies with the other. The difference between these two is a difference in philosophy. The "free software" crowd believes the most important aspect is freedom. The "open source" crowd believes the most important aspect is the practical marketplace advantage that freedom produces.

I think they're both right. I wouldn't use the software without the freedom guaranteeing me the right to improve the software, and the guarantee that my improvements will not later be withheld from me. Freedom is essential. And so are the practical benefits. Because tens of thousands of programmers feel the way I do, huge amounts of free software/open source is available, and its quality exceeds that of most proprietary software.

In summary, I use the terms "Linux" and "GNU/Linux" interchangably, with the former being an abbreviation for the latter. I usually use the terms "free software" and "open source" interchangably, as from a licensing perspective they're very similar. Occasionally I'll prefer one or the other depending if I'm writing about freedom, or business advantage.

Steve Litt is the author of Troubleshooting Techniques of the Successful Technologist.   Steve can be reached at his email address.

Picklists

By Steve Litt
A good data input and processing application presents a list of relevent records to the user, so the user can pick which record to process, edit, delete or whatever. This list of records is called a picklist.

The main difference between picklists and menus is that picklists usually have hundreds or thousands of lines. Scrolling is essential. They must be in an order discernable to the user, so that the user can spell out the first few characters of the desired record and navigate directly to it. For instance, if the user wants Milton Anderson, he can type in "Ande" and be close enough to use the page down key and the downarrow key to find the exact record.

Another difference is that the items on menus are typically (but not always) defined at compile time, whereas picklists are always data driven lists obtained at runtime. Menus can be programmed, and frequently are, with a magic number for a maximum number of items. Picklists cannot be so programmed, because you'll always find a data set with more items.

This presents a small problem with Curses. In our preceding programs we stored choice text in an array, but we could have just as easily used a file or SQL statement. Our first task would be to find the number of records, with an SQL count statement, or  a quick walkthrough with fgets(), or even the Linux wc command. Armed with that number, we allocate an array of that number plus one ITEM pointers. We can then loop through the file or data set, allocating an ITEM pointer for each, and assigning a relevant string and other information to each.

BUT...

What if we perform a record add directly from the picklist. There's no room for the new item, because the item list is an  array of pointers, not a linked list of pointers. Therefore, you can't just append, prepend or insert. You need to enlarge the array, and you need to insert the new item where appropriate, moving everything after it down one.

Ugh!

Perhaps the array could be created oversize by some magic number, like 100. In this way, every time something's added, there's a check to see whether the array size is exceeded. If not, it's as simple as a binary search and a shift. If the array size is exceeded, a new array is allocated, oversized by the same magic number, pointers are transferred from the old item list to the new one, and the app continues.

Fast but ugly. It's not ugly from a memory standpoint. Even with 8 byte pointers (enough to handle terrarecords), 1000 pointers would be only 8K of RAM. The problem is that the check must be made every time, and at the very least things must be shifted, and the MENU somehow refreshed.

Worse yet, when the array size is exceeded, the current ITEM ** must be stored in a temp ITEM **, the original ITEM ** must be reallocated with the larger capacity, the list of items must be copied from the temporary ITEM** to the original one, with provisions for the shift for the new record, and then the MENU must be refreshed.

Ugh!

It's enough to tempt one to just refresh from the database upon return from an insert or delete. Trouble is, the extra SQL retrieves would bring the enterprise DBMS to its knees if the app is heavily used.

Unlike a little menu, we must enable Home to go to the top, End to go to the bottom, PageUp and PageDown to go up or down a screenful. Up to six different functionalities must be enabled, together with keystrokes to enable them. And please remember, that due to pattern searching, those keystrokes cannot be simple letters, because those letters would be added to the search pattern. Here are the five possible functionalities of a picklist:
  1. Pick
  2. Add
  3. Change
  4. Delete
  5. Bail
  6. Refresh

Functionality
Typical keysrokes
What it does
Pick
Enter, Ctrl+P
Terminates the picklist and returns to the calling subroutine with the unique identifier of the item that was highlighted when the user chose
Add
Insert, Ctrl+A
The picklist calls a blank form to create a new item in the list or table under consideration. When the user terminates that form, if the user did not bail, then the newly created record is visible on the picklist
Change
Enter, Ctrl+C, Ctrl+M, Ctrl+E
Change, modify, edit are synonyms. The picklist calls a form, with the current contents of the highlighted record already in place. The user can then modify the record's data, and submit it to the database or file, returning afterward back to the calling picklist.
Delete
Del, Backspace, Ctrl+D
The picklist calls a read only form containing data from the highlighted record. The user can then choose to continue with the deletion or bail from the deletion. Either way control returns to the picklist, but if the user bails, the record is not deleted from the database.
Bail
Esc
Terminate the picklist, and return to the calling subroutine with an identifier indicating that the user declined to choose. The calling subroutine typically then acts as if the user had never gone into the picklist.
Refresh
Ctrl+L
Refresh the list of records on the picklist. Some picklist tools are incapable of putting a newly added record into the list, so the user must refresh the list. This is typically a slow operation, so it's better if the picklist and its associated routines refreshes the list. However, in a multiuser environment, refreshing might still be necessary because other users might have inserted or deleted records.

Picklists are typically called in one of two situations:
  1. Called from a menu to choose a record on which to operate
  2. Called from a field on a form to choose that field's contents from a list.
#1 is straightforward enough, but #2 can get more involved. What if upon entering a picklist from a form's field, the user finds that the desired record is not yet in the database. In many cases, the best policy is for the user to add that new record, and then choose it to fill the field.

Or, when the user is about to choose the record, she notices that the record contains bad information. In that case, often the best policy is to let the user modify the record before choosing it to fill the field.

Both these scenarios require two different choice modes in the picklist:
  1. Choose and return to the form
  2. Choose and drill down further into the data
For simplicity and speed, we like to assign the Enter key to the concept of choosing. Trouble is, on picklists with two choice modes, which will be assigned the Enter key? Typically this depends on which is more likely. Complicating this is the fact that the same picklist could be called both from a menu and from a field. This is common.

What this means is that key choices for these functionalities must be configurable at runtime for each use of a particular picklist. The key choices must be passed to whatever function runs the picklist.
Steve Litt is the author of Rapid Learning: Secret Weapon of the Successful Technologist.   Steve can be reached at his email address.

A Simple Picklist

By Steve Litt
The Curses tools have no provision for a picklist. However, they do have provisions for a scrolling menu, which is the same thing. Referring to last month's discussion of menus, when creating a scrolling menu, always keep the following rules in mind:
The following is the simple picklist program:

Data

Program
Anderson
Barnett
Carter
Charles
Childers
Colby
Daniels
Ellison
Franco
Goldstein
Heller
Ignacius
Johnson
Kelly
Lewis
Martin
Nunez
Osbourne
Peterson
Quincy
Rodriguez
Stevens
Thomas
Ullman
Victor
Watkins
Xavier
Yamaha
Zelinski
    
#include <curses.h>	/* Necessary for all Curses programs */
#include <menu.h> /* Gives you menuing capabilities */
#include <stdlib.h> /* Needed for calloc() */
#include <string.h> /* Needed for strlen() and friends */

#define WHITEONRED 1
#define WHITEONBLUE 2
#define WHITEONBLACK 3
#define BLACKONWHITE 4
#define REDONWHITE 5


void wCenterTitle(WINDOW *pwin, const char * title)
{
int x, maxy, maxx, stringsize;
getmaxyx(pwin, maxy, maxx);
stringsize = 4 + strlen(title);
x = (maxx - stringsize)/2;
mvwaddch(pwin, 0, x, ACS_RTEE);
waddch(pwin, ' ');
waddstr(pwin, title);
waddch(pwin, ' ');
waddch(pwin, ACS_LTEE);
}

void wclrscr(WINDOW * pwin)
{
int y, x, maxy, maxx;
getmaxyx(pwin, maxy, maxx);
for(y=0; y < maxy; y++)
for(x=0; x < maxx; x++)
mvwaddch(pwin, y, x, ' ');
}

bool initColors()
{
if(has_colors())
{
start_color();
init_pair(WHITEONRED, COLOR_WHITE, COLOR_RED);
init_pair(WHITEONBLUE, COLOR_WHITE, COLOR_BLUE);
init_pair(REDONWHITE, COLOR_RED, COLOR_WHITE);
return(true);
}
else
return(false);
}


int runMenu(
WINDOW *wParent,
int height,
int width,
int y,
int x,
char *choices[]
)
{
int c; /* key pressed */
ITEM **my_items; /* list of items on this menu */
MENU *my_menu; /* the menu structure */

WINDOW *wUI; /* window on which the user
interacts with the menu */
WINDOW *wBorder; /* window containing the wUI window
and the border and title */

int n_choices; /* number of items on menu */
int ssChoice; /* subscript to run around the choices array */
int my_choice = -1; /* the zero based numeric user choice */

/* CALCULATE NUMBER OF MENU CHOICES */
for(n_choices=0; choices[n_choices]; n_choices++);

/* ALLOCATE ITEM ARRAY AND INDIVIDUAL ITEMS */
my_items = (ITEM **)calloc(n_choices + 1, sizeof(ITEM *));
for(ssChoice = 0; ssChoice < n_choices; ++ssChoice)
my_items[ssChoice] = new_item(choices[ssChoice], NULL);
my_items[n_choices] = (ITEM *)NULL;

/* CREATE THE MENU STRUCTURE */
my_menu = new_menu((ITEM **)my_items);

/* PUT > TO THE LEFT OF HIGHLIGHTED ITEM */
set_menu_mark(my_menu, "> ");

/* SET UP WINDOW FOR MENU'S BORDER */
wBorder = newwin(height, width, y, x);
wattrset(wBorder, COLOR_PAIR(WHITEONRED) | WA_BOLD);
wclrscr(wBorder);
box(wBorder, 0, 0);
wCenterTitle(wBorder, "Choose one");

/* SET UP WINDOW FOR THE MENU'S USER INTERFACE */
wUI = derwin(wBorder, height-2, width-2, 2, 2);

/* ASSOCIATE THESE WINDOWS WITH THE MENU */
set_menu_win(my_menu, wBorder);
set_menu_sub(my_menu, wUI);
set_menu_format(my_menu, 12, 1);

/* MATCH MENU'S COLORS TO THAT OF ITS WINDOWS */
set_menu_fore(my_menu, COLOR_PAIR(REDONWHITE));
set_menu_back(my_menu, COLOR_PAIR(WHITEONRED) | WA_BOLD);

/* SET UP AN ENVIRONMENT CONDUCIVE TO MENUING */
keypad(wUI, TRUE); /* enable detection of function keys */
noecho(); /* user keystrokes don't echo */
curs_set(0); /* make cursor invisible */

/* DISPLAY THE MENU */
post_menu(my_menu);

/* REFRESH THE BORDER WINDOW PRIOR TO ACCEPTING USER INTERACTION */
touchwin(wBorder);
wrefresh(wBorder);

/* HANDLE USER KEYSTROKES */
while(my_choice == -1)
{
touchwin(wUI); /* refresh prior to getch() */
wrefresh(wUI); /* refresh prior to getch() */
c = getch();
switch(c)
{
case KEY_DOWN:
menu_driver(my_menu, REQ_DOWN_ITEM);
break;
case KEY_UP:
menu_driver(my_menu, REQ_UP_ITEM);
break;
case 10: /* Enter */
my_choice = item_index(current_item(my_menu));

/* RESET CURSOR IN CASE MORE SELECTION IS NECESSARY */
pos_menu_cursor(my_menu);
break;
}
}

/* FREE ALL ALLOCATED MENU AND ITEM RESOURCES */
unpost_menu(my_menu);
for(ssChoice = 0; ssChoice < n_choices; ++ssChoice)
free_item(my_items[ssChoice]);
free_menu(my_menu);

/* DESTROY MENU WINDOW AND BORDER WINDOWS */
delwin(wUI);
delwin(wBorder);

/* UNDO MENU SPECIFIC ENVIRONMENT */
curs_set(1); /* make cursor visible again */

/* REPAINT THE CALLING SCREEN IN PREPARATION FOR RETURN */
touchwin(wParent);
wrefresh(wParent);

/* RETURN THE ZERO BASED NUMERIC USER CHOICE */
return(my_choice);
}


int main()
{
char *choices[] = /* The menu choices */
{
"Anderson",
"Barnett",
"Carter",
"Charles",
"Childers",
"Colby",
"Daniels",
"Ellison",
"Franco",
"Goldstein",
"Heller",
"Ignacius",
"Johnson",
"Kelly",
"Lewis",
"Martin",
"Nunez",
"Osbourne",
"Peterson",
"Quincy",
"Rodriguez",
"Stevens",
"Thomas",
"Ullman",
"Victor",
"Watkins",
"Xavier",
"Yamaha",
"Zelinski",
NULL
};
int choiceno;

initscr(); /* start ncurses */
cbreak(); /* immediately acquire each keystroke */
noecho(); /* do not echo user keystrokes */
keypad(stdscr, TRUE); /* enable detection of function keys */
initColors(); /* enable colors and initialize pairs */

/* SET UP AND PAINT STANDARD SCREEN */
wattrset(stdscr, COLOR_PAIR(WHITEONBLUE) | WA_BOLD);
wclrscr(stdscr);
mvwaddstr(stdscr, 10, 10, "Simple color menu");
touchwin(stdscr);
wrefresh(stdscr);

/* ACQUIRE THE USER'S CHOICE */
choiceno = runMenu(stdscr, 16, 40, 2, 20, choices);

mvwaddstr(stdscr, 22, 0, "Hit any key to finish==>");
touchwin(stdscr);
wrefresh(stdscr);
getch();

endwin(); /* end ncurses */

printf("\n\nYou chose item %d, %s\n", choiceno, choices[choiceno]);
return(0);
}

Compile and run the preceding program with this script:

rm -f ./a.out
gcc -Wall -lcurses -lmenu -lform picklist.c
./a.out

You'll notice the preceding is very much like the menu subroutines discussed in last month's Linux Productivity Magazine, with the exception that they're made scrollable with the set_menu_format() routine.

The preceding code produces the following picklist:

Output of simple picklist

A simple picklist is great for a quickie 1 day app, but it's not maintainable or reusable. If you anticipate frequent changes or re-uses, you owe it to yourself to create a data driven picklist...
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Data Driven Picklist

By Steve Litt
This data driven picklist program places all picklist behavior inside a single PICKINFO data structure. That makes it incredibly adaptable, because behavior is changed in data, not in code. It wouldn't be particularly difficult to modify this program to read the picklist's entire configuration from a configuration file, thus freeing program modification from the necessity of recompilation, enabling end users to change the look and feel.

Compile and run the data driven picklist program with this script:

rm -f ./a.out
gcc -Wall -lcurses -lmenu -lform picklist.c
./a.out


Using this data file, the data driven picklist program produces the following output:

Output of data driven picklist

The placement of program behavior inside of data structures is called Data Driven Design. Data Driven Design yields the ultimately configurable, maintainable and reusable code. Data Driven Design isn't for everyone -- it usually requires more lines of code and more design, coding and debugging time. If you need to whomp out a 1 day app, go functional decomposition or a partial Data Driven Design. On the other hand, if you're creating a tool to save time and effort from now into the future, Data Driven Design is the way to go.

This program was designed around a data structure containing "everything you ever wanted to know about a picklist but were afraid to ask. Let's start with the #defines, typedefs and enums:

/* #DEFINES AND ENUMS {{{ */
#define WHITEONRED 1
#define WHITEONBLUE 2
#define WHITEONBLACK 3
#define BLACKONWHITE 4
#define REDONWHITE 5
#define GREENONBLACK 6
#define CYANONRED 7


enum KBDKEYS {ENTER, CTRL_A, CTRL_B, CTRL_C, CTRL_D, CTRL_E,
CTRL_F, CTRL_G, CTRL_H, CTRL_I, CTRL_J, CTRL_K,
CTRL_L, CTRL_M, CTRL_N, CTRL_O, CTRL_P, CTRL_Q,
CTRL_R, CTRL_S, CTRL_T, CTRL_U, CTRL_V, CTRL_W,
CTRL_X, CTRL_Y, CTRL_Z, ESC, BACKSPACE, TAB, NULLCHAR};

typedef enum KBDKEYS KBDKEYS8[8];

enum ACTIONTYPES {PICK, ADD, CHANGE, DELETE, BAIL, REFRESH, NULLACTIONTYPE};
/* #DEFINES AND ENUMS }}} */

Did you notice the {{{ and }}} markers in the comments? Those are Vim folding markers, enabling the entire program to be seen as an outline whose elements can be expanded and collapsed. When programs get over 600 lines, this is an excellent idea, always assuming you're using Vim as your editor. To read all about it, issue the following command from within Vim:
:h fold
One thing should be said right now -- this program has some design and philosophical errors. Turning it into a true data driven tool, in which all picklist configuration could be accomplished in configuration files rather than code changes, would require some more work. For instance, color pair constants shouldn't have been defined by color name, but instead by function name -- BORDERPAIR, UIPAIR, UIHIGHLIGHTPAIR and the like. And they should be variables so that they don't conflict with the application's color pair constants. The picklist tool would then contain a color string to color constant translation table so that string representations of a color, read in from a config file, could be converted to the number Curses uses for that color.

Because Curses takes color pairs only from 0 to 15, perhaps the tool should have an array of 16 ints, and subroutines to check for the next one and reserve it, and another to look up by (variable) number.

As mentioned, in order to get this magazine out on time, I did not do a complete job of making it data driven.

We start with the KBDKEYS enum we give names to all the control keys likely to be pressed during operation of the picklist. NULLCHAR is an end marker. We typedef an array of 8 such keyboard keys. This array of 8 will be used to house several keystrokes all leading to the same action.

Finally, we define ACTIONTYPES as PICK, ADD, CHANGE, DELETE, BAIL, or REFRESH, with NULLACTIONTYPE tacked on as an end marker.

The top level of our data structure is the PICKINFO struct, as follows:

struct PICKINFO						/*{{{*/
{
struct ROWSINFO *rowsInfo;
struct KEYSTROKES *keystrokes;
struct SUBROUTINES *subroutines;
struct WINDOWS *windows;
char ** actionNames;
}; /*}}}*/

The following is a short description of the contained information:

PICKINFO element
What's stored in the element
ROWSINFO
Pointer to a struct ROWSINFO, which contains everything about the rows of data presented for possible selection.
KEYSTROKES
Pointer to a struct KEYSTROKES, which contains everything about possible keystrokes, including keystroke ascii codes, string representation of keystroke names, and for each possible action of the picklist, the keystrokes that trigger such action.
SUBROUTINES
A pointer to the subroutines to be run upon the ADD, CHANGE or DELETE actions -- typically a form. Note that a picklist could also simply return the action chosen by the user, in which case the next loop iteration would run the form on the record selected by the user.
WINDOWS
The layout of the picklist -- its height, width, as well as the dimensions of the surrounding border window. Color scheme, and prompts location.
actionNames
Strings representing PICK, ADD, CHANGE, DELETE, BAIL or REFRESH.

The WINDOWS Structure

The WINDOWS * element is simplest because it contains no substructures, but simply identifies a bunch of properties about the Border Window and the UI Window, so we'll start with that. For the Border and UI windows, the following information is contained:
For the prompts (eg. Press Ctrl+D to delete), it contains the following infomation
The WINDOWS structure also contains the following information:
struct WINDOWS						/*{{{*/
{
int uiTopLeftY;
int uiTopLeftX;
int ui_height;
int ui_width;
int ui_color_combo_constant;
int ui_attrib;
int ui_highlight_color_combo_constant;
int ui_highlight_attrib;
int borderTopLeftY;
int borderTopLeftX;
int border_height;
int border_width;
int border_color_combo_constant;
int border_attrib;
char *title;
int firstPromptY;
int firstPromptX;
int prompt_color_combo_constant;
int prompt_attrib;
}; /*}}}*/

The KEYSTROKES Structure

The KEYSTROKES structure is either very simple or very complex, depending on how you view abstraction. It has only three elements, making it very simple. But one element is an array of char pointers, and another is an array of arrays of enum KBDKEYS.

struct KEYSTROKES					/*{{{*/
{
char * keySymbols[NULLCHAR + 1];
char keyChars[NULLCHAR + 1];
KBDKEYS8 action_keys[NULLACTIONTYPE];
}; /*}}}*/

The keySymbols element is an array of pointers. The array elements, which are pointers, are allocated in the declaration. But the pointer for each element must be allocated with malloc() and freed with free(). The action_keys element is entirely allocated by the declaration. It is an array of arrays of enum KBDKEYS used to map user keystrokes to various actions.

All of the elements of the KEYSTROKES pointer are manipulated by specialized routines.

The ROWSINFO Structure

The ROWSINFO structure holds all the application data scrolled in the picklist. It looks like this:

struct ROWSINFO						/*{{{*/
{
long numrows; /* number of rows currently consumed */
long maxrows; /* maximum rows in array */
long increase_maxrows_by; /* number of rows to add to array when full */
struct ROWNODE *headNode;
struct ROWINFO **rows;
}; /*}}}*/

This is very much a state variable. Rows of data are dynamically allocated increase_maxrows_by at a time. numrows and maxrows both start out zero. numrows increases by one for each row pushed into the picklist. maxrows is the allocated length of the rows array. When one more row would exceed that size, rows is reallocated to a size increase_maxrows_by larger than the current size.

Each row must be allocated a ROWINFO structure. Notice that ROWINFO is singular, ROWSINFO is plural. The ROWINFO (singular) struct looks like this:

struct ROWINFO						/*{{{*/
{
char *displayString;
char *sortString;
bool sortPointsToDisplay;
char *uid;
}; /*}}}*/

The ROWINFO structure consists of 3 strings to allocate and a boolean. If the boolean is true, it tells the application that it should use displayString in lieu of sortString.

To make this simpler, both the ROWINFO and ROWSINFO structs are manipulated by their own functions:

struct
Manipulating function
Description
struct ROWINFO
struct ROWINFO * newRowInfo(
    const char *displayString,
    const char *sortString,
    const bool sortPointsToDisplay,
    const char *uid
        );
This allocates all storage for a new row. First it allocates storage for the struct ROWINFO itself, and then it allocates strings equal to displayString, sortString and uid, leaving the original arguments untouched and unassigned. If sortPointsToDisplay=True, it does not make an allocated copy of sortString.
void freeRowInfo(struct ROWINFO * rowInfo);
This frees the displayString and uid, pointers, and also the , sortString pointer if sortPointsToDisplay=false. It then deallocates the struct ROWINFO itself.
void testRowInfo();
This is a loop to repeatedly runs newRowInfo() and freeRowInfo() in order to detect memory leak and other problems. Ideally, you should look at the memory display of the Linux top command. If memory usage continuously rises, you have a memory leak.



struct ROWSINFO
struct ROWSINFO * newRowsInfo(long increase_maxrows_by);
This allocates the ROWSINFO structure and initializes rowsInfo, maxrows and increase_maxrows_by. It does not allocate storage for rows and headNode, but instead sets them to NULL.
void freeRowsInfo(struct ROWSINFO *rowsInfo);
This deallocates EVERYTHING in the structure ROWSINFO, including any allocated rows. At this time it doesn't deallocate any ROWNODE structures because those structures aren't yet incorporated into the program, but exist as stub.
void testRowInfo();
This operates a loop that repeatedly:
  1. Calls newRowsInfo to instantiate a ROWSINFO pointer
  2. Repeatedly calls newRow to insert 30 rows into the  ROWINFO record
  3. Calls freeRowsInfo to free all rows and the ROWSINFO pointer itself.
This function is used to test for memory leak.
void addRow(
struct ROWSINFO * rowsInfo,
const char *displayString,
const char *sortString,
const bool sortPointsToDisplay,
const char *uid
);
This calls newRow to allocate and configure a new ROWINFO pointer, and also updates the numrows, maxrows

In this particular app, the picklist's appearance is input to the data structure with code, such as the setUpWindows(), buildKeyDefaults(), and assignDefaultActionKeys() functions. With a little more work it could come straight from a config file, eliminating the need for programming and recompilation. In this app, the data to be picked DOES come from a file, in this case a sorted file of strings. If it weren't sorted correctly, it could be sorted with qsort(), or each element could be inserted in a tree headed by pickInfo->rowsInfo->headNode, such that they're sorted upon insertion. This works in cases where the items to be sorted are not in an array of like-sized elements.

The routine to actually run the menu closely resembles that of its hardcoded brethren, except that it uses data from the PICKINFO struct instead of arguments or hardcodes:

int runMenu(						/*{{{*/
struct PICKINFO *pickInfo,
WINDOW *wParent
)
{
int c; /* key pressed */
ITEM **my_items; /* list of items on this menu */
MENU *my_menu; /* the menu structure */

WINDOW *wUI; /* window on which the user
interacts with the menu */
WINDOW *wBorder; /* window containing the wUI window
and the border and title */

int n_choices; /* number of items on menu */
int ssChoice; /* subscript to run around the choices array */
int my_choice = -1; /* the zero based numeric user choice */
int y;
enum ACTIONTYPES actiontype;


/* CALCULATE NUMBER OF MENU CHOICES */
n_choices = pickInfo->rowsInfo->numrows;

/* ALLOCATE ITEM ARRAY AND INDIVIDUAL ITEMS */
my_items = (ITEM **)calloc(n_choices + 1, sizeof(ITEM *));
for(ssChoice = 0; ssChoice < n_choices; ++ssChoice)
{
my_items[ssChoice] = new_item(
pickInfo->rowsInfo->rows[ssChoice]->displayString, NULL);
}
my_items[n_choices] = (ITEM *)NULL;

/* CREATE THE MENU STRUCTURE */
my_menu = new_menu((ITEM **)my_items);

/* PUT > TO THE LEFT OF HIGHLIGHTED ITEM */
set_menu_mark(my_menu, "> ");

/* SET UP WINDOW FOR MENU'S BORDER */
wBorder = newwin(
pickInfo->windows->border_height,
pickInfo->windows->border_width,
pickInfo->windows->borderTopLeftY,
pickInfo->windows->borderTopLeftX);
wattrset(
wBorder,
COLOR_PAIR(pickInfo->windows->border_color_combo_constant) |
pickInfo->windows->border_attrib);
wclrscr(wBorder);
box(wBorder, 0, 0);
wCenterTitle(wBorder, pickInfo->windows->title);

/* SET UP WINDOW FOR THE MENU'S USER INTERFACE */
wUI = derwin(
wBorder,
pickInfo->windows->ui_height,
pickInfo->windows->ui_width,
pickInfo->windows->uiTopLeftY,
pickInfo->windows->uiTopLeftX
);

/* ASSOCIATE THESE WINDOWS WITH THE MENU */
set_menu_win(my_menu, wBorder);
set_menu_sub(my_menu, wUI);
set_menu_format(my_menu, pickInfo->windows->ui_height, 1);

/* MATCH MENU'S COLORS TO THAT OF ITS WINDOWS */
set_menu_fore(
my_menu,
COLOR_PAIR(pickInfo->windows->ui_highlight_color_combo_constant) |
pickInfo->windows->ui_highlight_attrib
);
set_menu_back(
my_menu,
COLOR_PAIR(pickInfo->windows->ui_color_combo_constant) |
pickInfo->windows->ui_attrib
);

/* SET UP AN ENVIRONMENT CONDUCIVE TO MENUING */
keypad(wUI, TRUE); /* enable detection of function keys */
noecho(); /* user keystrokes don't echo */
curs_set(0); /* make cursor invisible */
menu_opts_on(my_menu, O_SHOWMATCH);
menu_opts_on(my_menu, O_IGNORECASE);
menu_opts_on(my_menu, O_NONCYCLIC);

/* DISPLAY THE MENU */
post_menu(my_menu);

/* DISPLAY THE PROMPTS */
wattrset(
wBorder,
COLOR_PAIR(pickInfo->windows->prompt_color_combo_constant) |
pickInfo->windows->prompt_attrib
);
y = pickInfo->windows->firstPromptY;
for(actiontype=PICK; actiontype < NULLACTIONTYPE; actiontype++)
{
enum KBDKEYS *kbdkeys = pickInfo->keystrokes->action_keys[actiontype];
if(kbdkeys[0] != NULLCHAR)
{
int ssKeySymbol;
mvwaddstr(
wBorder,
y,
pickInfo->windows->firstPromptX,
pickInfo->actionNames[actiontype]
);
waddstr(wBorder, ": ");
for(ssKeySymbol = 0; ssKeySymbol < 7; ssKeySymbol++)
{
enum KBDKEYS thisKey = kbdkeys[ssKeySymbol];
if(thisKey == NULLCHAR)
break;
else
{
char *pc;

if(ssKeySymbol > 0) waddstr(wBorder, ", ");
pc = pickInfo->keystrokes->keySymbols[thisKey];
waddstr(wBorder, pc);
}
}
y++;

}
}
wattrset(
wBorder,
COLOR_PAIR(pickInfo->windows->border_color_combo_constant) |
pickInfo->windows->border_attrib
);

/* REFRESH THE BORDER WINDOW PRIOR TO ACCEPTING USER INTERACTION */
touchwin(wBorder);
wrefresh(wBorder);

/* HANDLE USER KEYSTROKES */
while(my_choice == -1)
{
touchwin(wUI); /* refresh prior to getch() */
wrefresh(wUI); /* refresh prior to getch() */
c = getch();
switch(c)
{
case KEY_DOWN:
menu_driver(my_menu, REQ_DOWN_ITEM);
break;
case KEY_UP:
menu_driver(my_menu, REQ_UP_ITEM);
break;
case KEY_NPAGE:
menu_driver(my_menu, REQ_SCR_DPAGE);
break;
case KEY_PPAGE:
menu_driver(my_menu, REQ_SCR_UPAGE);
break;
case 10: /* Enter */
my_choice = item_index(current_item(my_menu));

/* RESET CURSOR IN CASE MORE SELECTION IS NECESSARY */
pos_menu_cursor(my_menu);
break;
}
}

/* FREE ALL ALLOCATED MENU AND ITEM RESOURCES */
unpost_menu(my_menu);
for(ssChoice = 0; ssChoice < n_choices; ++ssChoice)
free_item(my_items[ssChoice]);
free_menu(my_menu);

/* DESTROY MENU WINDOW AND BORDER WINDOWS */
delwin(wUI);
delwin(wBorder);

/* UNDO MENU SPECIFIC ENVIRONMENT */
curs_set(1); /* make cursor visible again */

/* REPAINT THE CALLING SCREEN IN PREPARATION FOR RETURN */
touchwin(wParent);
wrefresh(wParent);

/* RETURN THE ZERO BASED NUMERIC USER CHOICE */
return(my_choice);
} /*}}}*/

As mentioned previously, this is not a completely data driven design, in that it does not facilitate a picklist completely defined in a config file. The problem with #defined color pairs was already mentioned -- these should be in data. Also, notice that in the subroutine above, set_menu_mark() strongarms the mark to ">" rather than obtaining it from data (it's not included in the data). This magazine is very late, so I didn't want to take the time to get this program perfect.

What happens when somebody improves this program to perfection? At that point it becomes a do-anything picklist tool. Combine it with a data-driven do-anything menu tool, and a data-driven do-anything form tool (together with data driven field tools for the form), and an SQL interface, and you have the makings for some extremely fast development. If done right, the vast majority of the app could be written in a configuration file.

In my opinion, such a configuration file would best be implemented as a tab indented outline. Tab indented outlines are intuitive to humans, and easily manipulated by computer algorithms. Next month's Linux Productivity Magazine features a tool called Node.pm, which makes processing and manipulation of tab indented outlines trivial.

Imagine getting the specifications for a new application, spending a few hours with an outline processor, and then giving the client the specified app (or a close approximation thereof). With data driven menus, picklists, forms, formfields and SQL interface it's entirely possible.

One more point. Looking at the PICKINFO structure, do you notice that for the most part it's generic to the concept of a picklist, with very little dependency on Curses. This makes it feasible to have one configuration file service both Curses based picklists and QT or GTK based picklists. If you've ever admired programmers creating the ultimate level of modularity, consider they might simply be performing Data Driven Design.
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Form Hello World

By Steve Litt
The simplest possible form is one field taking keystrokes, and exiting when you press the "accept" key, which we'll define as F9. For this and all following forms, you'll see the best results when looking it on a genuine character mode terminal (Ctrl+Alt+F2, etc), rather than a GUI terminal emulator. Specifically, many GUI terminals don't support blink, so the cursor ends up looking like just space on the form.

The gnome-terminal GUI terminal does display blink. The following exercises use F9 as the "accept key" because gnome-terminal does nothing special with that key. If it had been F10, gnome-terminal would have intercepted the keystroke and displayed a menu in response.

Throughout the exercises in this magazine you'll see that different GUI terminals display slightly different. When in doubt, view the output on a genuine character terminal (Ctrl+Alt+F2, etc). We'll try to make sure that they all look good on genuine character terminals.

The following code displays a single field in reverse video, into which you can input keystrokes.

#include <form.h>

int main()
{
FIELD *field[2];
FORM *myForm;
int ch;

initscr(); /* start curses */
cbreak();
noecho(); /* don't echo keystrokes */
keypad(stdscr, TRUE); /* accept func keys */

/* CREATE FIELD ARRAY */
field[0] = new_field
(
1, /* height in rows */
10, /* width in characters */
4, /* y position in rows, 0 based */
17, /* x position in columns, 0 based */
0, /* number of offscreen rows */
0 /* number of working buffers */
);
field[1] = NULL;

/* SET FIELD OPTIONS */
set_field_back(field[0], WA_REVERSE); /* field has reverse video text */
field_opts_off(field[0], O_AUTOSKIP); /* don't walk off end of field */

/* CREATE FORM */
myForm = new_form(field);
post_form(myForm);
mvprintw(4, 10, "Field: ");
refresh();

/* KEYSTROKE ACQUISITION LOOP */
while((ch = getch()) != KEY_F(9)) /* F9 is accept keystroke */
{
form_driver(myForm, ch);
}

/* UNPOST FORM, FREE MEMORY */
unpost_form(myForm);
free_form(myForm);
free_field(field[0]);
free_field(field[1]);
free_field(field[2]);

endwin();
return 0;
}
Compile the preceding code like this:
rm -f ./a.out
gcc -Wall -lncurses -lform form.c
./a.out
When you run the program, the result looks like this:
Output of the Hello World form program

The preceding outputs a form with a single prompt and a single reverse video field. Try typing into the field. Then press the F9 key to exit the form.
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

A Trivial Input Form

By Steve Litt
The form in the preceding article does nothing except prove the concept of Curses forms. In this article we'll create a form with three fields, and provide navigation between them. Paste the following code into an editor, and then compile and run:

#include <form.h>

int main()
{
FIELD *field[4];
FORM *myForm;
int ch;

initscr(); /* start curses */
cbreak();
noecho(); /* don't echo keystrokes */
keypad(stdscr, TRUE); /* accept func keys */

/* CREATE FIELD ARRAY */
field[0] = new_field
(
1, /* height in rows */
10, /* width in characters */
4, /* y position in rows, 0 based */
22, /* x position in columns, 0 based */
0, /* number of offscreen rows */
0 /* number of working buffers */
);
field[1] = new_field(1, 10, 6, 22, 0, 0);
field[2] = new_field(1, 10, 8, 22, 0, 0);
field[3] = NULL;

/* SET FIELD OPTIONS */
set_field_back(field[0], WA_REVERSE); /* field has reverse video text */
field_opts_off(field[0], O_AUTOSKIP); /* don't walk off end of field */

set_field_back(field[1], WA_REVERSE);
field_opts_off(field[1], O_AUTOSKIP);

set_field_back(field[2], WA_REVERSE);
field_opts_off(field[2], O_AUTOSKIP);

/* CREATE FORM */
myForm = new_form(field);
post_form(myForm);
mvprintw(4, 10, "First name: ");
mvprintw(6, 10, "Last name : ");
mvprintw(8, 10, "Job : ");
form_driver(myForm, REQ_FIRST_FIELD); /* Put cursor in first field */
refresh();

/* KEYSTROKE ACQUISITION LOOP */
while((ch = getch()) != KEY_F(9)) /* F9 is accept keystroke */
{
switch(ch)
{
case 9: /* tab key */
case 10: /* enter key */
case KEY_DOWN:
form_driver(myForm, REQ_NEXT_FIELD);
form_driver(myForm, REQ_END_LINE);
break;
case KEY_BTAB:
case KEY_UP:
form_driver(myForm, REQ_PREV_FIELD);
form_driver(myForm, REQ_END_LINE);
break;
default:
form_driver(myForm, ch);
break;
}
}

/* UNPOST FORM, FREE MEMORY */
unpost_form(myForm);
free_form(myForm);
free_field(field[0]);
free_field(field[1]);
free_field(field[2]);

endwin();
return 0;
}

The preceding creates a form with three prompts and three fields. You can navigate between the fields using tab, or the  up and down arrow keys. Once again, F9 is the accept key, terminating the form.

Compile the preceding code like this:
rm -f ./a.out
gcc -Wall -lncurses -lform form.c
./a.out
The result is as follows:
Output of the Trivial Form

The lesser length of the first field is an artifact -- the cursor shows up as not part of the field.

This resembles a real form, but you'll notice several deficiencies:
In the next article, you'll see these deficiencies corrected.
Steve Litt is the author of Rapid Learning: Secret Weapon of the Successful Technologist.   Steve can be reached at his email address.

A Practical Input Form

By Steve Litt
  The form in the preceding article had several deficiencies:
In this article we cure all four of these deficiencies. We return the user's choice by reading the contents of the array of fields. We return whether the user chose to accept or bail by the function return of the runForm() subroutine. We gave the user the choice of accepting or bailing by adding provisions within the switch statement in the keystroke acquisition loop for Ctrl+A (accept) and Ctrl-Q (quit, bail). Actually, Ctrl+A fulfils the loop's stop condition, and ACCEPT is the default.

The switch statement in the loop also has provisions for the backspace and delete keys. Backspace deletes the character to the left of the cursor and pulls everything that was to the right of that, to the left. Delete deletes the character under the cursor and pulls everything to the right of that left. Note that if you use the Del key on the keypad, numlocks must be off.

The switch statement has a case for the Insert key. That case toggles the insert status of the form. In insert mode, when you type a character it goes at the cursor, and pushes what was under the cursor, and everything to the right of that, one character farther right. In overwrite mode, the character typed simply replaces what was at the cursor.

Note also that this version allows fields to be pre-filled. Here's the code:

#include <form.h>
#include <stdlib.h> /* accommodate malloc() */
#include <string.h> /* accommodate string functions */
#include <assert.h> /* accommodate assert() */

/* DEFINE A COUPLE ENUMERATIONS */
enum RESULTTYPES {ACCEPT, BAIL};
enum INSERT_MODES {INSERT, OVERLAY};


/* CREATE A FUNCTION TO RUN A FORM */
enum RESULTTYPES runForm(FIELD ** fields)
{
FORM *myForm;
int ch;
enum RESULTTYPES result;
enum INSERT_MODES insert_mode = OVERLAY;
int fieldNo;


initscr(); /* start curses */
cbreak();
raw();
noecho(); /* don't echo keystrokes */
keypad(stdscr, TRUE); /* accept func keys */

/* CREATE FORM */
myForm = new_form(fields);
post_form(myForm);
mvprintw(4, 10, "First name: ");
mvprintw(6, 10, "Last name : ");
mvprintw(8, 10, "Job : ");
mvprintw(20, 10, "Press Ctrl+A to Accept");
mvprintw(21, 10, "Press Ctrl+Q to Quit");
mvprintw(22, 10, "Press INSERT to toggle insert mode");
form_driver(myForm, (insert_mode==INSERT ? REQ_INS_MODE : REQ_OVL_MODE));
form_driver(myForm, REQ_FIRST_FIELD); /* Put cursor in first field */
refresh();

/* SET FIELD OPTIONS, TURN OFF FIELD BLANKING*/
for(fieldNo=0; fields[fieldNo] != NULL; fieldNo++)
{
field_opts_off(fields[fieldNo], O_BLANK);
}

/* KEYSTROKE ACQUISITION LOOP */
result = ACCEPT; /* assume user acceptance */
while((ch = getch()) != 1) /* Ctrl+A is accept keystroke */
{
switch(ch)
{
case 9: /* tab key */
case 10: /* enter key */
case KEY_DOWN:
form_driver(myForm, REQ_NEXT_FIELD);
form_driver(myForm, REQ_END_LINE);
form_driver(myForm, REQ_BEG_LINE);
break;
case KEY_BTAB:
case KEY_UP:
form_driver(myForm, REQ_PREV_FIELD);
form_driver(myForm, REQ_END_LINE);
form_driver(myForm, REQ_BEG_LINE);
break;
case KEY_LEFT:
form_driver(myForm, REQ_LEFT_CHAR);
break;
case KEY_RIGHT:
form_driver(myForm, REQ_RIGHT_CHAR);
break;
case 127:
form_driver(myForm, REQ_LEFT_CHAR);
form_driver(myForm, REQ_DEL_CHAR);
break;
case KEY_IC:

mvprintw(22, 10, "INS HIT");

if(insert_mode == INSERT)
{
insert_mode = OVERLAY;
form_driver(myForm, REQ_OVL_MODE);
}
else
{
insert_mode = INSERT;
form_driver(myForm, REQ_INS_MODE);
}
break;
case KEY_DC:
form_driver(myForm, REQ_DEL_CHAR);
break;
case 27: /* use pressed Esc */
case 17: /* use pressed Ctrl+Q */
case 2: /* use pressed Ctrl+B */
result = BAIL;
break;
default:
form_driver(myForm, ch);
break;
}
touchwin(stdscr);
wrefresh(stdscr);
if(result == BAIL) break;
}

/* GUARANTEE LAST ENTRY IS RECORDED */
form_driver(myForm, REQ_END_LINE);

/* UNPOST FORM, FREE MEMORY */
unpost_form(myForm);
free_form(myForm);

/* END CURSES */
endwin();
return(result);
}

int main(int argc, char *argv[])
{
FIELD *fields[4];
int fieldNo;
enum RESULTTYPES result;

/* CREATE FIELD ARRAY */
fields[0] = new_field
(
1, /* height in rows */
10, /* width in characters */
4, /* y position in rows, 0 based */
22, /* x position in columns, 0 based */
0, /* number of offscreen rows */
0 /* number of working buffers */
);
fields[1] = new_field(1, 10, 6, 22, 0, 0);
fields[2] = new_field(1, 16, 8, 22, 0, 0);
fields[3] = NULL;

/* SET FIELD OPTIONS */
set_field_back(fields[0], WA_REVERSE); /* field has reverse video text */
field_opts_off(fields[0], O_AUTOSKIP); /* don't walk off end of field */

set_field_back(fields[1], WA_REVERSE);
field_opts_off(fields[1], O_AUTOSKIP);

set_field_back(fields[2], WA_REVERSE);
field_opts_off(fields[2], O_AUTOSKIP);

/* PRESET FIELD VALUES */
set_field_buffer(fields[2], 0, "programmer");

/* RUN THE FORM */
result = runForm(fields);

/* PRINT THE USER INPUT */
printf("\n\n");
for(fieldNo=0; fields[fieldNo] != NULL; fieldNo++)
{
printf("Field %d is >%s<\n",
fieldNo,
field_buffer(fields[fieldNo], 0));
}
printf("User elected to %s\n", (result == ACCEPT ? "ACCEPT" : "BAIL"));

/* DELETE THE FIELDS */
for(fieldNo=0; fields[fieldNo] != NULL; fieldNo++)
{
free_field(fields[fieldNo]);
}

return(0);
}

Once again, compile it like this:
rm -f ./a.out
gcc -Wall -lncurses -lform form.c
./a.out
The result is as follows:
Output of the practical form

Steve Litt is the author of Samba Unleashed.   Steve can be reached at his email address.

A Maze Runner Program

By Steve Litt
Using Turbo Pascal, I wrote a maze runner program in 1986. The output was pretty easy in the days when all output was a memory write to segment B800. When I tried to rewrite the program as a C/Linux program, everything translated except the video output. What to do?

Curses, saved again! I used ncurses. The following is the maze runner program:

#include <ncurses.h>
#include <stdlib.h> /* malloc */
#include <string.h> /* strleng */
#include <assert.h> /* assert */
#include <unistd.h> /* sleep, usleep */
#include <ctype.h> /* toupper */

#define NUMCOLUMNS 80
#define NUMROWS 25

#define STARTCHAR 'S'
#define FINISHCHAR 'F'
#define DEADENDCHAR '.'
#define BLOCKREPRESENTATIVESTCHAR 'X'

#define YELLOWONBLUE 1
#define WHITEONBLUE 2
#define BLACKONWHITE 3
#define WHITEONBLACK 4
#define GREENONBLACK 5
#define YELLOWONBLACK 6
#define BLUEONBLACK 7
#define CYANONBLACK 8
#define REDONBLACK 9
#define MAGENTAONBLACK 10

typedef char LINE[NUMCOLUMNS+0];
typedef LINE SCREENCLONE[NUMROWS];

typedef struct
{
unsigned y;
unsigned x;
} POINT;

enum DIRECTION {RIGHT,UP,LEFT,DOWN};
#define DIRECTIONCHARS ">^<v"


SCREENCLONE scr; /* GLOBAL COPY OF COMPUTER SCREEN */

struct pgm_info /* INFO NECESSARY TO RUN THE PROGRAM */
{
char *mazefile;
long usecSleep;
bool stopAtEnd;
SCREENCLONE *maze;
POINT startyx;
POINT finishyx;
};

/*************************************************************
* Clear the screen with the current background color and attrib
*************************************************************/
void clrscr(void)
{
int y, x, maxy, maxx;
getmaxyx(stdscr, maxy, maxx);
for(y=0; y < maxy; y++)
for(x=0; x < maxx; x++)
mvaddch(y, x, ' ');
}

/*************************************************************
* Print usage statement
*************************************************************/
void usage(const char * msg)
{
printf("\n\n%s\n", msg);
printf("\n\nUsage is:\n");
printf("\trunmaze mazefile microseconds\n");
printf("\n\tThe microseconds argument is optional, and represents\n");
printf("\tthe delay between maze movements. If this argument is\n");
printf("\tnegative, the program waits for a keystroke at the end\n");
printf("\tbefore terminating. If it's -1 it waits, but there is no\n");
printf("\tintermovement delay.\n\n");
printf("\tBe aware that on Intel machines delays cannot be cut as\n");
printf("\tfine as microseconds, so in fact 1 and 500 produce\n");
printf("\tidentical results.\n\n");

printf("Aborting...\n\n");
exit(1);
}

/*************************************************************
* Abnormally end the program. Doesn't call endwin(), so this
* routine should be called only when ncurses is not running.
*************************************************************/
void abortt(const char * msg)
{
printf("\n\nERROR: %s\n", msg);
printf("ABORTING...");
exit(1);
}

/*************************************************************
* Set all necessary program data based on argc and argv
*************************************************************/
struct pgm_info *get_info(int argc, char *argv[])
{
FILE * fin;
struct pgm_info *pinfo;
int y, x;
char *buf;

/***** RETURN ON GROSS COMMAND LINE SYNTAX ERRORS *****/
if(argc < 2) usage("Must have at least one argument\n");
if(argc > 3) usage("Too many arguments\n");

/***** ALLOCATE RESOURCES *****/
buf = (char *) malloc(1000);
assert(buf);

pinfo = (struct pgm_info *) malloc(sizeof(struct pgm_info));
assert(pinfo);

pinfo->mazefile = (char *)malloc(strlen(argv[1]));
assert(pinfo->mazefile);
strcpy(pinfo->mazefile, argv[1]);

pinfo->maze = (SCREENCLONE *)malloc(sizeof(SCREENCLONE));
assert(pinfo->maze);

/***** SET SPEED ACCORDING TO COMMAND LINE ARGS *****/
pinfo->usecSleep = 0;
pinfo->stopAtEnd = false;
if(argc > 2)
{
pinfo->usecSleep = atol(argv[2]);
if(pinfo->usecSleep < 0)
{
pinfo->stopAtEnd = true;
if(pinfo->usecSleep == -1)
{
pinfo->usecSleep = 0;
}
pinfo->usecSleep *= -1;
}

}

/***** BLANK OUT pinfo->maze *****/
for(y=0; y < NUMROWS; y++)
memset((*pinfo->maze)[y], ' ', NUMCOLUMNS);

/***** READ MAZE FILE INTO pinfo->maze *****/
/***** THE +2 IN READING INTO BUF IS TO ACCOMMODATE THE '\n' AND '\0' ****/
fin = fopen(pinfo->mazefile, "r");
if(!fin)
{
char msg[61];
snprintf(msg, 60,
"Could not open maze file %s for read: aborting!\n\n",
pinfo->mazefile);
usage(msg);
}

y=0;
memset(buf, ' ', NUMCOLUMNS + 2);
while(fgets(buf, NUMCOLUMNS + 2, fin))
{
for(x=0; x < NUMCOLUMNS + 2; x++)
if(buf[x] < ' ')
buf[x] = ' ';
memcpy((*pinfo->maze)[y], buf, NUMCOLUMNS);
memset(buf, ' ', NUMCOLUMNS + 2);
if(++y >= NUMROWS) break;
}
fclose(fin);

/***** FIND LOCATION OF START POINT *****/
for(y=0; y < NUMROWS; y++)
for(x=0; x < NUMCOLUMNS; x++)
{
if(toupper((*pinfo->maze)[y][x]) == STARTCHAR)
{
y+=999;
x+=999;
}
}
if(y < 900) abortt("No starting point found in maze\n");
pinfo->startyx.y = y-1000;
pinfo->startyx.x = x-1000;

/***** FIND LOCATION OF FINISH POINT *****/
for(y=0; y < NUMROWS; y++)
for(x=0; x < NUMCOLUMNS; x++)
{
if(toupper((*pinfo->maze)[y][x]) == FINISHCHAR)
{
y+=999;
x+=999;
}
}
if(y < 900) abortt("No finish point found in maze\n");
pinfo->finishyx.y = y-1000;
pinfo->finishyx.x = x-1000;

return pinfo;
}

/*************************************************************
* A circular incrementer for the DIRECTION enumeration
*************************************************************/
enum DIRECTION nextDirection(enum DIRECTION dir)
{
switch (dir)
{
case RIGHT: return UP;
case UP: return LEFT;
case LEFT: return DOWN;
case DOWN: return RIGHT;
};
return RIGHT; /* kill warning */
}

/*************************************************************
* Return the next point based on current point and the current
* direction. This routine does NOT check for legal positions.
* That must be done by the calling routine.
*************************************************************/
POINT nextPoint(const POINT old, enum DIRECTION dir, POINT *new)
{
new->y = old.y;
new->x = old.x;
switch (dir)
{
case RIGHT: new->x++; break;
case UP: new->y--; break;
case LEFT: new->x--; break;
case DOWN: new->y++; break;
};

return (*new);
}

/*************************************************************
* This routine prints character c at POINT point, both on the
* screen and on the global clone screen scr. It does character
* translation on the screen, but not on the screen clone, such
* that on the screen lower or uppercase X become block characters.
*************************************************************/
void printAtPoint(const POINT point, const char c)
{
scr[point.y][point.x] = c;
if(toupper(c) == BLOCKREPRESENTATIVESTCHAR)
mvaddch(point.y, point.x, ACS_CKBOARD);
else
mvaddch(point.y, point.x, c);
touchwin(stdscr);
wrefresh(stdscr);
}

/*************************************************************
* A version of printAtPoint() not requiring a POINT as an
* argument. Calls printAtPoint().
*************************************************************/
void printAtYX(const unsigned y, const unsigned x, const char c)
{
POINT p;
p.y = y;
p.x = x;
printAtPoint(p, c);
}

/*************************************************************
* Gets the character at a specific POINT position.
*************************************************************/
char getCharAtPoint(const POINT point)
{
return(scr[point.y][point.x]);
}

/*************************************************************
* isOnRoute() is a recursive subroutine that is the "guts" of
* the program. It's called with the following arguments:
* struct pgm_info *pinfo,
* POINT p,
* enum DIRECTION curDir,
* int level
*
* * pinfo is the program's global info
* * p is the POINT location on which this routine operates
* * curDir is the direction the maze runner was going prior
* to this call. This is important because the maze runner
* tries to go straight as long as it can.
* * level is the recursion level. It is important because the
* runner always starts with the S point. Only on that starting,
* this routine should NOT return false if the POINT is occupied
* by a nonblank, because in fact it is occupied by the S. In
* other levels, bumping into a nonblank, even an S, means
* a failure, unless the nonblank is an F, in which case return
* success.
*
* Here's what the routine does. At this POINT, determine success
* or failure in finding a possible route. Going outside the 80x25
* grid is a failure, as is resting on any nonblank except 'F'.
* Resting on an 'F' means success -- you finished the maze so return
* success.
*
* If one of the preceding obvious result triggers doesn't happen, the
* next step is to see if any one of the four adjoining points on the
* grid is a possible success. For each direction you try, change the
* character written to one representing that direction (>^< or V).
* Starting with the point adjacent in the
* curDir direction, call isOnRoute on the adjoining points. The first
* point returning a true (success) ends the search, and this invocation
* returns true. However, if all 4 directions yield false, then this
* invocation returns false. In that case, a dot is printed in the current
* point.
*
* So you can think of it like this: You're on the correct route if:
* 1. You're sitting on the F point, or else
* 2. One of the 4 points adjacent to you is on the correct route.
*
* It's more complicated to explain than to code. Note that this comment
* is longer than the subroutine itself.
*************************************************************/
bool isOnRoute(struct pgm_info *pinfo, POINT p, enum DIRECTION curDir, int level)
{
enum DIRECTION orgDir = curDir;
bool success;
POINT newPoint;

if(pinfo->usecSleep > 0)
usleep(pinfo->usecSleep);

if(p.y < 0) return false;
if(p.x < 0) return false;
if(p.y >= NUMROWS) return false;
if(p.x >= NUMCOLUMNS) return false;
if(toupper(scr[p.y][p.x]) == FINISHCHAR) return true;
if((scr[p.y][p.x] != ' ') && (level != 0)) return false;

success = false;
do
{
if(level != 0)
printAtPoint(p, DIRECTIONCHARS[curDir]);
success = isOnRoute(
pinfo,
nextPoint(p, curDir, &newPoint),
curDir,
level+1
);
if(success) break;
curDir = nextDirection(curDir);
}
while(orgDir != curDir);
if(success)
{
return true;
}
else
{
if(level > 0)
printAtPoint(p, DEADENDCHAR);
return(false);
}
}

/*************************************************************
* main() handles various housekeeping and ncurses stuff in
* preparation for isOnRoute() to find the path.
*************************************************************/
int main(int argc, char *argv[])
{
int y;
struct pgm_info * pinfo;
bool successFlag;
POINT point;

/***** GET PROGRAM INFORMATION *****/
pinfo = get_info(argc, argv);

/***** BLANK GLOBAL SCREEN *****/
for(y=0; y < NUMROWS; y++)
{
memset(scr[y], ' ', NUMCOLUMNS);
scr[y][NUMCOLUMNS] = '\0';
}

/***** INITIALIZE CURSES *****/
initscr();
start_color();
init_pair(YELLOWONBLUE, COLOR_YELLOW, COLOR_BLUE);
init_pair(YELLOWONBLACK, COLOR_YELLOW, COLOR_BLACK);
init_pair(WHITEONBLUE, COLOR_WHITE, COLOR_BLUE);
init_pair(BLACKONWHITE, COLOR_BLACK, COLOR_WHITE);
init_pair(WHITEONBLACK, COLOR_WHITE, COLOR_BLACK);
init_pair(GREENONBLACK, COLOR_GREEN, COLOR_BLACK);
init_pair(BLUEONBLACK, COLOR_BLUE, COLOR_BLACK);
init_pair(REDONBLACK, COLOR_RED, COLOR_BLACK);
init_pair(MAGENTAONBLACK, COLOR_MAGENTA, COLOR_BLACK);
init_pair(CYANONBLACK, COLOR_CYAN, COLOR_BLACK);

/***** COPY MAZE TO SCREEN *****/
attrset(COLOR_PAIR(WHITEONBLACK));
clrscr();
for(point.y=0; point.y < NUMROWS; point.y++)
for(point.x=0; point.x < NUMCOLUMNS; point.x++)
{
printAtPoint(point, (*pinfo->maze)[point.y][point.x]);
}

/***** EMPHASIZE START AND FINISH *****/
attrset(COLOR_PAIR(WHITEONBLACK) | WA_BOLD);
printAtPoint(pinfo->startyx, STARTCHAR);
printAtPoint(pinfo->finishyx, FINISHCHAR);
touchwin(stdscr);
wrefresh(stdscr);

/***** RUN THE MAZE *****/
attrset(COLOR_PAIR(CYANONBLACK) | WA_BOLD);
sleep(2);
successFlag = isOnRoute(pinfo, pinfo->startyx, RIGHT, 0);
touchwin(stdscr);
wrefresh(stdscr);

/***** SHOW THE ROUTE *****/
sleep(2);
attrset(COLOR_PAIR(YELLOWONBLACK) | WA_BOLD);
for(point.y=0; point.y < NUMROWS; point.y++)
{
for(point.x=0; point.x < NUMCOLUMNS; point.x++)
{
char scrc = scr[point.y][point.x];
if (scrc == DEADENDCHAR)
printAtPoint(point, ' ');
else if(strchr(DIRECTIONCHARS, scrc))
printAtPoint(point, scrc);
}
}

if(!successFlag)
{
mvaddstr(0, 0, " THIS MAZE HAS NO SOLUTION ");
}
if(pinfo->stopAtEnd)
getch();
else
sleep(3);
endwin();
return 0;
}

To run this program, arg1 must be a "maze file", which describes an 80x25 maze. Here are several examples.

You run it something like this:
./maze valerie.maz -1
Running the preceding command produces the following:

valerie.maz run through the maze runner

Items of Note

There are some items of note:

Refreshing after each character write

Normally, curses programs are refreshed only after a whole screen is painted. But in this program, the whole point is to show each move of the maze runner. So the printAtPoint() routine concludes with a touchwin() and a wrefresh() call. This makes the program much slower than it otherwise would be.

Colors

This app is much more pleasing because it uses colors to differentiate various elements of the screen. Color changes are accommodated by writing or rewriting characters in different colors. Note that the rewrites could have been accomplished with the mvchgat() routine, but I chose to simply rewrite colors after changing the attributes with attrset().

Directions

enum DIRECTION defines the four directions: RIGHT, UP, LEFT and DOWN. Char array DIRECTIONCHARS maps arrow characters to their respective directions. The nextDirection() routine iterates through the directions.

Recursion

The actual maze running is performed by recursion. It's actually called a "backtracking algorithm" because upon failure it backtracks and tries again. It's simple if you understand the basic principles.

There are exactly two ways that a particular point in the 80x25 grid can be on the route from the start point to the finish point:
  1. The point is the finish point itself
  2. The point touches a point on the route from start to finish
The preceding is implemented by the isOnRoute() function. Like all recursive functions, it has three parts:
  1. The part before it calls itself
  2. The statement in which it calls itself
  3. The part after it calls itself
The part before itself is called when you're "going forward". The part after the self-call is executed after "coming back" from the instance of the subroutine which is called.

Whether or not the point is the finish point itself is determined in the part before self-call -- the "going forward" part. Whether or not you're touching a point on the route from start to finish is determined  after the self-call -- the "coming back" part. This makes sense because you won't know any point is on the route until you've made it all the way to the finish line.
Steve Litt is the author of the Universal Troubleshooting Process Courseware.   Steve can be reached at his email address.

Life After Windows: Data Driven Design

Life After Windows is a regular Linux Productivity Magazine column, by Steve Litt, bringing you observations and tips subsequent to Troubleshooters.Com's Windows to Linux conversion.
By Steve Litt
I've always been a good programmer. Even an excellent programmer. A sought after programmer. A quick programmer whose code took a licking and kept on ticking. A really good programmer.

But never a GREAT programmer.

The true shame is, I was one sentence away from being a great programmer. If someone had told me the single sentence I will tell you in this article, it wouldn't have taken me 20 years to achieve greatness.

It's not that I wasn't told. Mentor after mentor told me. But they didn't use the right words. Perhaps they themselves didn't understand that nugget that made them great, so they couldn't construct s a sentence containing that nugget and that nugget alone.

So let me phrase it in the way that would have given me a 20 year jumpstart, and just might give you a 20 year jumpstart:

It's the Data, Stupid!

If only my mentors had told me in just those words, instead of telling me by example, or in a roundabout manner. But then again, maybe they didn't understand which card in their hand truly held the power. Let me take you through a 20 year journey...

Part I: The Journey

The Clueless Years

In the early 1980's I was a programming student at Santa Monica Community College. For programming, this resembled a trade school rather than a university. They taught us to hit the marketplace quick and secure jobs. So they taught us the functional decomposition code design methodology until we ate, slept and breathed it. They taught us light systems analysis.

It worked. We rushed out into the local marketplace and snapped up jobs. All too often, the UCLA grads could write you a compiler, but it was the Santa Monica grads who could quickly create a working office automation app. We got hired and we got paid well. Not for greatness, but for quickly completing the desired business programming task.

Santa Monica College had a Pascal teacher named Jerry Hull who was a true computer scientist. During one class he told us there was a tradeoff between data complexity and code complexity. But I missed the part where he told us the huge advantages of data complexity. I decided I preferred code complexity, because I was a true ninja when it came to functional decomposition. The first opportunity missed...

A couple years later I was one of several junior programmers in a medical software house. We pounded out apps using tools created by the head programmer, Mike Welch. When  a fast rewrite of the entire system was needed, Mike created  a DBMS accompanied by a BASIC  interpreter with commands to fully interface to a DBMS. It was called RBASIC. In the end RBASIC lacked the speed and stability to form the backbone of our app, but consider for a second that in the space of maybe 9 months this one person created a tool for the PDP11 with the ease and power of DBASEIII. Mike Welch was a GREAT programmer.

One day I read through the C code of Mike's database. It was a massive tree of struct pointers that I found almost unfathomable. I asked Mike why he coded that way, and he explained, but not in a way I could understand. I was great at functional decomposition and whomping out procedural code, so I decided not to code such huge data structures. Another opportunity lost...

At the medical software house I did take one step in the right direction, creating a tool to print at any point on a monospace printer. This was in the low memory days, so the tool was NOT simply a 66x80 array -- it actually counted lines and spaces. So if you were on line 10 and needed the next print to be column 4 of line 12, you'd simply issue the following command:
atyxprint(myReport, 13, 4, stringToBePrinted);
The myReport variable was a pointer to a struct containing the current line and column coordinates, which are updated by atyxprint(). It also contained the file to which the report was printed, the filename of that file, the current page number, the page length (lines per sheet), and a flag indicating whether this structure was valid (initialized by openrpt()).

The results were outstanding. I could whomp out a complexly formatted report in a couple hours when it took everyone else days. I called my myReport structure a "context variable", and because this tool was so successful, I vowed to write other software using context variables. I was so, so close to the truth. I used many context variables from that point forward, and they helped immensely, but context variables weren't what I thought of first. I thought of functional decomposition. I could do it quicker than almost anyone else. That was why they paid me the big bucks...

Skip ahead another 4 years and I was a senior programmer at a huge law firm. I wrote 90% of the code for their timesheet program. I also wrote tools for myself and other programmers. I used context variables and other structures for many of these tools. Trouble was, I coded them quickly. Why use malloc() when it's so much easier to declare an array whose size is a magic number? I was smart enough to #define that magic number. So what if my tools had some limitations? Ugh!

In my first decade of programming, my initial question was always "how can I break this down into tasks". Data structure was postponed until needed to make the functional decomposition work. It's hard to get the right results when you ask the wrong question...

OOP

A decade after Santa Monica College, OOP came alive.  I bought Turbo C++. I viewed Phillippe Kahn's OOP video endlessly. I even rewrote a substantial portion of the law firm's timesheet processing software in C++. It worked flawlessly, but its complexity was oppressive.

From 1992 to 1996 I was like most other programmers. I tried to use OOP. I knew "OOP is the future". But I just didn't get it. I dabbled in OOP when I could, and forsook it when deadlines were tight. I viewed OOP as a more cohesive way to do my "context variable" trick, which obviously was a tiny portion of what OOP evangelists were saying about it. If only I'd known how close I was to the truth.

One day in 1996 I fell asleep reading an OOP design book I just couldn't grasp. Sleep overcame me. When I woke up, I understood! Just like that, I knew the process to design OOP apps. Simply view the program as a machine, and the objects as subsystems, boards, or components. Two weeks later I became the OOP coach of a Powerbuilder team.

Using this realization, I whipped out quite a bit of great OOP code, including a very simple and elegant replacement for the oppressively complex timesheet processing component written a few years before.

The IT Director at the law firm was a hardware guy turned executive named Ken Heaps. He was one of the best managers I've ever encountered. One day the firm's head managment guy met with Ken and me to ask whether a desired timesheet program change was doable. I said I didn't know. Ken Heaps then asked me whether the current timesheet front end was capturing the data involved in the change. I said it was. Ken turned to the head guy and said "he can do it!". I was about to protest when I realized that Ken was right -- if I had the data, one way or another I could get it to the back end in a useful format.

With 15 years programming experience, I'd just been taught a programming lesson by an ex hardware guy in a suit. When confronted with the question of doability, my first question became "can we capture all the necessary data?".

That realization was earthshaking, but it was page 8 small print next to the truth that still eluded me. Why, oh why didn't Ken throw a glass of water in my face and tell me "It's the Data, Stupid!". I travelled on, coding and evangelizing OOP.

OOP had come, and all was well with the world. Or so I thought...

OOP Disallusionment

I spent the remainder of the 1990's writing OOP code. No matter what the task, no matter what the underlying problem domain, my answer was OOP. 100% OOP. C++, Perl or Turbo Pascal, I wrote OOP. I'd view the program as a machine, split it into subsystems, and those subsystems into components. The main program would consist of one line:
my $controller = Controller->new();
Yep, even the main logic would be an object called $controller or controller. The programs got written. Everyone ooh'ed and ah'ed. But there was a problem.

Invariably, that super-understandable machine/subsystem/component design became an albatross in real life. Look at the original EMDL program as an example. The original design diagram fit cleanly on a single piece of 8.5 x 11 paper. The final product, written exactly to that clean design, was unfathomable. It was the worst code I've ever written. The original UMENU wasn't much better.

I began having OOP envy again. If only I really understood OOP like the master programmers did, I could make beautiful programs. Little did I know that OOP was strictly a bit player. What I didn't understand was the relevance and importance of data structure.

My entire career had been spent in small shops. With a couple exceptions, I'd never had a chance to read truly great code. That would soon change...

The First Light of Dawn

In September 1999 I began writing the book "Samba Unleashed". To fully understand Samba, I downloaded, compiled and installed the source. Then spent a few days studying and modifying the source, and noticed something common among Samba's many source files.

Most source included a header file containing a very large and detailed data structure, together with subroutines to manipulate that data structure. That data structure was a context variable on steroids. Funny thing was, to anybody who had ever run the testparm program or written a detailed smb.conf, that data structure was very readable. All of Samba's configuration, and most of what makes Samba Samba, was contained in that structure -- either directly or within a has-a relationship. It was C, not C++, but they had achieved a lot of the encapsulation we commonly associate with OOP.

I had discovered that with data structures and static variables, one could simulate OOP with C. There was also a tiny voice in my subconscious. A voice my conscious mind couldn't hear, but a voice that would make me receptive to other future voices. That voice whispered:

It's the data, stupid.

The Search

By 2002 the OOP well was running dry. My 1999 all OOP all the time UMENU was fiercely unmaintainable. Yet UMENU was downright readable compared to my 2002 monument to OOP, EMDL. The original EMDL parser was absolutely unmaintainable, even by me. On paid gigs I'd gone back to my coding practices of the mid 1990's -- functional decomposition where I could, and objects for distinct tools. That worked out much better when on the clock.

I was in trouble. My entire programming career was about code design methodology. My expertise at 20 year old functional decomposition was irrelevant. Object Oriented Design wasn't doing the job except for those rare projects involving simulation of tangible objects.

I looked for a new design methodology. Discussed it endlessly. Looked for tools.

The first signpost on my road to that design methodology happened in February 2001 during the writing of the XML themed "Troubleshooting Professional Magazine". The Xerces XML to DOM parser yielded a DOM node tree that was one of the most beautiful, useful, productive and understandable tools I'd ever seen.

By 2002 I'd created VimOutliner and repeatedly needed to parse tab indented outlines. I needed something like DOM, but DOM was just too bulky. I make a lightweight knockoff of DOM called Node.pm. Node.pm centered around an object called a Node, which, just like the DOM spec it emulated, contained:
Naturally, Node.pm contained get and set routines for all the above. I used Node.pm to create several nice outline extraction routines. Unfortunately, I didn't fully understand the power of this tool, and therefore didn't give it the design and programming time it required. Node.pm changed regularly, breaking programs based on previous versions. Another missed opportunity.

My whole programming life I've searched for a true Rapid Application Development environment. The only one I ever found was Clarion 2.1. All the rest pailed in comparison. During the go-go late 1990's you didn't need RAD -- there was plenty of money available to attack any job with massive programming talent. Besides, I could whomp out apps lightning quick.

After they began emailing American jobs to India, "lightning quick" took on a whole new meaning. It no longer meant 2 weeks -- now it meant a day. I looked for Open Source RAD environments, and found nothing suitable. I contemplated building my own RAD tools. The February and March 2003 Linux Productivity Magazine issues on Perl-TK were nothing more than a search for RAD, but the result was insufficient.

My exploration of Curses during the last two months was another example of seeking a RAD tool that I didn't find. Or maybe I did, but not in the way I though. More on that later...

In August 2003, I rewrote the obscenely convoluted EMDL parser program using Node.pm. In doing so, I got to debug and enhance Node.pm. Node.pm now had not only the Node object, but also an EMDL to Node Parser object, and a Node tree walker object. Node.pm was a completely reusable tool, whose use made complex operations into rather short, understandable programs.

In December 2003 I rewrote UMENU using Node.pm. Once again, the new program was much better than the old, OOP-only version. Now questioning why Node.pm seemed to turn everything into gold,  I touched on the realization that Node.pm is a data form exactly mimicking the hierarchies occuring in outlines and menu systems. Perhaps the secret to great development is using a data format that best represents the applications needs.

My RAD explorations led to a October 2003 thread on the VimOutliner mailing list about methods of rapid application development, which led to discussion of \rdb, which led to discussions of apps cobbled together with tiny commands via pipes. I put together a couple such apps, but the results were less than expected. bash is not a particularly efficient program, and data piping is efficient only if you can truly get a record, process it, and send it on. Any keeping of multiple records slows piping immensely.

Luckily, that October 2003 VimOutliner thread led to a subthread in which I posted the following to the VimOutliner mailing list:

Hi all,

Does any of you have any tips for moving computer program logic into computer
program data?

Yes, this really is on-topic. I'm trying to make a new design methodology
based on VimOutliner, and I suspect that the secret is in putting as much as
possible into data to relieve the need for complex logic.

SteveT

The stage was set. For 20 long years the words of my mentors had deflected off my functional decomposition comfort zone. Now I was taking a long, hard look at data.

The Discovery

In 2004 I wrote several programs using Node.pm. Every one turned out beautifully. In May 2004 I released Node.pm as an Open Source tool. In documenting it, I began to understand what a truly powerful tool it was. Its use extended to anything remotely connected to a hierarchy, or anything that could be connected to a hierarchy. But the farther I got from hierarchical use, the less useful Node.pm became. Jerry Hull had told me of the tradeoff between data complexity and code complexity. I was now beginning to suspect that data complexity was much more desireable.

Next month's Linux Productivity Magazine, which I've already completed, includes a binary tree using Node.pm. I got it to work, but only by simulating left and right child pointers with Node attributes. The algorithm would have been easier had I used a more representative piece of data -- one with only left and right pointers, with no siblings.

There was no doubt. The effectiveness of a design was proportional to the congruence of its data structure to the problem at hand.

Somewhere around this time, I read parts of "The Art of Unix Programming" by Eric Raymond. In chapter 1, "Philosophy", Raymond lists 17 rules of the Unix programming philosophy. Rule 9 is called the "Rule of Representation", and is stated like this:

Fold knowledge into data so program logic can be stupid and robust.

Exactly what Jerry Hull told our class so many years ago, except this time, making it clear that complex data was superior to complex code. I understood, I believed, and I followed through.

For this magazine's Data Driven Picklist article, I understood the need to have ALL configuration data for the picklist contained in a data structure. So I used the outline processor called VimOutliner to record all necessary data in an outline, catagorizing it with VimOutliner. I kept refining it as I went. Once I was satisfied I had all major data, I used Vim to create C data structures from that outline. I then built all the subroutines around it.

It wasn't easy. The resulting program is almost 1300 lines long, and doesn't even use all the data. It's enough code that I had to use Vim's marker foldmethod to organize and view the source code. One might be driven to question my sanity, because this almost 1300 line program does the same thing as the 230 line simple picklist shown earlier in this magazine.

The distinction is simple. Most of the code for the data driven version is generic to any picklist -- Curses, Perl/TK, or any other technology. Any data source -- flat file, outline, XML or SQL. It's a tool you can use over and over again. Better still, if done properly, the picklist's entire configuration could be defined in a file outside the source code, with no compilation necessary. Do you think that might speed development?

It's a 1300 line program, but if you look at the 100 lines comprising the data structures, the #defines and the enums, you can read this program like a book, understanding it completely. To improve it, you'd simply add or change the data structure, and then add/modify the routines that work on/with that data structure.

20 years after starting Santa Monica College I finally understand.

It's the data, stupid!

Part II: The Observations

I really came to understand this during my protracted search for a RAD tool, and my considerations for how to write one myself if nothing suitable existed. I realized that the only way to get all the needed features and flexibility, and also the only way to make the tool lightning quick and trivially easy to use would be to move the complexity from code to data.

In hindsight it all seems so obvious. What else would you look at besides data. Data is one of the very early products produced in systems analysis.

Einstein said that formulating the question is more important than the ultimate answer. The first generation of programmers were taught to ask the question "how should the logic flow". The spaghetti code results mirrored the appropriateness of that question.

My 1983 Santa Monica College professors told us to ask the question "how can I divide this into tasks?". The resulting code was written quickly, and was quite robust, but tended toward hardening of the arteries, lacking reuse, and after maybe 25,000 lines it collapsed on itself.

As far as I know, the OOP era guys weren't even taught to answer a question. They were pretty much told to "use objects". Maybe they were told to look for "is a" and "has a" relationships. I think the code of the last 10 years reflects the lack of a good, solid question.

Throughout all these programming fads (scuse me, paradigms), an elite few knew the truth:

It's the data, stupid!

My professor, Jerry Hull, knew it. So did my first mentor, Mike Welch. So did the hardware guy turned executive, Ken Heaps. And now I know it. And you know it. Match the program's data to the task at hand, and your life will be an easy one.

Look deeper and it starts looking better. In the Data Driven Picklist article, notice that the PICKINFO structure is about the concept of a picklist -- with very little dependency on Curses or any operating system features. This means your picklist is easily ported to QT, GTK, HTML or even Windows. By separating data into problem domain vs. OS/platform dependencies, our programs become much more portable.

Does all of this mean that functional decomposition is worthless? Obviously not. For 20 years I've made a living on a reputation for quickly producing robust code that solved the problem at hand. I used functional decomposition to do that. But now I know that data is the foundation, with functional decomposition serving as the studs in the wall. Functional decomposition is an excellent way to create code to support the underlying data structure.

Does this mean that OOP is worthless? Not in a million years. Think of how nicely OOP represents the underlying data. Had I written the data driven picklist in C++ or Perl, I'd have had classes called PICKINFO, WINDOWS, SUBROUTINES, AND KEYSTROKES, ROWSINFO and its contained class, ROWINFO. All those support routines I wrote would have been grouped directly with the data as methods. Once again, data is the foundation, and OOP is a great material with which to build on that foundation.

Compare code designed with data driven design compared to a code centered design. Code centered design invariably has more if statements, and its if statements are much longer. Those if statements are how logic is done in code. Data Driven Design simply assigns a variable to the correct data element, and the program runs just right.

Pure Data Driven Design is a lot like pure OOP -- it's overkill for small projects. You can see that my data driven picklist is a hefty 1200+ lines -- more than a 1 or 2 day project. For smaller projects, make the program less data driven and more hardcoded, but doesn't remove the need for data driven design. This will be discussed more later in this article...

Part III: The Methodology

The fact that this article skips over systems analysis does not mean it's unnecessary. Certainly, before committing time to anything but the most throwaway software project, a work and paperflow analysis should be done, users, stakeholders and top management should be consulted, feasibility and cost estimates must be performed. The system's analysis gives you a pretty good idea as to the data that needs to be collected and the data that needs to be output. That's an excellent start.

Unfortunately, many programmers think their data analysis is complete once they lay out data tables, indexes and foreign keys. They fail to address the fact that the process itself requires its own kind of data.

How will you address in-transit, in-memory data? What are the business differences between exempt and non-exempt employees? Could there be a third employee type in the future? How many other employee spectra exist or might later exist?

To what extent and frequency will the program require modification? What aspects of the program will require frequent modification? How can the various program flavors be represented in data, so that they can be configured with an editor or with a user operated front end?

If the program being written is intended for widespread use amongst disparate organizations, which fields will be structurally disparate? Employee number might be 5 digits in MomAndPop Groceries, and 8 to 14 alphanumeric characters in BigOldCorporation Inc. These differences can, and probably should, be expressed in data rather than code, so that a single body of Form code can handle any employee number field format.

If the employee number format is expressed in data, when BigOldCorporation Inc. switches to a 16 digit number in the future, a simple data change is all that's required. Yes, this is difficult to design, but it saves you from having to change the whole code base to accommodate business quirks of that new customer you "just have to get".

Example:

To accommodate differing field formats, you might have a field format table consisting of several items, each with a format name and a regular expression describing that field format. The name would be the unique key. Then each input field in the application would have, as one of its data members, the name of the format to be used. The actual code would simply compare the data in with the regex, on a character by character basis where possible, or on a field by field basis where necessary.

If SQL lookup of the field format proved too slow, the code could do the lookup when the form is opened, with the regexes stored as elements of the fields.

The point is, the format of each field is defined in data, not in code.


Maybe you are coding a tool. Tools must be extremely maleable. Good tools have no magic numbers. Magic numbers not only restrict usability, but also necessitate costly error checking that shouldn't be an issue. malloc() is your friend. Or Myclass->new().

Ask yourself what data you would need in order to describe the appearance, performance, form and operation of the tool. Also ask yourself what kind of data the tool handles, and what the tool needs to do with that data. Will the data always be able to fit into memory? Is the data hierarchical? Is it relational? Is it flat? Will it need to be sorted? Frequently? Will it be read only, read usually, or read write?

Armed with the answers to these questions, use a good outline processor (VimOutliner comes to mind) to lay out pseudocode for the data. Take time with this. It's every bit as important as functional decomposition, or outlining a prospective book. The more time you spend outlining the expected data needs, the less time you'll spend coding, debugging and re-coding.

Once you've laid out the data, you're half way home. You now have a very readable, very adaptable and very solid foundation for your app. Now it's time for real-world considerations.

Start with this. Does your language have garbage collection? Garbage collection isn't that important with straight procedural code. But when you store vast quantities of nested data without any magic numbers, the necessity of malloc()  and free() becomes error prone drudgery. Your best bet is, for each type of data structure, to write a routine to allocate that structure, another to free that structure, one to test for memory leak by repeatedly calling that allocate and free routines, and another to display the contents of the data structure. Nesting data structures should free nested data structures only by calling the nested structures free routines. Obviously, the requirements discussed in this paragraph become easier to construct if you use objects.

If your language lacks garbage collection, expect a significant and ongoing debugging task. Check early and often for segfaults and memory leaks. Code defensively. All pointers should be initialized to NULL, and then malloc'ed. When they're free'd, they should be set to NULL. Following this custom minimizes the chance of double-nulling a pointer, and gives you more reliable information about a pointer. This information can be checked with assert() commands.

If your language has garbage collection, you might be able to get away with less constructors, destructors and testers. It might (or might not) be reasonable to destruct three levels deep with a single subroutine. This would NOT be practical without garbage collection.

Data driven programs require more code than their hardwired brethren. That's the price you pay for versitility. The trouble is, such large programs are difficult to handle within an editor. Ideally, you want to group subroutines operating on the same data, or in OOP terms, group methods for the same class.

The Vim editor has a nifty mechanism to do just that: The marker foldmethod:

struct ROWINFO						/*{{{*/
{
char *displayString;
char *sortString;
bool sortPointsToDisplay;
char *uid;
}; /*}}}*/

Notice the /*{{{*/ and /*}}}*/ markers? In Vim, {{{ means "start fold here", while }}} marks the end of that fold. These can and should be nested. In fact, the top level of my data driven picklist looks like this:

Top level of pi.c

You can copy and paste pi.c and place in Vim. Then issue the following Vim command:
:set foldmethod=marker
You'll then see the screen pretty much as shown above. To "drill down", issue the zo command while the cursor rests on a header line. To close an opened fold, use zc while the cursor is within the text covered by that fold.. To close all folds on all levels, issue the following command:
:set foldlevel=0
Because I perform that action so often, I mapped the ,,1 (that's the numeral 1, not a lowercase L) key sequence to that command. Those knowledgeable about VimOutliner recognize that as identical to VimOutliner's command to perform the same action.

By organizing your code as described, and by using set foldmethod=marker and placing markers within your code, you'll soon be able to navigate instantly. Better yet, you'll usually have a .h file containing includes, #defines, enums, structs, and function prototypes. The .c file will contain the functions. You can also use Vim's ctags interface to ease your development and maintenance, but that's beyond the scope of this article.

As mentioned earlier, pure data driven design is overkill for small projects. You can't write a 2 day app if you spend a week writing allocation and destruction subroutines.

On quick projects, I'd recommend spending the first 2 hours of code design laying out necessary data with an outline processor. Remember, I said code design. I'm assuming you've already analyzed the problem domain, work and paper flow, etc. Why 2 hours? I've heard of few projects where saving an hour or two would make a material difference, so spend the full 2 hours so you know the full extent of needed data.

Once you know all the necessary data, NOW is the time to decide what to hardcode. Whatever's not likely to change over the near future, or be needed on future projects, should be flattened and hardcoded. But flattened I mean put it in, or near, the root of the data structure hierarchy. Don't spend a lot of time with allocation, freeing, instantiation or destruction. Just have the data element, and if possible, initialize it right within the structure or class. If that's not possible, instantiate all this hard coded data in a single subroutine. Then write your code to use the hardcoded data.

This gives you the hooks to an eventually pure data driven design, but without the voluminous code required by neverending levels of allocations and destructions.

Summary

In days gone by, a program's logic was contained in its code. Such programs were easy to write, but were difficult to modify, and almost impossible for a user to modify and configure. OOP sounded promising, but in fact was used incorrectly more often than not, and OOP for OOP's sake missed the point. The point is,

It's the data, stupid!

By placing the program's behavior and configuration in data, the program's versatility is maximized, with user modification of the program's behavior becoming a reality. Data driven design begins with a thorough investigation of needed data, and the best structure for that data to take. In doing so, an outline processor is your friend.

Once the data is decided upon, supporting routines are written for that data structure, and then the data and supporting routines are operated upon by the app. The result is an incredibly maintainable and configurable application or tool.

For quick apps, you'll want to de-datafy your design by flattening and hardcoding some of the less likely to be changed data. The result is faster development, while still retaining at least the hooks for data driven design. That way, adding the removed features will require an addition, not a rewrite.

Data Driven Design: Use it!
Steve Litt is the founder and acting president of Greater Orlando Linux User Group (GoLUG).   Steve can be reached at his email address.

Letters to the Editor

All letters become the property of the publisher (Steve Litt), and may be edited for clarity or brevity. We especially welcome additions, clarifications, corrections or flames from vendors whose products have been reviewed in this magazine. We reserve the right to not publish letters we deem in bad taste (bad language, obscenity, hate, lewd, violence, etc.).


Submit letters to the editor to Steve Litt's email address, and be sure the subject reads "Letter to the Editor". We regret that we cannot return your letter, so please make a copy of it for future reference.

How to Submit an Article

We anticipate two to five articles per issue, with issues coming out monthly. We look for articles that pertain to the GNU/Linux or open source. This can be done as an essay, with humor, with a case study, or some other literary device. A Troubleshooting poem would be nice. Submissions may mention a specific product, but must be useful without the purchase of that product. Content must greatly overpower advertising. Submissions should be between 250 and 2000 words long.

Any article submitted to Linux Productivity Magazine must be licensed with the Open Publication License, which you can view at http://opencontent.org/openpub/. At your option you may elect the option to prohibit substantive modifications. However, in order to publish your article in Linux Productivity Magazine, you must decline the option to prohibit commercial use, because Linux Productivity Magazine is a commercial publication.

Obviously, you must be the copyright holder and must be legally able to so license the article. We do not currently pay for articles.

Troubleshooters.Com reserves the right to edit any submission for clarity or brevity, within the scope of the Open Publication License. If you elect to prohibit substantive modifications, we may elect to place editors notes outside of your material, or reject the submission, or send it back for modification. Any published article will include a two sentence description of the author, a hypertext link to his or her email, and a phone number if desired. Upon request, we will include a hypertext link, at the end of the magazine issue, to the author's website, providing that website meets the Troubleshooters.Com criteria for links and that the author's website first links to Troubleshooters.Com. Authors: please understand we can't place hyperlinks inside articles. If we did, only the first article would be read, and we can't place every article first.

Submissions should be emailed to Steve Litt's email address, with subject line Article Submission. The first paragraph of your message should read as follows (unless other arrangements are previously made in writing):

Copyright (c) 2003 by <your name>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, version  Draft v1.0, 8 June 1999 (Available at http://www.troubleshooters.com/openpub04.txt/ (wordwrapped for readability at http://www.troubleshooters.com/openpub04_wrapped.txt). The latest version is presently available at  http://www.opencontent.org/openpub/).

Open Publication License Option A [ is | is not] elected, so this document [may | may not] be modified. Option B is not elected, so this material may be published for commercial purposes.

After that paragraph, write the title, text of the article, and a two sentence description of the author.

Why not Draft v1.0, 8 June 1999 OR LATER

The Open Publication License recommends using the word "or later" to describe the version of the license. That is unacceptable for Troubleshooting Professional Magazine because we do not know the provisions of that newer version, so it makes no sense to commit to it. We all hope later versions will be better, but there's always a chance that leadership will change. We cannot take the chance that the disclaimer of warranty will be dropped in a later version. 

Trademarks

All trademarks are the property of their respective owners. Troubleshooters.Com(R) is a registered trademark of Steve Litt.

URLs Mentioned in this Issue

_