"GNU/Linux" is probably the most accurate moniker one can give to
this
operating system. Please be aware that in all of Troubleshooters.Com,
when I say "Linux" I really mean "GNU/Linux". I completely believe that
without
the GNU project, without the GNU Manifesto and the GNU/GPL license it
spawned,
the operating system the press calls "Linux" never would have happened.
I'm part of the press and there are times when it's easier to say
"Linux"
than explain to certain audiences that "GNU/Linux" is the same as what
the
press calls "Linux". So I abbreviate. Additionally, I abbreviate in the
same
way one might abbreviate the name of a multi-partner law firm. But make
no
mistake about it. In any article in Troubleshooting Professional
Magazine,
in the whole of Troubleshooters.Com, and even in the technical books I
write,
when I say "Linux", I mean "GNU/Linux".
There are those who think FSF is making too big a deal of this.
Nothing
could be farther from the truth. The GNU General Public License,
combined
with Richard Stallman's GNU Manifesto and the resulting GNU-GPL
License,
are the only reason we can enjoy this wonderful alternative to
proprietary
operating systems, and the only reason proprietary operating systems
aren't
even more flaky than they are now.
For practical purposes, the license requirements of "free software" and
"open
source" are almost identical. Generally speaking, a license that
complies
with one complies with the other. The difference between these two is a
difference
in philosophy. The "free software" crowd believes the most important
aspect
is freedom. The "open source" crowd believes the most important aspect
is
the practical marketplace advantage that freedom produces.
I think they're both right. I wouldn't use the software without the
freedom
guaranteeing me the right to improve the software, and the guarantee
that
my improvements will not later be withheld from me. Freedom is
essential.
And so are the practical benefits. Because tens of thousands of
programmers
feel the way I do, huge amounts of free software/open source is
available,
and its quality exceeds that of most proprietary software.
In summary, I use the terms "Linux" and "GNU/Linux" interchangably,
with
the former being an abbreviation for the latter. I usually use the
terms
"free software" and "open source" interchangably, as from a licensing
perspective
they're very similar. Occasionally I'll prefer one or the other
depending
if I'm writing about freedom, or business advantage.
It's the Data, Stupid!
If only my mentors had told me in just those words, instead of telling
me by example, or in a roundabout manner. But then again, maybe they
didn't understand which card in their hand truly held the power. Let me
take you through a 20 year journey...
Part I: The Journey
The Clueless Years
In the early 1980's I was a programming student at Santa Monica
Community College. For programming, this resembled a trade school
rather than a university. They taught us to hit the marketplace quick
and secure jobs. So they taught us the functional decomposition code
design methodology until we ate, slept and breathed it. They taught us
light systems analysis.
It worked. We rushed out into the local marketplace and snapped up
jobs. All too often, the UCLA grads could write you a compiler, but it
was the Santa Monica grads who could quickly create a working office
automation app. We got hired and we got paid well. Not for
greatness, but for quickly completing the desired business programming
task.
Santa Monica College had a Pascal teacher named Jerry Hull who was a
true computer scientist. During one class he told us there was a
tradeoff between data complexity and code complexity. But I missed the
part where he told us the huge advantages of data complexity. I decided
I preferred code complexity, because I was a true ninja when it came to
functional decomposition. The first opportunity missed...
A couple years later I was one of several junior programmers in a
medical software house. We pounded out apps using tools created by the
head programmer, Mike Welch. When a fast rewrite of the entire
system was needed, Mike created a DBMS accompanied by a
BASIC interpreter with commands to fully interface to a DBMS. It
was called RBASIC. In the end RBASIC lacked the speed and stability to
form the backbone of our app, but consider for a second that in the
space of maybe 9 months this one person created a tool for the PDP11
with the ease and power of DBASEIII. Mike Welch was a GREAT programmer.
One day I read through the C code of Mike's database. It was a massive
tree of struct pointers that I found almost unfathomable. I asked Mike
why he coded that way, and he explained, but not in a way I could
understand. I was great at functional decomposition and whomping out
procedural code, so I decided not to code such huge data structures.
Another opportunity lost...
At the medical software house I did take one step in the right
direction,
creating a tool to print at any point on a monospace printer. This was
in the low memory days, so the tool was NOT simply a 66x80 array -- it
actually counted lines and spaces. So if you were on line 10 and needed
the next print to be column 4 of line 12, you'd simply issue the
following command:
atyxprint(myReport, 13, 4, stringToBePrinted);
The
myReport variable was
a pointer to a
struct
containing the current line and column coordinates, which are updated
by
atyxprint(). It also
contained the file to which the report was printed, the filename of
that file, the current page number, the page length (lines per sheet),
and a flag indicating whether this structure was valid (initialized by
openrpt()).
The results were outstanding. I could whomp out a complexly formatted
report in a couple hours when it took everyone else days. I called my
myReport structure a "context
variable", and because this tool was so successful, I vowed to write
other software using context variables. I was so, so close to the
truth. I used many context variables from that point forward, and they
helped immensely, but context variables weren't what I thought of
first. I thought of functional decomposition. I could do it quicker
than almost anyone else. That was why they paid me the big bucks...
Skip ahead another 4 years and I was a senior programmer at a huge law
firm. I wrote 90% of the code for their timesheet program. I also wrote
tools for myself and other programmers. I used context variables and
other structures for many of these tools. Trouble was, I coded them
quickly. Why use
malloc()
when it's so much easier to declare an array whose size is a magic
number? I was smart enough to
#define
that magic number. So what if my tools had some limitations? Ugh!
In my first decade of programming, my initial question was always "how
can I break this down into tasks". Data structure was postponed until
needed to make the functional decomposition work. It's hard to get the
right results when you ask the wrong question...
OOP
A decade after Santa Monica College, OOP came alive. I bought
Turbo C++. I viewed Phillippe Kahn's OOP video endlessly. I even
rewrote a substantial portion of the law firm's timesheet processing
software in C++. It worked flawlessly, but its complexity was
oppressive.
From 1992 to 1996 I was like most other programmers. I tried to use
OOP. I knew "OOP is the future". But I just didn't get it. I dabbled in
OOP when I could, and forsook it when deadlines were tight. I viewed
OOP as a more cohesive way to do my "context variable" trick, which
obviously was a tiny portion of what OOP evangelists were saying about
it. If only I'd known how close I was to the truth.
One day in 1996 I fell asleep reading an OOP design book I just
couldn't grasp.
Sleep overcame me. When I woke up, I understood! Just like that, I knew
the process to design OOP apps. Simply view the program as a machine,
and the objects as subsystems, boards, or components. Two weeks later I
became the OOP coach of a Powerbuilder team.
Using this realization, I whipped out quite a bit of great OOP code,
including a very simple and elegant replacement for the oppressively
complex timesheet processing component written a few years before.
The IT Director at the law firm was a hardware guy turned executive
named Ken Heaps. He was one of the best managers I've ever encountered.
One day the firm's head managment guy met with Ken and
me to ask whether a desired timesheet program change was doable. I said
I didn't know. Ken Heaps then
asked me whether the current timesheet front end was capturing the data
involved in the change. I said it was. Ken turned to the head guy and
said "he can do it!". I was about to protest when I realized that Ken
was right -- if I had the data, one way or another I could get it to
the back end in a useful format.
With 15 years programming experience, I'd just been taught a
programming lesson by
an ex hardware guy in a suit. When confronted with the question of
doability, my first question became "can we capture all the necessary
data?".
That realization was earthshaking, but it was page 8 small print next
to the truth that still eluded me. Why, oh why didn't Ken throw a glass
of water in my face and tell me "It's the Data, Stupid!". I travelled
on, coding and evangelizing OOP.
OOP had come, and all was well with the world. Or so I thought...
OOP Disallusionment
I spent the remainder of the 1990's writing OOP code. No matter what
the task, no matter what the underlying problem domain, my answer was
OOP. 100% OOP. C++, Perl or Turbo Pascal, I wrote OOP. I'd view the
program as a machine, split it into subsystems, and those subsystems
into components. The main program would consist of one line:
my $controller = Controller->new();
Yep, even the main logic would be an object called
$controller or
controller. The programs got
written. Everyone ooh'ed and ah'ed. But there was a problem.
Invariably, that super-understandable machine/subsystem/component
design became an albatross in real life. Look at the original EMDL
program as an example. The original design diagram fit cleanly on a
single
piece of 8.5 x 11 paper. The final product, written exactly to that
clean design, was unfathomable. It was the worst code I've ever
written. The original UMENU wasn't much better.
I began having OOP envy again. If only I really understood OOP like the
master programmers did, I could
make beautiful programs. Little did I know that OOP was strictly a bit
player. What I didn't understand was the relevance and importance of
data structure.
My entire career had been spent in small shops. With a couple
exceptions, I'd never had a chance to read truly great code. That would
soon change...
The First Light of Dawn
In September 1999 I began writing the book "Samba Unleashed". To fully
understand
Samba, I downloaded, compiled and installed the source. Then spent a
few days studying and modifying the source, and noticed something
common among
Samba's many source files.
Most source included a header file containing a very large and detailed
data structure, together with subroutines to manipulate that data
structure. That data structure was a context variable on steroids.
Funny thing was, to anybody who had ever run the
testparm program or written a
detailed
smb.conf, that
data structure was very readable. All of Samba's configuration, and
most of what makes Samba Samba, was contained in that structure --
either directly or within a
has-a
relationship. It was C, not C++, but they had achieved a lot of the
encapsulation we commonly associate with OOP.
I had discovered that with data structures and static variables, one
could simulate OOP with C. There was also a tiny voice in my
subconscious. A voice my conscious mind couldn't hear, but a voice that
would make me receptive to other future voices. That voice whispered:
It's the data, stupid.
The Search
By 2002 the OOP well was running dry. My 1999 all OOP all the time
UMENU was fiercely unmaintainable. Yet UMENU was downright readable
compared to my 2002 monument to OOP, EMDL. The original EMDL parser was
absolutely unmaintainable, even by me. On paid gigs I'd gone back to my
coding practices of the mid 1990's -- functional decomposition where I
could, and objects for distinct tools. That worked out much better when
on the clock.
I was in trouble. My entire programming career was about code design
methodology. My expertise at 20 year old functional decomposition was
irrelevant. Object Oriented Design wasn't doing the job except for
those rare projects involving simulation of tangible objects.
I looked for a new design methodology. Discussed it endlessly. Looked
for tools.
The first signpost on my road to that design methodology happened in
February 2001 during the writing of the XML themed "Troubleshooting
Professional Magazine". The Xerces XML to DOM parser yielded a DOM node
tree that was one of the most beautiful, useful, productive and
understandable tools I'd ever seen.
By 2002 I'd created VimOutliner and repeatedly needed to parse tab
indented outlines. I needed something like DOM, but DOM was just too
bulky. I make a lightweight knockoff of DOM called Node.pm. Node.pm
centered around an object called a
Node, which, just like the DOM
spec it emulated, contained:
- Parent pointer
- Next sibling pointer
- Previous sibling pointer
- First child pointer
- Last child pointer
- Node Name
- Node Type
- Node Value
- Key/value list of attributes
Naturally, Node.pm contained get and set routines for all the above. I
used Node.pm to create several nice outline extraction routines.
Unfortunately, I didn't fully understand the power of this tool, and
therefore didn't give it the design and programming time it required.
Node.pm changed regularly, breaking programs based on previous
versions. Another missed opportunity.
My whole programming life I've searched for a true Rapid Application
Development environment. The only one I ever found was Clarion 2.1. All
the rest pailed in comparison. During the go-go late 1990's you didn't
need RAD -- there was plenty of money available to attack any job with
massive programming talent. Besides, I could whomp
out apps lightning quick.
After they began emailing American jobs to India, "lightning quick"
took on
a whole new meaning. It no longer meant 2 weeks -- now it meant a day.
I looked for Open Source RAD environments, and found nothing suitable.
I contemplated building my own RAD tools. The February and March 2003
Linux Productivity Magazine issues
on Perl-TK were nothing more than a search for RAD, but the result was
insufficient.
My exploration of Curses during the last two months was another example
of seeking a RAD tool that I didn't find. Or maybe I did, but not in
the way I though. More on that later...
In August 2003, I rewrote the obscenely convoluted EMDL parser program
using Node.pm. In doing so, I got to debug and enhance Node.pm. Node.pm
now had not only the
Node
object, but also an EMDL to Node Parser object, and a Node tree walker
object. Node.pm was a completely reusable tool, whose use made complex
operations into rather short, understandable programs.
In December 2003 I rewrote UMENU using Node.pm. Once again, the new
program was much better than the old, OOP-only version. Now questioning
why Node.pm seemed to turn everything into gold, I touched on the
realization that Node.pm is a data form exactly mimicking the
hierarchies occuring in outlines and menu systems. Perhaps the secret
to great development is using a data format that best represents the
applications needs.
My RAD explorations led to a October 2003 thread on the VimOutliner
mailing list about methods of rapid application development,
which led to discussion of \rdb, which led to discussions of apps
cobbled together with tiny commands via pipes. I put together a couple
such apps, but the results were less than expected.
bash is not a particularly
efficient program, and data piping is efficient only if you can truly
get a record, process it, and send it on. Any keeping of multiple
records slows piping immensely.
Luckily, that October 2003 VimOutliner thread led to a
subthread in which I posted the
following to the VimOutliner mailing list:
Hi all,
Does any of you have any tips for moving computer program logic into
computer
program data?
Yes, this really is on-topic. I'm trying to make a new design
methodology
based on VimOutliner, and I suspect that the secret is in putting as
much as
possible into data to relieve the need for complex logic.
SteveT
|
The stage was set. For 20 long years the words of my mentors had
deflected off my functional decomposition comfort zone. Now I was
taking a long, hard look at data.
The Discovery
In 2004 I wrote several programs using Node.pm. Every one turned out
beautifully. In May 2004 I released Node.pm as an Open Source tool. In
documenting it, I began to understand what a truly powerful tool it
was. Its use extended to anything remotely connected to a hierarchy, or
anything that
could be
connected to a hierarchy. But the farther I got from hierarchical use,
the less useful Node.pm became. Jerry Hull had told me of the tradeoff
between data complexity and code complexity. I was now beginning to
suspect that data complexity was much more desireable.
Next month's Linux Productivity
Magazine, which I've already completed, includes a binary tree using
Node.pm. I got it to work, but only by simulating left and right child
pointers with Node attributes. The algorithm would have been easier had
I used a more representative piece of data -- one with only left and
right pointers, with no siblings.
There was no doubt. The effectiveness of a design was proportional to
the congruence of its data structure to the problem at hand.
Somewhere around this time, I read parts of "The Art of Unix
Programming" by Eric Raymond. In chapter 1, "Philosophy", Raymond lists
17 rules of the Unix programming philosophy. Rule 9 is called the "Rule
of Representation", and is stated like this:
Exactly what Jerry Hull told our class so many years ago, except this
time, making it clear that complex data was superior to complex code. I
understood, I believed, and I followed through.
For this magazine's Data Driven Picklist article, I understood the need
to have ALL configuration data for the picklist contained in a data
structure. So I used the outline processor called VimOutliner to record
all necessary data in an outline,
catagorizing it with VimOutliner. I kept refining
it as I went. Once I was satisfied I had all major data, I used Vim to
create C data structures from that outline. I then built all the
subroutines around it.
It wasn't easy. The resulting program is almost 1300 lines long, and
doesn't even use all the data. It's enough code that I had to use Vim's
marker foldmethod to organize and view the source code. One might be
driven to question my sanity, because this almost 1300 line program
does the same thing as the 230 line simple picklist shown earlier in
this magazine.
The distinction is simple. Most of the code for the data driven version
is generic to any picklist -- Curses, Perl/TK, or any other technology.
Any data source -- flat file, outline, XML or SQL. It's a tool you can
use over and over again. Better still, if done properly, the picklist's
entire configuration could be defined in a file outside the source
code, with no compilation necessary. Do you think that might speed
development?
It's a 1300 line program, but if you look at the 100 lines comprising
the data structures, the #defines and the enums, you can read this
program like a book, understanding it completely. To improve it, you'd
simply add or change the data structure, and then add/modify the
routines that work on/with that data structure.
20 years after starting Santa Monica College I finally understand.
It's
the data, stupid!
Part II: The Observations
I really came to understand this during my protracted search for a RAD
tool, and my considerations for how to write one myself if nothing
suitable existed. I realized that the only way to get all the needed
features and flexibility, and also the only way to make the tool
lightning quick and trivially easy to use would be to move the
complexity from code to data.
In hindsight it all seems so obvious. What else would you look at
besides data. Data is one of the very early products produced in
systems analysis.
Einstein said that formulating the question is more important than the
ultimate answer. The first generation of programmers were taught to ask
the question "how should the logic flow". The spaghetti code results
mirrored the
appropriateness of that question.
My 1983 Santa Monica College
professors told us to ask the question "how can I divide this into
tasks?". The resulting code was written quickly, and was quite robust,
but tended toward hardening of the arteries, lacking reuse, and after
maybe 25,000 lines it collapsed on itself.
As far as I know, the OOP era guys weren't even taught to answer a
question. They were pretty much told to "use objects". Maybe they were
told to look for "is a" and "has a" relationships. I think the code of
the last 10 years reflects the lack of a good, solid question.
Throughout all these programming fads (scuse me, paradigms), an elite
few knew the truth:
It's
the data, stupid!
My professor, Jerry Hull, knew it. So did my first mentor, Mike Welch.
So did the hardware guy turned executive, Ken Heaps. And now I know it.
And you know it. Match the program's data to the task at hand, and your
life will be an
easy one.
Look deeper and it starts looking better. In the
Data Driven Picklist article, notice
that the PICKINFO structure is about the concept of a picklist -- with
very little dependency on Curses or any operating system features. This
means your picklist is easily ported to QT, GTK, HTML or even Windows.
By separating data into problem domain vs. OS/platform dependencies,
our programs become much more portable.
Does all of this mean that functional decomposition is worthless?
Obviously
not. For 20 years I've made a living on a reputation for quickly
producing robust code that solved the problem at hand. I used
functional decomposition to do that. But now I know that data is the
foundation, with functional decomposition serving as the studs in the
wall. Functional decomposition is an excellent way to create code to
support the underlying data structure.
Does this mean that OOP is worthless? Not in a million years. Think of
how nicely OOP represents the underlying data. Had I written the data
driven picklist in C++ or Perl, I'd have had classes called PICKINFO,
WINDOWS, SUBROUTINES, AND
KEYSTROKES, ROWSINFO and its contained class, ROWINFO. All those
support routines I wrote would have been grouped
directly with the data as methods. Once again, data is the foundation,
and OOP is a great material with which to build on that foundation.
Compare code designed with data driven design compared to a code
centered design. Code centered design invariably has more
if statements, and its
if statements are much longer.
Those
if statements are
how logic is done in code. Data Driven Design simply assigns a variable
to the correct data element, and the program runs just right.
Pure Data Driven Design is a lot like pure OOP -- it's overkill for
small projects. You can see that my data driven picklist is a hefty
1200+ lines -- more than a 1 or 2 day project. For smaller projects,
make the program less data driven and more hardcoded, but doesn't
remove the need for data driven design. This will be discussed more
later in this article...
Part III: The Methodology
The fact that this article skips over systems analysis does not mean
it's unnecessary. Certainly, before committing time to anything but the
most throwaway software project, a work and paperflow analysis should
be done, users, stakeholders and top management should be consulted,
feasibility and
cost estimates must be performed. The system's analysis gives you a
pretty good idea as to the data that needs to be collected and the data
that needs to be output. That's an excellent start.
Unfortunately, many programmers think their data analysis is complete
once they lay out data tables, indexes and foreign keys. They fail to
address the fact that the process itself requires its own kind of data.
How will you address in-transit, in-memory data? What are the business
differences between exempt and non-exempt employees? Could there be a
third employee type in the future? How many other employee spectra
exist or might later exist?
To what extent and frequency will the program require modification?
What aspects of the program will require frequent modification? How can
the various program flavors be represented in data, so that they can be
configured with an editor or with a user operated front end?
If the program being written is intended for widespread use amongst
disparate organizations, which fields will be structurally disparate?
Employee number might be 5 digits in MomAndPop Groceries, and 8 to 14
alphanumeric characters in BigOldCorporation Inc. These differences
can, and probably should, be expressed in data rather than code, so
that a single body of Form code can handle any employee number field
format.
If the employee number format is
expressed in data, when BigOldCorporation Inc. switches to a 16 digit
number in the future, a simple data change is all that's required. Yes,
this is difficult to design, but it saves you from having to change the
whole code base to accommodate business quirks of that new customer you
"just have to get".
Example:
To accommodate differing field formats, you might have a field format
table consisting of several items, each with a format name and a
regular expression describing
that field format. The name would be the unique key. Then each input
field in
the application would have, as one of its data members, the name of the
format to be used. The actual code would simply compare the data in
with the regex, on a character by character basis where possible, or on
a field by field basis where necessary.
If SQL lookup of the field format proved too slow, the code could do
the lookup when the form is opened, with the regexes stored as elements
of the fields.
The point is, the format of each field is defined in data, not in code.
|
Maybe you are coding a tool. Tools must be extremely maleable. Good
tools have no magic numbers. Magic numbers not only restrict usability,
but also necessitate costly error checking that shouldn't be an issue.
malloc() is your friend. Or
Myclass->new().
Ask yourself what data you would need in order to describe the
appearance, performance, form and operation of the tool. Also ask
yourself what kind of data the tool handles, and what the tool needs to
do with that data. Will the data always be able to fit into memory? Is
the data hierarchical? Is it relational? Is it flat? Will it need to be
sorted? Frequently? Will it be read only, read usually, or read write?
Armed with the answers to these questions, use a good outline processor
(VimOutliner comes to mind) to
lay out pseudocode for the data. Take time with this. It's every bit as
important as functional decomposition, or outlining a prospective book.
The more time you spend outlining the expected data needs, the less
time you'll spend coding, debugging and re-coding.
Once you've laid out the data, you're half way home. You now have a
very readable, very adaptable and very solid foundation for your app.
Now it's time for real-world considerations.
Start with this. Does your language have garbage collection? Garbage
collection isn't that important with straight procedural code. But when
you store vast quantities of nested data without any magic numbers, the
necessity of
malloc()
and
free() becomes error
prone drudgery. Your best bet is, for each type of data structure, to
write a routine to allocate that structure, another to free that
structure, one to test for memory leak by repeatedly calling that
allocate and free routines, and another to display the contents of the
data structure. Nesting data structures should free nested data
structures only by calling the nested structures free routines.
Obviously, the
requirements discussed in this paragraph become easier to construct if
you use objects.
If your language lacks garbage collection, expect a significant and
ongoing debugging task. Check early and often for segfaults and memory
leaks. Code defensively. All pointers should be initialized to
NULL, and then malloc'ed. When
they're free'd, they should be set to NULL. Following this custom
minimizes the chance of double-nulling a pointer, and gives you more
reliable information about a pointer. This information can be checked
with
assert() commands.
If your language has garbage collection, you might be able to get away
with less constructors, destructors and testers. It might (or might
not) be reasonable to destruct three levels deep with a single
subroutine. This would
NOT
be practical without garbage collection.
Data driven programs require more code than their hardwired brethren.
That's the price you pay for versitility. The trouble is, such large
programs are difficult to handle within an editor. Ideally, you want to
group subroutines operating on the same data, or in OOP terms, group
methods for the same class.
The Vim editor has a nifty mechanism to do just that: The marker
foldmethod:
struct ROWINFO /*{{{*/ { char *displayString; char *sortString; bool sortPointsToDisplay; char *uid; }; /*}}}*/
|
Notice the
/*{{{*/ and
/*}}}*/ markers? In Vim,
{{{ means "start fold here",
while
}}} marks the end
of that fold. These can and should be nested. In fact, the top level of
my data driven picklist looks like this:
You can copy and paste
pi.c
and place in Vim. Then issue the following Vim command:
:set foldmethod=marker
You'll then see the screen pretty much as shown above. To "drill down",
issue the zo command while the cursor rests on a header line. To close
an opened fold, use zc while the cursor is within the text covered by
that fold.. To close all folds on all levels, issue the following
command:
:set foldlevel=0
Because I perform that action so often, I mapped the
,,1 (that's the numeral 1, not
a lowercase L) key sequence to that
command. Those knowledgeable about VimOutliner recognize that as
identical to VimOutliner's command to perform the same action.
By organizing your code as described, and by using
set foldmethod=marker and
placing markers within your code, you'll soon be able to navigate
instantly. Better yet, you'll usually have a .
h file containing includes,
#defines, enums, structs, and function prototypes. The
.c file will contain the
functions. You can also use Vim's ctags interface to ease your
development and maintenance, but that's beyond the scope of this
article.
As mentioned earlier, pure data driven design is overkill for
small projects. You can't write a 2 day app if you spend a week writing
allocation and destruction subroutines.
On quick projects, I'd recommend spending the first 2 hours of code
design laying out necessary data with an outline processor. Remember, I
said
code design. I'm
assuming you've already analyzed the problem domain, work and paper
flow, etc. Why 2 hours? I've heard of few projects where saving an hour
or two would make a material difference, so spend the full 2 hours so
you know the full extent of needed data.
Once you know all the necessary data, NOW is the time to decide what to
hardcode. Whatever's not likely to change over the near future, or be
needed on future projects, should be flattened and hardcoded. But
flattened I mean put it in, or near, the root of the data structure
hierarchy. Don't spend a lot of time with allocation, freeing,
instantiation or destruction. Just have the data element, and if
possible, initialize it right within the structure or class. If that's
not possible, instantiate all this hard coded data in a single
subroutine. Then write your code to use the hardcoded data.
This gives you the hooks to an eventually pure data driven design, but
without the voluminous code required by neverending levels of
allocations and destructions.
Summary
In days gone by, a program's logic was contained in its code. Such
programs were easy to write, but were difficult to modify, and almost
impossible for a user to modify and configure. OOP sounded promising,
but in fact was used incorrectly more often than not, and OOP for OOP's
sake missed the point. The point is,
It's
the data, stupid!
By placing the program's behavior and configuration in data, the
program's versatility is maximized, with user modification of the
program's behavior becoming a reality. Data driven design begins with a
thorough investigation of needed data, and the best structure for that
data to take. In doing so, an outline processor is your friend.
Once the data is decided upon, supporting routines are written for that
data structure, and then the data and supporting routines are operated
upon by the app. The result is an incredibly maintainable and
configurable application or tool.
For quick apps, you'll want to de-datafy your design by flattening and
hardcoding some of the less likely to be changed data. The result is
faster development, while still retaining at least the hooks for data
driven design. That way, adding the removed features will require an
addition, not a rewrite.
Data Driven Design: Use it!
Any article submitted to Linux Productivity Magazine must be
licensed
with the Open Publication License, which you can view at
http://opencontent.org/openpub/.
At your option you may elect the option to prohibit substantive
modifications.
However, in order to publish your article in Linux Productivity
Magazine,
you must decline the option to prohibit commercial use, because Linux
Productivity
Magazine is a commercial publication.
Obviously, you must be the copyright holder and must be legally able
to
so license the article. We do not currently pay for articles.
Troubleshooters.Com reserves the right to edit any submission for
clarity
or brevity, within the scope of the Open Publication License. If you
elect
to prohibit substantive modifications, we may elect to place editors
notes
outside of your material, or reject the submission, or send it back for
modification.
Any published article will include a two sentence description of the
author,
a hypertext link to his or her email, and a phone number if desired.
Upon
request, we will include a hypertext link, at the end of the magazine
issue,
to the author's website, providing that website meets the
Troubleshooters.Com
criteria for links and that the
author's
website first links to Troubleshooters.Com. Authors: please understand
we
can't place hyperlinks inside articles. If we did, only the first
article
would be read, and we can't place every article first.
Submissions should be emailed to Steve Litt's email address, with
subject
line Article Submission. The first paragraph of your message should
read
as follows (unless other arrangements are previously made in writing):
After that paragraph, write the title, text of the article, and a
two
sentence description of the author.