Troubleshooters.Com Presents

Troubleshooting Professional Magazine

 
Volume 8 Issue 1, Winter, 2004
Damage Control
Copyright (C) 2004 by Steve Litt. All rights reserved. Materials from guest authors copyrighted by them and licensed for perpetual use to Troubleshooting Professional Magazine. All rights reserved to the copyright holder, except for items specifically marked otherwise (certain free software source code, GNU/GPL, etc.). All material herein provided "As-Is". User assumes all risk and responsibility for any outcome.


Steve Litt is the author of Troubleshooting Techniques of the Successful Technologist and Rapid Learning: Secret Weapon of the Successful Technologist.

[ Troubleshooters.Com | Back Issues | Linux Productivity Magazine ]



 
A stitch in time saves nine. -- Proverb

CONTENTS

Editor's Desk

By Steve Litt
My first day as a technician at Pacific Stereo, my new boss read me the rules of the road:

"Here's the most important thing Steve. Make sure you never, and I mean NEVER, make a unit worse than when it came in. NEVER plug it in without first running it up on a Variac. When measuring voltages, NEVER let your probe slip in a way that can short out the unit in a serious way. NEVER make things worse!".

That was my introduction to Pacific Stereo. But more importantly, it was my introduction to the step 2 of the Universal Troubleshooting Process: Make a Damage Control Plan.

In my 3 year stint at Pacific Stereo, I saw high power receivers burst into flame. On other technicians' benches. I saw turntables flung across the room in utter frustration. But I was never the flinger. I heeded my first boss's word, and made sure never to make things worse. Years later as a software developer, I made sure never to jeopardize data unless it was backed up. Finally, some 15 years after my first day at Pacific Stereo, I incorporated my first boss's words into the Universal Troubleshooting Process as Step 2.

Better yet, about 8 years after that, a very astute customer made me aware that damage control was more than not making things worse. It was also limiting the damage caused by the defect to be troubleshot.

Today we'll discuss damage control as it relates to Troubleshooting. So kick back, relax, and read this issue. And remember, if you're a Troubleshooter, this is your magazine. Enjoy!
Steve Litt is the author of "Troubleshooting Techniques of the Successful Technologist".  Steve can be reached at Steve Litt's email address.

Don't Make Things Worse

By Steve Litt
Imagine the guy who decided to disable Chernobyl's automatic shutdown mechanisms during a 1986 test. Might that have been a career limiting event? Who in their right mind would disable a nuclear reactor's automatic shutdown mechanism for any reason? There are always other ways to test things.

You and I are probably lucky enough not to work on equipment so dangerous that a failure would be spoken of 17 years later, but screwups are still career landmines. Everyone knows Troubleshooters can't instantly find the problem, but they expect at least that you won't make it worse.

If you make it worse you have two unattractive choices:
  1. Admit it, and pay for the damage
  2. Admit it, and refuse to pay for the damage
  3. Fix the troubleshooter-caused damage on your own time, with your own labor and money
None is attractive. That's why every troubleshooting adventure must begin (after getting the right attitude) with formulation of a damage control plan. At Chernobyl, your damage control plan would have included "never defeat safety interlocks". Working on a car, it would include not placing your skin near the carburetor, always having the oil pipe capped to prevent screws dropping into the crankcase, and pinning up long hair before looking in an engine compartment.

Working on a computerized system, make sure to back up any data even remotely in danger.

A damage control plan is just that -- a plan. It's not an action. Taking specific actions in every troubleshooting episode would be too time consuming for most troubleshooting in non-safety-critical situations (consumer electronic repair, data processing computer repair and the like). Instead, stop for a moment and determine what risks your troubleshooting actions could trigger, and how to limit those risks. It's as simple as determining a line over which you won't step: "I won't edit the Windows Registry without backing up the computer's data".

Here are some of the major risk categories to consider:
Making a damage control minimizes such risks.
Steve Litt is the creator of the Universal Troubleshooting Process.  Steve can be reached at Steve Litt's email address.

Limit Damage from the Defect Itself

By Steve Litt
Summer of 1989. You could still get "45" records, and if you bought one, it just might be Good Thing by Fine Young Cannivals. You could get 45 records, but one thing you couldn't easily get back then was an antivirus program. Viruses were relatively new, and few IT people had seen them. Certainly none of the IT people at my large law firm client had ever seen a virus. Until that nice summer day in 1989.

One programmer said he thought we had a virus on several computers. I proved his hyptothesis correct by demonstrating executable programs change themselves. I told the MIS manager, and the effect was like a row of dominos. She panicked. That was really scary, because she was one of the calmest, coolest, most rational people I've met. So I panicked (and I knew better -- I was in the middle of writing Troubleshooting: Tools, Tips and Techniques). Every programmer panicked. One by one, we all fell down like a row of dominos.

Just then the CFO walked out of her office, saw our consternation, and asked what was wrong. I told her, and her reply was swift:

"OK. First, unplug all the network junctions to keep this thing from spreading. Then power down every machine in the place to keep it from spreading on the machines. Then take an infected machine, study it, and find a way to fix this thing". One of you go tell the central office about this problem to minimize inconvenience to everyone else.

The effect was galvanic. Imagine a video tape of a falling row of dominos, played in reverse. We each took a deep breath, said "yeah", got together, made a plan, informed the central office, and got to work. Another programmer and I analyzed several infected executables and managed to find a string occurring in every one. We could now parse for infected files. I went ahead and created a program that would scan an entire hard disk for files containing that string, and report on them. My program would be run on a write protected bootable floppy. A particularly skillful assembly language programmer analyzed the virus, reverse engineered it, found it replicated itself by an undocumented interrupt call (if memory serves me it was interrupt 35), and wrote a program to act as a "birth control device" for the virus, so that no re-infection would occur later on.

Our programs written, we hired about 30 people from a body shop, and disinfected the hundreds of computers in the office. Three days after discovery of the virus, we were up and running, and virus free.

We later found out the virus's name was Jerusalem B, a particularly evil virus from the early days of DOS virii. It had originated in Jerusalem, Israel, and when the University of Jerusalem had been infected, it took them longer than 3 days to fix it. Our crew of about 5 programmers had worked miracles.

But imagine for a second what would have happened without the quick and sure advice of the CFO. Had our panic and indecision continued much longer, upper management would have had meetings. Blame might have been placed. Alternative actions contemplated. Time wasted, possibly while the network was left up to continue the spread. The network might have been down for a couple weeks.

Fast Action Prevents Further Damage

One monkey don't stop no show. Neither does one computer, or one network, or one machine. Or several machines. Nothing stops the show. The show must go on.

After dispensing with personal safety issues, the next priority is business continuation. How can we gracefully shut down? How can we find alternate ways of working while this problem is being solved? How can we inform everyone, and give everyone ways to work around the problem until it's fixed? These questions must be answered quickly.
Steve Litt is the author of the Universal Troubleshooting Process courseware.   Steve can be reached atSteve Litt's email address.

The State of Troubleshooting Address

By Steve Litt
The old Chinese curse states "may you live in interesting times". We are so cursed.

Outsourcing, Offshoring and Recession (OOR) have decimated the ranks of technologists. Many engineers and computer programmers, especially older ones, are teaching school, driving cabs and even flipping burgers. Americans take the layoffs, and foreign workers get many of the newly created jobs. Where does this leave the Troubleshooter?

That depends on how well the expert Troubleshooter markets that skill. I predict that all too soon this love affair with cheap made-in-India software systems will sour. Outsourcing, whether across the street or across the Pacific, is difficult. The software systems are cheaper, but it remains to be seen whether they'll be cheap enough to compensate for the hassle.

Meanwhile, small business employs over half the U.S. workforce, and is responsible for most of the new job creation. Small business isn't big enough to offshore a huge app. Small business will be serviced by generalist computer expert freelancers. These freelancers must deliver more for less -- not just with respect to software creation and network configuration, but also with respect to troubleshooting.

Troubleshooting is on the short list of skills vital to keep you employed.

Outside of information technology, things might be even better. There are more cars than ever, and cars break. Excellent car repair places, both large and small, continue to thrive. Today's complexity is countered by specialization, or by training.

If the recession continues, purchase of new items will falter, meaning repair of old items will become more attactive. The Troubleshooter stands to gain.

These are tough times for all but the richest, with all too many middle class people drowning in the flood of outsourcing, offshoring and recession. Especially in this economy, Troubleshooting Process knowledge serves as a floatation device.

And remember -- no recession lasts forever. When good times return, great Troubleshooters will be scooped up fast.

Letters to the Editor

All letters become the property of the publisher (Steve Litt), and may be edited for clarity or brevity. We especially welcome additions, clarifications, corrections or flames from vendors whose products have been reviewed in this magazine. We reserve the right to not publish letters we deem in bad taste (bad language, obscenity, hate, lewd, violence, etc.).
Submit letters to the editor to Steve Litt's email address, and be sure the subject reads "Letter to the Editor". We regret that we cannot return your letter, so please make a copy of it for future reference.

How to Submit an Article

We anticipate two to five articles per issue, with issues coming out monthly. We look for articles that pertain to the Troubleshooting Process, or articles on tools, equipment or systems with a Troubleshooting slant. This can be done as an essay, with humor, with a case study, or some other literary device. A Troubleshooting poem would be nice. Submissions may mention a specific product, but must be useful without the purchase of that product. Content must greatly overpower advertising. Submissions should be between 250 and 2000 words long.

Any article submitted to Troubleshooting Professional Magazine must be licensed with the Open Publication License, which you can view at http://opencontent.org/openpub/. At your option you may elect the option to prohibit substantive modifications. However, in order to publish your article in Troubleshooting Professional Magazine, you must decline the option to prohibit commercial use, because Troubleshooting Professional Magazine is a commercial publication.

Obviously, you must be the copyright holder and must be legally able to so license the article. We do not currently pay for articles.

Troubleshooters.Com reserves the right to edit any submission for clarity or brevity, within the scope of the Open Publication License. If you elect to prohibit substantive modifications, we may elect to place editors notes outside of your material, or reject the submission, or send it back for modification. Any published article will include a two sentence description of the author, a hypertext link to his or her email, and a phone number if desired. Upon request, we will include a hypertext link, at the end of the magazine issue, to the author's website, providing that website meets the Troubleshooters.Com criteria for links and that the author's website first links to Troubleshooters.Com. Authors: please understand we can't place hyperlinks inside articles. If we did, only the first article would be read, and we can't place every article first.

Submissions should be emailed to Steve Litt's email address, with subject line Article Submission. The first paragraph of your message should read as follows (unless other arrangements are previously made in writing):

Copyright (c) 2001 by <your name>. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, version  Draft v1.0, 8 June 1999 (Available at http://www.troubleshooters.com/openpub04.txt/ (wordwrapped for readability at http://www.troubleshooters.com/openpub04_wrapped.txt). The latest version is presently available at  http://www.opencontent.org/openpub/).

Open Publication License Option A [ is | is not] elected, so this document [may | may not] be modified. Option B is not elected, so this material may be published for commercial purposes.

After that paragraph, write the title, text of the article, and a two sentence description of the author.

Why not Draft v1.0, 8 June 1999 OR LATER

The Open Publication License recommends using the word "or later" to describe the version of the license. That is unacceptable for Troubleshooting Professional Magazine because we do not know the provisions of that newer version, so it makes no sense to commit to it. We all hope later versions will be better, but there's always a chance that leadership will change. We cannot take the chance that the disclaimer of warranty will be dropped in a later version.
 

Trademarks

All trademarks are the property of their respective owners. Troubleshooters.Com (R) is a registered trademark of Steve Litt.
 

URLs Mentioned in this Issue