As a Human Resources professional, you might be brought in the loop if
there's a perception that the employees "can't solve problems". Or perhaps
you'll be involved in some other sort of complaint that, to you, looks
like employees' inability to solve problems. Either way, it's a complex
subject, so your response should be thoughtful and measured. This document
provides an overview to problem solving, and in doing so, perhaps gives
you the insight needed to research and formulate an answer to the
"can't solve problems" problem, from an organizational and Human
Problem solving is the most basic and most important human mental
activity. We all want to hire excellent problem solvers. We all want to
improve our employees' problem solving ability. But how?
The best problem solvers solve problems with a process. Once your
employees learn the process, their problem solving productivity will
skyrocket. How can the process be taught to them?
Here's where things get challenging. There are many categories of
troubleshooting, and many problem solving processes. Each problem
solving process (methodology) is optimized
for a specific problem solving category. Sometimes a process can be
used on a problem for which it's not optimized, but if that's done,
it's likely to be slow going. It's always best to approach each problem
with a process optimized for that category of problems.
Here's a very
partial list of problem solving categories:
- Conflict resolution problems
- Legal problems
- Money problems
- Math problems
- Science problems
- Mechanical/electrical/computer problems (Technical troubleshooting)
- Sales problems
- Personal problems
- Factory throughput problems
- Business decision problems
The following table lists some categories and their most famous problem solving methodologies:
||Algebra, geometry, calculus, etc
||Universal Troubleshooting Process
|Factory throughput problems
||Theory of Constraints
|Business decision problems
||Methods of Kepner and Tregoe
||Root Cause Analysis
When training employees to solve problems, the key advantage is to train
them in methodologies optimized
for the category of problem they will be solving. If they'll be solving
more than one class of problems, train them in methodologies optimized for
each problem class, rather than trying to shoehorn several problem classes
into a "one size fits all" methodology.
Who's This Steve Litt Guy?
I created, teach and market the Universal Troubleshooting Process
(UTP for short). This process is highly optimized for quick, economical
and accurate solutions of well defined systems with abundant test
points. In other words, machines, electronics, computerized systems,
software and the like.
Early in my career I tried to market the UTP for solving ALL problems,
but quickly found out that it doesn't work well for things like
business and relationship problems. Meanwhile, "competitors" with
problem solving methodologies optimized for vastly different problem
categories tried to sell my customers their methodologies for technical
troubleshooting, and I had to explain why those non-optimized
methodologies would have been costly, even if the training had been
Besides being an authority on the Universal Troubleshooting Process, I
also have a good working knowledge of the Theory of Constraints and
Root Cause Analysis.
The Most General Problem Solving Process
The most general problem solving process involves three steps:
- Analyze the problem state
- Analyze the solved state
- Analyze how to make the transition
In other words:
- What don't I like about the current situation?
- What would I like the situation to be instead?
- How can I go from what is to what I want?
This description is too general to solve real world problems, but it gives you the basic structure of most problem solving.
Universal Troubleshooting Process
This is what you use to fix machines, electronics, computerized systems
and software. It's optimized for well defined systems with abundant,
easy, safe and cheap test points. One of the ways it's optimized is that the
solved state is simply "get it to work as designed". There's no need to
evaluate alternate solutions, no need to manage expectations. Making
the transition is as simple as replacing the bad component.
The Universal Troubleshooting Process incorporates the fact that steps
2 and 3 of the most generic problem solving methodology are obvious in
technical troubleshooting. That's one of the many reasons it's much more
efficient at technical troubleshooting than more generic problem
I've written about it at http://www.troubleshooters.com/hr/utp.htm
Theory of Constraints
The Theory of Constraints (TOC) is what you use to identify and
correct bottlenecks. It's most often associated with factory floors,
but it can be used for any problem in which the entire system is slowed
by a bottleneck. The authoritative text on the TOC is "The Goal" by
Eliyahu M. Goldratt and Jeff Cox, ISBN-13 # 978-0884270614.
I'm not going to summarize the book in this article, but here are a few concepts from the book:
- The bottleneck is that component or subsystem that constrains the throughput of the entire system.
- Speed up the bottleneck and you speed up the system.
- Offload work from the bottleneck and you speed up the system.
- Speeding up a non-bottleneck does not speed up the system.
- Slowing a non-bottleneck doesn't slow the system, unless you slow the non-bottleneck to an extent that it becomes a bottleneck.
- The bottleneck is not necessarily bad. It can be used to you
control the throughput of the system. The gas pedal on your car is an
intentional bottleneck to control the speed of your car.
- Bottlenecks leave clues. On the factory floor, the bottleneck
machine will probably have a large pile of work, to be processed, in
front of it. Machines, downstream from the bottleneck, that use parts
from the bottleneck machine, will be starved for incoming work.
- The sales department can be the bottleneck.
- It is impossible to run the factory at 100% capacity, and if you try, the factory's overall productivity slows to a crawl.
That last point seems counterintuitive so let me explain. A factory
could theoretically run at 100% capacity if every single component of
the factory ran at 100% all the time. However, the first time a bad
part is created, or an operator comes back from lunch late, or an
extra-large order is received, every factory machine plays the "hurry
up and wait" game, resulting in grossly compromised production. This is
explained quite well, in the book, by "the match game". Additionally, if
non-bottlenecks are run at 100%, the factory drowns in work-in-process
inventory. "Full capacity" and "100% efficiency" are myths.
Root Cause Analysis
Imagine this situation: An employee backs up your company's commercial
website, and the
website immediately goes down, stranding a hundred thousand customers.
What do you do?
Obviously you get the website back up, but that's the tip of the
iceberg. You need to find out WHY the website went down so that you can
prevent future occurrence. It's likely you'll use Root Cause Analysis.
Root Cause Analysis is backward looking detective work. It looks something like this:
- Significant event occurs
- Problem definition and initial data collection
- Task analysis
- Change analysis
- Control barrier analysis
- Begin the event causal factor chart
- Conduct interviews
- Determine root causes
- Recommend corrective actions
- Report conclusions
Task analysis is just what it means -- analyze the task that failed (backup). What's involved with backup?
Change analysis involves analyzing the difference between the way the
task was performed when the problem occurred, and the way it's normally
Control barrier analysis is the analysis of control barriers. A control
barrier is a component or procedure in place to prevent a bad outcome.
Modern presses have safety mechanisms that shut off if a person's hand
is in the jaws. That's a control barrier. Your company might have a
policy that backups are not done until the webserver is failovered to
the alternate server. That's a control barrier. Theoretically, if a
system had all necessary control barriers and all those control
barriers worked as designed, there would be no mishaps. In performing a
control barrier analysis you might find a control barrier missing, or
one that was ineffective.
An event causal factor chart is technical and beyond the scope of this
The next step is to conduct interviews to find out what really happened
that day. This reduces the possibility of proceding on false assumptions.
Armed with the task analysis, change analysis, control barrier
analysis, event causal factor chart and interviews, you're in an
excellent position to find out what went wrong, and fix things so it
doesn't happen again. Perhaps in this case, the person performing the
backup was inadequately trained so he didn't failover the website to
the alternate server before performing the backup, resulting in a crash
of the web server software. He didn't failover to the alternate server
because he wasn't trained to do so. He wasn't trained to do so because
full training is scheduled only four times a year, so his training was
only unofficial training from his supervisor, and his supervisor forgot
to tell him to switch servers before backup.
The root cause is the system failed to train him to failover. You can
think of that as the four times annual training control barrier
failing, or you can think of it as a missing control barrier for new
hires hired between trainings. To fix the root cause and prevent future
occurrence, you could either fix the existing control barrier by
specifying that any new employee get full official training immediately
upon hire, and not operate the servers until trained, or you could
install a new control barrier by creating an official training guide
and checklist for the supervisor to provide immediate training.
Method of Kepner and Tregoe
As a business decision support methodology, this methodology is
one of the most generic problem solving methodologies around, and looks
like this from a high level:
- Analyze the problem state
- Analyze the solved state
- Analyze the transition
- Analyze how to prevent future problems and find future opportunities
The best known technique of the Method of Kepner and Tregoe is it's
"is/isn't analysis", used in problem state analysis, where questions
like these are asked and answered:
- Where is the problem occurring, and where is it definitely not?
- When did it start occurring, and when did it definitely not?
- Who has this problem, and who definitely does not?
Basically, the problem solver constructs a substantial series of such questions, using the 6 W's as a starting point:
- How much?
If you don't have abundent easy and safe test points, this is a very
powerful way of narrowing down a problem. If you have abundent easy and
safe test points, diagnosis by diagnostic test is quicker, more certain
and more accurate. That's
why this methodology is suboptimal for fixing most machines,
electronics, data systems and software.
With the methodology of Kepner and Tregoe, once you've analyzed the
problem domain you analyze the solved domain, using a prescribed set of
metrics. Solved state features are divided into wants and needs. Any
proposed solution not addressing every need is thrown out. The wants
are computed using metrics, and the list of possible solutions is
sorted by numerical value. As stated before, this is not necessary in
fixing machines, electronics, data systems and software, because the
one and only solved state is "performs as designed", and the one and only
transition path is "repair or replace the defective component".
The methodology of Kepner and Tregoe is great for decision analysis. It can be well suited in redesigning
machines, electronics, data systems and software because redesign is
really a business decision and requires analysis of all states.
However, for fixing broken machines, electronics, data systems and
software, this methodology is horribly suboptimal. Unfortunately, it is
sometimes sold as a troubleshooting tool for broken machines,
electronics, data systems and software, perhaps on the theory that you
can teach one process to help with both mechanical/electrical/data
problems and business problems. Don't fall for that. Teach the method
of Kepner and Tregoe for business decision, the Universal
Troubleshooting Process for troubleshooting, and teach both to anyone
performing both types of tasks.
Cars and Tanks
Rush hour is frustrating. Ever wish you could drive a tank to work?
Imagine driving to work in a M1A1 Main Battle Tank, also called an
tank. It has tracks instead of wheels, and it can go absolutely
It can roll over obstacles 42 inches high. It can cross trenches 9 feet
wide. It can go up a 60 degree slope. And due to its almost
armor, its 120mm main gun, and three auxiliary machine guns, it can
the most hostile environments. An M1A1 main battle tank can go almost
on land, including the freeway. So why use a car to go to work, when
accommodate only a small subset of terrains?
One reason is that the M1A1 gets less than 1 mile per gallon of gas.
Working only 10 miles from home, you'd pay $300/week in fuel alone. On
those rare occasions when the freeway travels full speed, the M1A1's 45
mph maximum speed is a liability. At a length of 32 feet, one inch, a
of 9.5 feet, and a width of 9.5 feet, parking is a problem.
There's no doubt the M1A1 can get you to work. But your friends
driving Ford Focuses get there faster, cheaper and more conveniently. Yes, the
M1A1 can go anywhere, but that ability is costly indeed.
Reminds me of companies selling general problem solving training to
those requiring electronic, mechanical or computer Troubleshooting.
Mechanical, electronic and computer troubleshooting is a subset of
solving. Machines and automated systems are well defined systems.
By that I mean they have a documented and well defined state and
Fixing them requires only returning them to their as-designed state
and behavior. You needn't analyze the solved state, with its heavy
design and creative thinking requirements. You needn't ask how you want
the machine to perform after repair -- you already know that. It must
as designed. You needn't ask if there's some better way you can do it.
All that's necessary is to get it back to its as-designed state and
Some training vendors are all too happy to sell you a generic
solving course for your technical people to use on
problems. Such generic problem solving methodologies contain several
consuming steps necessary only to design the solved state (which
into the as designed state and behavior for machine, computer
software problems). The vendor might justify this by mentioning that
generic problem solving methodology can solve all problems, including
of machines, computers and software. They're telling the truth, and
about as practical as trading in your car for an Abrams tank.
If you want to win, you go to war in a tank and the office in a car.
If you want to win, you fix fuzzily defined problems with a
problem solving methodology, and technical problems with a Troubleshooting
optimized for technical problems. If a person solves both types of problems,
train him in both methodologies.
So the question you need to ask is this: How would it affect my
if my competitors used more optimized Troubleshooting methodologies
Deciding on Training
If you're partially responsible for decisions on who gets what
training, you have a tough job. This document hasn't told you how to
make such decisions, but perhaps this document has given you a high
level framework, with which to work with others, in making training decisions.
information in this document is information is presented "as is",
without warranty of any kind, either expressed or implied, including,
not limited to, the implied warranties of merchantability and fitness
a particular purpose. The entire risk as to the quality and performance
of the information is with you. Should this information prove
you assume the cost of all necessary servicing, repair, legal costs,
negotiations with insurance companies or others, correction or
In no event unless required
by applicable law or agreed to in writing
will the copyright holder, authors, or any other party who may modify
redistribute the information, be liable to you for damages, including
general, special, incidental or consequential damages or personal
arising out of the use or inability to use the information, even if
holder or other party has been advised of the possibility of such
If this is not acceptable to
you, you may not read this information.
Top of Page
[ Troubleshooters.Com | Email
Steve Litt ]
(C)2007 by Steve Litt. -- Legal