The Impact

 No description of a problem should be considered complete without some explanation of the impact. Impact is simply stated as the result of not solving the problem. All statements of impact should have a cost, a timeline to realize the impact, and a likelihood or probabibility of realization. None of these are precise measurements, because measurement would only be valid after the impact was realized. These are informed conjecture.

Let me walk through an example:

In a manufacturing plant, a piece of equipment essential to making hottentot widgets breaks down. We are currently manufacturing to back fill our inventory, and do not have any open customer orders for hottentot widgets. This piece of equipment is one of two that are capable of doing the required work, so that I can continue to manufacture, but at half speed.

In this case, impact is only realized after I have sufficient customer orders to exhaust my inventory of hottentot widgets, and insufficient capacity to meet my customers delivery expectation (causing my customer to cancel the order and place with an alternate supplier). The immediate cost would be the profit on the canceled order. A subsequent cost might be that the customer would then favor the alternate supplier in a way that causes me a longer term loss of profitable business.

Given that I have three clients who purchase hottentot widgets regularly, my timeline to the immediate cost would be determined by the normal schedule for those clients and my current inventory. The probability of this impact would be realtively high 95% – in that if I don't fix that machine, it will happen. The longer term cost might only happen on one client, and only after two missed ship dates, amybe only with a 60% probability. This second risk can be mitigated by byuying the widgets from the competitor myself and selling them at a loss to keep my customer happy.

Let's say that the immediate cost is $10000, and the timeline to realization is 2 weeks, with a probablility of 95%. That is an easy impact to understand. If I have the machine fixed with 2 weeks, there is a strong likelihood, I can get by with no impact. Call the service company and schedule the technician. However, if by the middle of week 2, things have not been resolved, I might get a little excited. I would probably be calling the service provider daily or more frequently, escalating with their management, potentially threatening to use a competitor if they can't meet my need.

Understanding the impact in terms of the cost, timeline and probability are important to assessing the urgency of a solution. They tell me when I need to get out my cape and tights, and when even that is too late. More than anything, they allow me to manage my customers' and my management's expectations and to react to their concerns in ways that build confidence and credibility.

The Symptom

 Problem solving is done better when the symptom is articulated separately from the cause and the impact. The symptom should be articulated as the experience of the person reporting the problem, and his or her opinion about where the process/system failed to meet his or her expectations. There may be some steps leading up to the failure, and maybe an outcome (what happened after the failure.

Analysis of what is necessary to reproduce the symptom is valuable. Validation that they symptom can be reproduced following some steps is extremely valuable.

The context, or business process that the reporter was executing is important. A description of what should have happened instead is also valuable. These are part of the symptom.

— the cause is separate from the symptom. When reporting the problem, speculation with respect to the cause is simply that, speculation.
— the impact is separate from the symptom. When reporting the problem, the impact is only to be understood in the context in which the problem was reported. 

The Problem

Problem solving is difficult. The good news is that problem solving is domain independent, so good problem solving skills can be applied to any problem. They can be applied to business process problems, to technical software problems, organizational problems, anything that presents itself as a system, for which a potential customer can ask you to solve a problem.

Perhaps the most difficult part is understanding enough about the problem to ensure that any potential solution will be effective. Your customer doesn't necessarily have a complete understanding of the symptoms. Your technical team doesn't necessarily have a complete understanding of the causes. Your customer relations group doesn't necessarily have a complete understanding of the impact. Yet those three aspects of the problem are required to assess potential solutions.

To get a complete enough understanding of the symptoms, you should be able to reproduce the symptom. If the symptom occurs inconsistently, or unpredictably, it is much harder to tell if you have addressed it. At the same time, an understanding of the symptom including the circumstances under which it can be reproduced, divorced from any speculation about the cause is an essential aspect of the problem definition.

With a good understanding of the symptom, you can document the impact of the problem. What does it mean to the individual customer, how frequently/likely does it happen? What does it prevent the customer from doing? How many customers is it likely to affect? Are there any opportunities to work around the problem to reduce/remove the impact. What is the potential for customer relationship impact? (are you going to lose customers?). Sometimes the identity of the customer alone is an impact. When your customer is a public figure, or an organizational leader – the reputational impact is greater.

If your impact is a risk (no real damage has been done yet), what is the time line on the realization of that risk? Is it end of day? End of billing cycle? How long is the fuse on this bomb? Does the customer have a business event that is impacted? When is that business event? (Do you routinely ask the customer those questions?)

When you have a complete enough understanding of the symptom, you can isolate causes of the symptom easy enough. For symptoms that appear inconsistent, it may be the case that several causal factors are required to align, rather than an individual cause. Isolating these individual causes is the challenge.

Understanding the causes, then exposes a more comprehensive impact statement. What other symptoms might result from the issues that are causing the currently known symptoms. What are other potential impacts from this problem, based on these new possible symptoms, and the understanding of the causes.

A solution that addresses the symptom without reducing or eliminating the impact merely "pushes the bubble." A solution that addresses symptoms without mitigation at the cause, increases the complexity of the system and long term maintenance or operating costs. A good solution eliminates the impact, not necessarily the symptoms, without increasing the complexity or sustainability of the system.

To solve a problem well, it is necessary to understand the symptoms, the causes and the impact in isolation from each other, as well as in relation to each other. A well documented problem statement has these three elements, clearly described, and also explains how they are related.

Truth is, most people and organizations waste a lot of time because they start to work on a solution before they have a good grasp on the problem. If you cannot explain the problem to others, you definitely should not be working on the solution. If you start to solve every problem by attempting to articulate the problem in this way, you will likely be regarded as a genius at problem solving. You will save your company time and money and quickly become a star performer.

Behavioral Taxonomy

When developing requirements for a business process in which there are valid process variants, one usually describes the process variants as a behavior. When modeling these variants, it is useful to consider each aspect of a process that has variants, and isolate unique behaviors based on some decision framework. Both the distinct list of behaviors (behavioral taxonomy) and the attributes driving the decision framework (driver mapping) are important to the model.

The list of behaviors is important, because the words that identify each behavior become important words in the language used to describe your business process. When you talk about these behaviors, you all (technicians and subject matter experts) can use the same unambiguous terminology to describe the process.

Sometimes in the business process documentation, the language is about the steps and the business rules that govern each step. At a higher level of abstraction, we benefit from aggregating these business rules into patterns, or behaviors that have some cohesion around a small number of attributes or facts. When we observe these patterns, we can simplify the language around the business process, by naming the distinct patterns as specific behaviors, and identifying them with a business driver as observed in the attributes.

When we isolate and name the patterns as specific behaviors, we also can understand the data elements or attributes and values that drive the decision framework. This mapping of data elements and values to select a behavior is also part of the requirements, as it is important to ensure that all valid values for each attribute are considered in the behavior selection.

More complex processes may have several different aspects that are governed by distinct behavioral taxonomies, and isolating these taxonomies from each other is important. Sometimes when we try to render a single behavioral taxonomy that governs a process, and find that we cannot easily recognize the behaviors, we actually have a more complex case, where there are nested behavior variants, or we have non-correlated (independent) behaviors governing several aspects of a process. In these cases, if we isolate the individual lists or taxonomies of behaviors, then review against each other to determine whether relationships exist between taxonomies. Those relations can be classified as governing hierarchies (where available selections in one taxonomy are limited by the selection in another "governing" taxonomy.), or incidental constraining relations (where as it happens, the selection of a behavior in one taxonomy, either requires or invalidates one or more behaviors in other taxonomies, but those constraints are not imposed exclusively in any one direction between two taxonomies)

Clearly identifying the distinct behaviors of the business process, and the data elements or attributes that can be used to select each behavior supports a good requirements modeling practice.

Semantic Clarity

One of the most difficult aspects of gathering and documenting software requirements is to get all of your stakeholders and project team participants to agree on terminology, especially words commonly used in the business domain. Common usage is never very precise, and often business problems arise from ambiguities in common usage. In developing a conceptual model of the business domain, it is necessary to define key metaphors very precisely, to remove these ambiguities, and allow software to implement logic that is correct.

There are two key types of metaphor ambiguities: semantic diversity and multiple identity.

Semantic Diversity – when a key metaphor has different properties in different contexts. This presents itself as a word that means different things in different aspects of the domain.

Multiple identity – when multiple metaphors actually share the same meaning or identity. This presents as different words that are used in different aspects of the domain to mean the exact same thing.

There is one other pattern that we see, which is called contextual revelation. Contextual revelation is when a metaphor presents different attributes in different aspects of the domain. In this case, it is the same metaphor, but different information is relevant to different parts of the business process. This often presents itself as if we have different metaphors, that have a mandatory one to one relationship with each other.

In some business domains, certain aspects appear optional, in that not all aspects are in play for every situation. In this case, the contextual revelation pattern may present as the metaphor having behavioral variants. In this case, one way of disambiguation is to derive the Behavior Taxonomy. Simply put, this can be accomplished by determining the simple list of behavioral variants that are needed to support the business process, and mapping the driver attribute(s) used in determining which behavior is selected.

Transposition of the Familiar

Have you ever had the experience of having someone explain something new to you, and somewhere along the way, your brain replaced something new with something familiar? So when you went to the next step, to work on the new thing, you immediately became confused and either did the wrong thing, or did something that made no sense. You unknowlingly transposed something familar for something new without proving that they are the identical. Call it a synaptic misfire, or just mental laxiness, it happens with alarming frequency.

Transposition of the familiar is an antipattern of knowledge acquisition. In knowledge acquisition, one is required to assimilate new information. New information must be analyzed, and assigned to categories and taxonomies. The antipattern happens when you identify the category of a new fact, but transpose attributes of familiar facts from the same category onto the new fact, even when they are not really valid. The worst instances of this is when you ignore the new attributes, preferring the transposed attributes.

This is extremely common in learning new software development paradigms (languages, patterns, and related constructs) when the developer sees a logic construct, like an if-then-else or loop construct in a new language, and assume that it observes that properties of a similar construct in a more familiar language. For this reason, until the new language is completely internalized by the developer, it is good for him to be completely immersed in the new language.

The same thing is true of entities in a new business domain model. Our brain automatically pulls out the terms we recognize and substitutes familiar metaphors. When the metaphors in the new business domain differ from the familiar, those transpositions introduce errors or at minimum inconsistencies in the model.

This is an antipattern that can be avoided through diligence. Understanding the prevalence of the antipattern is the first and easiest preventative measure.

Another preventative is immersion – you immerse yourself in the new paradigm (what ever that is) and force yourself to unlearn everything else for a period of time, until the new paradigm sticks.

A third preventative is to engage others who have diverse backgrounds (and therefore familiarities) and collaborate with them to embrace the new paradigm together, holding each other accountable for adherence to the new.

The last and potentially most effective preventative is to engage someone who is expert in the new to assess whether you are "getting" it. This can be expensive if you end up hiring a trainer, or coach or bringing in experts from the outside of your organization.

Many times, when organizational management wants to implement transformation to new paradigm, this last preventative is the approach they take first. But this approach is limited when it is used to "push the paradigm" – all other approaches assume that a desire to assimilate the new paradigm is already in the resource who is trying to assimilate. Knowledge acquisition is by definition, a pull activity.

Document Driven Design

Software design is a process. It is a process of taking one abstraction – a business oriented model, and constructing a different abstraction – a technology oriented model.

Since there are many different technology paradigms, the technology oriented model differs based on the paradigms selected. The business oriented model differs by business, and the problem or value proposition. Part of every good design process is to decide how to represent or document each of those models.

For the most part, software design metthodologies have been selected because they help the developers represent the models that are in play, in ways that reflect the technology paradigms that are in use. Each methodology has commended some set of artifacts and sequence in developing, reviewing and revising those artifacts (e.g. Flowcharts, dataflow diagrams, uml class diagrams, data dictionary, data model, etc.).

Enter the antipattern:
In my experience, different organizations have used these methodologies and produced these diagrams with varying amounts of rigor. Somewhere along the way, auditors who are charged with proving that capital spent on software development actually resulted in a depreciable software asset decided that reviewing the artifacts of the software development life cycle (SDLC) were a good way to do this.

In public stock corporations, these audits are now required by law. Since the enactment of this law, known as Sarbanes-Oxley many large coprorations that engage in developing software products to be used as assets in the operation of their business, require that the documentation produced out of their SDLC are designed to pass these audits. Many of them have produced templates that are designed more to pass these audits then to help software developers capture the essence of the abstract models and prepare to build software.

When the developers who had less rigor in their design practice originally, are confronted with these document templates, their already ad hoc design process simply conforms to the template that they are presented with. This conformation of a process to a documentation template, without contemplating whether there is an appropriate methodology in place, or whether the template logically represents the essentials of the specific design being contemplated we have an anti-pattern.

Document driven design belies a focus on meeting audit requirements through our design documentation, rather than focused on articulating what it means to produce software capabilities. Similar antipatterns can manifest themselves in any aspect of our practice, where process compliance is more important that work product. Management's insistence on process compliance as a means to improve quality, moves us rapidly towards least common denominator solutions. However, it is inherently measurable, and easy to detect failures. It cannot, however, help you detect opportunity cost, or missed innovations, or other work product that was ignored, while maintaining process compliance.

Document driven design and similar template driven practice antipatterns often use the phrase "Best Practice" to rationalize their existence. But the tend to institutionalize a one-size-fits-all mindset, and with any best practice, what is best for some may not be suitable for all.

I am not suggesting that we eliminate design documentation, nor am I suggesting that we should not use document templates of some kind. Design doc is important both to clearly articulate design decisions we have made, and to help those who come behind understand how we got here. Templates are a useful product, for any repeatable process. However, templates must be aligned with the established repeatable practice, and if no practice is established, then all templates are simply taking the place of established repeatable practice. Those who create a document template should publish a process guide, to explain how their template aligns with a specific practice, or how it can be used by practitioners of alternate practices. The point of templates is to give someone responsible for documentation a leg up, or a head start. But often we are just providing an easy out, or an alternate version of chaos.

The next time you are confronted with a set of process documentation templates, ask yourself if you know how they align with the process you are performing. If you don't and are tempted to discern the process from the templates, then you are entering a template driven practice antipattern… proceed with caution.

A Template That Inspires Innovation

After adopting a new design process philosophy, and implementing a trial project using it, a design anti-pattern has presented itself.

Document Driven Design – this anti-pattern is the practice of allowing whatever document template is mandated, drive the actual process or practice of design.

It is common in the enterprise space, especially post Sarbanes-Oxley where such documentation is as essential for passing regulatory audits as it is for constructing application systems. Sometimes enterprise management get confused as to which is the primary objective. While I haven't read the federal register to determine what actually is required, many organizations have followed the consulting firm model, and developed a documentation framework for software projects that is mandatory.

I would expect that organizations that are towards the higher end of the capability maturity model use templates that are reflective of their already mature process, but organizations that are more steeped in chaos may be tempted to produce document templates that are an amalgamation of several different processes, thinking that those responsible for authoring those documents will know how to complete them. I have seen that this is not the case, and it can cause the aforementioned anti-pattern, that is reflected in the following statement: Design is complete, when the document is done.

As I said, I realized this after trialing a new design philosophy. New philosophy has 5 steps – initiate, decide, articulate, analyze, review. Articulate is the step where the the first draft of formal design documentation is created. The team had done fairly well, following the requirements of the philosophy until this point, and when they got there, we tripped over the template. Boom! Instead of figuring out what needed to be articulated, we attempted to cram all of the decisions we had made into the existing template, and became confused and the result was frustration. We got to the review step, and questions and accusations were flying. I had a great deal if trouble figuring out what went wrong.

What was the problem? Template thinking trumps philosophy! Why? I suppose that this is an instance of transposition of the familiar.

Now I realize that I have a challenge – that to support our new design philosophy, I need to construct an "articulation template" that can inspire innovation and help the team with the remaining steps of analysis ( which in my philosophy is impact analysis) and review.

The Solution

The notion that there is a single solution to any problem is a fallacy. There may be a solution to an equation, but every problem has more than one possible solution.

We learned in math that there is one right answer for every arithmetic "problem". But in fact even that is fallacious, because we can represent that answer correctly in many forms. 1/2 = 2/4 = 0.5 = 50%, etc… We also learned in math that the teacher expected us to show our work. Because the exercise was not to get the answer (that was in the back of the textbook) but to learn the method.

In the real world, when we have a problem, it is more likely to be a "word" problem, and if there is math behind it (rather than boolean logic) we need to represent the answer, or solution, in a form that fits whatever we are going to do with it. We don't get credit for doing the right method, or showing our work (unless you are building a repeatable process, that others will follow) – we get the answer and move on.

In the real world, there are always multiple paths to solution, and if the problem has any degree of complexity, it is the fastest, least cost, least effort, optimized path that is valued. In the real world, sometimes a quick approximation is more valueable than a 100% certainty. In the real world, the need changes faster than we can solve problems, so sometimes a quick fix is more valuable than a perfect solution.

In the real world, knowing what the likely points of failure are within the solution, and what the probability of experiencing those failures, or what external events would trigger those failures is as important as knowing how to construct the solution itself.

In the real world, understanding the problem, the impact of the problem, and the timeline for realizing that impact is as important to the solution as the solution itself.

In the real world, there are almost always solution options. Sometimes the right answer is to solve the problem multiple times.

1) A quick fix to manage the risk – in hours to days (you may call this a workaround, a band-aid, or a hack)

This is like applying a tourniquet. Good to stop the bleeding, but for a short perriod of time, otherwise we will lose a limb. This solution incurs technical debt, as this fix will need to be unwound, and soon.

2) A more thorough solution to provide more of whatever attribute needs to be increased – in days to weeks (this might be a well structured hardening or bullet-proofing exercise, or a non-behavioral system change, or a behavioral change to accomodate new real world conditions)

This can be a long term fix, but usually adds complexity at the expense of continual maintenance. Every new project that has to change this thing will need to contemplate this complexity. Enough of these in our system and change becomes difficult, beyond a certain point it is better to start over than to fix. This solution will last for a long time, but may make us despise our own handiwork after a while. The benefit here is that we keep the change inside the bounds of our own control.

3) A correction at the root cause of the problem – in months to years, and may require agreement/negotiation with multiple stakeholders, change to work process, legal documents, and other elements outside of the direct control of those who are impacted by the problem ( this is what we always want to do, but may require changes to human behavior, customer expectations, etc. that require a much greater planning effort to realize)

This change requires organizational management attention. If the root cause is outside the bounds of our control, or if the consequences of the change exert influence beyond the bounds of our control, we need to negotiate, and exert influence to get others (systems, teams, business units, customers) to accept the net impact of this change.

How many times do we stop after the first, or especially the second solution, and how much complexity is inserted and maintained because we do not go the distance to solve the root cause. How many times have we moaned that "There is never time to do it right, but there is always time to do it over…"

A Framework

The first step in problem solving is root cause analysis. Identifying the root cause requires a framework. A simple set of questions, the answers to which allow you to rule out probable causes, so that you can investigate only the probable causes you can't rule out. I have watched as team members exercise a brute force framework, investigating each identified cause to conclusion in sequence, instead of using a more optimized approach. I have observed colleagues spend hours investigating problems based on a single assumption about the root cause of the problem, only to realize that their assumption was incorrect, and they could have proven it in 15 minutes.

Try the following framework:

1) Carefully document the symptom that is manifest.
— make sure that you separate the symptom from any speculation as to cause.

2) Do you understand the system or portion of the system where the symptom is manifest?
— if not then engage someone who does – knowledge of the system is essential.

3) List out as many probable causes of the problem as you can think of.

4) For each probable cause, find the simplest way to prove that it is not the cause.
— ask yourself the question – if this were the cause, what else would need to be true?
— ask yourself the question – if this were the cause, what could not possibly also be true?

5) Select some optimized order or sequence to disprove
— the goal would be to disprove the most likely first, or to let the work of each proof build on the last – whichever feels more efficient.

6) Disprove each probable cause

7) The remaining probable causes must be investigated

8) Do not stop at one probable cause.
— a symptom can be the result of more than one cause.
— an occurrence of a symptom can be the result of exactly one cause.