August 2015 ⋆ LeMetri.ca

Have you witnessed that discussion between a software developer and the call centre associate where the developer prefers to have no duplicate entries in the bugs database and the associate points out that the reports come from two different important customers? It is almost as if they are talking past each other.

For the developer, whether an issue is reported by one person or by many persons, the fix still takes the same amount of effort. Since the developer usually needs to answer the question as to when the work will be done, which in turn requires figuring out how many things need to be fixed, removing duplicates is the appropriate thing to do.

For the call centre associate, duplicates are an indication that multiple customers are impacted. In addition, when talking with a specific customer, the associate really needs the record of the exact previous interaction, so that he can form a picture of that customer’s experience. Therefore, the associate must keep a copy of all of the duplicates, with perhaps a link to the «canonical» bug report; usually located in a different database.

There is an additional cost to identifying and grouping together duplicates — first one has to find the previous instance, and that is not a trivial thing to do. In general, to determine that two problems have the same root cause requires a specialized effort similar to that of debugging. In some cases, it is possible to automate the identification — for example, for issues that can be found in system logs, one can develop simple pattern matchers (or complicated ones, but simpler is usually sufficient, thus better).

When I look at this situation I see two kinds of work. One is the primary work of debugging and fixing the root issue. The other is the downstream impact of not yet having addressed the issue. The more times an issue has been reported, the larger the impact, so a first simplified model is one where this impact is directly proportional to the frequency of occurrence. A more sophisticated model takes into account different burdens depending on the source of the issues — starting with differentiating user reported issues versus those found internally.

The frequency of occurrence is not always available. Many, if not most, of the otherwise valuable quality assurance techniques will either not provide information about it, or will provide distorted numbers. The reason is that the frequency depends not just on the code itself, but also on the usage patterns. There are many usage patterns, for example those provided by the various testing scenarios and simulations. Certainly, one can make use of such frequency data, if available. However, the most valuable frequency data are those obtained when the software is used by the final customers.

In early development, it is rare to have frequency data. However, later on, as the software starts to be deployed, be on the lookout for the possibility of obtaining it. Those that can use their own software the same ways as their customers will are at a huge advantage: they can get data quickly, even before their software is released. It is not always possible to be in this desirable situation, but perhaps your beta testers will allow you to gather this technical usage data. Your testing efforts might also provide data, but do recognize that the testing scenarios do distort the usage patterns (that’s partly the point, they need to catch rare events).

There is a certain cost to obtain the frequency data; however, there is already a cost due to having to deal with the presence of the issue (support cost, managing the issues list, etc.) In addition, it may very well be possible to automate its collection (just be aware of concerns like privacy)

Once you have frequency data, it becomes easier to determine what changes will have the largest positive impact on the experience of the users, and give them a correspondingly higher priority.

Monthly Archive: August 2015

Frequency Information Enhances the Value of Trouble Reports