December 8, 2005

"Whatever remains, however improbable..."

Every now and then, I hit a regression bug which exists in one version of a file (or project), and doesn't exist in another, for reasons that are not easily deduced. It's maddening when diffs of a particular file offer no clues as to the cause of a bug.

I have a tried-and-true method of isolating such bugs:

  1. Take a known state where the desired behavior works (state A)
  2. Take a known state where the desired behavior is broken (state B)
  3. Determine a list of changes between state A and state B.
  4. In a copy of state B (state C), replace one of the changes in the list with the original code from state A.
  5. Compile and test state C for the desired behavior.
  6. If the build fails to compile, flag it for further examination (assuming we don't find a condition where state C succeeds).
  7. While state C fails, remove the reverted change from the list of changes, and repeat the previous three steps and this step.

Logically speaking, sooner or later you will eliminate all the incorrect causes of failure and arrive at the correct cause. Or, you will identify a much smaller set of possible regressions to look through.

Now, could someone find or write a Perl script to generate the "state C" sources for me (and possibly invoke another Perl script to run the test)? In the first bug I had this problem in, the set of changes was assumed to be all directories one level below the mozilla source directory. In the second bug, the set of changes was XBL methods and properties in a single XML file. It wouldn't be easy. But it would be doable. (I've written a Perl script for the first testcase, but it was somewhat hackish.)

Sir Arthur Conan Doyle, and I, would thank you.

Posted by WeirdAl at December 8, 2005 1:27 PM
Comments

Revision Control Systems make generation of your "state C" easy.

In CVS, you need either the date or tag of C and do "cvs update -D $date" or "cvs update -r $tag"

In SVN, you are more likely to have a revision number, so you can use that with "svn update".

(From Alex: The situation I have in mind is when CVS and SVN are not helpful -- like, when someone dumped a boatload of changes into a checkin without keeping the history.)

Posted by: Chris Dolan at December 8, 2005 2:15 PM

Look up "delta debugging" "Andreas Zeller".

Posted by: Robert O'Callahan at December 8, 2005 4:40 PM

Git has "bisect", which takes a Good and a Bad checkout, and splits the difference in terms of lines changed. Repeat as necessary until your good last checkout is next to your bad next checkout. Very cool.

Posted by: Randal L. Schwartz at December 11, 2005 2:14 PM