Tutorial "Mining Programs and Processes": Assignments

Arrangements

Send your solutions electronically (see instructions below) to Andreas Zeller <zeller@cs.uni-sb.de>
This e-mail address also can be used for questions.
Please turn in assignments by July 15, 2008.

Assignment 1: Simplifying Input

In this project, you write a simple tool that simplifies failure-inducing input. The idea is to use delta debugging to find a minimal input that causes an XML parser to fail.

This project requires you to program in Python. Learning Python is not hard, though; and even if you include the time it takes you to learn Python, you will still complete your work faster than in most other languages.

Read this first

Tutorial Slides
Chapter 5 "Simplifying Problems" of the "Why Programs Fail" book (Slides)
Paper: Simplifying and Isolating Failure-Inducing Input
Python documentation and tutorial

The failure

XMLProc is a small XML parser written in Python. It has a small defect. To exhibit the defect, follow these steps:

Download the XMLProc parser. It should unpack into a directory called xmlproc.

In the xmlproc dictionary, the file xpcmd.py is a command-line interface to the XML parser. To have the parser parse the XML file demo/nstest1.xml, type

$ cd xmlproc
$ python xpcmd.py demo/nstest1.xml
xmlproc version 0.70

Parsing 'demo/nstest1.xml'
Parse complete, 0 error(s) and 0 warning(s)
$ _

The input file demo/urls.xml causes the parser to fail:

$ python xpcmd.py demo/urls.xml
xmlproc version 0.70

Traceback (most recent call last):
  File "xpcmd.py", line 112, in ?
    p.parse_resource(sysid) [...]
  File "xmlproc/xml/parsers/xmlproc/xmlproc.py", line 401, in parse_charref
    if digs==None: return
UnboundLocalError: local variable 'digs' referenced before assignment
$ _

Your task

Use Delta Debugging to simplify the failure-inducing input. Proceed in eight steps:

Step 1: Write a testing function.

Start with a Python program that invokes the parser, as described above, and assesses the result. To avoid complications, you may wish to conduct this work in the xmlproc dictionary.

To invoke the parser, you have two options:

Invoke the parser as a separate program. Use Python's os module to execute the parser. You can use fork(), execv(), and wait() functions, as in Figure 5.11; however, for this particular task, system() or popen() are far easier to use. Python's built-in File Objects are useful in communicating with the parser via files.
Invoke the parser directly from Python. Take a look at the xpcmd.py source to see how this works. The essential lines are
```
from xml.parsers.xmlproc import xmlproc
	
app = xmlproc.Application()
p = xmlproc.XMLProcessor()
p.parse_resource(file)
```
In order to catch the exception thrown by the parser, be sure to read and understand exception handling in Python.

Take care to differentiate three outcomes:

The parser completely parses the file (PASS), without errors or warnings;
The parser fails as in the original failure (FAIL)
The parser has another outcome, in particular syntax errors (UNRESOLVED)

Step 2: Write a splitting function.

You can start with the split() function as described in the book (Figure 5.9).

In the long term, you may want to split the input along token delimiters, or even use syntactic simplification (Section 5.8.3).

Step 3: Attach Delta Debugging.

To complete Delta Debugging, you need an implementation of the listminus() function as described in the book (Figure 5.10). The listsets module gives you exactly this.

Finally, you need the ddmin() function from Figure 5.7. This ddmin module even comes with a self-contained test function.

Step 4: Choose a representation.

How do you represent the configuration that is to be minimized?

If you want to simplify input, each delta might be a single character—thus, the string "defect" becomes the list ['d', 'e', 'f', 'e', 'c', 't']. However, this will cause trouble as you cannot distinguish between the same character at different locations. For instance, listminus(['d', 'e', 'f', 'e', 'c', 't'], ['e']) = ['d', 'f', 'c', 't']—which is not what you want.
A better solution is to store characters with their indices, as in [('d', 1), ('e', 2), ('f', 3), ('e', 4), ('c', 5), ('t', 6)]—this way, each circumstance can be uniquely identified.
For a more general solution, you may wish to set up Circumstance or Delta classes and to store lists of these objects.

Step 5: Run it.

What is the simplified failure-inducing input in demo/urls.xml? Record the number of tests that Delta Debugging takes.

Step 6: Document it.

Be sure to have docstrings for every function which describe its purpose.

Have a README file that describes how to invoke the simplification program.

Step 7 (optional): Make it more general.

Convert your testing function into Python's unittest framework.

Generalize the delta debugging function such that it can be parameterized with an arbitrary unit test and an arbitrary input—that is, it becomes independent from XMLProc and can easily be adapted to other input minimization tasks. (As examples of such tasks, see the paper Simplifying and Isolating Failure-Inducing Input.) Eventually, you will have a general stand-alone framework; to have it solve the XMLProc problem, instantiate it with the XMLProc unit test and the failing input.

Step 8 (optional): Improve efficiency.

In the book, Section 5.8.3 sketches how to implement syntactic simplification. Using a Python XML parser such as Expat, split the input along the XML structure to narrow down the failure cause more rapidly. Validate the success by counting the number of tests.

Assignment 2: Mine a Bug Database

In this project, you write a simple tool that summarizes defect data as collected from the AspectJ project. The idea is to identify those components which have had the most defects in the past — such that quality assurance efforts would be redirected towards these components.

You do not interact with version and bug databases directly; instead, you operate on the iBugs data set which contains all the necessary data in XML format.

Read this first

Tutorial Slides
Chapter 4 "Predicting Bugs from History" of the "Software Evolution" book (PDF)
The iBugs project (www.ibugs.org) and the papers referred therein.
Learn how to process XML data in your favorite programming language.

The data

Download this archive. It contains

The AspectJ source code (in aspectj.org)
Defect data from iBugs (in repository.xml)

The repository.xml file has an entry for every single bug (enclosed in <bug>...</bug>).

Each bug, among other characteristics, lists the file(s) where the bug was fixed, enclosed in <fixedFiles>...</fixedFiles>. Each <file> has a name attribute listing the file that was fixed.

A property name="severity" attribute lists the severity of the bug, from enhancement over minor, normal, and major to critical.

Task 1: Find the most defect-prone files.

By parsing repository.xml, create a top 5 list of the most defect-prone files in AspectJ — that is, those five files in which the most defects were fixed. Ignore all enhancement bugs.

A common observation in software systems is Pareto's Law: 80% of the defects can be found in 20% of the locations. Is this true for AspectJ, too?

Task 2: Find the most defect-prone directories.

Every file is part of a directory, which represents a higher-level component of a system. By parsing "repository.xml", create a top 5 list of the most defect-prone directories in AspectJ — that is, those directories in which the most defects were fixed. Again, ignore all enhancement bugs.

Note: Be sure to count every bug only once per directory. That is, a single bug whose fix affects foo/a.c and foo/b.c should be counted only once for foo/.

Task 3: Hypothesize.

In your own words, give an assessment of what makes these files or directories the most defect-prone. Use the supplied source code and bug reports as additional information.

Please note: As the supplied AspectJ source code is just a snapshot, it does not contain every file referred in repository.xml, which records a multi-year version history.

Task 4 (optional): Where are the most critical bugs?

Repeat Task 1 and 2, above, but with using only bugs that are of critical severity. Do the results differ? Why?