Andreas Dautovic: Automatic Measurement of Software Documentation Quality, Doctoral thesis, Department of Business Informatics - Software Engineering, Johannes Kepler University Linz, September 2012.


The quality of a software product is a significant driver for its success. However, the majority of the applied quality assurance methods mainly focus on the executable source code. Quality reviews of the software documentation are often omitted. Nonetheless, software documents such as requirements specifications, design documents, or test plans represent essential parts of a software product. Therefore, the quality of such documents influences the overall quality of a software product considerably.

The results of a conducted empirical study show that quality attributes such as Accuracy, Clarity, Consistency, Readability, Structuredness, and Understandability of software documentation are considered particularly important in practice. Generally accepted methods for measuring these quality attributes usually base on manual reviews. However, the required resources for reviews cause that these methods are rarely used for software documentation in practice. Although the study results show that there is a basic demand for automatic measurement of software documentation quality, tools that would support this process are hardly used.

In this thesis I present an approach that enables the automatic measurement of software documentation quality. For this, generally accepted documentation practices are formulated as automatic measurable rules, whose violations indicate problem spots in software documents which in turn directly affect the documentation quality. In order to support the automatic identification of such problem spots I developed a Java framework for documentation quality defect detection, which allows applying predefined rules to software documents of various types and formats. Furthermore, the presented approach includes the automated import of such violations into a developed documentation quality model to determine impacts of rule violations on different quality attributes and to regard software documentation from a quality model perspective.

In an experimental study I evaluate the suitability of the presented approach (1) by applying the developed documentation quality defect detection framework using a set of 46 rules on several software documents of a real software project, (2) by importing the detected violations into the documentation quality model, (3) and by analyzing their impacts on the documentation quality. The results of the study show that most of the applied rules indeed are trustworthy to identify shortcomings of the software documentation (e.g., incorrect document structure, broken references, inconsistent glossary content), which were overlooked in previously conducted manual reviews. Actually, only few rules return unreliable results. The study results also indicate that relating rule violations to quality attributes defined in a quality model helps to measure impacts on the quality of software documentation and to view them from a more abstract perspective. Furthermore, the results of an evaluation show that the automatic measurement results correlate with independent, manually measured quality data.

The automatic measurement of software documentation quality based on rule violations and the use of a quality model follows an established rule-based concept for measuring source code quality. However, the presented approach has some limitations, which are discussed in detail in this work. Nevertheless, there are some scenarios (e.g., the assessment, the monitoring, and the improvement of software documentation quality) in which the application of this developed automatic quality measurement approach to software documentation is beneficial.