Why software isn’t so soft
For something with the word “soft” in the name, software is very hard indeed. Every study I’ve seen has shown that we, as an industry, are terrible at estimating how long it will take to create and not good at all at producing it without defects. There are all kinds of reasons for this, but mostly it boils down to the fact that creating software is far more akin to a craft than an engineering discipline. Each software product is lovingly sculpted from the depths of our creative minds. Developing software is simultaneously artistic and scientific, which accounts for its appeal for some of the smartest and intuitive people on the planet. And, like art, it must reveal a consistent truth. Unlike many works of art, though, software is intended to be functional as well. In many environments, it must also provide safety (“do no harm”), reliability (“provide services all the time”) and security (“resist interference or theft”). Such systems – that exhibit safety, reliability, and security – are said to be dependable because of our ability to depend on them for mission- and life-critical applications[1].
The most common ways of managing the correctness of software include inspections and testing. Inspections are valuable, but they are error-prone, expensive, and difficult to perform well. Testing is limited by the fact that software is, in real use, not simple. Software is deterministic in that given the same sequence of inputs with the same timings, it will always do the same thing. The problem is that software has a near-infinite state space. Software has many (countable, but huge) control paths. In addition, data influences the behavior of the software so that f(-1) may perform a different behavior and use a different control path than f(0). To exhaustively test a “simple” function that takes a single 32-bit integer input parameter requires 232 test cases. If you add a second input parameter of the same size, the number of test cases balloons to 264. Although the number of test cases may be trimmed down with branch coverage analyses (such a structural coverage, decision coverage, and modified condition-decision coverage) and data equivalency classes, the number of test cases for a typical software application remains larger than we can reasonably apply. For this reason, safety standards typically require not only verification activities (e.g. testing, inspections, and formal mathematical analysis) but also process assurance activities to make sure that process activities meant to ensure software quality are performed properly.
Process assurance is a set of activities that define the steps that developers, ensures that these are well-defined task, performed in a particular sequence, and have specified transition criteria amongst those tasks. We call this planning. Further, process assurance analyzes these plans to make sure that these process steps make sense in terms of meeting quality objectives. We call this good planning. Finally, process assurance monitors project execution to confirm that the project team actually performs these process steps well enough to meet the quality objectives. We call this process execution governance. That is, process assurance is about
Saying what the project team is going to do
Ensuring the plans meet the project objectives
Doing what you said you’d do
Oh yeah, and providing evidence of compliance.
Some people view quality assurance as including, but larger than, verification testing, while others consider it to be a separate activity from verification. Personally, I am less concerned about its ontological relations than I am with ensuring that quality assurance and verification are done, and done well. For this discussion, though, I’ll consider quality assurance as a set of activities that include both verification and process assurance.
Quality? What’s that?
The term quality means many different things to people. For example, (different) people will say a system is of high quality if any of the following are true about the system:
It meets the requirements
It meets the customer’s needs
It passes the verification tests
It is free of defects
It is easy to use
It is easy to learn
It acts according to the “principle of least surprise”
It exhibits consistent and predictable performance
It has an appealing appearance
It is certified (or certifiable)
It adheres to relevant standards
It was created by following a rigorous process
It is durable
It is reliable
It is robust
It has high performance
It is maintainable
These are all valid statements because they reveal what quality means to a different set of stakeholders each of whom, understandably, has their own concerns and perspectives. In short, a system is of high quality if it is “fit for purpose” (usable, functional, and free of defects) and is “developed properly” (developed via a process that complies with standards and yields a fit result).
Quality – I need me some of that!
We ensure quality in our developed systems by employing a variety of mechanisms. First, it is well understood in the developer community that you “cannot test quality into a product”. That means that testing can reveal, identify, and localize defects, but doesn’t address the problem of defect avoidance or defect repair. That is the job of developers. It is a Law Of Douglass[2] that “The best way to not have defects in a system is to avoid putting them there in the first place. “ This suggests the key to avoiding defects is to have good engineering and development practices that prevent, or at least inhibit, the introduction of defects. Agile practices, such as test-driven development and continuous integration, serve such a role. It also helps to have exceptional and thorough engineers executing well-defined processes with traceability among the related work products (e.g. requirements, design, code, and test).
Of course, verification testing is another crucial activity. The Harmony process[3] identifies 3 levels of verification testing. Unit testing is performed by the developer or a peer at the class, function, or data structure level to ensure correctness and robustness of the primitive software elements. Integration testing brings the work of multiple developers together to ensure that it works together properly. Integration testing focuses on test cases that cross unit boundaries and exercise interfaces among components. System verification testing is black-box, requirements-driven testing that verifies that the system input-output control and data transformations specified by the requirements are properly implemented. For safety- and mission critical systems, system verification testing is analyzed with white-box analysis tools to ensure that every line of code and decision branch is appropriately covered by a requirements-based test. In addition to verification testing, there is also validation testing, which is done to ensure that they system – even if meets the requirements – also meets the needs of the customer. This last step is necessary because there is often a gap between the specified requirements and what the customer actually needs.
The last cornerstone of quality assurance is process assurance. Process assurance is really about conformance. Plans conform to external (regulatory) standards such as IEC 61508, DO-178C, or CMMI and those plans meet the objectives and goals of those standards. Internal company and project standards – such as requirements or coding standards – are defined to give guidance to developers and provide checklists for process assurance reviews and audits. Work products must conform to those internal and external standards – this is managed through quality assurance inspections and reviews. Quality assurance audits ensure that the tasks performed by the project team conform to plans. These latter two activities – audits of work tasks and inspections of work products – are the soul of process assurance. Verification activities are about the correctness of the product itself – that is, verification is to prove (in some loose sense) the semantics of the developed system. Process assurance is more about the syntax (compliance to rules) than correctness of the product per se.
Summary
Developing software is hard because software is inherently complex. Although software is deterministic, it has an essentially infinite state space. This means that it cannot be exhaustively tested to ensure correctness. The way to get quality in our systems is to employ three key activities:
1. Develop with quality in mind. Using experienced, dedicated, and thorough engineers, use best practices to avoid defects in the first place. This means using practice such as test-driven development, continuous integration, and incremental traceability. These practices reduce the number of defects to be found by other means.
2. Verify the software Verification is about demonstrating correctness. I recommend three levels of verification – unit, integration, and system – to show that the system correctly performs the input-output data and control transformations. For the system level verification, coverage analysis should be used to confirm adequate path testing of the software has been achieved using requirements-based tests.
3. Process assurance checks compliance to plans and standards Process assurance checks the syntax of the project tasks (through process audits) and work products (via work product inspections). This is usually driven with checklists derived from internal and external standards with which the process and work products must comply.
If we, as development teams and organizations, can adopt and apply these best practices maybe our software will become just a bit more dependable and we can avoid panicked pilots requesting developer assistance.
[1] I know I just hate it when I’m on a commercial aircraft and the pilot comes on the loudspeaker to say “OMG! Is there a programmer on board?” Talk about just in time development!
[2] See my book “Doing Hard Time” (Addison-Wesley, 1999) for some of the currently over-two hundred Laws.
[3] See my book “Real-Time Agility” (Addison-Wesley, 2009) for a detailed discussion of the Harmony process.