A Case for Formal Specification

Journal Coryoth's Journal: A Case for Formal Specification 8

Journal by Coryoth on Wednesday August 03, 2005 @02:42AM

Formal Specification helps build more robust, more maintainable software with fewer bugs and defects. It has a long history, but it is still a developing field. While it may not be suitable for all software projects, a case can be made that there are many projects not currently using formal specification that stand to benefit from it. As the methods and tools for formal specification develop it is increasingly becoming something that developers and software engineers should learn to use to their advantage.

What is Formal Specification

Formal specification is simply a matter of being more explicit and specific in defining the requirements of software. At the simplest level this can take the form of Design by Contract, where functions and procedures specify pre- and post-conditions and loops and objects include a set of invariant properties. In the more rigorous case formal specification involves building an explicit mathematical definition of the requirements of the software. Using such a definition one can prove the correctness of the system, or simply prove theorems about properties of the system. An implementation can also be checked against such a formalised specification, verifying that the implemented code does indeed do precisely what the requirements claim. At the most rigorous level the initial formal requirement specification can be expanded, through (mathematically rigorous) refinement, to ever more specific and detailed specifications resulting eventually in executable code.

All of these different levels allow a significantly greater degree of analysis of the software, be it improved static and runtime checking with contracts, to more complex data flow analysis and proof with more complete specifications. In the same way that static types allow more rigorous checking at compile time, catching a lot of simple errors, contracts and specifications allow even more analysis and checking, catching even more errors at the early stages of development when they are most easily and efficiently fixed.

A Simple Example

For good examples of formal specification at work I would suggest you try this Z case study or the case study for CASL in the CASL user manual. For those who want something more immediate I have a warning: To be short enough to present easily in this space the specifications are necessarily very simple and should be taken as a sample of the flavour of formal specifications rather than a serious example.

The specification for a simple stack, in SPARK, a version of Ada that adds formal specification semantics, would look something like the following:

package Stack --# own State: Stack_Type; --# initializes State; is --# type Stack_Type is abstract

--# function Not_Full(S: Stack_Type) return Boolean; --# function Not_Empty(S: Stack_Type) return Boolean; --# function Append(S: Stack_Type; X: Integer) return Stack_Type;

procedure Push(X: in Integer); --# global in out State; --# derives State from State, X; --# pre Not_Full(State); --# post State = Append(State~, X);

procedure Pop(X: out Integer); --# global in out State; --# derives State, X from State; --# pre Not_Empty(State) --# post State~ = Append(State, X); end Stack;

The package body, containing the implementation, can then follow.

The own simply declares the variable State to be a package global variable, and initializes means that the variable State must be initialized internally in this package. We then declare an abstract type (to be defined in implementation) and some simple functions. The in and out keywords in the procedure declarations tag the parameters: in means that the parameters current value will be read in the procedure, and out means the parameter will be written to in the procedure. The global keyword declares the the package global variable State will be used in the procedure (and both read from and written to). The derives keyword provides explicit declarations of which input will be used in determining values for variables that will be output or written to. The pre- and post-conditions should be largely self explanatory.

As you can see, mostly all we are doing is making explicit exactly how we intend the procedures to operate. This specificity provides automated verification tools the information necessary to properly analyse and validate implemented code: Prior to a compile step you can run a verification and flow analysis tool that catches many subtle errors.

A similar object, specified in an algebraic specification language like CASL might look something like this:

spec Stack[sort Elem] = sort stack ops empty: stack push: stack * Elem -> stack pop: stack ->? stack top: stack ->? Elem axioms not def pop(empty) not def top(empty) forall s : stack; e : Elem pop(push(s,e)) = s top(push(s,e)) = e end

CASL offers syntax that can compact this considerably, but this longer version makes the workings of the specification more clear.

We declare a specification for a stack of generic elements, of sort stack; and with operations empty (which is in a sense the constructor creating and empty stack); push which maps a stack and an element to a "new" stack; pop a partial function (denoted by the question mark) which maps a stack to "new" stack; and top another partial function which maps a stack to an element. We then define axioms: first, as pop and top are partial functions, we say they are not defined on empty stacks; secondly we declare that pop and top should behave as (partial) inverses for push for all stacks and elements.

This is sufficient information to completely specify stacks as a universal algebra and brings a great deal of powerful mathematical machinery to bear on the problem, much of which has been coded into a set of analysis tools for CASL.

While these examples are very simple, they should give you the flavour of how formal specification can work. Both SPARK and CASL support structured specifications, allowing you to build up libraries of specification, making them scalable to very large problems.

Why Use Formal Specification?

Why use static types? Why do any compile time checking? Why produce design documents for your software? Not every project is worth writing design documents for, and some projects are better off being developed quickly using a dynamically typed language and runtime checking. Not every project really needs formal specification, but there are a great many software projects that could benefit greatly from some level of formal specification - a great many more than make use of it at present.

I would suggest that it is probably unimportant whether your paint program, music player, word processor or desktop weather applet uses formal specification. What about your spreadsheet application? Bothering with formal specification for the GUI might be a waste of time. Bothering to do some specification for the various mathematical routines, ensuring their correctness, would potentially be worth the extra trouble. Formal specification doesn't need to be used for a whole project, only those parts of it that are sensitive to error. Likewise any network services could easily benefit from formal specification on the network facing portions of the code to significantly reduce the possibility of exploits: it is far easier to audit and verify code that has been properly specified. Security software, and implementations of cryptographic protocols, are far safer if formally specified: with cryptography the protocols are often rigorously checked, and many exploits relate to errors where the implementation fails to correctly follow the protocol. Finally mission critical business software, where downtime can costs millions of dollars, could most assuredly benefit form the extra assurances that formal specification and verification can provide.

None the less, barring a few industries, formal specification has seen little use in the past 25 years. Developers and software engineers offer many excuses and complaints as to why formal specification isn't suitable, isn't required, or is too hard. Many of these complaints are founded in a poor understanding of how formal specification works and how it can be used. Let's consider some of the common complaints:

But Formal Specification is Too Much Work...

Formal specification can be as much work as you choose to make it. You can do as little as adding contracts to critical functions or classes, or you can write the entire project from the top down by progressive refinement of (and verification against) a specification for which you have created formal proofs of all important properties. There is a sliding scale from a dynamically typed script with no documentation or comments, all the way up to a completely explicitly specified and proven system. You can choose the level of specificity and verification, and you can specify as much or as little of the system as you need. Formal specification covers everything from adding contracts to a couple of critical routines to complete specification of the entire project.

Formal specification isn't necessarily significantly more work than you do now. Writing contracts is only a little more work than static typing: you are simply declaring formally what you intend the function to do, something you ought to have a clear idea of before you write the function anyway. Writing a basic specification is only a little more work that writing a design document: you are simply formalising the diagrams and comments into explicit statements. For a little extra work in the initial stages you gain considerably in debugging and testing. Extended static checking based on contracts can catch many simple errors that would otherwise go unnoticed, and debugging is significantly accelerated by the added information contracts supply about runtime errors. If you've gone to the trouble of writing a formal specification, you can statically verify properties of the system before it is even implemented. Once implemented you can validate the implementation against the specification. When performing testing you can use the specification to help determine the best coverage with the least testing.

Formal specification does work in practice, and the gains in efficiency in testing and debugging can often outweigh the potential extra overhead during design. Design by Contract, for instance, is often cited as saving time in the development process, and there are many real world success stories where systems were developed faster by using contracts. A 20 person team completed in 5 months what took a team of 100 people using standard C++ almost a year. For projects where robustness is critical, full formal specification has been used successfully. Companies like Praxis High Integrity Systems use formal methods successfully in the rail, aerospace, nuclear, finance and telecommunications industries (among others), and have sold their formal method tool-set to a variety of large companies. Reading through some of their recent work and case studies makes it clear that formal specification and verification can prove to be more efficient and faster to use.

But Formal Specification is Not Practical...

On the contrary, formal specification is used all the time in many industries, and there are various companies like Praxis High Integrity Systems, B-Core, Eiffel Software, Kind Software, and Escher Technologies who provide formal specification tools as a major part of their business.

Formal specification has proved to be practical in the real world. What it has lacked is mature tools to make development using formal specification faster and easier (try writing code without a good compiler; try doing formal specification without good static checking tools), developer awareness, and developers skilled in the languages and techniques of formal specification. It doesn't take much to learn - learning a specification language is not much harder than learning a programming language. Just as programming appears to be a difficult and daunting task to a person who doesn't know any programming languages, specification looks difficult and impractical to people who don't know specification languages. Take the time to learn and your perspective will change.

But the Project Requirements are Always Being Changed...

It is true that many projects face ever changing requirements. The question is not whether this renders formal specification useless (it doesn't) but how formal specification compares to informally specified software in such a situation.

First of all it should be noted that engaging in formal specification can help reduce the problem to begin with. By requiring you to rigorously specify at least some portions of the software, formal specification can assist in finding ambiguities, cases that were not covered, and other problems much earlier in the development cycle when changes are easiest to make. Those parts of the system, such as GUIs, that are often the target of more whimsical design or requirements changes are precisely the parts that benefit least from, and are least likely to use, formal specification. Changes to design or requirements for those parts of the system, then, are not likely to be more detrimental than for informally specified projects.

Secondly formal specification is quite capable of dealing with change, and even offers benefits in the face of changing requirements. A change in requirements results in a change in specification, but because the specification is formally written the impact of the changes can be analysed. You can check that new requirements remain consistent with the rest of the system. You can also determine all parts of the system that will be impacted by the change and avoid bugs introduced by the changes. Having a formal specification for the system when a change in requirements occurs makes it easier to see how the change will effect the system and easier to make the changes consistently and completely. Formal specification can make changes in requirements easier to deal with, not harder.

But I Already do Unit Testing...

Unit testing is fantastic, and there's no reason to stop just because you can write proofs for some aspects of your code. Unit testing is not a replacement for formal specification however. Testing only verifies the component for a very limited and specific range of cases. Specifically testing covers the cases that the developer thought of. A great many bugs, however, are those that occur from situations the developer didn't consider, and probably wouldn't have included in his test. You can improve the situation with testing tools like fuzzers, but you're still only checking a sampling of the whole space. Formal specification, on the other hand, allows construction of formal proofs that cover all possible cases. I would much rather specify and prove Pythagoras' theorem than simply assume it is true by testing it on random right angled triangles. Likewise, while testing is valuable (and easy) when assurance matters it is no replacement for proof based on a specification.

Does this mean you should give up Unit testing? In no way shape or form. Just as formal proofs in mathematics are much harder than trying a few cases, formal proofs from specifications require significant work - you most likely can't prove everything. That means the best method is prove those properties that are critical, and continue testing the system much as you normally would. Hopefully you'll see a lot less errors to debug based on the extended static checking of course.

But Formal Specification Still Can't Guarantee Perfect Software...

Correct, there are no guarantees. Complete proofs are often too complicated to perform, so theorem proving for specific properties is usually undertaken instead. Static checking can be enhanced significantly with suitable specifications, but not all errors can be caught. Your software is only going to be as stable and secure as the system it is running on. If you can't be perfect why try at all? Because significant gains can be made and the costs may be smaller than you think. Even if the whole system can't be proved, managing to prove key properties may well be all the assurance of correctness that is required for the application at hand. Even if you can't catch all the errors statically, reducing the number and severity of runtime errors can bring significant efficiency gains in the testing phase. Furthermore the cost of errors in software found after release can be sufficiently high that any reduction is worthwhile. Finally if you can't trust your compiler and operating system, then everything is compromised. At some level you have to trust something - we all trust a great deal of software right now that we expect to have flaws - reducing the source of errors to trusted components is still a significant gain.

But Formal Specification Is Only Viable for Trivial Examples...

Not even close to the truth. Formal specification and formal methods have been used for large projects involving hundreds of thousands of lines of code. Just look at many of the examples provided above: these were not small projects, and they were not toy trivial examples; these were large scale real world software systems. Examples of formal specification tend to be small simplistic cases because they are meant to be just that: easily comprehensible examples. In the same way that programming languages often use "Hello World" and quicksort as examples, formal specification languages use equally simple but demonstrative examples. Don't take this to mean that the language and method doesn't scale.

Conclusions

Formal specification isn't a silver bullet, and it isn't the right choice for every project. On the other hand you can use as much specification and verification as is suitable for your project. You can apply formal specification only to the portions that are critical, and develop the rest normally. More importantly, the techniques, languages and tools for formal specification continue to improve. The more powerful the methods and the tools, the more projects for which formal specification becomes a viable option. In an increasingly network oriented computing world where security and software assurance is becoming increasingly important, the place for formal specification is growing. If you are a software engineer it makes sense to be aware of options available to you.

This discussion has been archived. No new comments can be posted.