Software Correctness

If I were to choose one overarching philosophy for how software should be developed, it would be to program for correctness — that is, do everything practical to reduce errors.

Introduction

There are many attributes we may want in geophysical software:

  • Developed quickly
  • Runs quickly
  • Easy to use
  • Well documented
  • Correct
  • Maintainable
  • Portable
  • Feature rich
  • Technically advanced

All of these are worthy goals, but if you had to pick one to be the first priority in your software-development strategy, which would it be?

My choice would be correctness. Without this, all other items are moot. What good is software that is easy to use or runs fast if it produces the wrong answer? But in a way we don’t have to choose. Programming for correctness will with time improve most other items on the list, foremost the speed of development.

But a note on what I am NOT referring to. “Programming for correctness” is often associated with formal proofs of software correctness. While this field attracted much attention in the early years of computer science, it hasn’t had noticeable impact on the average programmer, and I suspect it never will.

So what do we mean by correct software? In day-to-day programming, it might mean software that:

  1. Compiles.
  2. Completes normally for at least one set of input.
  3. Completes normally for common forms of input.
  4. Completes normally for all valid input.
  5. Gives the correct answer for at least one set of valid input.
  6. Gives the correct answer for common forms of valid input.
  7. Gives the correct answer for all valid input.
  8. Gives the correct answer for all valid input, and properly handles all invalid input.

Too often developers achieve level 3 and declare the software to be working. We should set our sights higher.

Software correctness is particularly important for researchers. How many excellent ideas were abandoned because they weren’t implemented correctly, coding errors being mistaken for weaknesses in the algorithm? And there are many cases of new software working great for the data set it was developed with, but never giving good results after.  Were there hardwired assumptions in the code that made it incorrect for all but the first data set?

But writing correct software is difficult. To achieve it, you must use every tool at the your disposal, coming at the problem from many directions at once. Even then, you will not eradicate all errors — that is probably not possible. Improvement, not perfection, is the aim here.

Programming allows you immense freedom, but you must not use it. Writing quality software requires method and discipline so as not to make an enormous mess. This is called “software engineering”, a term coined in the 1960s when people realized that software was far harder to develop and maintain than they initially imagined.

The early literature on software engineering was accessible and rewarding to the everyday programmer, and is still worth reading today. In particular, structured programming (limit yourself to certain control structures, and avoid others like the dreaded “go to”), structured design (build modules with high functional cohesion and low data coupling), and structured analysis (design and analyze software systems by how data flows through them) were all easily applied, and significantly improved the quality of software.

By the 1980s, software engineering had blossomed into a full-blown academic field, and the literature became increasingly abstract and decreasingly relevant to the everyday programmer. Some of it was certainly valuable for the architects of large systems, but to most programmers, software engineering was morphing into “Here are the 172 things you must do before you write a line of code.” In the last two decades, this trend has reversed somewhat with the introduction of light-weight styles such as agile development. Regrettably, this has sometimes been used as an excuse to return to undisciplined hacking.

My interest has always been in the practical everyday things that programmers can do to improve their products, without burdening them with layers of mind-numbing bureaucracy. So how can we improve the correctness of code? Here are some ideas:

  • Modularity
  • Assertions
  • Testing
  • Write it twice
  • Code reading
  • Libraries
  • High-level languages
  • Automatic code analysis
  • Source configuration management
Modularity

Without modularity — that is to say, writing software in small independent functions or classes– forget it. You have little chance of producing sound software. Modularity is the foundation for quality.

There are three main styles for sound modular development:

  • Structured design
  • Object-oriented programming
  • Functional programming

Structured design says you develop modules with high coherence and low coupling. High coherence means that a module is singular in purpose.  It does one thing, with no side effects that are detectable by the rest of the software. Low coupling means that a module communicates with the rest of the software in a way that is simple, minimal, and through formal interfaces. Enemy #1 in data coupling was the Fortran COMMON block, or in C the static variable.

The second approach is object-oriented programming. When it first became popular in the 1990s, some over-excited fans claimed that it was an utterly different way of thinking, and that companies should fire their old programmers because their minds were forever warped by past habits. This was a bone-headed suggestion, and I hope the advice was not followed. Those who practiced structured design recognized that object-oriented design was a natural extension of it. Such people were likely to make fine object-oriented programmers.

Just because you work in an object-oriented language and regularly define classes doesn’t mean you are an object-oriented programmer. Sound object-oriented design produces many small classes whose member functions have simple interfaces and serve to hide data and decisions from the rest of the software. Abstraction — “programming to an interface” — is the order of the day. Too many programmers, however, write massive, complicated classes whose long lists of data members, freely and directly accessed by large blocks of code, are indistinguishable from Fortran COMMON blocks.

I strongly recommend object-oriented over structured design, but either approach is fine provided you do it well.

There is a third wave of software methodology crashing on our shores: functional programming, an old idea that is now gaining a lot of attention.  It has some appealing features that, according to many, enhance correctness. Early reviews suggest that some hybrid of object-oriented and functional programming will be the development method of choice for the future.

Assertions

An assertion causes a program to abort if a Boolean expression evaluates to false. It guarantees that a certain condition holds at a certain point in the code — that is to say, it tests for an invariant. The most common use is to check that function arguments have allowable values (preconditions). Less common is to ensure that a block of code gives the expected output (postconditions):

int greatestCommonDivisor (int m, int n)
  {
    assert m > 0 && n > 0;                 // Arguments valid?
    int mOrig = m, nOrig = n;
    while (m != n)                         // Euclid's algorithm 
        if (m < n)
            n -= m * ((n-1)/m);
        else
            m -= n * ((m-1)/n);
    assert m > 0;                          // Here m is > 0 and
    assert mOrig%m == 0 && nOrig%m == 0;   // divides the input
    return m;
  }

Few things are as simple and powerful as assertions. The concept has been greatly expanded, and is an integral part of “programming by contract”. It is well worth studying in detail.

But be warned. Assertions are designed to catch coding errors. Problems with user or data input should be handled in a more controlled and user-friendly manner.

One of the sillier ideas is that assertions should be turned off once the code is in production, on the assumption that the code is now correct and assertions just slow things down. Nonsense on both counts. Released software  is almost never bug-free. In fact, production often reveals new problems that were missed during testing. Further, assertions rarely take a significant amount of run time. Most require trivial computation, as in the example above. The alternative to triggering an assertion in production is having the program crash somewhere in the code well removed from the error, or give incorrect results, perhaps without being detected. Neither are inviting prospects.

Testing

The purpose of testing is to find as-yet-undiscovered deficiencies in the software. It can be broken down into three types:

  • Unit (Are there errors in this class or function?)
  • Integration (Are there errors in how these classes or functions work together?)
  • System (Does the program or system as a whole have errors? Does it meet specifications?)

These are huge topics that can not be adequately covered here, but if I were to recommend a place to begin, it would be with unit and integration testing supported by frameworks like cppunit or junit.

The impact on the quality of software and speed of development is immense. It is not possible to produce quality software without extensive unit tests. These returns compound when you  start to design software so that it CAN be easily unit tested.

Write it twice

Suppose you have written a fast but complicated algorithm. How do you test it? Write a slow brute-force version, which is often easy to do. Then compare the two using thousands or millions of generated test cases.

Below we unit test the above greatest common divisor function against a brute-force version within the junit framework:

@Test
public void testGCD () 
  {
    for (int m = 1; m <= 1000; m++)
    for (int n = 1; n <= 1000; n++)
        assertTrue (greatestCommonDivisor (m, n) == slowGCD (m, n));
  }
 
int slowGCD (int m, int n)
  {
    for (int k = Math.min (m, n); k > 1; k--)
        if (m%k == 0 && n%k == 0) return k;
    return 1;
  }
Code reading

Testing isn’t enough. There are many errors that you could test all day and never detect. You just didn’t think to try them. Reading code was something early programmers learned to do well. Computer time was scarce, so they inspected their code closely before wasting machine resources.

The best time to read code is first thing in the morning the day after writing it, when you’re relaxed and patient and can view the code with fresh eyes. Reading code in printed form gives a different perspective than viewing it on a screen. You will almost always end up improving the code.

But first, the code has to be written in a style meant to be read. Modularity is critical. An 800-line function is near impossible to comprehend. A group of small functions with simple, well-defined purposes and interfaces is intellectually manageable.

Libraries

The surest way to avoid new bugs is to not write new code. Reusable software libraries — both external and those you develop yourself — are a God-send for correctness and speed of production. Every programmer knows this. Very few act like they know this. The exigencies of the current project never seem to allow time to bring in outside software, or develop reusable libraries yourself. And let’s face it — developing high-quality reusable code that’s easy for others to understand and use is difficult and time consuming. The payoff, however, is enormous. Anytime you find yourself writing or copying over the same code repeatedly, ask yourself if it belongs in a library. If the answer is yes, do it.

High-level languages

Programming languages are improving. Ever more tasks that the programmer had to handle are now being done by the language itself. Writing container structures and sorting-related algorithms mostly disappeared with the introduction of C++’s Standard Template Library. Memory leaks, uninitialized memory, and undetected array-bound violations — constant irritants in C and C++ — mostly disappeared with Java. People noticed how a few pipelined Unix commands, often written on a single line, could do the work of hundreds of lines of C code,  inspiring the development of TCL, Perl, and the cascade of scripting languages that followed. Mathematical languages such as Matlab let us treat vectors and matrices more like the abstract entities that they represent.

We need to program at a higher level in languages naturally suited to the application, leaving tedious, error-prone details for the language to handle.

Automatic code analysis

Most popular languages have freely available automatic tools that can critique your code, identifying possible problems or improvements with little effort on your part. It’s like having a really sharp programmer reviewing your code for free. Typically the lower the level of the programming language, the more automatic analysis you need. For Fortran, C and C++, for example, it’s near essential for quality software.

There are two types of code analysis: static, which inspects your source code, and dynamic, which finds problems during execution. Both are worth having. Profiling is a type of dynamic analysis which identifies where CPU and other resources go. Although its principle aim is to help you speed up execution, it’s surprizing how often profilers find programming errors.

Source configuration management

One of the greatest sources of errors is not knowing what software is on the system and how it has changed. Often software problems get solved but not put in production because the programmer failed to update the system properly. And many times a new bug has appeared and the programmer is left wondering “This used to work, so what happened? Did the system change?”

Source configuration management systems help solve these problem. If you don’t have one, or if you have one but it’s not used consistently, or it is not inextricably tied to production releases, you need to make changes.

Finally…

Geophysical software is often written by people with backgrounds in the earth sciences or engineering rather than computer science. As a result, they often have few software-engineering skills. If that’s true in your case, don’t fret. Pick a single idea from above that seems inviting and run with it. Become an expert by practicing and learning about it. Spread the methodology to your associates. Then, perhaps many months later, pick another idea and run with that.

Stewart Trickett © 2019 – 2020

The above article is copyrighted by Juniper Bay Software Ltd. under the Creative Commons Attribution 4.0 International Public License. Briefly, this means you are free to distribute, remix, adapt, and build upon this work, even commercially, as long as you credit the author for the original creation.

4 thoughts on “Software Correctness”

  1. Can I simply just say what a comfort to discover somebody
    who genuinely understands what they’re talking about on the net.

    You certainly understand how to bring an issue to light and make it important.
    A lot more people need to check this out and understand this side of the story.
    I was surprised you are not more popular since you most certainly possess
    the gift.

Leave a Reply

Your email address will not be published. Required fields are marked *