Sunday, April 10, 2011

Writing preconditions and postconditions incrementally

Once you start enforcing pre- and postconditions at compile time, as in ParaSail, some interesting things start to happen.  Unless programmers are very familiar with writing pre- and postconditions as a matter of course, chances are that the code will be written initially with few if any pre/postconditions.  However, presuming that the language-provided abstractions, such as Arrays, already have some preconditions (such as you cannot index into an array with an index that is out of the array bounds), the compiler pretty quickly identifies places where the programmer will have to write some of their own preconditions, constraints, invariants, etc.  For example, consider the classic Stack abstraction.  This might be written in ParaSail as follows, initially without any pre/postconditions:
interface Stack
  <Component is Assignable<>; 
   Size_Type is Integer<>> is
    function Create(Max : Size_Type) -> Stack;
    function Top(S : ref Stack) -> ref Component;
    procedure Pop(S : ref var Stack);
    procedure Push
      (S : ref var Stack; 
       X : Component);     
end interface Stack;

class Stack is
    const Max_Len : Size_Type;
    var Cur_Len : Size_Type;
    type Index_Type is Size_Type {Index_Type in 1..Max_Len};
    var Data : Array<optional Component, Indexed_By => Index_Type>;    
  exports
    function Create(Max : Size_Type) -> Stack is
      return (Max_Len => Max, Cur_Len => 0, Data => [.. => null]);
    end function Create;

    procedure Push(S : ref var Stack; X : Component) is
      S.Cur_Len += 1;
      S.Data[S.Cur_Len] := X;
    end procedure Push;
      
    function Top(S : ref Stack) -> ref Component is
      return S.Data[S.Cur_Len];
    end function Top;                

    procedure Pop(S : ref var Stack) is
      S.Cur_Len -= 1;
    end procedure Pop;     
end class Stack;
This all looks pretty straightforward until we try to compile it, and the compiler begins to complain at various places. For example, we blithely add one to S.Cur_Len in Push without any apparent concern about overflow. Then we blithely use the incremented S.Cur_Len as an index into S.Data, again with no apparent concern for the index being out of bounds.

The ParaSail compiler is never blithe, and will complain because the Array abstraction imposes a precondition on the indexing into S.Data, requiring that the index be in the range of Index_Type (1..Max_Len). It will also complain because the increment of S.Cur_Len might cause it to exceed the maximum Size_Type value (this precondition is from the "+=" operation of the Integer abstraction).

We could resolve these problems by adding a meaningful constraint on the range of values of Cur_Len (namely {Cur_Len in 0 .. Max_Len}) and adding a precondition on the Push operation ensuring that Cur_Len < Max_Len before any call on Push.  However, Cur_Len and Max_Len are not exported in the Stack interface, so we need to provide functions that reveal their value, say "Count" and "Max_Stack_Size."  So our new Stack abstraction would look like this:
interface Stack
  <Component is Assignable<>; 
   Size_Type is Integer<>> is
    function Max_Stack_Size(S : Stack) -> Size_Type;
    function Count(S : Stack) -> Size_Type;
    function Create(Max : Size_Type) -> Stack;
    function Top(S : ref Stack) -> ref Component;
    procedure Pop(S : ref var Stack);
    procedure Push
      (S : ref var Stack {Count(S) < Max_Stack_Size(S)}; 
       X : Component);
end interface Stack;

class Stack is
    const Max_Len : Size_Type;
    var Cur_Len : Size_Type {Cur_Len in 0..Max_Len};
    type Index_Type is Size_Type {Index_Type in 1..Max_Len};
    var Data : Array<optional Component, Indexed_By => Index_Type>;    
  exports
    function Max_Stack_Size(S : Stack) -> Size_Type is
      return S.Max_Len;
    end function Max_Stack_Size;
    
    function Count(S : Stack) -> Size_Type is
      return S.Cur_Len;
    end function Count; 

    function Create(Max : Size_Type {Max > 0}) -> Stack is
      return (Max_Len => Max, Cur_Len => 0, Data => [.. => null]);
    end function Create;

    procedure Push
      (S : ref var Stack {Count(S) < Max_Stack_Size(S)}; 
       X : Component) is
      S.Cur_Len += 1;
      S.Data[S.Cur_Len] := X;
    end procedure Push;
      
    function Top(S : ref Stack) -> ref Component is
      return S.Data[S.Cur_Len];
    end function Top;                

    procedure Pop(S : ref var Stack) is
      S.Cur_Len -= 1;
    end procedure Pop;
end class Stack;
So with these changes, Push should now compile. However, we are going to run into similar problems in Top and Pop, and we will need a precondition on both of those of {Count(S) > 0}.  Top also presents a more subtle problem.  The Data array is defined as being an array of optional Element_Type, but we are returning S.Data[S.Cur_Len] with a return type of simply Element_Type, implying that we somehow know that S.Data[S.Cur_Len] is not null.  So how do we know that?  Well, we know that Push only pushes non-null elements onto the stack (because the parameter type for X is a non-optional Element_Type.  But Data starts out all null, and when looking only at Top we have no way of knowing that S.Data[S.Cur_Len] is not null.

What we need is some kind of class invariant which ensures that when we get into Top, S.Data[S.Cur_Len] is not null.  What might that invariant look like?  What we know about a stack is that all of the elements that have been pushed onto the stack are non-null, but the slots in the stack past Cur_Len might be null if they have never been used.  So our invariant would seem to be:
   {for all I in 1..Cur_Len => Data[I] not null}
The basic requirement for a class invariant is that, presuming it is true on entering any externally visible operation, it is true again on exiting that operation.  It might be false in the middle of some operation, but it is preserved overall by all visible operations.  Furthermore, for any operation that creates an object of type Stack, the returned object must satisfy the invariant.

So let us check whether the above proposed invariant is true upon creation of a Stack object, and is preserved by operations that modify a Stack.  Create produces a new value with Cur_Len == 0, so the invariant is trivially true, since there are no values in the interval 1..Cur_Len.  Push and Pop modify a Stack object.  Pop is trivial since it is decrementing Cur_Len, so the interval 1..Cur_Len is shrinking.  Push is not trivial, because it is adding one to Cur_Len.  But if we presume the invariant holds on entry to Push, then the only new value to worry about is S'.Data[S'.Cur_Len], where S' represents the value of S after the Push operation is complete.  Since S'.Data[S'.Cur_Len] is set to X, and X is not null, our invariant is preserved.

Note that we might have tried an invariant of simply {Data[Cur_Len] not null}, which would have satisfied Top, and would be preserved by Push, but it would not be preserved by Pop, since without the more sophisticated invariant, Pop would know nothing about S'.Data[S'.Cur_Len].  It might be null if the invariant only claims that the very top element is non-null.

So after adding the class invariant, and the various preconditions, we can now get the Stack abstraction to compile.  However, it now becomes time to write some code that tests the Stack abstraction.  What we quickly discover is that the compiler rejects any call on any Stack operation with a precondition, since there are no postconditions that would allow it to know whether Count(S) > 0 or Count(S) < Max_Stack_Size(S).  At the call site there is no peeking into the class that defines the interface.  The compiler must prove the preconditions of the call looking only at information in the interface.  And without any postconditions, the compiler is stuck.

So now we have to start adding postconditions to the operations.  If we do a Create and then immediately try to do a Push onto the created stack object, our first job is to prove that Count(S) < Max_Stack_Size(S).  This implies we need a postcondition on Create that indicates the values of Count() and Max_Stack_Size() for the result of Create.  We could write this as follows:
   {Max_Stack_Size(Create) == Max; Count(Create) == 0} 
What this says is that the result of Create (within a postcondition, we can use the name of the creating function to represent the result) has a Max_Stack_Size that corresponds to the passed in parameter Max, and starts out with a Count of zero.  No big surprise here, but without such a postcondition the compiler would be pretty much at a loss.  We also might want to impose a precondition on Max of {Max > 0} just so the stack is at least of some use.

Next question we might have is what happens to Count as a result of Push.  This brings up an interesting question that is sometimes called the frame problem.  How do we know what might be written by an operation -- what is the write frame?  (We might also be interested in exactly what might be read by an operation -- what is the read frame -- but that is less relevant as far as checking preconditions.)  Since the Push operation has S as a ref var parameter, we could assume the worst and assume any function of S' (the value of S after Push) might return a different value, so after Push we know pretty much nothing about Count(S') and Max_Stack_Size(S').  Alternatively, we could take the view that if there is no postcondition, then nothing is changed. 

For ParaSail we are (tentatively) adopting a model that for any function of S mentioned in any precondition of an operation on a Stack object S, if such a function is not mentioned in a postcondition then it is presumed unchanged.  Now that we have mentioned Count() and Max_Stack_Size() in the preconditions of Push, we are duty bound to write postconditions any time the value of Count() or Max_Stack_Size() is changed.  This obligation will be checked when we compile the Stack class, so having written some preconditions, the compiler is going to start complaining because Count(S') is clearly going to be different from Count(S) after the call on Push.  Similar obligations will apply to Pop.  So now we will need to add a postcondition to Push of {Count(S') == Count(S) + 1} and a postcondition to Pop of {Count(S') == Count(S) - 1}.  We don't need to say that {Max_Stack_Size(S') == Max_Stack_Size(S)} because as indicated, our default assumption is that anything used in any precondition, but not mentioned in a postcondition, is presumed to be unchanged by the operation.  Again, the compiler will check that when it compiles the code for the operation.  If we don't want to say anything about the new value other than it is not the same as the old value, we could say {Some_Property(S') != Some_Property(S)}.  That won't help the compiler with proving later preconditions, but it will at least satisfy the obligation to mention the changed property in a postcondition.

So if we continue this process until we make the compiler happy, both with the Stack module itself, and with our various test programs, we will probably end up with a relatively fully annotated Stack like the following:
interface Stack
  <Component is Assignable<>; 
   Size_Type is Integer<>> is
    function Max_Stack_Size(S : Stack) -> Size_Type;
    function Count(S : Stack) -> Size_Type;

    function Create(Max : Size_Type {Max > 0}) -> Stack 
      {Max_Stack_Size(Create) == Max; Count(Create) == 0};
    
    procedure Push
      (S : ref var Stack {Count(S) < Max_Stack_Size(S)}; 
       X : Component) {Count(S') == Count(S) + 1};

    function Top(S : ref Stack {Count(S) > 0}) -> ref Component;
    
    procedure Pop(S : ref var Stack {Count(S) > 0}) 
      {Count(S') == Count(S) - 1};
end interface Stack;

class Stack is
    const Max_Len : Size_Type;
    var Cur_Len : Size_Type {Cur_Len in 0..Max_Len};
    type Index_Type is Size_Type {Index_Type in 1..Max_Len};
    var Data : Array<optional Component, Indexed_By => Index_Type>;    
  exports
    {for all I in 1..Cur_Len => Data[I] not null}   // invariant for Top()

    function Max_Stack_Size(S : Stack) -> Size_Type is
        return S.Max_Len;
    end function Max_Stack_Size;
    
    function Count(S : Stack) -> Size_Type is
      return S.Cur_Len;
    end function Count; 

    function Create(Max : Size_Type {Max > 0}) -> Stack
      {Max_Stack_Size(Create) == Max; Count(Create) == 0} is
      return (Max_Len => Max, Cur_Len => 0, Data => [.. => null]);
    end function Create;

    procedure Push
      (S : ref var Stack {Count(S) < Max_Stack_Size(S)}; 
       X : Component) {Count(S') == Count(S) + 1} is
      S.Cur_Len += 1;
      S.Data[S.Cur_Len] := X;
    end procedure Push;
      
    function Top(S : ref Stack {Count(S) > 0}) -> ref Component is
      return S.Data[S.Cur_Len];
    end function Top;                

    procedure Pop(S : ref var Stack {Count(S) > 0}) 
      {Count(S') == Count(S) - 1} is
        S.Cur_Len -= 1;
    end procedure Pop;
end class Stack;
One thing to note is that the compiler has not forced us to create postconditions that ensure all the nice properties of Stacks, such as after Push(S,X), then Top(S') == X, or if Top(S) == A, then after Push(S, X) and Pop(S'), Top(S'') == A.  These might be called correctness preconditions as opposed to merely safety preconditions, and they are up to the programmer to add if they see fit.  We have considered having a second class annotation in ParaSail which is not required to be provable at compile-time, but is designed to aid in correctness proofs, and provide additional debugging support.  We have set aside a syntax of doubly nested {{...}} for such correctness preconditions, but we aren't worrying about them more than that at this point.