Wednesday, March 9, 2011

WoDet 2011, ParaSail, and back to the future with Value Semantics

This past Sunday WoDet 2011, the 2nd Workshop on Determinism and Correctness in Parallel Programming was held in Newport Beach, CA.  It was a very interesting event, with a number of excellent talks and a lively debate at the end on the topic of whether current programming models are sufficient given the right tools, etc., or whether we need fundamentally new models to support programming going forward in a multi-core world. 

After listening to the various talks at the workshop, and while thinking about the debate topic, it seemed that ParaSail aligned nicely with an underlying thread in the research community moving toward, or perhaps returning to, what might best be called value semantics.  A number of the talks described one variant or another of multi-version concurrency control, which at its heart involves giving different threads their own private copy/version/snapshot of an object, and then dealing with merging any updates at specific later points, possibly rolling back one of the threads if no meaningful merge is possible.  This is in contrast to an approach where multiple threads share access to a single object, and use some kind of locking or "careful" lock-free operations to make direct updates to the shared object.  The multi-version approach results in fewer interactions between the threads, fewer synchronization points, and hence generally higher throughput in a highly parallel world. 

Looked at another way, the multi-version approach means that objects are being passed around by value rather than by reference, and updating the original object is done after all the new "values" have been generated independently and then appropriately merged (perhaps using some kind of parallel merging process).  This isn't going all the way to pure functional programming, because updates can eventually happen, but the bulk of the processing is performed on object values, with no underlying sharing.

If we go all the way back to the original versions of BASIC, a la Kemeny and Kurtz or Microsoft CP/M BASIC, pointers didn't exist, but there were strings, and you could create them and concatenate them and assign them, and you never by mistake overwrote memory because you concatenated into a too-short buffer, nor did you ever have two variables unintentionally referring to the "same" string in memory, such that changing one would unintentionally alter what you had in the other (even though the implementation might choose to share the space until one of them was altered). 

These kinds of value-semantics strings still exist in many scripting languages, but they have essentially disappeared from compiled languages in the years since BASIC was widely used (and compiled, in some cases).  In almost all modern compiled languages, strings are manipulated by reference, though in some languages they are immutable which to some extent skirts the issue of by-value vs. by-reference semantics.  And certainly when you get to larger, more abstract objects, by-reference semantics rules, with only the simplest of "struct" objects being manipulated by value.  And even when declared objects have value semantics, parameters generally revert to by-reference semantics, again introducing the possibility of unintended aliasing between two distinctly named objects.  (Or if by-value semantics are available for parameter passing (as in C++), it can result in "slicing" of the object, so that it is truncated from its original run-time type to the compile-time type of the formal parameter.)

Unintended aliasing due to by-reference semantics can cause nasty (but still reproducible) bugs in any sequential language that uses by-reference semantics, but in a parallel language, such bugs can become race conditions, which can result in hard-to-reproduce, timing-dependent bugs.

In any case, the net effect of the various ParaSail rules that restrict aliasing, eliminate reassignable pointers, disallow global variables, etc. is that ParaSail provides value semantics for all non-concurrent objects, of arbitrarily complex types.  Note that this does not imply that objects need to be copied when passed as a parameter.  Rather the ParaSail rules allow non-concurrent objects to be passed by reference or by copy, with the same results. This allows efficient by-reference parameter passing when the call stays within the same address space, but allows transferring the object between address spaces, such as might be needed by a call between a CPU and a GPU.  This essentially value semantics eliminates a large class of common bugs, and makes it straightforward for the compiler to detect and disallow at compile-time any possible race conditions.

As implied by the growing interest in multi-version concurrency control in the research community, a return to the world of value semantics might be a good answer for the multi-core challenges.

Tuesday, March 1, 2011

Streaming in Parasail -- A Unix-like pipeline

We have been considering how streaming can best be supported in ParaSail.  By streaming we mean having a series of components each of which reads inputs from a stream and writes outputs to a stream, with each component running in parallel with the others.  The most natural seems to be to define an abstract Stream_Component interface which takes an Input_Stream and Output_Stream type and declares an operation Transform which reads one or more input elements from the Input_Stream, transforms the input(s) in some way, and then writes one or more output elements to the Output_Stream.   An actual component would be defined by some concrete module that implements this abstract interface, with an appropriate constructor function that provides whatever arguments are needed to specify or control the transformation.

This might be the abstract Stream_Component interface:
abstract interface Stream_Component
  <Input_Stream<>; Output_Stream<>> is
    procedure Transform
      (Args : Stream_Component;
       Inp : ref var Input_Stream; 
       Outp : ref var Output_Stream);
end interface Stream_Component;
The Input_Stream and Output_Stream interfaces might look like this:
abstract concurrent interface Input_Stream
  <Element_Type is Assignable<>> is
    function Get(Stream : queued var Input_Stream) -> optional Element_Type;
end interface Input_Stream;

abstract concurrent interface Output_Stream
  <Element_Type is Assignable<>> is
    procedure Put
      (Stream : queued var Output_Stream; Element : Element_Type);
    procedure Close(Stream : queued var Output_Stream);
end interface Output_Stream;
We now want to put together two or more stream components into a pipeline. Let us presume we define a Pipeline interface:
interface Pipeline
  <Pipe_Input is Input_Stream<>;
   Pipe_Output is Output_Stream<>> 
  implements Stream_Component<Pipe_Input, Pipe_Output> is
    operator "|"
     (Left is Stream_Component
        <Pipe_Input,
         Output_Stream<Internal_Type is Assignable<>>;
      Right is Stream_Component
        <Input_Stream<Internal_Type>,
         Pipe_Output>)
     -> Pipeline;
    procedure Transform
      (Args : Pipeline;
       Inp : ref var Pipe_Input; 
       Outp : ref var Pipe_Output);
end interface Pipeline;
The pipeline "|" operator takes two stream components and produces a Pipeline stream component, feeding the output of the first stream component into the input for the second stream component, by using an IO_Queue:
concurrent interface IO_Queue<Element_Type is Assignable<>> 
  implements Input_Stream<Element_Type>, Output_Stream<Element_Type> is
    function Create(Buffer_Size : Univ_Integer := 100) -> IO_Queue;
    function Get(Queue : queued var IO_Queue) -> optional Element_Type;
    procedure Put
      (Queue : queued var IO_Queue; Element : Element_Type);
    procedure Close(Queue : queued var IO_Queue);
end interface IO_Queue;
When constructing a pipeline from two stream components, the output stream of the first must have the same Element_Type as the input stream for the second, so that they can be opposite ends of the IO_Queue. The result pipeline has the same input stream as the first, and the same output stream as the second. The implementation of the pipeline Transform procedure would construct an IO_Queue object and then pass it to parallel invocations of the Transform procedures of the two stream components:
class Pipeline is
    const Left : Stream_Component+;
    const Right : Stream_Component+;
  exports
    operator "|"
     (Left is Stream_Component
        <Pipe_Input,
         Output_Stream<Internal_Type is Assignable<>>;
      Right is Stream_Component
        <Input_Stream<Internal_Type>,
         Pipe_Output>)
     -> Pipeline is
      return (Left => Left; Right => Right);
    end operator "|"; 
    procedure Transform
      (Args : Pipeline;
       Inp : ref var Pipe_Input; 
       Outp : ref var Pipe_Output) is
        var Queue : IO_Queue := Create();
      then
        Transform(Left, Inp, Queue);
      || 
        Transform(Right, Queue, Outp);
    end procedure Transform;
end class Pipeline;
If we have a set of components with both input and output streams being streams of characters, then the "|" operator is essentially equivalent to the Unix shell's "|" pipe command. To write a "program" that is compatible with such a pipeline, we define a module that implements Stream_Component, with Input_Stream and Output_Stream both having Element_Type of Character<>. Any constructor function within such a module may be used to provide the parameters to the program. The Transform procedure of the module is the one that actually performs the action of the program. Alternatively, we can be somewhat more flexible, and allow the streams to have higher level element types, such as words or lines.

For example, here is a trivial "element count" program, which allows the input element type to be any sort of thing. The output element type is presumed to be some Character type:
interface Elem_Count
  <Input_Stream<>, Output_Stream<Character<>>>
  implements Stream_Component
    <Input_Stream, Output_Stream> is
    function Args()->Elem_Count;  // takes no parameters
    procedure Transform
      (Args : Elem_Count;
       Inp : ref var Input_Stream; Outp : ref var Output_Stream);
       // Prints a count of the elements in Inp on the Outp stream
end interface Elem_Count;

class Elem_Count is
  exports
    function Args() -> Result : Elem_Count is
      return;
    end function Args;
    procedure Transform
      (Args : Character_Count;
       Inp : ref var Input_Stream; Outp : ref var Output_Stream) is
       // Prints a count of the elements in Inp on the Outp stream
       var Count : Univ_Integer := 0;
       while Get(Inp) not null loop
         Count += 1;
       end loop;
       const Answer : String<Output_Stream::Element_Type> := To_String(Count);
       // Copy the answer to the output stream one character at a time
       // (in all probability, we would actually use an output stream
       //  with an operation that takes a string of characters directly).
       for each C of Answer forward loop
          Put(Outp, C);
       end loop;
       Close(Outp);
    end procedure Transform;
end class Elem_Count; 
In a more realistic example, the Args constructor function would take one or more arguments, which it would store as fields of the stream component object. The Transform procedure would refer to the arguments using the Args parameter, which is an instance of the stream component object.

Presuming we create a number of stream components like Elem_Count, we could construct a pipeline like the following:
Read::Args("my_file.txt") | Break_Into_Words::Args() | Elem_Count::Args()
This would presumably read my_file.txt producing a stream of characters, break it into a stream of words, and then count the words and produce the count as a string on the output stream of characters. With a little syntactic sugar by some kind of ParaSail "shell" this could become:
Read my_file.txt | Break_Into_Words | Elem_Count
This is a little bit of a "cheat" as we would have to pass in the "Context" or equivalent to "Read" to provide it access to the underlying file system. The Context argument could presumably be supplied automatically as part of the syntactic sugar provided by the "shell" if desired.