Thursday, May 20, 2010

ParaSail without pointers?

We are working on defining the semantics for pointers in ParaSail.  As described in an earlier entry, we plan to use region-based storage management rather than more fine-grained garbage collection.  However, there is the question of how pointers are associated with their region, and more generally how pointers are handled in the type system.  We are considering a somewhat radical approach:  effectively eliminate most pointers, replacing them with generalized container indexing and expandable objects.  Why bother?  Because unrestricted use of pointers can create storage management complexity, including both dangling references and storage leakage, and because pointers create aliasing of variables, where there are multiple ways to get to the same variable, interfering with any attempts to use handoff semantics to avoid unsynchronized access to a (non-concurrent) variable between threads.

By generalized container indexing, we mean using an abstract notion of index into an abstract notion of container.  The most basic kind of container is an array, with the index being an integer value or an enumeration value.  Another kind of container is a hash table, with the index (the key) being something like a string value.  We could imagine that a single container might have multiple kinds of indices, in the same way that a relation in a relational DBMS can have multiple indices.  So for example, a hash table might have an index that identifies directly the node in the hash table using something that is the moral equivalent of a pointer, though it can presumably survive the node being deleted and will properly indicate that the node no longer exists, rather than simply acting as a dangling reference.  If we go to the extreme of a generalized region of heterogeneous objects as the container, and the index identifies the object within the region, we are very close to a pointer.  However, we would still require that such an index correctly handle deletion, indicating that the object no longer exists, rather than simply dangling.

Although indexing can also produce aliasing, it is very well defined and can be easily checked for at the point of the indexing.  Two expressions that index into the same container are aliased if and only if the reference into the container determined by the two indices identify the same element of the container.  For example, A[I] and A[J] are aliased only if I == J.  Or Table["abc"] and Table["def"] are aliased only if the two keys reference the same element of Table.

By expandable objects, we mean treating a declared but not initialized object (including a component of another object) as a kind of stub into which a value can be stored.  Even an initialized object could be overwritten via an assignment which might change the size of the object.  Finally, an initialized object could be set back to its stub state, where the object doesn't really exist anymore.  We discussed a related issue in a past entry on ParaSail references.

Expandable objects eliminate a common use of pointers as a level of indirection to support objects of variable or unknown size.  We propose to provide qualifiers optional and mutable on objects to indicate that they are expandable.  The qualifier optional means that the object starts out null, but may be assigned a non-null value.  The qualifier mutable means that the object may be assigned multiple values of different sizes during its lifetime.  If both are given, then the object starts out null, can be assigned non-null values of varying size during its lifetime, and can be assigned back to null.

Another source of pointers is for back pointers, that is pointers back to the enclosing object (or to the previous node in a doubly-linked list).  We propose to provide something analogous, where a component of an object has the equivalent of a (potentially null) reference to its enclosing object.  In this way a binary tree structure can be constructed by an object with Left and Right optional subobjects, plus an encloser reference to the enclosing object, if any.  Similarly a linked list could be represented as an object with an optional Next subobject, and an encloser reference to the Prev node/enclosing object.  Some kind of move or swap operation would be useful in addition to assignment to support rearranging a tree or linked list structured in this way.

Note that it is perfectly safe to use conventional pointers to objects that are no longer changing (or where any remaining dynamic parts are within concurrent subobjects), and that are known to outlive the object containing the pointer.  So ParaSail should probably allow such pointers, as they introduce no race conditions, nor any other safety issues.

2 comments:

  1. If we use expandable subobjects to represent, for example, the branches of a tree, then we need a syntax for walking such an object tree without presuming we are loading and storing pointers. Instead, we are establishing a reference to an object, and then establishing a reference to a subobject.

    By-reference parameter passing is essentially establishing a reference, and we have proposed using "=>" for named parameter association. So perhaps "=>" might also be used for establishing a reference:

      ref var X => Root;
      ref var Left_Subtree => X.Left;

    Using our generalized "for" loop syntax (see subsequent post on generalized loops), this might be:

      for X => Root then X.Left || X.Right
       while Exists(X) loop
        Process(X);
      end loop;

    Using the explicit "continue loop with ..." syntax, this could be:

      for X => Root
       while Exists(X) loop
        Process(X);
       || continue loop with X => X.Left;
       || continue loop with X => X.Right;
      end loop;

    ReplyDelete
  2. If we translate "=>" as "refers to" then this seems to read fairly naturally.

    ReplyDelete