[R6RS] rationale for sealed record types

William D Clinger will at ccs.neu.edu
Wed May 3 19:09:02 EDT 2006


I have been asked to write a rationale for the "sealed"
feature of SRFI 76, which has already been withdrawn.
This rationale is related to issues 35 and 37, on which
we just voted by email and telephone conference call:

35. flush sealed clause in records

    Vote: abstain, yes, yes, yes, yes

    Outcome: REVOTE

37. allow record? to be true of built-in types

    Vote: no, yes, no, yes, yes

    Outcome: yes

****************************************************************

First, let me quote one of the better statements of a rationale
for omitting the "sealed" feature.  The following is from
http://srfi.schemers.org/srfi-76/mail-archive/msg00103.html

    To: Michael Sperber <sperber at informatik.uni-tuebingen.de>
    Subject: Opacity considered harmful [was Re: Revised SRFI-76 (R6RS Records) draft]
    From: Tony Garnock-Jones <tonyg at kcbbs.gen.nz>
    Date: Sat, 07 Jan 2006 10:29:52 +0000
    Cc: srfi-76 at srfi.schemers.org
    
    Michael Sperber wrote:
    > Without opacity and with reflection, a client who gets its hands on a
    > record can get at its fields, thus breaking the abstractions provided
    > by the maker of that record.
    
    I see this as an unqualified benefit.
    
    Sealed types and opaque records seem very similar to final classes and
    private members from the Java world. I have never had *any* benefit in
    my professional programming work from sealedness/finality, nor from
    opacity/privateness; quite the opposite: every time I have even
    *noticed* that something is either final, or private, it has been
    because I'm trying to correct for an error in some library code or
    trying to extend it in some direction unforeseen by the author. (A
    similar objection applies to non-virtual methods in languages like C#.)
    
    Sealedness and opacity have given me nothing but trouble, and have on
    occasion cost my clients real money.
    
    Just a thought!
    
    Regards,
      Tony
    
    PS: Opacity can be recovered in Scheme, should it be absolutely
    necessary, by building capabilities out of lambdas.

This argument was supported by an implementor of Scheme, in
http://srfi.schemers.org/srfi-76/mail-archive/msg00105.html

    > Sealedness and opacity have given me nothing but trouble, and have on
    > occasion cost my clients real money.
    >
    
    I absolutely and strongly support this!
    
    Building up insurmountable walls to avoid peeking into
    data-structures originates in my experience mainly from
    psychological reasons, it's practical use being minor or
    nonexistant. Reflection and transparent data, on the other
    hand, have very much practical use, even if it's just for
    debugging.
    
    cheers,
    felix

Against this anti-rationale, I will describe a situation in which
it is appropriate for the author of a record declaration to use
the sealed feature, and in which the author of that declaration
cannot use procedures instead.

****************************************************************

As the outcome of issue 37 shows, several R6RS editors regard
records as, potentially, a basic feature that can be used to
implement the Scheme language itself.  Without bothering to
argue the point, as most of the editors have experience with
implementation of Scheme and also with implementation of other
applications, I will state that the use of records to implement
Scheme is not all that different from the use of records or
similar features (e.g. sums/structs/classes) to implement other
applications.  As can be seen from the two emails quoted above,
some of those who oppose the sealed feature have based their
arguments on this similarity.

Let us therefore consider how records might be used to implement
some feature of Scheme, such as pairs.  One possible definition
of pairs as records is (untested):

    (define pair-type-descriptor
      (make-record-type-descriptor 'pair #f #f #f #f
       '((mutable car) (mutable cdr))))

    (define cons
      (make-record-constructor-descriptor
       pair-type-descriptor
       #f
       (lambda (new) (lambda (x y) (new x y)))))

    (define pair? (record-predicate pair-type-descriptor))

    (define car (record-accessor pair-type-descriptor 0))
    (define cdr (record-accessor pair-type-descriptor 1))

    (define set-car! (record-mutator pair-type-descriptor 0))
    (define set-cdr! (record-mutator pair-type-descriptor 1))

This might well suffice for a meta-circular interpreter of R6RS
Scheme, which might well use the same record facility that is
used by interpreted programs.

Now consider what will happen if the meta-circular interpreter
becomes popular enough to justify the creation of a compatible
compiler.  By then, many programmers will have discovered that
they can define new record types that inherit from pairs, and
will have used this ability to create extended pairs of many
different kinds.  That makes it surpassingly difficult for the
compiler to generate efficient code for such basic operations
as car and cdr.

The problem is that car and cdr will be irrevocably polymorphic.
While the compiler writer can expend arbitrarily large amounts
of effort on flow analysis to detect special cases in which the
arguments to car and cdr can only have been created by the cons
procedure, and these flow analyses will work admirably on small
benchmarks, they will fail (conservatively) on most uses of car
and cdr in most of the large programs that contain even modest
extensions to the pair record type.  A compiler that works well
on small programs but generates inefficient code for large
programs is not likely to be an effective compiler.

****************************************************************

This problem arises in ordinary applications as well.  When the
implementor of an abstract data type chooses to represent that
ADT by a Java class type, and allows one of the classes that
represent the ADT to be exposed and subclassed, then the ADT
is no longer abstract.  Its implementors must expose enough
information to allow for effective subclassing, and must commit
to enough of the representation to allow those subclasses to
continue to work even as the ADT evolves.

In practice, the ADT cannot evolve freely over time.  Its
representation cannot be changed without breaking most of its
subclasses.  This kind of brittleness has been the ruin of many
large programs.  While it is true that some programmers do not
have enough experience to have seen the disastrous consequences
of this brittleness, or have seen the consequences without
understanding the causes, their inexperience is not a valid
argument against language features that can be used to create
software that evolves more gracefully over time.

Our meta-circular interpreter provides a concrete example of
this.  The implementors of the meta-circular interpreter and
its companion compiler cannot change the representation of pairs
to some more efficient representation without breaking all of
the programs that extended the pair type.

****************************************************************

Returning to the problem of adding an effective Scheme compiler
to an implementation that has defined pairs as records:  What
could the implementor of the original meta-circular interpreter
have done to prevent this problem from arising?

The implementor could have made pair records opaque via

    (define pair-type-descriptor
      (make-record-type-descriptor 'pair #f #f #f #t
       '((mutable car) (mutable cdr))))

and also have taken pains to prevent the pair-type-descriptor
from being exposed to interpreted programs.  Then interpreted
programs would not be able to use the reflective facilities of
SRFI 76 to extract the pair-type-descriptor from a pair, and
would not be able to extend the pair record type in any way.

On the other hand, this means that the record? procedure would
have returned #f when given a pair argument.  Judging by the
outcome of issue 37, a majority of the R6RS editors somehow
feel that it is somehow appropriate for pairs to be exposed as
records.  I myself do not understand any rationale for that,
but let us suppose some rationale actually exists, and that it
had applied to the meta-circular interpreter, so that it would
not have been appropriate to make pairs opaque.

What else then could the implementor of the original
meta-circular interpreter have done to prevent the eventual
compiler from encountering a serious inefficiency?

The implementor could have made pair records transparent but
sealed, via

    (define pair-type-descriptor
      (make-record-type-descriptor 'pair #f #f #t #f
       '((mutable car) (mutable cdr))))

This would have worked just fine.

****************************************************************

But suppose the sealed feature were not available.  How then
could the original implementor have implemented pairs as records
without making them opaque, and without allowing programmers to
extend the pair type in ways that would make it impossible for
the eventual compiler to generate efficient code for car and cdr?

I don't know.  I doubt whether it could be done.  The implementor
could not have represented pairs as procedures, because the R5RS
(and presumably the R6RS) require pairs and procedures to be
disjoint; in any case, it is hard to see how car and cdr could be
made efficient when operating on pairs that are represented by
procedures.

That is a rationale for sealed records.

Will



More information about the R6RS mailing list