[R6RS] Ticket #110: Remove double phase semantics

William D Clinger will at ccs.neu.edu
Tue Nov 28 11:13:42 EST 2006

Thanks for responding, Matthew.  It really helped to
explain where you've been coming from.  It looks like
we're in agreement on most technical issues, but do
not yet agree on what the R6RS should require.

Matthew wrote:
> >     (define-syntax m (lambda (e) 13))
> I agree that it's not clear whether this is valid. But the question is
> not whether `13' as a compile-time expression somehow produces a
> run-time 13, but whether `13' is allowed as a syntax object.
> In any case, this one should certainly work:
>  (define-syntax m (lambda (e) (datum->syntax e 13)))

Thanks for the correction.  The example without the
call to datum-syntax was simplified from an example
given by Andre, which also omitted the call.

> I think what you're saying is that `datum->syntax' and `syntax->datum'
> must know how to deal with all things that have an external
> representation, i.e., all datums.

Almost.  We have to be careful about the meanings of
"datum" and "deal with", lest we draw false conclusions
such as

> To put it another way, things with an external representation can't be
> implemented in a library, because the syntax+library system implements
> libraries, so concepts with an external representation must already
> exist before the library system exists.

The datum->syntax and syntax->datum procedures must be
able to recognize and to expose certain kinds of datum
in the forms in which those data exist at the relevant
phases.  They do not even have to know about every kind
of datum, and they certainly don't have to know about
the (potentially much larger) set of things that have
external representations.

In an extensible system written in Scheme, it is natural
to define new types dynamically, and to define external
representations for some of those new types.  It would
be a serious mistake for the R6RS to preclude libraries
that define new types along with external representations
for values of those new types.

> I don't believe that allowing sharing across phases would avoid this
> chicken-and-egg problem.

Nor do I.  *Allowing* sharing across phases doesn't do
much good; we would have to *require* it, and we would
probably have to require some other things as well.

If we can't require sharing across phases, the best we
can do is to allow it, while making sure that nothing
else in the R6RS inadvertently assumes the separated
binding semantics.  The contracts for libraries that
define new types with new external representations
(e.g. the reference implementations of the arithmetic
libraries) will just have to say they probably won't
work in systems that use separated binding semantics.

> `datum->syntax' works only on datums. It should raise an exception when
> given any kind of value that is not a datum. (I think this is implicit
> in the conventions of R6RS, since the argument for `datum->syntax' is
> called "datum", but it may make sense to be more explicit).

I object to this proposed change.  (The naming conventions
in section 5.1 of the draft R6RS do not list "datum" as
one of the metavariables that imply a type restriction,
so the restriction is not implied by the 5.91 draft.
Note also that the 5.91 draft says "datum should be a
datum value."  The word "should" is not the word "must".)

I do not object to requiring a syntax exception to be
raised when someone tries to insert a non-datum into the
output of a macro.  Implementations should be allowed to
detect that within the compiler, however; they should
not be required to enforce it within datum->syntax.  This
distinction is, alas, observable to programmers who
establish an exception handler within a macro transformer
and then call datum->syntax experimentally to see whether
it raises an exception when passed a non-datum, but there
is nothing to be gained by trying to make that kind of
nonsensical programming portable.

The reason this matters to me is that extensible systems
may want to support non-R6RS modes in which any value that
has an external representation can be used within a quoted
constant.  These systems should be allowed to place mode
dependencies within a single module (typically some phase
of the compiler), and should be allowed to use the R6RS
library and syntax-case subsystems without change even in
non-R6RS modes.  In my opinion, this would increase the
acceptance of the R6RS; if implementors were to support
R6RS features only for scripts, disabling them for other
modes, it would increase the already widespread perception
that implementors don't like the R6RS.

> > Since PLT Scheme has been using the separated binding semantics
> > for some time, one might ask why these problems have not been
> > observed in practice.  The answer, I believe, is that PLT Scheme
> > does not use the separated binding semantics consistently; it
> > escapes from that semantics by writing its basic libraries in
> > some other language (probably C or C++), and libraries written
> > in those other languages use a shared binding semantics.
> It's because the module system exists on top of of the layer that
> defines external representations.

Agreed.  The reason that matters is as I said: the layer
that defines external representations must use a shared
binding semantics.

The bottom line is that implementors who want to write
most of their code in Scheme, and allow Scheme libraries
to define new types and new external representations,
will use a shared binding semantics for libraries.

> The fact that the lower layer is in C is unfortunate, but not
> important. The fact that the low layer exists outside the library
> system is indeed important.

Let's remember, however, that the reason it is important
for the low layer to exist outside the PLT library system
is that, in PLT Scheme, the library system uses separated
binding semantics.

> But, it's also not important that the lower layer persists across
> phases. You can always separate compile time and run time in MzScheme
> --- even at the level of symbols --- by restarting MzScheme.

Sure.  Persistence isn't what matters.  What matters is
that two incompatible registries never exist within a
single execution, or if they do exist then incompatible
representations that result from incompatible registries
become resolved via semantics-preserving marshalling.

> > To fix this problem, I recommend the following.
> > 
> >   * Accept the recommendation of Ticket #110, and require
> >     implementations of R6RS Scheme to use the shared binding
> >     semantics.
> This doesn't seem practical to me. A compiler like Chicken, for
> example, would have to package compiled code with all state accumulated
> at expand time.

I give up.  What's so special about Chicken?

> Implicitly wrap a `datum->syntax' around the result of any transformer?
> That's fine with me.

It's fine with me, too, so long as the draft R6RS specification
of datum->syntax isn't changed to interfere with the modularity
of extensible systems.

> >   * Requiring that automagical transformation would not be
> >     enough to make portable libraries possible in general,
> >     but it might become possible to write portable libraries
> >     provided they don't define any record types.
> I think portable libraries are possible, and I also think it's
> important to restrict the range of `datum->syntax' to datums.

I think the draft R6RS may make portable libraries possible
in practice, if not theory, because most implementors of
extensible systems will figure out that they need to use
the shared binding semantics for libraries.  We probably
don't have to worry too much about implementors who satisfy
the R6RS in a legalistic sense just to show how badly it is
broken from a theoretical point of view.

As stated earlier, I think it is important we not restrict
the domain to datum->syntax to datums.


More information about the R6RS mailing list