[R6RS] syntax-case semantics

Wed Mar 22 10:08:32 EST 2006

Mike wrote:
> I'm not sure I buy this argument that opacity is primarily an
> implementation concern.  How is the non-symbolness of
> identifiers not primarily driven by implementation?

The non-symbolness of identifiers is definitely driven by
implementation concerns.

> I know this is again naive, but the recent discussions on
> c.l.s. have shown (again) it's a view some people would like
> to hold---they don't understand why they can't just use symbols.

They *can* use symbols.  In particular, Larceny represents
identifiers by symbols, in both its high-level (R5RS) and
low-level (explicit renaming) macro systems.

The disadvantage of using symbols is that it's slow.  If
we explain to people that the macro system will run faster
if we represent identifiers by non-symbols, then most will
accept that, because it offers an advantage to programmers.

The question at hand is whether we can honestly say that
opaqueness for syntax objects in general conveys similar
advantages to programmers.

The main advantage that has been offered is that opaqueness
makes it easier for systems to correlate source positions
with macro-generated code.  The problem with this argument
is that it's pretty easy to achieve most of the same thing
using association lists---you don't even need to use hash
tables, let alone weak hash tables.

The main problem with using association lists, so far as I can
see, is with atoms.  Some have suggested that systems should
have multiple representations for the empty list, et cetera.
This creates a problem for eq? and null?, which must treat all
empty lists the same: either eq? and null? become slower, or
special versions of them must be used during macro expansion,
which implies special versions of all the R5RS procedures that
implicitly rely on them.  Having a second implementation of
the standard procedure library defeats the purpose of reusing
the standard library.

On the other hand, opaque syntax objects don't fully solve
the problem with atoms either.  A procedural macro can still
unwrap a syntax object down to the datum level, and then put
an atomic datum so obtained into a new syntax object.  That
will also lose the correlation between the original source
location of the atomic datum and its location in the output
of the macro.

In practice, I doubt whether the difference in accuracy
between opaque and non-opaque syntax objects is worth the
overhead of defining a whole set of new procedures for
taking apart syntax objects.

I would be less opposed to those new procedures if we could
keep them out of the standard runtime environment, where
they are essentially useless, and restrict them to the
environment used by procedural macro-expanders.  That gets
us into a discussion of the phasing issues.

For example:  Are procedural macros allowed to refer to the
same variables that will exist at runtime?  In particular,
are they allow to change the values of those variables?
I certainly hope not, because the order of side effects
matters, yet macro expansion time will not be well-defined
by the R6RS: we will still have systems ranging from pure
interpreters to batch compilers with separate compilation.

Furthermore, many of the same arguments we have given for
opaque syntax objects can be made against mixing the macro
and runtime environments.  In particular, the correlation
between source and object will be easier for some systems
if procedural macros are not allowed to refer to any help
procedures except for those they define locally and to
those that are defined by some subset (preferably pure!)
of the standard R6RS procedures.  (This gets back to Kent's
point about the problem of finding syntax objects hidden
in closures that escape from procedural macros.  If there
is no way for such closures to escape, the problem goes
away.)

I don't believe these issues have yet been addressed by
our discussions.  If anything, I think the proposals Kent
has set forth assume arbitrary mixing of macro and runtime
environments, which I think is a mistake.

Will