[R6RS] new syntax srfi draft, reference implementation

dyb at cs.indiana.edu dyb at cs.indiana.edu
Fri Jun 16 15:35:01 EDT 2006


Mike,

I tried to think of a way to move forward without responding to your "bit
of a rant", as you put it, but I haven't thought of one, so here goes. 
I've tried to be diplomatic, but it's not easy to be diplomatic in
responding to rants.  Please forgive me if I'm not diplomatic enough, and
try to respond without further ranting if possible so that we can come to
some sort of closure on this and get the syntax SRFI out the door.

>> [Kent]
>> Although I've tried to be as even handed as possible, I'm afraid
>> you'll be unhappy because the resulting text doesn't support the
>> conclusion you want, which is that flushing internal define-syntax
>> will somehow eliminate the need to specify the expansion algorithm.

> [Mike]
> You keep putting that into my mouth, but I never said that.  (In fact,
> I repeatedly said that I never said that.)  So, to be perfectly clear:
> I want R6RS to specify the expansion algorithm completely and
> precisely.

I don't recall even thinking that this was your reason for not liking
internal define-syntax before the brief exchange that led to your your
Issue text.  I thought your reason was some generic concern about
complexity, although I never understood why you thought internal
define-syntax adds complexity.

In any case, I'll try to explain where I got the impression after the
exchange began.  First, in the srfi-83 discussion you suggested I
reference (http://srfi.schemers.org/srfi-83/mail-archive/msg00081.html),
you claimed that define-syntax requires one to "explain the expansion
algorithm" and implied that this was a bad thing:

> So, with internal DEFINE-SYNTAX, you get more generality (DEFINE-SYNTAX
> can occur wherever DEFINE occurs), but you also have to explain the
> expansion algorithm, ...

My impression was strengthened by your Issue text.  The text started out
with:

> Allowing `define-syntax' in local environments incurs subtle issues
> with the interaction between shadowing and macro expansion.  For
> example, inconcistencies arise between the behavior of internal
> `define' and internal `define-syntax'.

then presented a pair of examples, and continued with:

> The semantics of expressions like these are highly sensitive to the
> particular operational specification of macro expansion, and Scheme
> implementations have historically used different such formulations.

then presented a reference to back this up, and continued with

> Similar issues already arise in the context of R5RS Scheme, but are
> much harder to trigger there.  Internal `define-syntax' aggravates
> them.

In other words, the text claims that internal define-syntax is "highly
sensitive" to the operational semantics of macro expansion while internal
define is not.  I inferred from this that, in your opinion, we can get by
without specifying the semantics operationally, i.e., without specifying
the expansion algorithm, if we leave out internal define-syntax.

> If my issue had been about this, I would have written so
> explicitly.  Also for the record:  Whatever the issue is, I don't want
> to support conclusions---I want to stimulate debate.

You didn't say explicitly what your purpose in writing was, so I was
forced to draw my own inferences.  As far as wanting to stimulate debate,
that begs the question.  You could stimulate debate on many issues---why
this one?

> Now, you also completely rewrote what I had written to be about the
> need to specify an expansion algorithm, including changing the
> examples, which I had carefully constructed to be about something
> else.

Okay, let's look at the examples in the context of the Issue text that
surrounds them:

> Allowing `define-syntax' in local environments incurs subtle issues
> with the interaction between shadowing and macro expansion.  For
> example, inconcistencies arise between the behavior of internal
> `define' and internal `define-syntax'.
>
> As an example, consider the following expression:
> 
> (let-syntax ((foo (syntax-rules ()
>                     ((foo) 'outer))))
>   (let ()
>     (define a (foo))
>     (define-syntax foo
>       (syntax-rules ()
>         ((foo) 'inner)))
>     a))
>
> As the right-hand side of the definition of `a' is expanded only after
> the inner macro definition for `foo' has been collected, the result is 'inner.
> 
> Next, consider this expression:
> 
> (let-syntax ((foo (syntax-rules ()
>                     ((foo ?x) (define ?x 'outer)))))
>   (let ()
>     (foo a)
>     (define-syntax foo
>       (syntax-rules ()
>         ((foo ?x) (define ?x 'inner))))
>     a))
>
> Again, the first form in the let refers to a macro definition named
> `foo', definitions for which occur in places analogous to the first
> example.  However, this time (foo a) gets expanded before the inner
> definition for the `foo' macro has been seen---and hence uses the
> outer macro.  Consequently, this expression returns 'outer.

>From the lead-in text, I expected the examples to illustrate some subtle
issue with internal syntax definitions, and/or some inconsistency between
internal define-syntax and internal define.  So it's confusing that the
only differences between the two examples involve whether the macros
introduce a whole variable definition or just the RHS.  Given this, they
actually seem to illustrate a subtle difference with internal *variable*
definitions rather than with internal syntax definitions.

Indeed, it turns out that the internal syntax definitions are essentially
red herrings.  If we eliminate them by replacing them with internal
variable definitions, here's what we get.

(let-syntax ((foo (syntax-rules ()
                    ((foo) 'outer))))
  (let ()
    (define a (foo))
    (define foo (lambda () 'inner))
    a))                              ;=> undefined variable foo

(let-syntax ((foo (syntax-rules ()
                    ((foo ?x) (define ?x 'outer)))))
  (let ()
    (foo a)
    (define foo (lambda (x) 'inner))
    a))                              ;=> outer

The outcomes still differ, and the new pair still appears to illustrate a
subtle issue with internal variable definitions.  In any case, neither
pair of examples seems to me to illustrate a more subtle issue than the
other, and neither illustrates an inconsistency between internal syntax
definitions and internal variable definitions.

On the other hand, if your Issue text had included the first of each pair
above, this might have illustrated the inconsistency you were hoping to
illustrate.  Probably not, since that inconsitency is a direct consequence
of the fact that the macro foo in the first example is defined at
expansion time, obviously before the reference on the RHS of
(define a (foo)) is evaluated, while the variable foo in the second
example is not defined until after the attempted reference at run time.
This kind of inconsistency arises between let and let-syntax bindings.

I rewrote the examples several times in order to try to illustrate a
deeper issue with internal syntax definitions relating to shadowing, but
discovered each time that the same issues could just as easily be
illustrated with internal variable definitions.  In retrospect, this makes
sense because the scoping rules for all identifiers are the same.

In the end, I decided that there actually was a subtle and troublesome
issue:  the way the expansion algorithm treats essentially bogus examples
like the second of each pair above.  So that's the issue I decided to
address.  I could have used the two pairs of examples above, but I rewrote
them to make them more parallel.  I also rewrote them to avoid the
implication that the possibly shadowed binding was necessarily in the
immediate vicinity of the shadowing binding---where it is less likely to
cause confusion---by positing that the example code appears somewhere
within the scope of a keyword binding for foo.  (I think most people would
agree that shadowing problems are more subtle when they involve distant
bindings, such as bindings imported from a separate library.)

I've since addressed this "bogus keyword reference" issue, incidentally,
in the current draft of the SRFI and reference implementation.  The
algorithm now raises an exception in cases like the second of each pair of
examples above where a keyword referenced during the expansion of one
definition is redefined by the same or a subsequent definition in the same
body.  This is basically an expand-time version of the letrec*
restruction.

> The result is actually an interesting and worthwhile writeup,
> but it is not at all about what I had in mind.  Possibly I wasn't
> explicit enough, so let me try again (you'll forgive me for a bit of a
> rant, I hope):

I forgive you, but I hope you'll try not to rant in the future.

> I am worried about the complexity of macro expansion.  It is already
> complex and subtle in R5RS, and what we've decided on for R6RS already
> raises the bar significantly.  As I pointed out, adding internal
> `define-syntax' doesn't really introduce completely new issues, but
> new combinations of old issues with, say, shadowing.

I don't believe that it does.  The scoping (including shadowing) rules are
the same for variables as for keywords; in fact, they are the same for all
identifiers, and while internal syntax definitions can do strange things
to obscure scoping, so can other forms of keyword bindings.

> (My original text explicit referred to shadowing, but you deleted that.)

I may not have used the word "shadowing" but the word appears in the R5RS
paragraph I quoted, and discussions about an internal definition of foo
"within the scope of a keyword binding for foo" are certainly about the
concept of shadowing.

> This makes it much easier to write programs that expose those issues,
> and are sensitive to subtle aspects of the operational behavior of macro
> expansion.

I'm not convinced.  Your text and examples do not demonstrate that it is
"much easier" or even "easier" to expose those issues with internal syntax
definitions.

> Consequently, it is a legitimate question of whether the added
> complexity is worth the benefit.  I haven't seen convincing (to me,
> that is) examples of a significant benefit.  You claim that the added
> complexity is negligible, but I disagree: I'm getting the impression
> that many programs you deem simple I see as complex.

One measure of complexity is the number of lines of code required to
implement a construct.  In the reference implementation, the handling of
internal syntax definitions adds 10 lines of code out of a total of 1273
lines of code for the expander plus derived syntax definitions.  The
expander would probably be a couple of hundred lines shorter if I had not
been concerned about portability, but even still the code for handling
internal syntax definitions is less than one percent of the code.  Yes, I
do claim that this is negligible.

In fact, eliminating internal define-syntax might actually make the final
expander longer and more complex.  We can probably use a common helper for
handling both library and lambda bodies.  Eliminating internal
define-syntax would make the common helper, and hence the expander, more
rather than less complex, since the helper would have to treat
define-syntax differently depending upon the context.

There is also more than one way to look at complexity.  C is less complex
to understand and implement because of its restrictions.  It is not less
complex to use.  For example, the statement/expression dichotomy and the
lack of a binding expression (let) prevents programmers from writing
expression macros that create their own temporary bindings.  This forces
the programmer to write more awkward, complex code.

A Scheme analogy I mentioned during our April 11 phone call is that
macro-generating macros must come in two favors if syntax definitions can
appear only at the library level:  one that expands into syntax
definitions for the library level, and one that expands into let-syntax or
letrec-syntax forms in other contexts.  In other words, even if a language
without local syntax definitions were easier to understand and implement,
it would force the programmer to write more complex code.

> Consider that you've been immersed in the field of macro expansion for
> something like 15 years, while others are still struggling, if the
> volume of email on c.l.s or the PLT mailing list on the subject is any
> indication.

The volume of email could just mean that a lot of people are using
internal define-syntax.  It could also mean that the change to the new
expansion algorithm was long overdue.  I've certainly seen a reduction in
questions about body expansion since I made the change in Chez Scheme.

> Now, you've been pounding on this sentence in R5RS:
> 
> > Programming languages should be designed not by piling feature on
> > top of feature, but by removing the weaknesses and restrictions that
> > make additional features appear necessary.
> 
> ... arguing that it effectively mandates allowing internal
> `define-syntax'.

You left out the most relevant part of the paragraph:

  with no restrictions on how they are composed

I don't know if I've "pounded" on it, but I do see this as a valid
argument in favor of including internal define-syntax unless we have a
compelling reason not to do so.

> I would like to put a different spin on it: I see
> the complexity of macro expansion as a weakness in the sense of that
> sentence.  (It certainly makes the restriction on internal
> `define-syntax' appear necessary to me.)  Yet, we've spent very little
> effort on reducing or removing that weakness.  (For example, I'm
> reasonably sure there are ways to address the top-level shadowing
> problem with keywords.)  I'm not primarily a macrologist, and
> certainly not the expert you, Matthew, or Will are, so I feel on very
> unsure footing here, and you can argue rings around me on anything I'm
> likely to come up with.

The macro system may be complex, but you're picking on the wrong part. 
Internal define-syntax isn't complex---it accounts for only a few lines of
text in the SRFI and adds only minimally to the size of the reference
implementation.  And, as I pointed out on April 11 and again above, the
language would be more complex to use without it.

> Your kind of argument puts us on a very slippery slope: "Just because
> the problem basically exists in R5RS already, it's OK to exacerbate it
> in R(5+x)RS."

I have never said such a thing, and it would not apply to this situation
anyway.  In my view, the only problem is an inadequate specification in
R5RS.  It can be fixed by making the specification adequate.  It can also
be fixed by eliminating both internal variable and internal syntax
definitions (curing the disease by killing the patient).  It cannot be
fixed by eliminating just internal syntax definitions.

Furthermore, while internal syntax definitions can make some programs
harder to understand (and should be avoided in such cases), they can make
other programs easier to write and understand.  The benefits of internal
syntax definitions and of consistency between lambda and library bodies
outweighs, IMO, the additional complexity, if any.

> My impression of the historical process is that, in
> Scheme, internal `define' was added before macros---it wasn't a
> problem then, just a bit of syntactic sugar.  Then macros were added,
> and, because nothing was done about internal `define', the screw was
> turned a bit more.  Now, the next bit for R6RS.

Your impression is wrong.  Many Scheme systems (including perhaps all of
those represented by the revised-report authors) had macro expanders at
the time internal define was adopted as an optional feature, and by the
time it was required by the IEEE standard, we were actively working on
syntax-rules and various low-level hygienic macro expanders---including
syntax-case.

> [Polemical aside: This is what seems to have happened to templates in
> C++.  The basic idea was there, and they just kept turning the screws
> in the directions they had been turned in the past.  Look what
> happened.]

Many smart people think templates are great, including colleagues who work
down the hall from me, although they acknowledge that the syntax forced on
them by C++'s restricted syntactic abstraction facilities is terrible.

> And while I'm on a roll here: It' not just the order expansion that
> worries me.  It seems at least three different interpretations of
> hygiene are possible with `syntax-case', again with subtle differences
> that affect real programs.  It worries me that the description in the
> current document refers to a marking algorithm, but in no way really
> defines what its notion of hygiene is, or how the specification agrees
> with that notion.  (As, for example, the Clinger/Rees paper does.)  In
> fact, the current semantics of `syntax' seems inconsistent with the
> notion as I understand it.

Okay, let's do a better job nailing down what it means.  I believe that
the syntax SRFI is already more detailed and precise than some of the
other R6RS SRFIs, but it can certainly be improved.

> I don't think it'll be a catastrophe if we go ahead for R6RS, but I
> thought very long and very hard about this issue over a period of many
> months, and I'm still unhappy it.  So I insist on at least bringing
> the issue out in the open.

Fair enough.  If you're still not convinced, we have two options.  Either
we can work together on some text for an Issue that we both agree isn't
slanted one way or another, or you can open up your own thread on the SRFI
discussion when it comes out.

Kent



More information about the R6RS mailing list