Formal comment #236 (defect)

the formal semantics should be a non-normative appendix
Reported by: 	William D Clinger
Version: 	5.92

Full description of issue:

   1. The summary says that "Appendix 10 provides a formal semantics
      for a core of Scheme", but there is no appendix 10. There is
      only a chapter 10, giving the impression that chapter 10 is
      supposed to be normative.
   2. Chapter 10 describes a rewriting system, not a formal semantics.
   3. That rewriting system is incomplete and also contains several
      errors. If the rewriting system is taken as normative, then it
      must be completed, its errors repaired, and its semantic import
      clarified before the R6RS can be ratified.
   4. The rewriting system is unnecessarily complicated, which reduces
      its usefulness.

Suggestions: Move the rewriting system into an appendix that is
clearly identified as non-normative. Convert the rewriting system into
a semantics by using it to describe the observable meaning of a
program. Correct the errors and describe a semantics. Simplify.

I will discuss these four issues, and then proceed to a tediously
detailed list of the errors I happened to spot while reading chapter
10.

Normative vs non-normative

I would love to have a normative formal semantics for Scheme, but the
rewriting system in chapter 10 contains too many errors and is too
complicated for it or any quick revision of it to be considered
normative in the R6RS. It should be noted that the formal semantics of
previous reports was never considered normative, and IEEE Std
1178-1990 was explicit about the semantics being non-normative (and
not a part of the standard).

If the formal semantics were considered normative, then any conflict
between it and the rest of the document would render the language
unimplementable in theory, and would lead in practice to unresolvable
disputes about the semantics. If the formal semantics is
non-normative, then any conflicts between it and the informal
semantics will be resolved in favor of the informal semantics.  Not a
semantics

A formal operational semantics is not just a rewriting system; a
semantics must also describe a relationship between the rewriting
system and the meanings of programs. This is usually done by using the
rewriting system to define the observable effects and/or outcomes of
programs.

Chapter 10 does not define the observable effects and/or outcomes of
programs. Chapter 10 defines a set of possible execution traces for
each program, but the meaning of a Scheme program should be more
abstract than a set of execution traces. Chapter 10 should explain
that implementations of Scheme are required only to deliver the same
observable effects and/or results as one of the execution sequences,
and that programmers are responsible for ensuring that all of the
legal execution sequences deliver observable effects and/or results
that satisfy the requirements of the program.  Incompleteness and
errors

Chapter 10 makes no attempt to explain the semantics of macro
expansion or libraries. That's fine if chapter 10 is non-normative,
but a normative semantics would have to be connected to the output of
macro expansion. That would be difficult for R6RS Scheme, since the
output of macro expansion is explicitly implementation-dependent.

Chapter 10 omits the char, vector, and bytevector types of (r6rs
base), and omits almost all of the additional types that are described
in the standard libraries.

Chapter 10 requires several premature arity checks, sanctions
violations of the letrec* requirement, and both over-specifies and
under-specifies the semantics of eqv?. There are several other errors
of lesser importance, which I will list below.

Complicated

Chapter 10 is overly complicated in two distinct ways: (1) it attempts
to specify the semantics of constructs that could be (and probably
would be) eliminated during macro expansion and (2) it performs
several optimizing transformations that implementations might perform,
but are out of place in a formal semantics.

Both of these complexities have been defended on the grounds that they
make it easier for a human to follow individual execution traces, but
that misses the point of a formal semantics. If someone wants to
understand the execution traces of some particular program, they will
read the informal prose, or run the program in several
implementations, or employ a user interface that presents the
execution traces generated by the rewriting system in a form that is
easier for humans to understand. They are unlikely to look at the
actual execution traces produced by a rewriting system.

The purpose of a formal semantics is to prove theorems, not to provide
a tutorial. The unnecessary complexity of the rewriting system in
chapter 10 makes it harder to prove theorems.

The formal semantics should be simplified by:

    * dropping beginF forms (which will be eliminated by macro
      expansion)

    * dropping one-armed if (which can be converted into two-armed if
      by macro expansion)

    * requiring the body of a lambda expression to be a single
      expression (which the macro expander can easily arrange)

    * dropping the last condition of [6appN!] and dropping [6appN]
      altogether

    * dropping [6app0] 

    * dropping figure 10.13 altogether 

    * dropping [6beginF] 

The first sentence of section 10.9, "Library Top Level", says the
semantics is limited to modelling the bodies of libraries that don't
export anything and import only the primitives described by the
semantics. In other words, the rewriting system does not attempt to
shed light on the problematic issues of library semantics: macro
expansion, phases, importing the same name from different libraries,
and so on. The rewriting system does describe a semantics for
redefinition, but that semantics is (at least partly) at odds with the
prose in chapter 6, and appears to be a mere artifact of the
formalization. It would be better to drop section 10.9 altogether.
Detailed list of errors

Some of these have already been reported, but it is easier for me to
report them all. They are listed in order of occurrence, not in order
of importance. I'm sure there are other errors I failed to notice.

page 61, paragraph 1: there's a space between the c and o of "c
ontext-".

page 61, paragraph 2, last sentence: E[ep] should be E[e'].

page 61, paragraph 3, second display: missing curly braces.

page 62, section 10.1, paragraph 6, first sentence: "program" should be plural.

page 62, bullets: the bullets omit unspecified?, raise*, and
with-exception-handler.

page 63, figure 10.2, productions for PG and G: these have no
restriction on the number of values, which may be correct but looked
suspect to me.

page 63, figure 10.3: there are no productions for (dot sym) and (dot
sqv). If the formal semantics is supposed to use the convention that
those are equivalent to sym and sqv, it needs to say so.

page 63, last paragraph, first sentence: "are" should be "is", since
it refers to "a set".

page 64, last paragraph before 10.2, last sentence: "are" should be
"is".

page 64, section 10.4, first sentence: "workhorse" should be
"workhorses".

page 64, section 10.4, paragraph 1, sentence 3: "is relevant" should
be "are relevant".

boundary between pages 64 and 65: "P and E" should be "E and P".

page 65, first full paragraph, sentence 1: [6xuhee] should be
[6xunee].

page 66, second full paragraph, sentence 3: "installer" should be
"installed".

page 67, section 10.6, paragraph 2, sentence 2: it is misleading to
call pp an identifier, since the correctness of the rewriting system
depends upon pair pointers being distinct from identifiers.

page 68, figure 10.9: [6cons] and [6pair?t] refer to pp as an
individual, but the last three rules refer to pp as a set. The last
three rules are correct; [6cons] and [6pair?t] should refer to a
subscripted pp_i.

page 68, figure 10.10: [6eqt] overspecifies; for example, it says that
(eqv? + +) is true. [6eqf] underspecifies; for example, it does not
require (eqv? 3 +) to be false.

page 69, rule [6appN!]: refers to a mysterious bp, which must be fresh
but is otherwise unexplained. This may be a typo, or bp may have been
omitted from figure 10.1.

page 69, rule [6unarity]: why is this an arity mismatch instead of a
call to a non-procedure?

page 69, right column, line 4: "picks on" should be "picks one".

pages 69 and 70, paragraph that spans those pages: The two rules that
have a Greek mu in their names do not appear in any of the
figures. Presumably they should refer to [6app] and [6arity].

page 71, section 10.7, last paragraph: the first sentence of that
paragraph is not written in English.

page 72, figure 10.14, rules [6wind] and [6dwerr]: both rules mandate
premature arity checks. This is consistent with the report's prose,
but the report's prose is apparently in error also and should be fixed
in the next draft.

page 72, figure 10.14, typography: I think the script letters at the
extreme left of the last 9 lines are intended to be T, S, and R, but
it's hard to tell because they are in a different script from some
(but not all) of the T, S, and R that appear on the right hand sides
of those equations.

page 72, figure 10.14, specification of what I think is supposed to be
T: the fourth of the five lines for T overlaps the case covered by the
fifth line, creating an unintended ambiguity in the rewriting
system. This is an error, since dropping the fourth line would not
give the intended semantics. A more complete case dispatch is required
here, alas.

page 72, figure 10.14, specification of what I think is supposed to be
S, first equation: E_1 should be F_1.

page 72, first full paragraph, last sentence: what is the point of
guaranteeing that there is always some result to a program, since the
rewriting system (as it stands) does not relate said result to any
observable meaning of the program?

page 72, section 10.9, last paragraph: there is no rule named [6setf].

page 73, figure 10.15, rule [6setd]: this rule violates the letrec*
restriction.

page 73, figure 10.15, rules [6refu] and [6setu]: what is the point
(or meaning) or "format" here?

Will


RESPONSE:

The next draft will have the rewriting system in an appendix that is
clearly identified as non-normative.

The remaining suggestions will be forwarded to the authors of the formal
semantics appendix.