[R6RS] The unspecified values, multiple-value semantics and all that

Sun Nov 5 13:21:06 EST 2006

Will put this entry on the Wiki a while ago:

> Lots of unhappiness about "the unspecified value", but no evident
> consensus within community.
> 
> Will Clinger thinks there are four distinct issues here: 
> 
> - whether it is truly necessary to specify the value returned by the
>   procedures that, in the current draft, are required to return the
>   unspecified value;
> 
> - if there is going to be a specific unspecified value, is the name of
>   the procedure that returns it;
> 
> - its external representation, were it to have one (and Will thinks
>   it should if it exists);
>
> the editorial question of how the R6RS should refer to the creature, assuming it exists. 
> 
> Mike has the following to add: 
> 
> Most of the rationale for having an unspecified value would go away if
> the procedures returning it currently would return zero return
> values. This has been discussed at length among the editors, but the
> issue should probably explained to and raised with the community
> (along with the currently unspecified bit of the multiple-value spec),
> possibly using one or several polls.

It's my impression that we've taken this issue as far as we can on our
own, and (given the level of unhappiness Will is alluding to) that
it's not enough.  I would like to start a new thread on the discuss
list, polling the community on what their opinion is.  As the issue is
complicated, I'd appreciate your help drafting that post (if you agree
there should be one; otherwise, please say so).  I've drafted my take
here:

http://uighur.ccs.neu.edu:3456/r6rs-trac/wiki/TheUnspecifiedValue

Here's a copy.  Help welcome.

---
Various operations in Scheme exist only to perform side effects; they
do not have a natural return value.  Examples include `set!' forms,
calls to `vector-set!' and other data-structure mutators, as well as
calls to I/O output procedures.

Historically, these operations had to return a value before R5RS
because all operations do.  This value was unspecified in previous
versions of the report to encourage programmers to write readable
programs, and not rely on that return value.  Starting with R5RS,
another natural option for these operations is to return zero return
values.

The 5.91 draft for the R6RS specifies that these procedures return a
new "unspecified value," a value with a fixed identity.  This
tightening of the specification with respect to R5RS came from a
desire to increase portability.  Also, the "unspecified value" is
occasionally useful as a placeholder in "uninitialized" fields of data
structures, and as a marker in program text that explicitly indicates
that that no specific return value makes sense in a given context.

Arguably, this change potentially improves portability, but also
potentially degrades readability by inviting programmers to rely on
this specific value.

The design issues associated with the unspecified values are
interdependent with the semantics of multiple values.  In an
alternative language design where multiple values work as in the 5.91
draft, the operations that currently return the unspecified value
return zero return values, the following program idiom no longer
works:

(let* ((foo ...)
       (dummy (set! ...))
       (bar ...))
  ...)

Actually, the 5.91 draft also admits, but does not mandate a
semantics, where continuations created by procedure application (and,
thus, created by `let' and `let*' forms) accept any number of return
values.  In such a semantics, zero return values, when passed to such
a continuation, could be coerced to an unspecified value (or even *the
unspecified*) value.  (For more than one return value, the values
beyond the first one could be ignored, but this is less relevant to
the issue at hand.)

This semantics for multiple return values is not the semantics
currently implemented by many Scheme systems that require the number
of return values to always match the number of values explicitly
accepted by its continuation.  In such a system, any attempt to use
the return value of an operation returning zero values is an error,
and usually signalled as such.

The "zero return values" option might impact the way teaching works,
where one might want to introduce multiple return values later than,
say, assignment.  It also might impact the way the implementation is
presented and structured.  A CPS-based intermediate representation
can deal with multiple values implicitly; not so in ANF-based
representations, for example.

Here are some possible directions for these issues to take:

1. Leave things as they are.

2. Drop the unspecified value from the report, and specify the
   operations currently specified to return it to return *an*
   unspecified value.

3. Specify the operations that currently the unspecified value to
   return zero return values, and

   3a) possibly drop the unspecified value from the report.

   3b) possibly mandate that continuations created by procedure
       applications accept zero return values, coercing it to the or
       an unspecified value
       3b*) possibly also mandate that they also accept more than one
            return value, ignoring all but the first

   3c) possibly mandate that continuations created by procedure
       applications accept only exactly one return value, and raise an
       exception if zero or more than one value is passed

4. Choose option 1. or option 2. in connection with option 3b, and
   possibly in connection with 3b*.

5. Choose option 1. or option 2. in connection with option 3c.

Other questions to consider are whether the unspecified value, if it
remains in the report, should have an external representation, and if
`unspecified' is an appropriate name for it.
---

-- 
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla