Sun Jan 25 20:46:58 EST 2004
OK, here is my list of issues to address and requirements. As you
will notice it is quite long. I kept adding items over the last few
days, so I'm sure it is incomplete, but in the interest of getting
started I think it is best to submit the list as it currently stands.
Some issues are important some others are secondary. For some items I
have included a brief motivation, others I will have to motivate if
needed and when it is appropriate. I have more-or-less ordered the
list by importance, but some issues are hard to order as they are
often interdependent (for example, for binary I/O it is useful to have
byte vectors as specified by SRFI-4 and this requires that the lexical
syntax be changed so that #f must be followed by a delimiter, to allow
#f64(...)). I do not know if we can tackle all these issues in the
design of R6RS, but lets see which issues are shared by others and
decide which are worth pursuing.
I remind you that we shouldn't start discussing these issues yet. The
first goal is to put together a comprehensive list of issues to
address by merging and organizing the lists contributed by all the
editors. Of course we can, and should, read through each of the lists
submitted to start forming an opinion.
1) Modules and libraries.
1.1) Add a module system. Modules must be containers for variable
bindings and syntax bindings. For ease of debugging it must
be possible to use a fully qualified name (such as
module:identifier or mod1:mod2:identifier if modules can be
nested) to name each exported binding. This would also
simplify implementation (i.e. it is still possible to use a
single global environment).
1.2) A small core language must be defined (containing the basic
forms and also the module system, the macro system and the
record definition form), and other functionality (strings,
numbers, ports, etc) put in standard libraries.
1.3) The specification of eval needs to be updated with respect
to modules. What does (eval '(module ...)) mean? What does
(eval '(f 1 2)) mean (i.e. which module does f come from)?
2) Non-hygienic macros.
Replace syntax-rules by a non-hygienic macro system (syntax-case
or something like it). Moreover, the macro system must allow
redefining the expansion of primitive expressions such as variable
references, assignments, procedure call, etc (something like
MzScheme's approach should be considered).
3.1) Add a record definition special form. New record types must
be distinct from all other types. It must be possible to
define a record type that can later be used to define a
record type that is a subtype of that type, by single
inheritance (by default a record type definition would not
allow extension). Moreover, record definitions would be
generative by default, but non-generative definitions must be
possible (they are needed to allow two programs to exchange
records). Non-generative definitions would specify a
"globally unique identifier" (GUID) that is used in testing
membership to the type, and that is separate from the name of
the type used in the source code.
3.2) Add a pattern matching special form that allows destructuring
records as well as the predefined types. This form should allow
expressions to be evaluated in the patterns. This would allow
"case"-like expressions that use symbolic constants (this is one
of the big problems with "case").
3.3) A standard external representation for (generative) records
should be adopted, such as
Generative records can be output with "write", but can't be
read back with "read". Non-generative records could use a
different syntax that can be written out and read back, such
#record-type-name(#type(...);<-- this type descriptor includes a GUID
3.4) It would be great to standardize on the structure of the type
descriptor, so that records can be exchanged between
implementations of Scheme. One issue is that implementations
may extend the record definition facility in various ways
(which would mean an extended type descriptor in that
implementation). This shouldn't be a problem if the type of
these type descriptor records is a subtype of the standard
type descriptor record. Note that type descriptors are
probably circular structures, so we would need to standardize
on a syntax for this, such as #n# and #n=<datum>.
3.5) Remove multiple values. Their usefulness is questionable
when record types are available, and the syntax to use them
is clumsy (I find that for the code I write explicit CPS is
more readable and flexible!).
Add cond-expand as specified in SRFI-0.
5.1) Add -0., +0., -inf., +inf. and +nan., and extend the numeric
primitives to behave properly on them (in particular
(eqv? -0. +0.) => #f even though (= -0. +0.) => #t, and
(rational? +inf.) => #f even though (real? +inf.) => #t, and
the same for +nan.).
5.2) Add bitwise operations on exact integers (i.e. bitwise-and,
arithmetic-shift, integer-length, etc).
6) Optional and keyword parameters.
6.1) Add keyword objects. Keywords are essentially like symbols
but they self-evaluate. They are written with a trailing colon.
6.2) Add optional and keyword parameters, as specified by the
DSSSL standard. Keyword parameters are particularly useful
when a procedure has many optional parameters. The semantics
should be fairly obvious from this example:
> (define (f a b #!optional (c 10) (d (+ a c)) #!key (e #f) (f d))
(list a b c d e f))
> (f 1 2)
(1 2 10 11 #f 11)
> (f 1 2 3)
(1 2 3 4 #f 4)
> (f 1 2 3 55 f: 99)
(1 2 3 55 #f 99)
7) Lexical syntax extensions.
7.1) Add #\#d1000 syntax for characters (i.e. #\ can be followed
immediately by an exact integer with an explicit prefix).
7.2) Extend the named characters and escape characters in strings.
The escape characters should cover those of C and Java, in
particular "\n" is the newline character.
7.3) Add the |...| syntax for symbols. The symbol's name
corresponds verbatim to the characters between the vertical
bars except for escaped characters. The same escape
sequences as for strings are permitted except that
doublequote does not need to be escaped and the vertical bar
needs to be escaped (in other words the function of the
doublequote and the vertical bar characters is interchanged
with respect to the string syntax).
7.4) Multiline comments: #| ... |# .
7.5) Circular structures: #n=<datum> and #n# .
7.6) Uninterned symbols: #:g0 .
7.7) #!eof is the end-of-file object.
7.8) Require that #f, #t and characters be followed by a delimiter. This
would eliminate problems with the syntax of homogeneous numeric
vectors (SRFI-4), which use the syntax: #f64(1.0 2.0 3.0)
8) Homogeneous numeric vectors.
These are specified in SRFI-4. They are useful, among other things,
to do binary I/O and pass data to other languages using an FFI.
9) String ports.
Add open-input-string, open-output-string, get-output-string,
10.1) Require that strings and characters, as well as source-code,
use Unicode. Cleanup char-alphabetic?, char-upcase, etc so
that they respect the Unicode standard (or drop them
altogether if there are complications such as introducing
"locales", etc). The string comparison procedures, such
as string<?, need to be reconsidered because lexicographic
string comparison is dependent on culture (in French for
example "élément" < "elle", even though the first characters
compare "é" > "e" in the latin-1 character encoding; actually
the same thing happens in English for foreign words that have
been adopted into the English language).
10.2) For I/O, adopt a set of encodings/decodings from Scheme
strings to/from bytes (i.e. latin-1, utf-8, ucs-2, ...). The
encoding should be attached to the port when it is open, and
it should be possible to change the encoding while the port
10.3) Make the syntax case-sensitive, i.e. (eq? 'INVERT 'invert) => #f.
This is useful for interaction with other programming
languages that are case-sensitive. Moreover, the meaning of
"case conversion" is dependent on culture (in Turkish for
example, the lowercase variant of "I" is an "i" with no dot,
which is a specific Unicode character different from "i").
11) Binary I/O.
Distinguish between character ports (such as string ports and
ports associated with text files) and byte ports (where the unit
of I/O is the byte). Note that some character ports, such as
ports associated with text files, are also considered byte ports
(there is an encoding/decoding of the bytes to/from characters and
the encoding is attached to the port). Procedures such as
read-byte and write-byte should be considered for byte ports.
Also needed are bulk transfer procedures that read/write a
sequence of bytes into/from a buffer (of type u8vector).
Define a type for continuations that is separate from procedures,
and operations to capture and invoke continuations of this type.
For example: continuation-capture, continuation-call, and
continuation-return. The procedure call-with-current-continuation
can be defined in terms of this type for backward compatibility.
13) Dynamic environment.
Define more precisely the meaning of dynamic environment (as used
by current-input-port, etc). Adopt an API to define new dynamic
variables, access them and bind them. Parameter objects are one
approach. I would suggest a type separate from procedures.
Parameter objects could be defined in terms of this type.
Define the semantics of escaping and returning to a "before" and
15) transcript-on / transcript-off
Let's remove these procedures from the standard.
Add the with-exception-handler and raise procedures, and require that
runtime errors raise exceptions.
Add the box type.
18) Uninterned symbols.
Add the procedures gensym, string->uninterned-symbol and
19) Semantics of internal definitions.
The letrec* form should be added, and internal definitions should
expand into a letrec*. This would make internal definitions and
toplevel definitions closer semantically.
More information about the R6RS