[R6RS] Goals

Sun Jan 25 20:46:58 EST 2004

OK, here is my list of issues to address and requirements.  As you
will notice it is quite long.  I kept adding items over the last few
days, so I'm sure it is incomplete, but in the interest of getting
started I think it is best to submit the list as it currently stands.
Some issues are important some others are secondary.  For some items I
have included a brief motivation, others I will have to motivate if
needed and when it is appropriate.  I have more-or-less ordered the
list by importance, but some issues are hard to order as they are
often interdependent (for example, for binary I/O it is useful to have
byte vectors as specified by SRFI-4 and this requires that the lexical
syntax be changed so that #f must be followed by a delimiter, to allow
#f64(...)).  I do not know if we can tackle all these issues in the
design of R6RS, but lets see which issues are shared by others and
decide which are worth pursuing.

I remind you that we shouldn't start discussing these issues yet.  The
first goal is to put together a comprehensive list of issues to
address by merging and organizing the lists contributed by all the
editors.  Of course we can, and should, read through each of the lists
submitted to start forming an opinion.

Marc

 1) Modules and libraries.

    1.1) Add a module system.  Modules must be containers for variable
         bindings and syntax bindings.  For ease of debugging it must
         be possible to use a fully qualified name (such as
         module:identifier or mod1:mod2:identifier if modules can be
         nested) to name each exported binding.  This would also
         simplify implementation (i.e. it is still possible to use a
         single global environment).

    1.2) A small core language must be defined (containing the basic
         forms and also the module system, the macro system and the
         record definition form), and other functionality (strings,
         numbers, ports, etc) put in standard libraries.

    1.3) The specification of eval needs to be updated with respect
         to modules.  What does (eval '(module ...)) mean?  What does
         (eval '(f 1 2)) mean (i.e. which module does f come from)?

 2) Non-hygienic macros.

    Replace syntax-rules by a non-hygienic macro system (syntax-case
    or something like it).  Moreover, the macro system must allow
    redefining the expansion of primitive expressions such as variable
    references, assignments, procedure call, etc (something like
    MzScheme's approach should be considered).

 3) Records.

    3.1) Add a record definition special form.  New record types must
         be distinct from all other types.  It must be possible to
         define a record type that can later be used to define a
         record type that is a subtype of that type, by single
         inheritance (by default a record type definition would not
         allow extension).  Moreover, record definitions would be
         generative by default, but non-generative definitions must be
         possible (they are needed to allow two programs to exchange
         records).  Non-generative definitions would specify a
         "globally unique identifier" (GUID) that is used in testing
         membership to the type, and that is separate from the name of
         the type used in the source code.

    3.2) Add a pattern matching special form that allows destructuring
         records as well as the predefined types.  This form should allow
         expressions to be evaluated in the patterns.  This would allow
         "case"-like expressions that use symbolic constants (this is one
         of the big problems with "case").

    3.3) A standard external representation for (generative) records
         should be adopted, such as

         #record-type-name(field1-name: field1-value
                           field2-name: field2-value)

         Generative records can be output with "write", but can't be
         read back with "read".  Non-generative records could use a
         different syntax that can be written out and read back, such
         as:

         #record-type-name(#type(...);<-- this type descriptor includes a GUID
                           field1-name: field1-value
                           field2-name: field2-value)

    3.4) It would be great to standardize on the structure of the type
         descriptor, so that records can be exchanged between
         implementations of Scheme.  One issue is that implementations
         may extend the record definition facility in various ways
         (which would mean an extended type descriptor in that
         implementation).  This shouldn't be a problem if the type of
         these type descriptor records is a subtype of the standard
         type descriptor record.  Note that type descriptors are
         probably circular structures, so we would need to standardize
         on a syntax for this, such as #n# and #n=<datum>.

    3.5) Remove multiple values.  Their usefulness is questionable
         when record types are available, and the syntax to use them
         is clumsy (I find that for the code I write explicit CPS is
         more readable and flexible!).

 4) cond-expand

    Add cond-expand as specified in SRFI-0.

 5) Numbers.

    5.1) Add -0., +0., -inf., +inf. and +nan., and extend the numeric
         primitives to behave properly on them (in particular
         (eqv? -0. +0.) => #f even though (= -0. +0.) => #t, and
         (rational? +inf.) => #f even though (real? +inf.) => #t, and
         the same for +nan.).

    5.2) Add bitwise operations on exact integers (i.e. bitwise-and,
         arithmetic-shift, integer-length, etc).

 6) Optional and keyword parameters.

    6.1) Add keyword objects.  Keywords are essentially like symbols
         but they self-evaluate.  They are written with a trailing colon.

    6.2) Add optional and keyword parameters, as specified by the
         DSSSL standard.  Keyword parameters are particularly useful
         when a procedure has many optional parameters.  The semantics
         should be fairly obvious from this example:

         > (define (f a b #!optional (c 10) (d (+ a c)) #!key (e #f) (f d))
             (list a b c d e f))
         > (f 1 2)              
         (1 2 10 11 #f 11)
         > (f 1 2 3)
         (1 2 3 4 #f 4)
         > (f 1 2 3 55 f: 99)
         (1 2 3 55 #f 99)

 7) Lexical syntax extensions.

    7.1) Add #\#d1000 syntax for characters (i.e. #\ can be followed
         immediately by an exact integer with an explicit prefix).

    7.2) Extend the named characters and escape characters in strings.
         The escape characters should cover those of C and Java, in
         particular "\n" is the newline character.

    7.3) Add the |...| syntax for symbols.  The symbol's name
         corresponds verbatim to the characters between the vertical
         bars except for escaped characters.  The same escape
         sequences as for strings are permitted except that
         doublequote does not need to be escaped and the vertical bar
         needs to be escaped (in other words the function of the
         doublequote and the vertical bar characters is interchanged
         with respect to the string syntax).

    7.4) Multiline comments: #| ... |# .

    7.5) Circular structures: #n=<datum> and #n# .

    7.6) Uninterned symbols: #:g0 .

    7.7) #!eof is the end-of-file object.

    7.8) Require that #f, #t and characters be followed by a delimiter.  This
         would eliminate problems with the syntax of homogeneous numeric
         vectors (SRFI-4), which use the syntax:  #f64(1.0 2.0 3.0)

 8) Homogeneous numeric vectors.

    These are specified in SRFI-4.  They are useful, among other things,
    to do binary I/O and pass data to other languages using an FFI.

 9) String ports.

    Add open-input-string, open-output-string, get-output-string,
    with-input-from-string, with-output-to-string.

10) Unicode.

   10.1) Require that strings and characters, as well as source-code,
         use Unicode.  Cleanup char-alphabetic?, char-upcase, etc so
         that they respect the Unicode standard (or drop them
         altogether if there are complications such as introducing
         "locales", etc).  The string comparison procedures, such
         as string<?, need to be reconsidered because lexicographic
         string comparison is dependent on culture (in French for
         example "élément" < "elle", even though the first characters
         compare "é" > "e" in the latin-1 character encoding; actually
         the same thing happens in English for foreign words that have
         been adopted into the English language).

   10.2) For I/O, adopt a set of encodings/decodings from Scheme
         strings to/from bytes (i.e. latin-1, utf-8, ucs-2, ...).  The
         encoding should be attached to the port when it is open, and
         it should be possible to change the encoding while the port
         is open.

   10.3) Make the syntax case-sensitive, i.e. (eq? 'INVERT 'invert) => #f.
         This is useful for interaction with other programming
         languages that are case-sensitive.  Moreover, the meaning of
         "case conversion" is dependent on culture (in Turkish for
         example, the lowercase variant of "I" is an "i" with no dot,
         which is a specific Unicode character different from "i").

11) Binary I/O.

    Distinguish between character ports (such as string ports and
    ports associated with text files) and byte ports (where the unit
    of I/O is the byte).  Note that some character ports, such as
    ports associated with text files, are also considered byte ports
    (there is an encoding/decoding of the bytes to/from characters and
    the encoding is attached to the port).  Procedures such as
    read-byte and write-byte should be considered for byte ports.
    Also needed are bulk transfer procedures that read/write a
    sequence of bytes into/from a buffer (of type u8vector).

12) Continuations.

    Define a type for continuations that is separate from procedures,
    and operations to capture and invoke continuations of this type.
    For example: continuation-capture, continuation-call, and
    continuation-return.  The procedure call-with-current-continuation
    can be defined in terms of this type for backward compatibility.

13) Dynamic environment.

    Define more precisely the meaning of dynamic environment (as used
    by current-input-port, etc).  Adopt an API to define new dynamic
    variables, access them and bind them.  Parameter objects are one
    approach.  I would suggest a type separate from procedures.
    Parameter objects could be defined in terms of this type.

14) dynamic-wind.

    Define the semantics of escaping and returning to a "before" and
    "after" thunk.

15) transcript-on / transcript-off

    Let's remove these procedures from the standard.

16) Exceptions.

    Add the with-exception-handler and raise procedures, and require that
    runtime errors raise exceptions.

17) Boxes.

    Add the box type.

18) Uninterned symbols.

    Add the procedures gensym, string->uninterned-symbol and
    uninterned-symbol?.

19) Semantics of internal definitions.

    The letrec* form should be added, and internal definitions should
    expand into a letrec*.  This would make internal definitions and
    toplevel definitions closer semantically.