[R6RS] R6RS Unicode SRFI controversial issues

Fri Jun 17 19:52:06 EDT 2005

On 17-Jun-05, at 7:00 PM, dyb at cs.indiana.edu wrote:

> Common Lisp allows newlines in characters, treats them as newlines,
> and doesn't ignore whitespace after \<newline>.  You may be  
> thinking of
> Common Lisp format, which treats ~<newline> as you propose for  
> \<newline>
> (with various options for suppressing or not suppressing the newline
> and whitespace that follows, of course).  I don't mind \<newline>  
> having
> the behavior you suggest.

I don't feel strongly about it, but it seems like the right thing.  I  
dislike not being able to break long strings in my code, or to  
violate my indentation rules.  Note that if we specify that  
whitespace is ignored after \<newline>, we probably have to adopt the  
\<space> escape (which would expand to a single space character) so  
that the continuation line can start with a space, i.e.

     (display "hello\
              \   world")

would print "hello   world", note that this particular example could  
also be written

     (display "hello   \
               world")

The reason for the \<space> escape is to not be forced to put all the  
spaces at the end of the first line when there are many of them,  
which might exceed the width of the page.

>   I'd prefer that we not disallow newlines in
> strings, but don't feel strongly about it.
>

I'm OK with that.  However, newlines are not allowed in C strings,  
and the obligation to break strings on multiple lines with \<newline>  
makes multiline strings somewhat easier to parse for humans (i.e. you  
have a visual clue at the end of the lines that this text is in a  
multiline string).

> Why not allow \o<o><o><o> for octal character notation in both strings
> and characters?  We could still allow \<o><o><o> as well for strings.
>

Octal is dead.  I don't mind the \<o><o><o> notation in strings for  
"C compatibility", but given that Scheme's character syntax is  
completely different from C's, I don't feel the least bit compelled  
to support octal in the character syntax.  I feel a decimal notation  
(as a string escape and for characters) would be more useful, but I  
see problems with that (it would have to be variable length and  
probably include a delimiter, e.g. "4\d32;1" = "4 1", but I don't see  
how to make this work elegantly for character literals so I'm willing  
to live without it).

> I would like for all characters to be followd by a delimiter, so that,
> for example, #\s3 is an error.  Requiring all characters to be  
> delimited
> also helps leave open future extensions.
>

Fine by me.  Do you also mean that #\(foo and #\ foo are errors?   
That too is fine by me.

Marc