[R6RS] Unicode scalar value escape sequences

Marc Feeley feeley
Tue Mar 1 12:07:37 EST 2005


> Marc> Well strings would still use the "..." notation, so instead of
> Marc> writing (string? #\a) you'd write (string? "a"), and instead
> Marc> of (char->integer #\a) you'd write (char->integer "a"), and instead
> Marc> of (string-ref "abc" 1) you'd write (substring "abc" 1 2), or
> Marc> you could keep the string-ref procedure and define it as  [...]
> 
> Well, sure.  But given that you have all kinds of procedures that
> operate on strings of length 1 only, I don't see how you're making the
> character data type go away in any real sense---you still effectively
> have a separate type.  It's just that the type is wedged into the
> language in a way that, to me, makes way less sense than the current
> setup.

How would you define (char-upcase (integer->char #x00df))?

What about (char-ci<? (integer->char #x00df) #\T)?

Why is there a string and character type in Scheme, but there is no
digit type to go with the exact integer type?  This asymmetry
bother's me.  Why is there a need to distinguish #\a and "a" but not
to distinguish the digit 9 and the number 9?  The only good reason I
see is mutability, but as I said before I think strings should be
immutable.

> This thread started with me asking about making the syntax for scalar
> values in string literals agree with what you have in Gambit-C.  You
> never replied to that---any opinion?

I'm against it.  I think the syntax would be strange, overly complex
and redundant.

Here's another proposal.  Keep the \xhh and \uhhhh syntaxes for
compatibility with C and Java (i.e. exactly 2 and 4 hexadecimals
respectively) but require a delimiter for \U and allow any number of
hexadecimals, i.e.

    "\U20;\U00000021;\Ua;"   =   " !\n"

Marc


More information about the R6RS mailing list