[R6RS] Unicode scalar value escape sequences

Marc Feeley feeley
Thu Mar 3 23:33:08 EST 2005


> Marc> How would you define (char-upcase (integer->char #x00df))?
> Marc> What about (char-ci<? (integer->char #x00df) #\T)?
> 
> We've been through this a zillion times: via the standard Unicode case
> mapping.

But that does not conform to the Unicode specification.  My point is
that this problem would not exist if there was no character type (it
would also go away if char-upcase was removed).

> Marc> I'm against it.  I think the syntax would be strange, overly complex
> Marc> and redundant.
> 
> I don't get this argument: Matthew's proposal has three different
> escape sequences for scalar values.  Implicit termination incurs
> complexity.  Also, specifying scalar values via the vanilla numerical
> literal syntax *removes* redundancy, as you can just re-use the lexer
> for the numerical literals.

But the redundancy in escape sequences you alude to is justified for
compatibility with C and Java.  We don't want to add more redundancy.
The "\Uh...h;" syntax I propose allows expressing codes beyond 65535.
I think the character syntax should offer a similar syntax.  And yes
if we standardize on the notation #\#... for characters will be
removed from Gambit.

Marc


More information about the R6RS mailing list