[R6RS] Unicode scalar value escape sequences

Matthew Flatt mflatt
Fri Mar 4 00:01:26 EST 2005


At Thu, 3 Mar 2005 23:33:01 -0500, Marc Feeley wrote:
> > Marc> How would you define (char-upcase (integer->char #x00df))?
> > Marc> What about (char-ci<? (integer->char #x00df) #\T)?
> > 
> > We've been through this a zillion times: via the standard Unicode case
> > mapping.
> 
> But that does not conform to the Unicode specification.

I believe that Mike is referring to the code-point -> code-point
mappings that are defined by the Unicode standard.

Specifically, in "UnicodeData.txt", the last three fields for a given
code point provide one code point each for the upcase, downcase, and
titlecase mappings. (Well, zero or one code point each, but zero is
usually interpreted as "maps to itself".)

Thus, if a character is defined to be a Unicode code point, there is a
specific standard mapping that we might use to define
locale-insensitive `char-upcase', `char-downcase', and even
`char-titlecase' operations.

Matthew



More information about the R6RS mailing list