[R6RS] revised Unicode SRFI

Tue Apr 11 09:44:03 EDT 2006

Matthew Flatt <mflatt at cs.utah.edu> writes:

> At Wed, 29 Mar 2006 15:47:28 +0200, Michael Sperber wrote:
>>   Moreover, it might be worth mentioning that
>>   CaseFolding.txt contains purely derived information.
>
> I think it's not derived. A comment in file says that it's a
> "supplement to the UnicodeData file".

I'm still confused on this---if only for the purposes of implementing.
I've currently implemented case-folding by first doing string-upcase
and then mapping char-downcase over the result, like so:

(define (string-foldcase s)
  ;; map to uppercase, then back to lowercase char-by-char
  (let* ((ucase (string-upcase s))
	 (size (string-length ucase)))
    (let loop ((i 0))
      (if (< i size)
	  (begin
	    (string-set! ucase i (char-downcase (string-ref ucase i)))
	    (loop (+ 1 i))))
      ucase)))

These require only UnicodeData.txt and SpecialCasing.txt. I generally
have a hard time reading the algorithms as written up in the Unicode
consortium TRs, so I'm probably wrong---could you clarify?

Also, an implementation of char-upper-case? and char-lower-case? must
look at PropList.txt, I believe, which isn't mentioned yet.

I'm very slightly concerned about the inconsistent spelling of
"title-case" (and, by transitivity, the other -cases) but that
probably can't be helped.

Other than that, it's ready to go by me.

-- 
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla