[R6RS] Source code encoding

Michael Sperber sperber
Mon Mar 7 02:23:21 EST 2005


I believe it's important that we specify the encoding of Scheme source
code files in the standard for the obvious portability reasons.

Manuel mentioned in Snowbird that it might be a bad idea to just pick
UTF-8, as the standard Unicode encoding on Windows is UTF-16 + BOM.

My suggestion is to allow UTF-8 + BOM, UTF-16 + BOM, or Latin-1.  (We
may allow UTF-32 + BOM as well, but that seems a rare encoding in
files.)

This allows auto-detecting the actual UTF encoding used, except for
Latin-1 files that start with LATIN SMALL LETTER THORN, LATIN SMALL
LETTER Y WITH DIAERESIS (or the same in opposite order).

Alternatively, if we want to avoid this wart, we could replace Latin-1
be ASCII.

Opinions?

-- 
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla


More information about the R6RS mailing list