[R6RS] Source code encoding

Michael Sperber sperber
Tue Mar 8 02:34:04 EST 2005


>>>>> "Marc" == Marc Feeley <feeley at IRO.UMontreal.CA> writes:

Marc> Something's strange here.  First of all there is no need for a BOM in
Marc> UTF-8 because UTF-8 is a sequence of bytes. [...]
>> 
>> For an explanation, check 
>> 
>> http://www.unicode.org/faq/utf_bom.html#BOM

Marc> But this reference also says that adding a BOM on UTF-8 is only useful
Marc> as a signature to disambiguate it from some encodings like UTF-32 and
Marc> Latin-1,

That's exactly what I intended it for.  Specifically, if we were going
to go with a meta-encoding, I didn't want to exclude UTF-32 and
Latin-1 a priori.

Marc> It would mean that you can't use a UTF-8 encoded Scheme source
Marc> file as a shell script.  That would be bad.

Yes, that's a good point.

Marc> I maintain that allowing UTF-16 + BOM and UTF-8 is a good compromise
Marc> (it covers the two most popular Unicode file encodings, allows shell
Marc> scripts, plain ASCII files need not be changed, and a wide range of
Marc> editors can be used).  We could however add that an initial BOM
Marc> on a UTF-8 encoded file is ignored.

I'm hesitant to use a meta-encoding that there's little experience
with, so I guess I'm backpedalling to wanting only UTF-8.  Are there
any editors / IDEs that matter to Scheme that can't produce UTF-8?

-- 
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla


More information about the R6RS mailing list