[R6RS] Source code encoding

Marc Feeley feeley
Mon Mar 7 13:13:22 EST 2005


> >>>>> "Marc" == Marc Feeley <feeley at IRO.UMontreal.CA> writes:
> 
> Marc> The beauty of UTF-8 is that plain ASCII files (probably most current
> Marc> Scheme files) are compatible with UTF-8.  For UTF-8 + BOM you would
> Marc> need to add a byte order mark at the beginning of the ASCII
> Marc> file,
> 
> No you wouldn't, as I noted slightly below in my email in a part you
> didn't quote: you can always tell a file in UTF-8 + BOM apart from an
> ASCII file.

Why would this be interesting, since an ASCII encoded file also happens
to be a UTF-8 encoded file?  Why would you want to distinguish these
encodings by adding a BOM to UTF-8?

> Marc> I feel a better solution is to allow UTF-8 and UTF-32 + BOM encodings
> Marc> of Scheme source files.  As for end-of-line encodings, I propose that
> Marc> all three end-of-line encodings (NL, CR, CR+NL) be equivalent.
> 
> If you drop UTF-16 (which is there mainly because it's the standard
> encoding on Windows), you might as well drop UTF-32 and just use UTF-8
> (sans BOM) as the only encoding.

Sorry, autocompletion didn't do the right thing.  I meant UTF-16 + BOM.

Marc


More information about the R6RS mailing list