[R6RS] Source code encoding
Mon Mar 7 13:13:22 EST 2005
> >>>>> "Marc" == Marc Feeley <feeley at IRO.UMontreal.CA> writes:
> Marc> The beauty of UTF-8 is that plain ASCII files (probably most current
> Marc> Scheme files) are compatible with UTF-8. For UTF-8 + BOM you would
> Marc> need to add a byte order mark at the beginning of the ASCII
> Marc> file,
> No you wouldn't, as I noted slightly below in my email in a part you
> didn't quote: you can always tell a file in UTF-8 + BOM apart from an
> ASCII file.
Why would this be interesting, since an ASCII encoded file also happens
to be a UTF-8 encoded file? Why would you want to distinguish these
encodings by adding a BOM to UTF-8?
> Marc> I feel a better solution is to allow UTF-8 and UTF-32 + BOM encodings
> Marc> of Scheme source files. As for end-of-line encodings, I propose that
> Marc> all three end-of-line encodings (NL, CR, CR+NL) be equivalent.
> If you drop UTF-16 (which is there mainly because it's the standard
> encoding on Windows), you might as well drop UTF-32 and just use UTF-8
> (sans BOM) as the only encoding.
Sorry, autocompletion didn't do the right thing. I meant UTF-16 + BOM.
More information about the R6RS