[R6RS] Source code encoding

Michael Sperber sperber
Tue Mar 15 08:33:18 EST 2005


>>>>> "Marc" == Marc Feeley <feeley at IRO.UMontreal.CA> writes:

Marc> Now you are advocating for using UTF-8 only.  Why not allow UTF-16 +
Marc> BOM also, since it does not conflict in any way with UTF-8 and UTF-16
Marc> + BOM is the norm on Windows for encoding Unicode text files?  What is
Marc> the downside of supporting both of these popular Unicode encodings?

- Because there are standard decoders out there where you can say
  "UTF-xx + BOM" where the auto-detection wouldn't work in the setup
  you describe.

- Because, if we allow two different concrete encodings now, we might
  want to add a third one in the future, and it's not clear that
  leaving out the BOM on one of them where it's actually allowed will
  scale.

- Because this auto-detection based on a tag that isn't there always
  makes me feel queasy, and doesn't seem very robust.

- Because the perceived (by me) complexity.

- Because this is relevant only when you actually ship a Scheme source
  file to someone or some other implementation, in which case it
  shouldn't be hard to see that it's converted to UTF-8 if it isn't
  already.

You're right that I should have stated my actual own position more
clearly at the outset.  I was just trying to summarize what had been
said, and to suggest a compromise.  (Also, I'm learning through this
discussion.)  Sorry about that.

-- 
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla


More information about the R6RS mailing list