[R6RS] Changing the transcoding mid-stream
sperber at informatik.uni-tuebingen.de
Sat Aug 19 04:56:50 EDT 2006
William D Clinger <will at ccs.neu.edu> writes:
> Mike wrote:
>> > The ones that appear to perform binary i/o use the procedures that,
>> > according to you, were not intended for binary i/o.
>> I think we're using different definitions of the word "binary I/O".
>> You seem to mean "untranscoded I/O" whereas I mean "I/O to and from
>> bytes objects and octets." Is that a correct interpretation of what
>> you're asking for?
> I don't want to answer in the affirmative, because
> I no longer have any confidence in my understanding
> of what you mean by transcoding. You are using that
> term to include both "compression or SSL or whatever"
> and Unicode encoding schemes, which to me are radically
> different things.
They are not to me. The SRFI spells out its notion of transcoding
under "Encoding" in the "Design rationale" section. Specifically:
>> This SRFI avoids this problem by specifying that textual I/O always
>> uses UTF-8. This means that, if the target or source of an I/O port
>> is to use a different encoding, a translated port needs to be used,
>> for which this SRFI offers the required facilities. This means that
>> text decoders or encoders are expressed as binary-to-binary
>> mappings, and as such compose.
Moreover, the second sentence in the section on "Text Transcoders" is:
>> A transcoder is an opaque object encapsulating a specific
>> translation from byte sequences to byte sequences.
Thus, the SRFI essentially treats textual I/O as a variant of binary
I/O, and transcoding works on the binary data. That's also why
they're called "transcoders" implying a translation from one encoding
to another, and not "encoders", "decoders," or "codecs."
This is clearly contrary to your notion
>> The first maps from uninterpreted binary to uninterpreted binary,
>> while Unicode encoding schemes are all about interpreting binary.
I don't know how to argue with you about this---the way the SRFI deals
with transcoding is one of its distinguishing characteristics (and has
been since day 1), and I made to sure to point that out every time the
I/O discussion came up. There are tradeoffs with every approach to
this, which I considered, and I stand by the one I chose.
I don't think there's any "most programmers" notion about this as most
programmers (including myself when I started out designing this)
haven't really considered the implications of mixed binary and textual
I/O in a multi-encoding and multi-byte encoding setting.
> I think we should either change the proposal so it can
> support what most programmers mean by mixed binary and
> textual i/o, or we should change the proposal to support
> completely separate binary and textual i/o through
> completely separate sets of i/o procedures, or we should
> give up on binary i/o for R6RS and eliminate all of the
> operations that give the misleading impression of
> performing (what most programmers mean by) binary i/o.
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla
More information about the R6RS