[R6RS] I/O questions for everyone: encoding errors

Michael Sperber sperber at informatik.uni-tuebingen.de
Fri Jul 21 11:55:53 EDT 2006


Matthew Flatt <mflatt at cs.utah.edu> writes:

> Options "a" and "b" can be expressed as decodings. For example, there's
> a decoding like UTF-8, except that bytes that would be bad in UTF-8 are
> decoded as U+FFFD. Similarly, there's a decoding that ignores bytes
> that would be bad for UTF-8.

Like so?

(transcoder (codec codec)
            (eol-style eol-style)
            (errors handling))

where `handling' is one of the symbols:

o ignore
o replace
o raise

Moreover, the text on transcoding errors would read:

---
When the read-char and the various read-string-... procedures
encounter an invalid UTF-8 encoding, they raise an exception with
&i/o-encoding condition type.

If a transcoder encounters an invalid or incomplete character
encoding, it will behave according to the specified error-handling
mode.  If it is ignore, the first byte of the invalid encoding will be
ignored and decoding continues with the next byte.  If it is replace,
the encoding of U+FFFD will be emitted by the transcoder, and decoding
continues with the next byte.  If it is raise, an exception with
condition type &i/o-encoding is raised.

If a transcoder for an output port encounters a character it cannot
encode, it will behave according to the specified error-handling mode.
If it is ignore, the character is ignored and decoding continues after
the encoding.  If it is replace, the encoding of U+FFFD will be
emitted by the transcoder, and decoding continues after the encoding.
If it is raise, an exception with condition type &i/o-encoding is
raised.
---

Now, this leaves open the question whether something should be said
about the state of the port after an &i/o-encoding exception.  I'd say
we either leave it unspecified or the position should stay at the
place it was before the read-... or write-... procedure was called.
The latter option has the obvious advantage, but also the downside
that it restricts the implementation in some ways, and composes
poorly.

-- 
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla



More information about the R6RS mailing list