[R6RS] Protocol for handling encoding/decoding errors

Michael Sperber sperber at informatik.uni-tuebingen.de
Thu Aug 24 10:58:32 EDT 2006


Here's a first cut at it, mainly for Will:

For encoding errors, i.e. putting text on an output port:

The idea is that, when the error is raised, the output port points
past the character that caused the error.  A continuable exception
with condition type &i/o-encoding is raised, with the following
definition:

(define-condition-type &i/o-encoding &i/o-port
  &i/o-encoding-error?
  (char encoding-error-char))

The exception handler must `put-bytes' or whatever onto the port.  Its
return values would be ignored.

Continuing means picking up after that character.  I.e., `put-string'
will continue after the character that caused the error.

For decoding errors, i.e. getting text from an input port:

When the error is raised, the input port points at the beginning of
the incorrect encoding.  A continuable exception with condition type
&i/o-decoding is raised, with the following definition:

(define-condition-type &i/o-decoding &i/o-port
  &i/o-decoding-error?)

The exception handler must return a character or string representing
the decoded text starting at the port's current position, and update
the port's position to point past the error.

This has the disadvantage that a handler may want to try handling the
error, decline and leave the handling to some higher-level handler,
but that "try handling" involves removing data from the port, which is
not reversible.  I don't know an easy way around this.

Continuing means continuing the operation as though the returned
string had been read from the input.  Consider `get-line' seeing this:

foo<invalid byte>bar<linefeed>

with an exception handler like so:

(lambda (c)
  (get-byte (i/o-error-port c))
  #\?))

would return 

"foo?bar"

-- 
Cheers =8-} Mike
Friede, Völkerverständigung und überhaupt blabla



More information about the R6RS mailing list